Neptune's
rkestra

jump to main content

Cyborg Lisps

Attention Conservation Notice

A look at embedding lisp in other languages and some reasons why you might want to do so. Might be nice if you've already drunk the lisp-flavoured kool-aid or have built an interpreter. Otherwise, it might just be a bunch of techniques and languages you haven't used (yet).

Cyborgs

There is a classic meme™ that floats around wherever lisp nerds aggregate:

cl views other pl users

One of the things that stands out to me is Clojure's picture: it's of the borg.

The Borg are an alien group that appear as recurring antagonists in the Star Trek fictional universe. The Borg are cybernetic organisms (cyborgs) linked in a hive mind called "the Collective". The Borg co-opt the technology and knowledge of other alien species to the Collective through the process of "assimilation": forcibly transforming individual beings into "drones" by injecting nanoprobes into their bodies and surgically augmenting them with cybernetic components. The Borg's ultimate goal is "achieving perfection".

I think this is an apt description of the language. It is a cyborg that consists of an organic host (JVM) and a bionic attachment (lisp). This combination allows for more power than either on their own.

Some background on Clojure

Clojure is a dialect of lisp that is relatively new. It came out in 2007, which feels like forever ago, but is quite modern by lisp standards. Here is a snippet of code:

(ns monty-hall-problem
  (:use [clojure.contrib.seq :only (shuffle)]))

(defn play-game [staying]
  (let [doors (shuffle [:goat :goat :car])
        choice (rand-int 3)
        [a b] (filter #(not= choice %) (range 3))
        alternative (if (= :goat (nth doors a)) b a)]
    (= :car (nth doors (if staying choice alternative)))))

(defn simulate [staying times]
  (let [wins (reduce (fn [counter _]
                       (if (play-game staying) (inc counter) counter))
                     0
                     (range times))]
    (str "wins " wins " times out of " times)))

(simulate true 1000)

Looks pretty lispy!

What makes Clojure stand out is that it is hosted on the JVM. This means that not only does Clojure have its own set of packages but it can also access packages that are available in other JVM languages like Java, Scala, Groovy, Kotlin, etc. Furthermore, it means that a lot of the hard work that goes into making a language accessible to developers (build tools, basic packages, a reputation for stability) is already baked in. Here is an example of Java interop (taken from wiki):

; create an instance of `java.util.ArrayList` and
; increment its elements with `clojure.core/map`
(def al (doto (java.util.ArrayList.)
          (.add 1)
          (.add 2)
          (.add 3)))

(map inc al)
;; => (2 3 4)

; show a message dialog using Java Swing
(javax.swing.JOptionPane/showMessageDialog
  nil
  "Hello, World!")

This is a very powerful idea, as it means that a language designer can get going much faster by leveraging an entire communities resources and building upon them. In a sense, clojure is an addition to Java, a biomechatronic enhancement. As a hosted language, clojure didn't need to even stop at the JVM. It may be implemented on the JVM, but the language itself exists beyond the JVM. There are clojure implementations in many other environments including python, and the Ruby VM, and it even compiles to javascript (clojurescript).

Why Bother Embedding Lisp?

Why not just use the original language? Or, if you must, why not make your own domain-specific language with only the features you need?

A Long, Consistent History

Lisp has been around for a long time. From Wiki:

lisp timeline

Pragmatically, this means that there are a lot of resources floating around. The core of the language is very stable over time and maintainers have a focus on backwards-compatibility. This means that resources from 10+ years back are still valid and can be used to learn the language.

Furthermore, a lot of lisp languages are standardized, providing consistency between implementations. There are generally 2 big branches of lisp (3 if you count cyborgs as a distinct group!), common lisp and scheme. Both have standards (common lisp standard & R7RS Scheme Standard) and standardized features tend to be consistent and implemented in descendent languages.

As an example, if you learned a dialect of scheme 20 years ago, and then started learning racket today you would be pleased to find that all the language features you remember using are still available and widely used. You can even load SRFI packages, which are scheme language constructs that are carried across the various types of scheme.

As an example of lisp's consistency over time, here is maclisp code (a language from ~1960):

(defun handle-cli-msg (ignore)
  (iota ((stream '((cla)) '(cla block)))
    (do ((i 0 (1+ i))) ((= i 8)) (tyi stream))
    (do ((c (tyi stream -1) (tyi stream -1)))
        ((or (= c -1) (= c 3)))
      (format t "~&~3,'0O ~@:C~%" c c)))) ;print out chars seen

(setq cli-message 'handle-cli-msg)

(sstatus cli t)

; --------

(defun eval-cli-msg (ignore) ;alternate handler
  (iota ((stream '((cla)) '(block cla)))
    (do ((i 0 (1+ i))) ((= i 8)) (tyi stream))
    (do ((f (read stream nil) (read stream nil)))
        ((null f))
      (errset (eval f) nil)))) ;Quietly evaluate forms...

;; Assumes the other lisp will have EVAL-CLI-MSG as value of CLI-MESSAGE

(defun eval-in-other-lisp (uname jname s-expression)
  (iota ((stream `((cli) ,uname ,jname) '(out ascii block)))
    (print s-expression stream)))

and common lisp code (a language still in use today):

(defun get-answer (library)
  (gethash 'answer library 42))

(defun the-answer-1 (library)
  (format nil "The answer is ~A" (get-answer library)))
;;;; Returns "The answer is 42" if ANSWER not present in LIBRARY

(defun the-answer-2 (library)
  (multiple-value-bind (answer sure-p)
      (get-answer library)
    (if (not sure-p)
        "I don't know"
     (format nil "The answer is ~A" answer))))
;;;; Returns "I don't know" if ANSWER not present in LIBRARY

They both look pretty similar! They both use s-expressions, they both have similar scoping rules, and they share a lot of the same syntax (or lack of syntax, s-expressions are essentially abstract syntax trees).

When you factor in standardization, resources and backwards-compatibility you have a programming language that rewards the effort of learning it by still being relevant throughout an entire career. It truly becomes a tool that grows with you, not one that you grow out of. This largely reduces learning new programming languages to understanding underlying semantics rather than syntax.

This connects back to the idea of embedded cyborg lisps because one of the major advantages of embedding lisp is that if you know lisp, you know lisp wherever it is located. You don't need to learn a bunch of new syntax; most of the work is done for you. The semantics of the host language is abstracted away, and you can start hacking much quicker with much less pain. Furthermore, if someone starts a new project in 10 years with a lisp veneer, you are already set to start tinkering with it.

Macros and Language Building

You may be wondering:

"If lisp doesn't break backward-compatibility, then how do they manage to add new features? Surely it must get more difficult over time to add new constructs without breaking old constructs…"

The way that it does this is with macros. A macro (or syntax-transformer) is one of the more characteristic parts of a lisp language. If you embed a lisp on some platform, you almost always get macros along with the implementation. With macros, you can build out the language however you'd like.

As an example of a macro, consider the following example:

In general, lisps use prefix notation. This means that to do something like addition, you'd type the following:

(+ 100 200)  ;;=> 300

Suppose that instead you wanted to have the more familiar infix notation. You can implement this change in your language by writing a macro that transforms the syntax:

(defmacro infix
  [inf]
  (list (second inf) (first inf) (last inf)))

(infix (100 + 200)) ;;=> 300

This is a simple macro and it isn't very exciting, but the idea is. All it does is take an expression (100 + 200) as a list containing 3 elements: 100, +, 200. Then, it rearranges them to make the list (+ 100 200). In this form, the language can evaluate the expression and return 300. The exciting idea behind macros is the following:

If you don't like something about a language, just change it.

You’ll come to see yourself as a designer of languages rather than only a user of languages, as a person who chooses the rules by which languages are put together, rather than only a follower of rules that other people have chosen.

Metacircular Evaluation

Let's take the simple macro idea further. What if you could take a very small core of a language, and then use macros to fill out the rest of the language? Not only is it possible, but it is common. When designing a language, it makes sense to start with a few small axiomatic primitives and build everything up. It is a right of passage in the scheme world to implement a metacircular evaluator, a programming language that implements itself.

Alan Kay once stated the following about lisp's metacircular nature:

Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.

The code he was talking about is the following:

evalquote[fn;x] = apply[fn;x;NIL]

apply[fn;x;a] =
     [atom[fn] -> [eq[fn;CAR] -> caar[x];
                  eq[fn;CDR] -> cdar[x];
                  eq[fn;CONS] -> cons[car[x];cadr[x]];
                  eq[fn;ATOM] -> atom[car[x]];
                  eq[fn;EQ] -> eq[car[x];cadr[x]];
                  T -> apply[eval[fn;a];x;a]];
     eq[car[fn];LAMBDA] -> eval[caddr[fn];pairlis[cadr[fn];x;a]];
     eq[car[fn];LABEL] -> apply[caddr[fn];x;cons[cons[cadr[fn];
                               caddr[fn]];a]]]

eval[e;a] = [atom[e] -> cdr[assoc[e;a]];
     atom[car[e]] ->
      [eq[car[e],QUOTE] -> cadr[e];
      eq[car[e];COND] -> evcon[cdr[e];a];
      T -> apply[car[e];evlis[cdr[e];a];a]];
      T -> apply[car[e];evlis[cdr[e];a];a]]

evcon[c;a] = [eval[caar[c];a] -> eval[cadar[c];a];
      T -> evcon[cdr[c];a]]

evlis[m;a] = [null[m] -> NIL;
      T -> cons[eval[car[m];a];evlis[cdr[m];a]]]

This is a specification written in M-expressions, but it can be translated to s-expressions. What it does is specifies the full-blown lisp programming language in terms of a few simple primitives (car, cdr, cons, eq, atom).

For more explanation, see this post

Reducing Boilerplate

Macros help reduce boilerplate code. Here are some examples where you can use an embedded lisp to remove some of it.

Example 1

You are coding in a low-level language and want to be able to abstract away some of the granular details to focus on the bigger picture. You can embed a lisp, make some macros that abstract over the pattern, and get on with your actual goals.

An interesting example of this is the Flux Harmonic project from David Wilson. He creates an embedded lisp in C and uses it to write out complex parts of the graphics layer in his program here

Example 2

You find yourself writing a lot of boilerplate code that can't be replaced by traditional functions because:

  • the differences between each instance of boilerplate rely on special forms, things like if, else, function arguments, assignment, logical primitives
  • your language doesn't have a nice way of delaying evaluation when passing expressions as arguments

Here is an example of this situation being handled in Racket:

;; a macro that creates a binary search function
(define-syntax (bsearch stx)
  (define xs (syntax->list stx))
  (datum->syntax stx
                 `(λ (ls target)
                    (define (ptr-narrow left right)
                      (let* ([mid (quotient (+ left right) 2)]
                             [mid-val (list-ref ls mid)])
                          (cond [(= mid-val target) ,(cadr xs)]
                                [(> left right) ,(caddr xs)]
                                [(> mid-val target) (ptr-narrow left (sub1 mid))]
                                [else (ptr-narrow (add1 mid) right)])))
                    (ptr-narrow 0 (sub1 (length ls))))))

(define (lref ls ind)
  (if (= ind -1) '() (list-ref ls ind)))

;; a binary search that relies on an if statement
(define (check-bin ls target)
  ((bsearch mid (if (> mid-val target) (sub1 mid) mid)) ls target))

;; a binary search that relies on an if statement
(define (search ls target)
  (if (empty? ls) #f ((bsearch #t #f) ls target)))

(define (search-matrix matrix target)
  (search (lref matrix (check-bin (map first matrix) target)) target))

Notice that this macro allowed me to avoid all this boilerplate when defining each binary search function (everything aside from SOMETHING-*-HERE):

(define (ptr-narrow left right)
  (let* ([mid (quotient (+ left right) 2)]
         [mid-val (list-ref ls mid)])
    (cond [(= mid-val target) `SOMETHING-HERE]
          [(> left right) `SOMETHING-ELSE-HERE]
          [(> mid-val target) (ptr-narrow left (sub1 mid))]
          [else (ptr-narrow (add1 mid) right)])

Example 3

You find yourself consistently doing the same pattern over and over. For example, a common macro found in functional programming languages is the threading macro:

;; deeply nested function compositions
(f (g (h i)))

;; replaced with a thread
(~> i
    h
    g
    f)

This macro unwinds the function composition to make the resulting set of operations more readable by showing the order in which they happen. This way the user can read the code in a linear fashion rather than inside out.

Another example is from R

# we don't need to reference the underlying dataset or treat the column as a string
mtcars %>%
    filter(qsec > 20)

# normally we would have to do this:
mtcars %>%
    filter(mtcars$qsec > 20)

In any language that supports macros, you can essentially wash away the warts of the language by replacing them with well thought out macros. As a result, the final codebase feels sparse. You end up with essentially pure data and business logic left over.

Bilbo

You may be thinking:

Great, you can put a programming language in your programming language. But doesn't this mean we replace our current problem of writing a program in language x with another problem of implementing an entire programming language y to write the program?

  • you, maybe

This sounds like the world's most intensive yak-shave. Thankfully many languages already have well-established lisps built on top of their runtime.

Examples of Embedded Lisps

Here are some examples of embedded cyborg lisps. I've also added their logos since some of them have cute mascots (Hy is a cuttlefish!). The titles are in the format language :: host

Clojure :: JVM

Clojure

As covered above, Clojure is a lisp that was initially built on the JVM. It has since expanded its domain to also provide compilation to javascript (clojurescript), the ability to run on a separate, closer-to-the-metal VM (babashka), and more.

Hy :: Python

Hy

The Hy language is a lisp embedded in python. It transforms lisp code into python abstract syntax tree objects, giving the user full access to the python ecosystem. Similar in spirit to this blog post is the Hy documentations Why Hy?

Hy (named after the insect order Hymenoptera, since Paul Tagliamonte was studying swarm behavior when he created the language) is a multi-paradigm general-purpose programming language in the Lisp family. It’s implemented as a kind of alternative syntax for Python. Compared to Python, Hy offers a variety of extra features, generalizations, and syntactic simplifications, as would be expected of a Lisp. Compared to other Lisps, Hy provides direct access to Python’s built-ins and third-party Python libraries, while allowing you to freely mix imperative, functional, and object-oriented styles of programming.

Fennel :: Lua

Fennel

Fennel is a neat little lisp that uses Lua as a host language. This is inceptively interesting because Lua is often used as an embedded language itself, so for example, you could be running an embedded lisp in an embedded lua in C. If you're interested in game development, there is a large community in Lua that you could build upon.

Full Lua compatibility: Easily call any Lua function or library from Fennel and vice-versa.

Zero overhead: Compiled code should be just as efficient as hand-written Lua.

Compile-time macros: Ship compiled code with no runtime dependency on Fennel.

Embeddable: Fennel is a one-file library as well as an executable. Embed it in other programs to support runtime extensibility and interactive development.

Janet :: C

Janet

Janet is a lisp that is embedded in C. You can use it stand-alone (and have it essentially generate C99 code for you), or you can embed it in C with just a single C source file and a single header. Out of the box, it has a bunch of features:

  • Minimal setup - one binary and you are good to go!
  • Builtin support for threads, networking, and an event loop
  • First class closures
  • Garbage collection
  • First class green threads (continuations)
  • Mutable and immutable arrays (array/tuple)
  • Mutable and immutable hashtables (table/struct)
  • Mutable and immutable strings (buffer/string)
  • Macros
  • Tail call optimization
  • Direct interop with C via abstract types and C functions
  • Dynamically load C libraries
  • Lexical scoping
  • REPL and interactive debugger
  • Parsing Expression Grammars built into the core library
  • 500+ functions and macros in the core library
  • Export your projects to standalone executables with a companion build tool, jpm
  • Add to a project with just janet.c and janet.h

Lisp-Flavoured Erlang :: Erlang

LFE

LFE is a language written by one of the co-creators of Erlang. Lisp-flavoured Erlang is, like it says on the tin, an implementation of Lisp on the erlang Open Telecom Platform. Essentially you can think of it as a language that interacts with a very special operating system that excels at massive-scale lightweight concurrency. LFE can be used to make highly distributed, fault-tolerant, real-time, highly available, non-stop systems. That's quite a load of strong adjectives!




▽ Comments