Bicameral, Not Homoiconic

Bicameral, Not Homoiconic 🔗

If you spend enough time reading internet discussions of programming
languages, you’ll learn that Lispy languages have a special property:
they are homoiconic. This property is vested with mystical
powers that both enrich Lisps and debase its competitors.

I have programmed in, and built, Lisps since the late 1980s. My blog
is called “parenthetically speaking”. And yet I’m here to tell you
that this term is mostly nonsense. However, there is something
special—something far less mystical, but also very powerful and
useful—about Lisps. It’s worth understanding what that is and
transporting its essence to other languages.

1 (Weak) Homoiconicity 🔗

What, supposedly, is homoiconicity? You will hear things like: the
property that “a property of some programming languages that allows
programs to be represented as data within the language”, or with
“represented” substituted by “manipulated”, or more simply as
“code as data”.

Let’s tease these apart a bit. Consider the following Python code:

This is clearly a program. But can I represent this as a datum
within the language? Sure:

is a perfectly good representation. (Well, it may be good but
it’s not great; we’ll return to that!) Can I manipulate
it? Sure, I can concatenate strings to create it:

will produce that program, and

will take it apart into constituent pieces.

Does that make Python homoiconic?

Of course, there’s nothing special about Python here. We can use
JavaScript to represent and manipulate JavaScript programs, C to do
the same to C programs, and so on. Essentially, any programming
language with a string datatype seems to be homoiconic. Heck, we
didn’t even need strings: we could just as well have represented the
programs as numbers (e.g., using
Gödel numbering).

One of the traits of a good definition is that it be non-trivial: it
must capture some things but it must also exclude some things. It’s
not clear that this notion of homoiconicity excludes much of anything.

2 (Strong) Homoiconicity 🔗

But there’s a reasonable objection to what we wrote above. All that
we’ve done is written, combined, and taken apart strings. But
strings are not necessarily programs; strings are just strings,
a form of data. Data are data, but programs—entities
that we can run—seem to be a separate thing.

How do we turn data into programs? We do need some language support
for that. We need something that will take some agreed-upon data
representation of a program and treat it like a program,
i.e., do whatever the program would have done. Typically, this is a
function called eval: it evaluates the datum, performing the
effects described in the datum, just as if it were a program. (Note
that eval really treats “data as code”, not “code as
data”.)

So maybe eval is the real characteristic of homoiconic
languages? Maybe. It’s certainly true that eval is a
distinctive feature, and some languages have it while others don’t:
that is, it non-trivially distinguishes between languages. But it’s
worth noting:

Many languages, including Python and JavaScript, have an
eval. If they’re all homoiconic, then clearly this isn’t a
particularly Lispy trait.
eval interacts poorly with its lexical environment,
thereby making it hard to even program with effectively.
We showed that JavaScript’s
eval is not one but four operations and there are eight
contexts that determine which of the four to use. This kind of
complexity is overwhelming.
The complexity might be worth it if eval were a good
idea, but it’s often a bad idea in programs! It makes code statically
invisible, making every other aspect of program management—static
analysis, compilation, security checking, and more—much, much harder
(or, for some important and useful kinds of analysis, impossible).

This seems like a disappointing way to end: homoiconic languages are
ones that have a complex, excessively-powerful feature that we
probably shouldn’t use but is anyway found in many languages that are
not Lispy at all…which certainly doesn’t seem to be a good way to
describe what makes Lispy languages distinctive.

But this just shows why we shouldn’t be talking about homoiconicity at
all. Let’s talk about what’s actually interesting instead.

3 The Parsing Pipeline 🔗

Let’s talk briefly about the classical parsing pipeline. For decades,
we’ve been taught to think of parsing a program as having two phases:
tokenization (sometimes colloquially called “lexing”) followed by
parsing (not colloquially called “yaccing”), and more. What do
they do?

A program is a list, or stream, of characters. A language
implementation needs to first understand what a program is saying
before it can make the program do what it says. That is, it has to
understand the meaning of a program. This process of determining
meaning is usually done in several stages, which make sense once we
understand them in terms of their expressiveness.

A typical dataflow in the front-end of a language implementation looks
like this:

What does this mean?

Just as you don’t read this article at the level of individual
letters, given a program like hello = 1, we don’t want to deal
with it at the level of each letter: h, e, l, …
either. Instead, we’d like to start by thinking of it as having three
tokens: hello, =, and 1. Note that in the
process, we can ignore whether or not the = was surrounded by
spaces (which Python does not require, but some languages
do).We’ll return to the elided spaces later! This is the job
of the scanner.

So the input list or stream of characters gets transformed, by the
scanner, into an output list or stream of tokens. But as a reader, you
still don’t really read the program at the level of tokens; instead,
you assign meaning. That is, you presumably read the above as
“assign hello to 1”.The actual meaning of = in
Python is much more messy. This is the job of the parser: to
create a data structure representing the high-level semantics of the
language. In this case, it might create an abstract syntax tree of a
“variable assignment”-type node with two children, one the variable being
assigned (here, hello) and one the value expression (here, just
1). If the actual expression were more complicated—say the
program were hello = 1 + 2—then there would be a more
complicated tree representing addition with two children.This
being Python, of course the actual meaning of + is also messy.

The exact mechanisms by which this form of parsing happens is somewhat
outside scope here. For many languages, grammars are ambiguous and
top-down parsing strategies—the ones you might first think
of—don’t work, and parsing must instead happen bottom-up. This
process is complicated, algorithmically fascinating, and often bad at
giving good errors. There are very interesting trade-offs here that we
often don’t pay enough attention to.

Not every language implementation follows the above pipeline. People have,
for instance, experimented with
parsing
without using a scanner. Both conceptually and practically, it is
still useful to have this separation:

The scanner usually sits at the lowest level of the Chomsky
hierarchy: it is regular. It is therefore quick and efficient and its
rules are easy to reason about in depth.
The parser, in contrast, is usually context-free or worse,
making it a much more complicated beast.

We should want to take advantage of complexity theory in building our
tools! This also represents a good separation-of-concerns, whereby we
could imagine building multiple parsers—for instance, trading off
speed or space (or both) for better error messages—while reusing the
same scanner.

In what follows, therefore, I will take the scanner as a
given. Thus, the traditional pipeline then has just one stage:
the parser. I therefore call this the unicameral pipeline.

4 The Bicameral Analogy 🔗

Many countries have what is called a bicameral legislature:
“bi” meaning two, “camera” meaning room. That is, there are two
houses of parliament. India, for instance, has the Lok Sabha and Rajya
Sabha; England has the House of Commons and House of Lords; the USA
has the House of Representatives and Senate. In all cases, there is a
“lower” house and “upper” house.

My goal here is not to endorse various class-related associations or
to delve into the actual practices, but rather to employ this as a
useful, rough analogy. A theory of bicamerality is this. The lower
house is a bit more rowdy and boisterous. It filters out some of the
more absurd legislations, but it allows through many that are perhaps
not entirely wise. The upper house is more calm and deliberative; in
legislatures, its members are either appointed or serve longer terms,
so they do not need to bend to every populist will. Therefore, the
upper house usefully serves as a filter.

So: we have two houses. The lower house filters out dreck, but lets
through several bills, some of which are unwise. The upper house
passes judgment on these bills, allowing only the ones it considers
wise. Enough political science, back to computer science.

5 Bicameral Syntax 🔗

As I’ve alluded above, having just a parser gives you a unicameral
syntax. What Lispy languages give you is a bicameral
syntax. But they are not unique in this; other notations, such as XML
and JSON, are also bicameral. To make my point, I will use these other
notations.

First of all, all these notations have to get through a scanner. In
XML, you can’t write without a closing >; that’s just not even a valid opening tag. In JSON, you can’t write “hi without a closing “; that’s just not a valid string. We can think of these as basic tokenization failures. So let’s assume our input passes the scanner and we get a list or stream of tokens.

Even once you’ve written proper tokens, there are still things you
cannot do in XML or JSON. For instance, the following full document is
not legal in XML:

because we didn’t “close” with a ; this
isn’t legal either,

because we didn’t preserve the nesting structure (we should close
bar befor we close foo). The same is true in JSON: we can’t
open a structure without closing it (or vice versa); similarly, we
have to close braces and brackets in the opposite order of how we
opened them.

Observe that we can make these judgments without knowing what
the datum represents. There’s a reason I used names like foo and
bar instead of something meaningful. That’s because it doesn’t
matter: these XML and JSON data are wrong at a basic level. We
say that they fail to be well-formed. This is our lower house.

However, there may be more complicated rules at play. It may be that
a bar really should not reside within a foo; it may be that
every baz requires one or more quuxes. If you’ve ever worked
with an XML or JSON format, you know that there are all kinds of rules
like this (that may or may not have been documented well!) that you
need to follow. For instance, when you read the documentation of a Web
API, it might tell you that a certain kind of request needs to provide
certain fields, but must also not provide certain other
keys. All these are called rules of validity. They build on top
of—indeed, leave unstated—the well-formedness rules, and focus on
what wasn’t checked there. This is the upper house.

This leads to a new pipeline, one where some tool performs an
intermediate check for well-formedness, and another for validity. The
last stage—the one that checks validity and, having confirmed it,
produces terms tagged with their “parts of speech”—is still the
parser. But now there’s one more tool in the middle. I don’t believe
there’s a standard name for it, but there should be; following Lisp
tradition, I call it the reader:

Scaner –> Reader –> Parser

Now, as written, it’s not clear what the output of the reader is. It
could be that the reader just checks the tokens, and then
passes on the same token list or stream to the parser, just as in the
previous pipeline. But this would be a loss! The reader has the
ability to construct an extremely useful intermediate data
structure. The main property the reader looks for is that
“things match up and nest properly”. In other words, the reader is
confirming that the input is a tree. Thus, instead of passing
on a dumb, flat list of tokens to the parser, it constructs trees.

This greatly simplifies the job of the parser for two reasons. At a
more basic level, some of the checking that the parser previously had
to do is already done. More interestingly, tokens are very low-level;
trees are high-level. It’s much easier to write a recursive function
that checks for validity and produces abstract syntax trees when
already given well-formed trees as an input!For this reason,
PLAI calls this the “Primus Inter
Parsers”, a joke that seems lost on most.

Again, complexity theory is a useful guide, and what we have done is
steadily ratchet up the complexity:

The scanner is still regular, with all the attendant benefits
and weaknesses.
The reader is context-free. It does not need to do more
than that.
The parser is now freed of basic context-free checks, and can
focus on other context-free and, in particular,
context-sensitive checks.

And once again, we can exploit this separation of concerns to swap out
other implementations if we need to.

This, then, is a bicameral syntax. The reader is the “lower
house”; it rejects basic mistakes, but still lets through some
invalid things. But it also presents them wrapped up in a cleaner,
simpler structure to work with. The parser is the “upper house”; it
receives this clean structure and can look for deeper flaws in it,
only allowing those things that pass its more exacting scrutiny.

The advantages to a bicameral syntax are many:

We get a greater separation of concerns, making it easier to
both understand and to swap out components.
We get to more gradually walk up the complexity hierarchy.
The reader sits at a key intermediate level, turning tokens into
trees, thereby greatly simplifying the parser’s task while itself
being quite easy to write.

This simplicity of (correct!) implementation matters! It doesn’t
matter as much when you have only one implementation of a
language. However, a lot of our traffic in computing is best
represented as trees: they are much more powerful than strings, but
also simple enough to write and read. It is therefore no surprise that
bicameral syntaxes like JSON have become ubiquitous. Every
language now needs a reader for them, and these readers have to
implement the same language, so simplicity of specification and of
implementation are a virtue. In turn, many systems need to separately
validate a JSON input; again, the simplicity of writing this kind of
(tree-consuming) parser is a big win.

This additional level also has other tooling advantages. It’s not only
language implementations that need to support syntax; so do
editors. It’s a lot easier to support matching, indentation, coloring,
and so on at the well-formedness level. And it’s easy to indent
trees! It’s also easy to traverse them: there are meaningful
generic tree-level operations (“go to the opening”, “go to
the closing”, “go up a level”, “go down a level”, etc.) that can
be applied to all languages with the same well-formedness
characteristic (e.g., JSON). And again, we have lots of editors, and
they each need to support the same syntaxes; bicameral languages, by
providing the intermediate notion of well-formedness, let tools hit
the trifecta of: correct, useful, and relatively easy.

These advantages are offset by one drawback: some people just don’t
like them. It feels constraining to some to always write
programs in terms of trees, rather than more free-form syntax. Even in
this bicameral space, there are trade-offs: XML requires you to repeat
tags at closing, which improves error-checking and catches errors in
refactoring, but makes authoring painful; Lisps don’t require tag
repetition but, traditionally, use only one kind of parenthesis,
making structure harder to see; JSON mixes delimiters while avoiding
closing tags, which seems to be some kind of sweet-spot (because it
still provides some checking against editing errors). Still, what
people are willing to embrace for writing data seems to irk them when
writing programs, leading to the long-standing hatred for Lispy
syntaxes.

6 Back to Lisps 🔗

We started with Lisp, so let’s go back there. What is Lisp? Lisp is a
feeling, an emotion, a sentiment; Lisp is a vibe; Lisp is the dew on
morning grass, it’s the scent of pine wafting on a breeze, it’s the
sound of a cricket ball on a bat, it’s the…oh, wait, where was
I. Sorry.

Okay, seriously. Note that I keep writing “Lispy languages”: both
the “y” and “[language]s” should have stood out by now. What do I
mean by this?

I’m not referring to a particular language: Common Lisp, say. I’m
referring to a family of languages, which includes Common Lisp but
also Scheme and Racket and Clojure and many others. This family is
bound by some cultural antecedents, but the individual languages can
be wildly different. What they share is a philosophy of
syntax. And now that we’ve fleshed out that philosophy, if we are
being generous, we could allow that any bicameral language is
at its essence “Lispy”.

The power of this philosophy is that the language platform can
implement its bicameral syntax once, and well, and lots of languages
can inherit it. We noted above that there are many (data) languages
built atop JSON, each of which has its own validity rules. We can
likewise build many data and programming languages atop any bicameral
syntax; they will share some familial traits but differ quite a bit in
the details.

The bicamerality of Lisps, however, leads to some
misconceptions:

It does not dictate a particular implementation strategy. For
instance, the argument in Lisp sometimes goes, “code is data because
code is represented using s-expressions, which are also Lisp data”. A
full tutorial of what these terms means is beyond what I want to get
into here. What I will just say is that there are Lispy languages
where this is not true at all.

The one I know best is Racket, because I originally built some of
these parts of it. In Racket, code is not represented using
s-expressions. This is because “code” requires many more things: it
needs, for instance, to record the source locations so that it can use
it to report errors;Remember those whitespaces we said we could
ignore? We can’t if we want to report errors!
it needs hygiene information to avoid inadvertent
capture; and more. Therefore, Racket has a notion of
syntax object
(which, yes, is a datum) that captures this much richer notion of
“program source”. Syntax is a separate type that happens to parallel
(and enrich) s-expressions and can be interconverted with them, but it
is not an s-expression. So, while there is a large family of functions
that operate on (say) lists, and s-expressions include lists, none of
those functions will apply to syntax.
People will sometimes say that the read primitive
“parses”. It does not: it reads. It “parses” inasmuch as it
confirms that the input is well-formed, but it is not the parser of
convention—one that determines validity according to
context-sensitive rules, and identifies “parts of speech”—so it is
false to say that Lisps ship with a parser.

To make this concrete, here is a well-formed Lispy term that
read has no problem with: (lambda 1). That is, however,
a syntax error in most Lisps: a determination made by the
parser, not the reader. Of course, nothing prevents us from creating a
new language where that term has some meaning. That language’s parser
would be responsible for making sense out of the pieces.
People sometimes believe you can only have macros if you have a
bicameral syntax. This isn’t true: all you need is a way to represent
code, because macros transform code to code. But as more languages
have realized the benefits of having macros, some have added
procedural macros (an idea that dates back decades in Lisp
lore). Bicameral syntaxes just provide a gateway to gloriously rich
macro systems, because we’ve learned that the well-formedness layer is
an especially good layer atop which to build macros.

7 What About Other Languages?🔗

I want to end with some forward-looking comments.

One of my frequent sources of amusement—and frustration—is to
search for people asking “How do I build a programming language?”
and reading the advice they get. These conversations invariably get
mired in syntax issues. But by the time the novice gets through
building all the knick-knacks of syntax and learning to build parsers,
they’ve usually exhausted their complexity, cognitive, and time
budget. The resulting language either never gets done or is a semantic
mess.

I would like to suggest another path forward: start with a bicameral
syntax. Quickly get to the point where you can think about
semantics. What does your language do? How does one
reason about programs? How does one run them? Do all the hard and
interesting and important parts first.

But, you argue, “Now I have a bicameral syntax! Nobody will want to
program in it!” And that may be true. But I want you to consider the
following perspective:

Syntax is a view.

One of the enduring lessons we’ve learned from software design is the
model-view separation: that is, there are often many useful ways to
view the same thing, some of which we haven’t even dreamt up yet, so
it’s wise to separate an object from how we view it (and coordinate
the different views). This principle seems to virtually never be
applied to programs, but it should! We should view the abstract
syntax as the model—the “one true program”—and every
syntax as a view on it.

So what you should really do is define an abstract syntax, and layer a
light bicameral concrete syntax on top of it. Then build all the other
parts. And then, think about what views you want to provide. Some
people may be happy with the bicameral syntax. Others may want a more
traditional infix syntax. Some may want braces, some may want
significant whitespace. Some users may even want to use blocks—the
ultimate bicameral syntax. You can (and should) view syntaxes as the fun
desserts that you get to design and deploy once the hard work is
done.This is not to say syntax is not hard; it’s actually
harder than most people think, because it requires a delicate
interplay between mathematics and human factors.
Both
F*dging up a Racket and, in much more depth,
Beautiful Racket
illustrate this principle practically and beautifully.

This perspective also gives you a useful artifact: a bicameral syntax
that is a very nice target for programs that need to generate programs
in your language. This is no longer a new idea, so you don’t have to
feel radical: formats like SMT-LIB and WebAssembly text format are
s-expressions for a reason. Our Forge tool supports both an
syntax based on Alloy and one in s-expressions, the former
for (most) people to use, the latter a target to tools that use Forge
as a back-end.

In short, I hope this inspires a new vision of syntax
design. Bicamerality is the best intermediate language we’ve come up
with; it’s driven by both theoretical elegance and practical
experience. For some, the bicameral syntax is also a lovely source
language, but it should be just one of many views onto the core
abstract syntax. Tools will become more reusable, and at the same
time, the syntax wars can end. We can find peace, joy, and
happiness. It is not a coincidence that a pair of parentheses looks
like an embrace.

Bicameral, Not Homoiconic🔗

1 (Weak) Homoiconicity🔗

2 (Strong) Homoiconicity🔗

3 The Parsing Pipeline🔗

4 The Bicameral Analogy🔗

5 Bicameral Syntax🔗

6 Back to Lisps🔗

7 What About Other Languages?🔗

Leave a Reply

Related News

The one change that worked: I set my phone to ‘do not disturb’ three years ago – and have never looked back | Health & wellbeing

Javier Milei Backtracks on $4.4B Memecoin After ‘Insiders’ Pocket $87M

Google defends scrapping AI pledges and DEI goals in all-staff meeting | US news

Critics say Google rules put profits over privacy