The Problem
We have tokens. We need a tree.
// We have this:
[LeftParen, Symbol("+"), Integer(1), Integer(2), RightParen]
// We need this:
CallExpression("+", [Literal(Integer(1)), Literal(Integer(2))])
Why a tree? Because evaluation is recursive. To evaluate (+ (* 2 3) 4)
:
- Evaluate the function:
+
- Evaluate each argument:
(* 2 3)
and4
- Apply the function to the results
But (* 2 3)
is itself a function call, so we recurse. The tree structure mirrors this evaluation order - parent nodes can’t be evaluated until their children are.
Most languages require complex parsing. Lisp doesn’t. Its syntax is already a tree.
The Type System
First, we convert tokens into AST nodes. The key types:
pub enum Tokens {
Bounds(TokenBounds), // (, ), [, ], {, }
Literal(Literal), // 42, "hello", true
Symbol(String), // +, defn, my-var
Key(String), // :name, :age
}
pub enum Form {
Root, // Top level container
Literal(Literal), // Numbers, strings, bools
CallExpression(String), // (function-name ...)
Symbol(String), // Variable references
List, // [1 2 3]
Record, // {:key value}
Key(String), // Record keys
}
The magic happens with the From
trait:
impl From<Tokens> for Form {
fn from(value: Tokens) -> Self {
match value {
Tokens::Bounds(_) => unreachable!(),
Tokens::Literal(l) => Form::Literal(l),
Tokens::Symbol(s) => Form::Symbol(s),
Tokens::Key(k) => Form::Key(k),
}
}
}
Bounds tokens don’t become forms - they control the parsing flow.
The AST
Our tree nodes are represented by the Form
enum:
pub enum Form {
Root,
Literal(Literal),
CallExpression(String),
Symbol(String),
List,
Record,
Key(String),
}
Each variant represents a different syntactic element:
CallExpression
: Function calls like(+ 1 2)
List
: Data lists like[1 2 3]
Record
: Hash maps like{:name "Alice"}
Symbol
: Variables and function namesLiteral
: Numbers, strings, booleans
The Algorithm
We use recursive descent parsing. The algorithm:
- Take one token at a time
- If it’s an opening delimiter: Parse a nested structure
- If it’s a value: Add it to the current node
- If it’s a closing delimiter: Skip it (handled elsewhere)
- Recurse with remaining tokens
pub fn parse(tokens: &[Tokens], mut parent_node: Node<Form>) -> Result<Node<Form>> {
let Some((current_token, remaining_tokens)) = tokens.split_first() else {
return Ok(parent_node);
};
match current_token {
Tokens::Bounds(TokenBounds::LeftParen) => {
// Parse function call: (symbol ...)
let (Tokens::Symbol(function_name), tokens_after_symbol) = remaining_tokens
.split_first()
.ok_or(anyhow!("Unexpected empty parens"))?
else {
return Err(anyhow!("Expected symbol after left paren"));
};
// We'll parse everything between ( and ) into a CallExpression node
parse_delimited_form(
Form::CallExpression(function_name.clone()),
TokenBounds::LeftParen,
tokens_after_symbol,
parent_node,
)
}
Tokens::Bounds(TokenBounds::LeftBracket) => {
// Parse list: [items...]
parse_delimited_form(Form::List, TokenBounds::LeftBracket, remaining_tokens, parent_node)
}
Tokens::Bounds(TokenBounds::LeftCurlyBraces) => {
// Parse record: {:key value ...}
parse_delimited_form(Form::Record, TokenBounds::LeftCurlyBraces, remaining_tokens, parent_node)
}
Tokens::Bounds(_) => parse(remaining_tokens, parent_node),
token => {
// Simple value: add and continue
let form: Form = token.clone().into();
parent_node.append(Node::new(form));
parse(remaining_tokens, parent_node)
}
}
}
When we encounter an opening delimiter, we need to find everything that belongs inside it. That’s where parse_delimited_form
comes in.
Parsing Nested Structures
The parse_delimited_form
function handles nested structures:
fn parse_delimited_form(
form_type: Form,
opening_delimiter: TokenBounds,
tokens_to_parse: &[Tokens],
mut parent_node: Node<Form>,
) -> Result<Node<Form>> {
let (tokens_inside_delimiters, tokens_after_closing) =
find_matching_delimiter(tokens_to_parse, opening_delimiter)?;
// Parse the contents into a new node
let parsed_form = parse(tokens_inside_delimiters, Node::new(form_type))?;
parent_node.append(parsed_form);
// Continue parsing after the closing delimiter
parse(tokens_after_closing, parent_node)
}
This creates a tree structure: the parent gets a child node containing all the parsed contents.
The Power of Mutual Recursion
The magic happens through mutual recursion between parse
and parse_delimited_form
:
parse
encounters an opening delimiter → callsparse_delimited_form
parse_delimited_form
finds the matching closer → callsparse
on the contentsparse
processes the contents, which may contain more delimiters → back to step 1
This mutual recursion naturally handles arbitrary nesting:
Input: (+ (* 2 3) (- 5 1))
parse sees '(' → parse_delimited_form
→ finds matching ')' after "1)"
→ calls parse on "(* 2 3) (- 5 1)"
→ parse sees '(' → parse_delimited_form
→ finds matching ')' after "3)"
→ calls parse on "2 3"
→ parse adds 2 and 3 as children
→ parse sees '(' → parse_delimited_form
→ finds matching ')' after "1)"
→ calls parse on "5 1"
→ parse adds 5 and 1 as children
Each level of nesting creates a new branch in the tree. The recursion depth matches the nesting depth - no special cases needed.
Finding Matching Delimiters
The trickiest part is finding where a nested structure ends:
fn find_matching_delimiter(tokens: &[Tokens], opening_delimiter: TokenBounds)
-> Result<(&[Tokens], &[Tokens])> {
let closing_delimiter = match opening_delimiter {
TokenBounds::LeftParen => TokenBounds::RightParen,
TokenBounds::LeftBracket => TokenBounds::RightBracket,
TokenBounds::LeftCurlyBraces => TokenBounds::RightCurlyBraces,
_ => return Err(anyhow!("Invalid opening delimiter")),
};
let mut nesting_level = 1;
for (index, token) in tokens.iter().enumerate() {
match token {
Tokens::Bounds(delimiter) if *delimiter == opening_delimiter => {
nesting_level += 1;
}
Tokens::Bounds(delimiter) if *delimiter == closing_delimiter => {
nesting_level -= 1;
if nesting_level == 0 {
// Split: tokens inside, tokens after closer
let (inside, including_closer) = tokens.split_at(index);
return Ok((inside, &including_closer[1..]));
}
}
_ => {}
}
}
Err(anyhow!("Unmatched {} - missing closing delimiter",
match opening_delimiter {
TokenBounds::LeftParen => "parenthesis '('",
TokenBounds::LeftBracket => "bracket '['",
TokenBounds::LeftCurlyBraces => "brace '{'",
_ => "delimiter"
}
))
}
We track nesting level. Each opener increases it, each closer decreases it. When it hits zero, we’ve found our match.
The function returns two slices:
- Tokens inside the delimiters (for parsing)
- Tokens after the closing delimiter (to continue)
Complete Example
Let’s parse a function definition:
For a simpler example, (+ 1 2)
:
What We Built
Our parser:
- Transforms flat tokens into a tree structure
- Handles nested expressions recursively
- Validates delimiter matching
- Returns helpful errors for malformed input
The parser leverages Lisp’s simple syntax. No precedence rules, no ambiguity. Just nested lists.
Next: Part 3 - Evaluation