The Problem
We have tokens. We need a tree.
// We have this:[LeftParen, Symbol("+"), Integer(1), Integer(2), RightParen]
// We need this:CallExpression("+", [Literal(Integer(1)), Literal(Integer(2))])Why a tree? Because evaluation is recursive. To evaluate (+ (* 2 3) 4):
- Evaluate the function:
+ - Evaluate each argument:
(* 2 3)and4 - Apply the function to the results
But (* 2 3) is itself a function call, so we recurse. The tree structure mirrors this evaluation order - parent nodes can’t be evaluated until their children are.
Most languages require complex parsing. Lisp doesn’t. Its syntax is already a tree.
The Type System
First, we convert tokens into AST nodes. The key types:
pub enum Tokens { Bounds(TokenBounds), // (, ), [, ], {, } Literal(Literal), // 42, "hello", true Symbol(String), // +, defn, my-var Key(String), // :name, :age}
pub enum Form { Root, // Top level container Literal(Literal), // Numbers, strings, bools CallExpression(String), // (function-name ...) Symbol(String), // Variable references List, // [1 2 3] Record, // {:key value} Key(String), // Record keys}The magic happens with the From trait:
impl From<Tokens> for Form { fn from(value: Tokens) -> Self { match value { Tokens::Bounds(_) => unreachable!(), Tokens::Literal(l) => Form::Literal(l), Tokens::Symbol(s) => Form::Symbol(s), Tokens::Key(k) => Form::Key(k), } }}Bounds tokens don’t become forms - they control the parsing flow.
The AST
Our tree nodes are represented by the Form enum:
pub enum Form { Root, Literal(Literal), CallExpression(String), Symbol(String), List, Record, Key(String),}Each variant represents a different syntactic element:
CallExpression: Function calls like(+ 1 2)List: Data lists like[1 2 3]Record: Hash maps like{:name "Alice"}Symbol: Variables and function namesLiteral: Numbers, strings, booleans
The Algorithm
We use recursive descent parsing. The algorithm:
- Take one token at a time
- If it’s an opening delimiter: Parse a nested structure
- If it’s a value: Add it to the current node
- If it’s a closing delimiter: Skip it (handled elsewhere)
- Recurse with remaining tokens
pub fn parse(tokens: &[Tokens], mut parent_node: Node<Form>) -> Result<Node<Form>> { let Some((current_token, remaining_tokens)) = tokens.split_first() else { return Ok(parent_node); };
match current_token { Tokens::Bounds(TokenBounds::LeftParen) => { // Parse function call: (symbol ...) let (Tokens::Symbol(function_name), tokens_after_symbol) = remaining_tokens .split_first() .ok_or(anyhow!("Unexpected empty parens"))? else { return Err(anyhow!("Expected symbol after left paren")); }; // We'll parse everything between ( and ) into a CallExpression node parse_delimited_form( Form::CallExpression(function_name.clone()), TokenBounds::LeftParen, tokens_after_symbol, parent_node, ) } Tokens::Bounds(TokenBounds::LeftBracket) => { // Parse list: [items...] parse_delimited_form(Form::List, TokenBounds::LeftBracket, remaining_tokens, parent_node) } Tokens::Bounds(TokenBounds::LeftCurlyBraces) => { // Parse record: {:key value ...} parse_delimited_form(Form::Record, TokenBounds::LeftCurlyBraces, remaining_tokens, parent_node) } Tokens::Bounds(_) => parse(remaining_tokens, parent_node), token => { // Simple value: add and continue let form: Form = token.clone().into(); parent_node.append(Node::new(form)); parse(remaining_tokens, parent_node) } }}When we encounter an opening delimiter, we need to find everything that belongs inside it. That’s where parse_delimited_form comes in.
Parsing Nested Structures
The parse_delimited_form function handles nested structures:
fn parse_delimited_form( form_type: Form, opening_delimiter: TokenBounds, tokens_to_parse: &[Tokens], mut parent_node: Node<Form>,) -> Result<Node<Form>> { let (tokens_inside_delimiters, tokens_after_closing) = find_matching_delimiter(tokens_to_parse, opening_delimiter)?;
// Parse the contents into a new node let parsed_form = parse(tokens_inside_delimiters, Node::new(form_type))?; parent_node.append(parsed_form);
// Continue parsing after the closing delimiter parse(tokens_after_closing, parent_node)}This creates a tree structure: the parent gets a child node containing all the parsed contents.
The Power of Mutual Recursion
The magic happens through mutual recursion between parse and parse_delimited_form:
parseencounters an opening delimiter → callsparse_delimited_formparse_delimited_formfinds the matching closer → callsparseon the contentsparseprocesses the contents, which may contain more delimiters → back to step 1
This mutual recursion naturally handles arbitrary nesting:
Input: (+ (* 2 3) (- 5 1))
parse sees '(' → parse_delimited_form → finds matching ')' after "1)" → calls parse on "(* 2 3) (- 5 1)" → parse sees '(' → parse_delimited_form → finds matching ')' after "3)" → calls parse on "2 3" → parse adds 2 and 3 as children → parse sees '(' → parse_delimited_form → finds matching ')' after "1)" → calls parse on "5 1" → parse adds 5 and 1 as childrenEach level of nesting creates a new branch in the tree. The recursion depth matches the nesting depth - no special cases needed.
Finding Matching Delimiters
The trickiest part is finding where a nested structure ends:
fn find_matching_delimiter(tokens: &[Tokens], opening_delimiter: TokenBounds) -> Result<(&[Tokens], &[Tokens])> { let closing_delimiter = match opening_delimiter { TokenBounds::LeftParen => TokenBounds::RightParen, TokenBounds::LeftBracket => TokenBounds::RightBracket, TokenBounds::LeftCurlyBraces => TokenBounds::RightCurlyBraces, _ => return Err(anyhow!("Invalid opening delimiter")), };
let mut nesting_level = 1; for (index, token) in tokens.iter().enumerate() { match token { Tokens::Bounds(delimiter) if *delimiter == opening_delimiter => { nesting_level += 1; } Tokens::Bounds(delimiter) if *delimiter == closing_delimiter => { nesting_level -= 1; if nesting_level == 0 { // Split: tokens inside, tokens after closer let (inside, including_closer) = tokens.split_at(index); return Ok((inside, &including_closer[1..])); } } _ => {} } } Err(anyhow!("Unmatched {} - missing closing delimiter", match opening_delimiter { TokenBounds::LeftParen => "parenthesis '('", TokenBounds::LeftBracket => "bracket '['", TokenBounds::LeftCurlyBraces => "brace '{'", _ => "delimiter" } ))}We track nesting level. Each opener increases it, each closer decreases it. When it hits zero, we’ve found our match.
The function returns two slices:
- Tokens inside the delimiters (for parsing)
- Tokens after the closing delimiter (to continue)
Complete Example
Let’s parse a function definition:
For a simpler example, (+ 1 2):
What We Built
Our parser:
- Transforms flat tokens into a tree structure
- Handles nested expressions recursively
- Validates delimiter matching
- Returns helpful errors for malformed input
The parser leverages Lisp’s simple syntax. No precedence rules, no ambiguity. Just nested lists.
Next: Part 3 - Evaluation