Leaf Language Grammar
This page contains the complete formal grammar for the Leaf programming language in Extended Backus–Naur Form (EBNF).
The notation used:
=— defines a rule|— alternation (either/or){ }— zero or more repetitions[ ]— optional (zero or one)( )— grouping" "— literal terminal? ?— prose description of a terminal class;— end of rule
Program Structure
program = { top_level_item } ;
top_level_item = import_statement
| struct_declaration
| enum_declaration
| interface_declaration
| function_declaration
| type_alias_declaration
| constant_binding ;
Comments and Whitespace
comment = "#" { ? any character except newline ? } [ ? newline ? ] ;
whitespace = ? space, tab, or newline ? ;
Comments are single-line, beginning with # and extending to the end of the line. Whitespace (including newlines) is not significant beyond token separation, except that newlines act as statement terminators when the preceding token can end a statement. See lang/semantics/core.md §2.0.1 for the full continuation-line rules.
Identifiers and Keywords
letter = ? a–z, A–Z ? | "_" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
hex_digit = digit | "a" | "b" | "c" | "d" | "e" | "f"
| "A" | "B" | "C" | "D" | "E" | "F" ;
octal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" ;
identifier = letter { letter | digit } ;
Reserved keywords (may not be used as identifiers):
keyword = "as" | "bool" | "break" | "byte"
| "continue" | "else" | "elseif" | "end"
| "enum" | "false" | "float" | "fn" | "for"
| "gen" | "if" | "implements" | "in"
| "int" | "interface" | "match" | "matches"
| "mut" | "never" | "out" | "pkg"
| "pub" | "return" | "Self" | "self"
| "str" | "struct" | "then" | "true"
| "type" | "uint" | "use" | "where"
| "while" | "yield" ;
Literals
Numeric Literals
numeric_literal = integer_literal | float_literal ;
integer_suffix = "u" | "b" ;
integer_literal = decimal_digits [ integer_suffix ]
| "0" ( "x" | "X" ) hex_digit { [ "_" ] hex_digit } [ integer_suffix ]
| "0" ( "b" | "B" ) ( "0" | "1" ) { [ "_" ] ( "0" | "1" ) } [ integer_suffix ]
| "0" ( "o" | "O" ) octal_digit { [ "_" ] octal_digit } [ integer_suffix ] ;
float_literal = decimal_digits "." [ decimal_digits ] [ exponent_part ]
| decimal_digits exponent_part ;
decimal_digits = digit { [ "_" ] digit } ;
exponent_part = ( "e" | "E" ) [ "+" | "-" ] decimal_digits ;
The u suffix produces a uint literal; the b suffix produces a byte literal. Numeric literals may use underscores as separators (e.g. 1_000_000, 0xFF_AA). The b suffix is lexically distinct from the 0b binary prefix: 0b1010 is a binary int literal, while 42b is a decimal byte literal and 0b1010b is a binary byte literal. When 0 is followed by b and then a binary digit (0 or 1), it is a binary prefix; when b follows the final digit of the literal, it is a byte suffix.
String Literals
string_literal = '"' { string_char | escape_sequence | interpolation } '"'
| '"""' { triple_string_char | escape_sequence | interpolation } '"""' ;
string_char = ? any character except '"', '\', '{', '}' ? ;
triple_string_char = ? any character except '"""' and '{', '}' ? ;
escape_sequence = "\" ( "n" | "t" | "\" | '"' | "{" | "}" )
| "\" "u" hex_digit hex_digit hex_digit hex_digit
| "\" "u{" hex_digit { hex_digit } "}" ;
interpolation = "{" expression "}" ;
String literals support interpolation via {expression}. Triple-quoted strings span multiple lines. Only double quotes are used; single-quote strings do not exist.
Boolean Literals
boolean_literal = "true" | "false" ;
Types
type = primary_type [ "?" ] ;
primary_type = "bool"
| "byte"
| "int"
| "uint"
| "float"
| "str"
| "never"
| "()"
| "Self"
| generic_type
| function_type
| identifier
| "(" type ")" ;
generic_type = identifier "[" type { "," type } "]" ;
function_type = "fn" "(" [ fn_type_params ] ")" "->" type ;
fn_type_params = type { "," type } [ "," "..." type ]
| "..." type ;
T? is syntactic sugar for Option[T]. Built-in generic types include Option[T], Result[T, E], List[T], Map[K, V], Tuple[T1, T2, ...] (2–10 elements), and Generator[Y] / Generator[Y, R] / Generator[Y, R, N].
Type Parameters
type_params = type_parameter { "," type_parameter } ;
type_parameter = [ "in" | "out" ] identifier [ "=" type | ":" interface_bound ] ;
where_clause = "where" constraint { "," constraint } ;
constraint = ( identifier | "Self" ) ":" interface_bound ;
interface_bound = interface_ref { "&" interface_ref } ;
interface_ref = identifier [ "[" type { "," type } "]" ] ;
Variance annotations:
in T— contravariant type parameterout T— covariant type parameter- (unmarked) — invariant
Visibility and Modifiers
visibility = "pub" | "pkg" ;
pub— visible to all modulespkg— visible within the same package- (unmarked) — private to the defining module
Import Statements
import_statement = [ visibility ] "use" import_path ;
import_path = [ "@" ] module_path ;
module_path = identifier { "." identifier }
| identifier { "." identifier } "." "[" identifier { "," identifier } "]"
| identifier { "." identifier } "." "*"
| identifier { "." identifier } "as" identifier ;
- Bare path — imports from the current project (resolved from
src/) @pkgprefix — imports from external packagepkg@stdprefix — imports from the standard librarypub use/pkg use— re-exports with the given visibility (top-level only)- Multi-import:
use math.vector.[Vector, dot] - Alias:
use math.vector.Vector as Vec - Enum variant wildcard:
use Direction.*brings all variants into scope - Scoped import: bare
usemay also appear as a statement inside any block (seescoped_importin the statement grammar)
Top-Level Declarations
Constants
constant_binding = [ visibility ] identifier "=" constant_expression ;
constant_expression = constant_term { ( "+" | "-" ) constant_term } ;
constant_term = constant_unary { ( "*" | "/" ) constant_unary } ;
constant_unary = [ "-" ] constant_primary ;
constant_primary = numeric_literal
| string_literal
| boolean_literal
| "(" constant_expression ")" ;
Constants are module-level bindings whose value is a compile-time expression. The constant_expression grammar is intentionally simplified — constant expressions follow the same operator precedence and associativity rules as runtime expressions (see lang/semantics/values.md §2.12.2). Multiplication and division bind tighter than addition and subtraction: 2 + 3 * 4 evaluates as 2 + (3 * 4) = 14.
Type Aliases
type_alias_declaration = [ visibility ] "type" identifier [ "[" type_params "]" ]
[ where_clause ]
"=" type ;
Interface Declarations
interface_declaration = [ visibility ] "interface" identifier [ "[" type_params "]" ]
[ where_clause ]
{ interface_method }
"end" ;
interface_method = "fn" identifier [ "[" type_params "]" ]
"(" [ interface_params ] ")" [ "->" type ]
[ where_clause ]
[ block_body ] ;
interface_params = ( "self" | "mut self" ) { "," interface_param } [ "," variadic_parameter ]
| interface_param { "," interface_param } [ "," variadic_parameter ]
| variadic_parameter ;
interface_param = identifier ":" type ;
Struct Declarations
struct_declaration = [ visibility ] "struct" identifier [ "[" type_params "]" ]
[ "implements" interface_list ]
[ where_clause ]
{ struct_member }
"end" ;
interface_list = interface_ref { "," interface_ref } ;
struct_member = struct_field | method_declaration ;
struct_field = [ visibility ] identifier ":" type ;
method_declaration = [ visibility ] [ "gen" ] "fn" [ qualified_method_prefix ]
[ "[" type_params "]" ]
method_name "(" [ method_params ] ")" [ "->" type ]
[ where_clause ]
block_body ;
qualified_method_prefix = identifier [ "[" type { "," type } "]" ] "." ;
method_name = identifier | "new" ;
method_params = ( "self" | "mut self" ) { "," parameter } [ "," variadic_parameter ]
| parameter { "," parameter } [ "," variadic_parameter ]
| variadic_parameter ;
The reserved method name new is the constructor. gen fn marks a generator method. qualified_method_prefix disambiguates interface method names (e.g. fn Comparable[T].compare(...) for generic interfaces, or fn Hash.hash(...) for non-generic interfaces). Type arguments are required only when the interface is generic.
Enum Declarations
enum_declaration = [ visibility ] "enum" identifier [ "[" type_params "]" ]
{ enum_variant }
"end" ;
enum_variant = identifier
| identifier "(" variant_fields ")" ;
variant_fields = positional_fields | named_fields ;
positional_fields = type { "," type } ;
named_fields = identifier ":" type { "," identifier ":" type } ;
Block Bodies
Every construct that introduces a body uses the same two-form rule:
block_body = expression (* inline: expression on the same line as the introducer *)
| { statement } "end" ; (* multi-statement: zero or more statements, then "end" *)
The parser selects the inline form when an expression immediately follows the block introducer on the same line; otherwise the multi-statement form is used. See lang/semantics/core.md §2.0.2 for full rules.
Function Declarations
function_declaration = [ visibility ] [ "gen" ] "fn" identifier [ "[" type_params "]" ]
"(" [ parameters ] ")" [ "->" type ]
[ where_clause ]
block_body ;
parameters = parameter { "," parameter } [ "," variadic_parameter ]
| variadic_parameter ;
parameter = [ "mut" ] identifier ":" type
| [ "mut" ] destructure_pattern ":" type ;
variadic_parameter = identifier ":" "..." type ;
gen fn declares a generator function; its return type must be Generator[Y], Generator[Y, R], or Generator[Y, R, N].
Statements
statement = variable_binding
| assignment
| compound_assignment
| function_declaration
| scoped_import
| if_statement
| while_statement
| for_statement
| match_statement
| return_statement
| break_statement
| continue_statement
| expression_statement ;
scoped_import = "use" import_path ;
Variable Bindings
variable_binding = [ "mut" ] binding_target ":" type "=" expression
| [ "mut" ] binding_target "=" expression ;
binding_target = identifier
| destructure_pattern
| "_" ;
The mut modifier makes the binding mutable. A type annotation is optional when the type can be inferred. The binding_target production restricts the left-hand side of variable bindings to valid binding forms (identifiers, destructuring patterns, and the discard placeholder). The full pattern production — which also includes literal patterns and enum patterns — is only used in match arms and matches conditions.
Assignment
assignment = lvalue "=" expression ;
lvalue = identifier
| expression "." identifier
| expression "[" expression "]" ;
Compound Assignment
compound_assignment = lvalue compound_op expression ;
compound_op = "+=" | "-=" | "*=" | "/=" | "%="
| "&=" | "|=" | "^=" | "<<=" | ">>=" ;
Control Flow Statements
if_statement = "if" condition if_branch_body
{ "elseif" condition if_branch_body }
[ "else" if_branch_body ]
[ "end" ] ;
if_branch_body = expression (* inline: no "end"; "else"/"elseif"/"end" follow directly *)
| { statement } ; (* multi-statement: closed by the next "elseif"/"else"/"end" *)
else and elseif implicitly close the preceding branch body. When any branch uses the multi-statement form, end is required at the end of the chain to close the final branch and the entire if construct. When all branches use the inline form, end is omitted.
while_statement = "while" condition block_body ;
for_statement = "for" [ "mut" ] for_pattern "in" expression block_body ;
for_pattern = identifier
| destructure_pattern ;
condition = expression ;
elseif is one word (not else if). matches conditions (e.g., if x matches Some(v)) are parsed as ordinary expressions using the matches comparison operator; binding introduction is a semantic rule, not a grammar distinction (see §3.1).
Match Statement
match_statement = "match" expression
{ match_arm }
"end" ;
match_arm = pattern "then" block_body ;
match requires exhaustive coverage of all possible values. Each arm uses then to introduce its body. Inline arms have no end; multi-statement arms each end with end, and the outer end closes the match construct.
Other Statements
return_statement = "return" [ expression ] ;
break_statement = "break" ;
continue_statement = "continue" ;
expression_statement = expression ;
break and continue affect the innermost enclosing loop only. A return with no expression returns ().
Patterns
pattern = wildcard_pattern
| literal_pattern
| binding_pattern
| enum_pattern
| list_pattern
| map_pattern
| tuple_pattern ;
wildcard_pattern = "_" ;
literal_pattern = [ "-" ] numeric_literal | string_literal | boolean_literal ;
binding_pattern = ? identifier other than "_" ? ;
enum_pattern = qualified_name [ "(" pattern { "," pattern } ")" ] ;
list_pattern = "[" [ list_pattern_items ] "]" ;
map_pattern = "{" [ map_pattern_entries ] "}" ;
tuple_pattern = "(" pattern "," pattern { "," pattern } ")" ;
list_pattern_items = pattern { "," pattern } [ "," "..." identifier ]
| "..." identifier ;
map_pattern_entries = identifier { "," identifier } ;
qualified_name = identifier { "." identifier } ;
Multiple _ bindings in the same scope do not conflict — each occurrence is a distinct discard (see lang/semantics/patterns.md §2.14.6).
Patterns are used in match arms and if/while matches conditions. Enum patterns match a variant and optionally bind its fields (positional). List patterns support rest bindings (...tail). Map/struct patterns bind named fields by their identifier.
Destructuring Patterns
destructure_pattern = map_destructure
| list_destructure
| tuple_destructure ;
map_destructure = "{" destructure_binding { "," destructure_binding } "}" ;
list_destructure = "[" list_destruct_items "]" ;
tuple_destructure = "(" destructure_binding "," destructure_binding { "," destructure_binding } ")" ;
destructure_binding = identifier | "_" ;
list_destruct_items = destructure_binding { "," destructure_binding }
| destructure_binding { "," destructure_binding } "," "..." identifier
| "..." identifier ;
Destructuring extracts fields from structs and maps, elements from lists, or elements from tuples, into local bindings. Tuple destructuring requires at least two elements (matching the 2–10 element tuple constraint).
Expressions
expression = yield_expr | pipe_expr ;
yield_expr = "yield" [ pipe_expr ] ;
pipe_expr = or_expr { "|>" or_expr } ;
or_expr = and_expr { "||" and_expr } ;
and_expr = comparison_expr { "&&" comparison_expr } ;
comparison_expr = bitor_expr [ comparison_op bitor_expr ] ; (* non-associative *)
bitor_expr = bitxor_expr { "|" bitxor_expr } ;
bitxor_expr = bitand_expr { "^" bitand_expr } ;
bitand_expr = shift_expr { "&" shift_expr } ;
shift_expr = additive_expr { shift_op additive_expr } ;
additive_expr = multiplicative_expr { additive_op multiplicative_expr } ;
multiplicative_expr = unary_expr { multiplicative_op unary_expr } ;
unary_expr = postfix_expr | unary_op unary_expr ;
postfix_expr = primary_expr { call_suffix | "?" } ;
comparison_op = "==" | "!=" | "<" | ">" | "<=" | ">=" | "matches" ;
shift_op = "<<" | ">>" ;
additive_op = "+" | "-" ;
multiplicative_op = "*" | "/" | "%" ;
unary_op = "-" | "!" | "~" ;
matches right-hand side is a pattern. When matches is the comparison operator in comparison_expr, the right operand is parsed as a pattern (see the Patterns section), not as a bitor_expr. Patterns and expressions overlap syntactically in most cases, but they differ semantically — for example, Some(x) in pattern context introduces a fresh binding x, whereas in expression context it is a function call. See lang/semantics/values.md §2.12.2 and lang/types.md §3.1 for the full matches semantics.
Operator precedence (tightest binding to loosest — matches lang/semantics/values.md §2.12.2):
| Precedence | Operators |
|---|---|
| 1 | .field, .method(), Type.new(), Enum.Variant |
| 2 | f(), x[i], F[T] |
| 3 | x? (postfix propagation) |
| 4 | -x, !x, ~x (unary prefix) |
| 5 | *, /, % |
| 6 | +, - |
| 7 | <<, >> |
| 8 | & |
| 9 | ^ |
| 10 | \| |
| 11 | ==, !=, <, >, <=, >=, matches (non-associative) |
| 12 | && |
| 13 | \|\| |
| 14 | \|> (pipe) |
| 15 | yield |
Call Expressions and Postfix Operators
postfix_expr (defined above) handles call expressions and the ? propagation operator in a single loop, allowing chains like getData()?.len().
call_suffix = "." identifier [ [ "[" type { "," type } "]" ] "." identifier ] [ "(" [ arguments ] ")" ]
| "." integer_literal
| "[" expression "]"
| "(" [ arguments ] ")" ;
arguments = argument { "," argument } ;
argument = expression
| "..." expression ;
The "." integer_literal form is tuple element access (e.g., pair.0, triple.2). The integer must be a non-negative decimal literal with no suffix, evaluated at compile time, and must be within the tuple's arity.
Bracket disambiguation. The "[" expression "]" suffix in call_suffix is syntactically ambiguous with explicit type arguments (e.g., f[int] could be bracket indexing or a type-parameterized function). The grammar does not distinguish these forms — the parser produces an ambiguous bracket-expression node. During name resolution, the compiler determines the interpretation:
- If the preceding identifier resolves to a type (struct, enum, type alias), the brackets are type arguments.
- If the preceding identifier resolves to a value (variable, function parameter, etc.), the brackets are bracket indexing.
- If the bracket contents include commas (
f[x, y]), the node is always type arguments, since bracket indexing takes exactly one expression.
Normal scoping rules apply — the innermost binding wins. If a local variable shadows a type name, bracket access on that name is indexing, not type arguments. See lang/semantics/functions.md §2.7 for the full disambiguation rule.
Qualified call disambiguation. When the type arguments are omitted in a qualified call (e.g., obj.Hash.hash()), the syntax is ambiguous with chained field/method access. The parser produces an ambiguous node and relies on semantic resolution — if the first identifier resolves to an interface name, the expression is a qualified call; otherwise it is chained access. This is consistent with the bracket disambiguation approach above.
Primary Expressions
primary_expr = numeric_literal
| string_literal
| boolean_literal
| identifier
| qualified_name
| type_name
| primitive_type_name
| "Self"
| "self"
| struct_literal
| list_literal
| map_literal
| tuple_literal
| closure_expr
| if_expr
| match_expr
| "(" ")"
| "(" expression ")" ;
primitive_type_name = "byte" | "int" | "uint" | "float" | "str" | "bool" ;
tuple_literal = "(" expression "," expression { "," expression } ")" ;
type_name = identifier "[" type { "," type } "]" ;
primitive_type_name allows primitive type keywords to appear in expression position, enabling static method calls and constant access such as int.parse("42"), byte.MAX, float.INFINITY, etc. (see stdlib/functions.md). These names are keywords and would not match the identifier production. They combine with call_suffix for field and method access like any other primary expression.
type_name matches a generic type instantiation such as Pair[int, str] or Response[int]. It appears both in primary_expr (for enum variant access like Response[int].Success(42)) and in struct_literal (for generic struct construction). Because type_name begins with identifier "[" ..., it is syntactically ambiguous with bracket indexing — see the bracket disambiguation rule above.
tuple_literal matches parenthesized, comma-separated expressions with at least two elements (e.g., (1, "hello"), (true, 3.14, "yes")). The comma after the first expression disambiguates from grouping ("(" expression ")").
"(" ")" is the unit literal expression. It is unambiguous — empty parens have no expression between them, while grouping requires one and tuple literals require at least two comma-separated expressions. The unit value is used in expressions like Done(()), Ok(()), and return Ok(()).
Struct Literals
struct_literal = ( identifier | type_name | "Self" ) "{" [ struct_fields ] "}" ;
struct_fields = struct_field_init { "," struct_field_init } ;
struct_field_init = identifier ":" expression
| identifier ; (* shorthand: uses local binding of same name *)
Collection Literals
list_literal = "[" [ list_items ] "]" ;
list_items = list_item { "," list_item } ;
list_item = expression | "..." expression ; (* spread *)
map_literal = "{" [ map_entries ] "}" ;
map_entries = map_entry { "," map_entry } ;
map_entry = identifier ":" expression (* bare identifier key = string key *)
| "[" expression "]" ":" expression (* computed key *)
| identifier (* shorthand: key = identifier, value = local binding *)
| "..." expression ; (* spread *)
An empty list [] has type List[never], assignable to any List[T]. An explicit type annotation is required when creating an empty mutable collection.
Closures
closure_expr = "fn" "(" [ closure_params ] ")" [ "->" type ] block_body ;
closure_params = closure_param { "," closure_param } [ "," variadic_parameter ]
| variadic_parameter ;
closure_param = [ "mut" ] identifier [ ":" type ]
| [ "mut" ] destructure_pattern ":" type ;
Closure parameter types may be omitted when they can be inferred from context. Parentheses are always required, even for zero-parameter closures: fn() expr.
if Expressions
if_expr = "if" condition if_branch_body
{ "elseif" condition if_branch_body }
[ "else" if_branch_body ]
[ "end" ] ;
When used as an expression, all branches must produce a value of the same type (or a subtype). A missing else branch produces () and forces the if type to be ().
match Expressions
match_expr = "match" expression
{ match_arm }
"end" ;
When used as an expression, all arms must produce compatible types. match must be exhaustive.
The ? Propagation Operator
The ? postfix operator is defined as part of postfix_expr in the expression grammar above.
Applied to Result[T, E]: unwraps Ok(v) to v, or returns Err(e) early from the enclosing function. Applied to Option[T]: unwraps Some(v) to v, or returns None early. The enclosing function's return type must be compatible.
Generator Functions and yield
Generator functions are declared using the gen modifier on function_declaration (see §Function Declarations above). There is no separate grammar production for generator functions — the [ "gen" ] prefix in function_declaration handles them. When gen is present, the return type must be Generator[Y], Generator[Y, R], or Generator[Y, R, N]; this constraint is enforced by the type checker, not the grammar.
yield is an expression (see yield_expr in the expression grammar). It suspends execution and produces a value of type Y to the caller; the expression itself evaluates to type N (the value passed by the caller on resume). The operand is optional — bare yield without an operand yields () (the unit value) and is valid only when Y = (). This mirrors return without an operand, which returns (). yield is only valid inside gen fn bodies. The three type parameters of Generator are: Y (yield type), R (return type, default ()), N (next-input type, default never).
Source File Structure
source_file = { import_statement } { top_level_item } ;
A .leaf source file begins with optional imports, followed by declarations in any order. Top-level declarations are mutually visible throughout the file regardless of order. Top-level imports may include visibility modifiers (pub use, pkg use) for re-exports. Bare use statements may also appear inside any block as scoped imports (see scoped_import in the statement grammar). The entry point for executable programs is fn main() defined in src/main.leaf.