7.7. PEG parser generator
The PEG module is a parser generator based on
Parsing Expression Grammars.
Define grammars directly in daslang using the parse macro — the compiler
generates a packrat parser at compile time. No external tools, no runtime code
generation.
See PEG-01 — Hello Parser for a hands-on introduction.
7.7.1. Grammar basics
A grammar lives inside a parse(input) { ... } block. The first var
declaration is the entry rule. Each rule(...) adds an ordered alternative:
def my_parser(input : string; blk : block<(val : int; err : array<ParsingError>) : void>) {
parse(input) {
var expr : int
rule(number as n, WS, "+", WS, number as m, EOF) {
return n + m
}
}
}
Rules try alternatives top to bottom. The first match wins (ordered choice — no ambiguity).
7.7.2. Built-in terminals
Terminal |
Description |
|---|---|
|
Exact text match |
|
End of input |
|
End of line ( |
|
Zero or more whitespace (including newlines) |
|
Zero or more tabs/spaces (no newlines) |
|
Any single character |
|
Decimal integer (returns |
|
Double-quoted string literal (returns content) |
|
Floating-point number (returns |
|
Character from ranges: |
|
Any character NOT in the set |
7.7.3. Operators and combinators
Syntax |
Description |
|---|---|
|
Bind the rule result to |
|
Extract matched text as string |
|
One or more repetitions (array) |
|
Zero or more repetitions (array) |
|
Optional (zero or one) |
|
Negative lookahead (no input consumed) |
|
Positive lookahead (no input consumed) |
|
Cut — no backtracking past this point |
|
Print debug message during parsing |
7.7.4. Options
Add inside the parse block:
option(tracing)— print rule-by-rule execution traceoption(color)— ANSI colors in trace outputoption(print_generated)— show generated parser code at compile time
7.7.5. Return types
Rules can return any daslang type: scalars, strings, structs, variants,
tuples, tables, and arrays. Use void? for pattern-only rules that
match characters without producing a value.
7.7.6. Error handling
parse calls a callback $(val; err) where err is an
array<ParsingError>. Place commit after unambiguous prefixes to
get meaningful error messages with position information.
7.7.7. Left recursion
dasPEG supports left-recursive rules via packrat memoization. This is the
standard technique for encoding left-associative operators (e.g. a + b + c).
7.7.8. Performance
Packrat memoization caches results per rule per position — O(n) parsing
Place common alternatives first (PEG tries in order)
Use
PEEKto quickly reject impossible alternativesUse
commitafter unambiguous prefixes for speed and error qualitySet
options stack = 1000000— PEG parsers need more stack than default
All functions and symbols are in “peg” module, use require to get access to it.
require peg/peg
Example:
require peg/peg
def parse_greeting(input : string;
blk : block<(val : string; err : array<ParsingError>) : void>) {
parse(input) {
var greeting : string
rule("Hello, ", "{+letter}" as name, "!", EOF) {
return name
}
var letter : void?
rule(set('a'..'z', 'A'..'Z')) {
return null
}
}
}
[export]
def main() {
parse_greeting("Hello, World!") $(val; err) {
print("name = {val}\n")
}
}
// output:
// name = World
7.7.8.1. Structures
- ParsingError
struct ParsingError
7.7.8.2. Matching primitives
- get_current_char(parser: auto): int
def get_current_char (var parser: auto) : int
- Arguments:
parser : auto
- matches(parser: auto; tmpl: string; strlen: int): bool
def matches (var parser: auto; tmpl: string; strlen: int) : bool
- Arguments:
parser : auto
tmpl : string
strlen : int
- move(parser: auto; offset: auto): auto
def move (var parser: auto; offset: auto) : auto
- Arguments:
parser : auto
offset : auto
- reached_EOF(parser: auto): bool
def reached_EOF (var parser: auto) : bool
- Arguments:
parser : auto
- reached_EOL(parser: auto): bool
def reached_EOL (var parser: auto) : bool
- Arguments:
parser : auto
7.7.8.3. Whitespace
- skip_taborspace(parser: auto): auto
def skip_taborspace (var parser: auto) : auto
- Arguments:
parser : auto
- skip_whitespace(parser: auto): auto
def skip_whitespace (var parser: auto) : auto
- Arguments:
parser : auto
7.7.8.4. Literal matching
- match_decimal_literal(parser: auto): tuple<success:bool;value:int;endpos:int>
Simple lexing of decimal integers, doesn’t check for overflow
- Arguments:
parser : auto
- match_double_literal(parser: auto): tuple<success:bool;double>
Matches doubles in the form of [-+]? [0-9]* .? [0-9]+ ([eE] [-+]? [0-9]+)? The number is not checked to be representable as defined in IEEE-754
- Arguments:
parser : auto
- match_string_literal(parser: auto): tuple<success:bool;string>
Tries to match everything inside “”
- Arguments:
parser : auto
7.7.8.5. Tracing and logging
- log_fail(parser: auto; message: auto): auto
def log_fail (var parser: auto; var message: auto) : auto
- Arguments:
parser : auto
message : auto
- log_info(parser: auto; message: auto): auto
def log_info (var parser: auto; var message: auto) : auto
- Arguments:
parser : auto
message : auto
- log_plain(parser: auto; message: auto): auto
def log_plain (var parser: auto; var message: auto) : auto
- Arguments:
parser : auto
message : auto
- log_success(parser: auto; message: auto): auto
def log_success (var parser: auto; var message: auto) : auto
- Arguments:
parser : auto
message : auto
- tabulate(parser: auto): auto
def tabulate (var parser: auto) : auto
- Arguments:
parser : auto