.. _stdlib_peg: ==================== PEG parser generator ==================== .. das:module:: peg The PEG module is a parser generator based on `Parsing Expression Grammars `_. Define grammars directly in daslang using the ``parse`` macro --- the compiler generates a packrat parser at compile time. No external tools, no runtime code generation. See :ref:`tutorial_dasPEG_hello_parser` for a hands-on introduction. Grammar basics ^^^^^^^^^^^^^^ A grammar lives inside a ``parse(input) { ... }`` block. The first ``var`` declaration is the **entry rule**. Each ``rule(...)`` adds an ordered alternative: .. code-block:: das def my_parser(input : string; blk : block<(val : int; err : array) : void>) { parse(input) { var expr : int rule(number as n, WS, "+", WS, number as m, EOF) { return n + m } } } Rules try alternatives top to bottom. The first match wins (ordered choice --- no ambiguity). Built-in terminals ^^^^^^^^^^^^^^^^^^ ================================ ============================================= Terminal Description ================================ ============================================= ``"literal"`` Exact text match ``EOF`` End of input ``EOL`` End of line (``\n`` or ``\r\n``) ``WS`` Zero or more whitespace (including newlines) ``TS`` Zero or more tabs/spaces (no newlines) ``any`` Any single character ``number`` Decimal integer (returns ``int``) ``string_`` Double-quoted string literal (returns content) ``double_`` Floating-point number (returns ``double``) ``set(ranges...)`` Character from ranges: ``set('a'..'z', '0'..'9')`` ``not_set(chars...)`` Any character NOT in the set ================================ ============================================= Operators and combinators ^^^^^^^^^^^^^^^^^^^^^^^^^ ================================ ============================================= Syntax Description ================================ ============================================= ``rule as name`` Bind the rule result to ``name`` ``"{rule}" as name`` Extract matched text as string ``+rule`` One or more repetitions (array) ``*rule`` Zero or more repetitions (array) ``MB(rule)`` Optional (zero or one) ``!rule`` Negative lookahead (no input consumed) ``PEEK(rule)`` Positive lookahead (no input consumed) ``commit`` Cut --- no backtracking past this point ``log("msg")`` Print debug message during parsing ================================ ============================================= Options ^^^^^^^ Add inside the ``parse`` block: - ``option(tracing)`` --- print rule-by-rule execution trace - ``option(color)`` --- ANSI colors in trace output - ``option(print_generated)`` --- show generated parser code at compile time Return types ^^^^^^^^^^^^ Rules can return any daslang type: scalars, strings, structs, variants, tuples, tables, and arrays. Use ``void?`` for pattern-only rules that match characters without producing a value. Error handling ^^^^^^^^^^^^^^ ``parse`` calls a callback ``$(val; err)`` where ``err`` is an ``array``. Place ``commit`` after unambiguous prefixes to get meaningful error messages with position information. Left recursion ^^^^^^^^^^^^^^ dasPEG supports left-recursive rules via packrat memoization. This is the standard technique for encoding left-associative operators (e.g. ``a + b + c``). Performance ^^^^^^^^^^^ - Packrat memoization caches results per rule per position --- O(n) parsing - Place common alternatives first (PEG tries in order) - Use ``PEEK`` to quickly reject impossible alternatives - Use ``commit`` after unambiguous prefixes for speed and error quality - Set ``options stack = 1000000`` --- PEG parsers need more stack than default All functions and symbols are in "peg" module, use require to get access to it. .. code-block:: das require peg/peg Example: .. code-block:: das require peg/peg def parse_greeting(input : string; blk : block<(val : string; err : array) : void>) { parse(input) { var greeting : string rule("Hello, ", "{+letter}" as name, "!", EOF) { return name } var letter : void? rule(set('a'..'z', 'A'..'Z')) { return null } } } [export] def main() { parse_greeting("Hello, World!") $(val; err) { print("name = {val}\n") } } // output: // name = World ++++++++++ Structures ++++++++++ .. _struct-peg-ParsingError: .. das:attribute:: ParsingError struct ParsingError +++++++++++++++++++ Matching primitives +++++++++++++++++++ * :ref:`get_current_char (var parser: auto) : int ` * :ref:`matches (var parser: auto; tmpl: string; strlen: int) : bool ` * :ref:`move (var parser: auto; offset: auto) : auto ` * :ref:`reached_EOF (var parser: auto) : bool ` * :ref:`reached_EOL (var parser: auto) : bool ` .. _function-peg_get_current_char_auto_0x67: .. das:function:: get_current_char(parser: auto) : int def get_current_char (var parser: auto) : int :Arguments: * **parser** : auto .. _function-peg_matches_auto_string_int_0x22: .. das:function:: matches(parser: auto; tmpl: string; strlen: int) : bool def matches (var parser: auto; tmpl: string; strlen: int) : bool :Arguments: * **parser** : auto * **tmpl** : string * **strlen** : int .. _function-peg_move_auto_auto_0x36: .. das:function:: move(parser: auto; offset: auto) : auto def move (var parser: auto; offset: auto) : auto :Arguments: * **parser** : auto * **offset** : auto .. _function-peg_reached_EOF_auto_0x53: .. das:function:: reached_EOF(parser: auto) : bool def reached_EOF (var parser: auto) : bool :Arguments: * **parser** : auto .. _function-peg_reached_EOL_auto_0x57: .. das:function:: reached_EOL(parser: auto) : bool def reached_EOL (var parser: auto) : bool :Arguments: * **parser** : auto ++++++++++ Whitespace ++++++++++ * :ref:`skip_taborspace (var parser: auto) : auto ` * :ref:`skip_whitespace (var parser: auto) : auto ` .. _function-peg_skip_taborspace_auto_0x61: .. das:function:: skip_taborspace(parser: auto) : auto def skip_taborspace (var parser: auto) : auto :Arguments: * **parser** : auto .. _function-peg_skip_whitespace_auto_0x5b: .. das:function:: skip_whitespace(parser: auto) : auto def skip_whitespace (var parser: auto) : auto :Arguments: * **parser** : auto ++++++++++++++++ Literal matching ++++++++++++++++ * :ref:`match_decimal_literal (var parser: auto) : tuple\ ` * :ref:`match_double_literal (var parser: auto) : tuple\ ` * :ref:`match_string_literal (var parser: auto) : tuple\ ` .. _function-peg_match_decimal_literal_auto_0x7d: .. das:function:: match_decimal_literal(parser: auto) : tuple Simple lexing of decimal integers, doesn't check for overflow :Arguments: * **parser** : auto .. _function-peg_match_double_literal_auto_0xac: .. das:function:: match_double_literal(parser: auto) : tuple Matches doubles in the form of [-+]? [0-9]* .? [0-9]+ ([eE] [-+]? [0-9]+)? The number is not checked to be representable as defined in IEEE-754 :Arguments: * **parser** : auto .. _function-peg_match_string_literal_auto_0x90: .. das:function:: match_string_literal(parser: auto) : tuple Tries to match everything inside "" :Arguments: * **parser** : auto +++++++++++++++++++ Tracing and logging +++++++++++++++++++ * :ref:`log_fail (var parser: auto; var message: auto) : auto ` * :ref:`log_info (var parser: auto; var message: auto) : auto ` * :ref:`log_plain (var parser: auto; var message: auto) : auto ` * :ref:`log_success (var parser: auto; var message: auto) : auto ` * :ref:`tabulate (var parser: auto) : auto ` .. _function-peg_log_fail_auto_auto_0x44: .. das:function:: log_fail(parser: auto; message: auto) : auto def log_fail (var parser: auto; var message: auto) : auto :Arguments: * **parser** : auto * **message** : auto .. _function-peg_log_info_auto_auto_0x4e: .. das:function:: log_info(parser: auto; message: auto) : auto def log_info (var parser: auto; var message: auto) : auto :Arguments: * **parser** : auto * **message** : auto .. _function-peg_log_plain_auto_auto_0x3f: .. das:function:: log_plain(parser: auto; message: auto) : auto def log_plain (var parser: auto; var message: auto) : auto :Arguments: * **parser** : auto * **message** : auto .. _function-peg_log_success_auto_auto_0x49: .. das:function:: log_success(parser: auto; message: auto) : auto def log_success (var parser: auto; var message: auto) : auto :Arguments: * **parser** : auto * **message** : auto .. _function-peg_tabulate_auto_0x3b: .. das:function:: tabulate(parser: auto) : auto def tabulate (var parser: auto) : auto :Arguments: * **parser** : auto