Pick a markdown and code style:

RUSTLR: Bottom-Up Parser Generator for Rust

Rustlr was originally created as a platform to explore advanced ideas in grammars and parsing. It was also designed to be practcial and can compete with other parser generators in terms of features and usability. Among its abilities are

  1. The option of automatically generating the abstract syntax datatypes and semantic actions from hints given in the grammar, in whole or in part. The AST types do not necessarily mirror the format of the grammar: one can declare dependencies that allow multiple non-terminal symbols to share a common type.

  2. The option of generating recursive AST types using the "bumpalo" crate, which enables deep pattern matching on recursive structures by replacing smart pointers with transparent references.

  3. The possibility of using regular-expression style operators such as *, + and ? directly inside the grammar. This is not as easy to do as it sounds as the additional grammar rules required to support them may lead to new ambiguities.

  4. The option of using a larger class of unambiguous grammars compared to traditional LR and LALR, based on Selective Marcus-Leermakers delayed reductions. For example, the following grammar

  S -->  a B c  |  A B d
  B --> b | B b
  A --> a

is unambiguous but is not LR(k) because it cannot decide whether to reduce a to A with a fixed number of lookaheads. However, rustlr can still genereate a deterministic parser for it by automatically transforming it to

  S -->  a B c  |  N d
  B --> b | B b
  N --> a b | a B b

which is an LR(1) grammar. Functional compositions can preserve the semantics of values computed for the original grammar. See the Appendix. Other experimental features include an "unexpected wildcard" symbol.

  1. The ability to interactively train the parser to give better error messages.

  2. Automatically generates a lexical scanner from the declaration of terminal symbols.

Rustlr also implements capabilities found in traditional LR parsing tools including operator precedence and associativity declarations and error recovery mechanisms. Rustlr is designed for the parsing of programming language syntax. It is not designed to parse natural languages, or binary data, although there's also nothing that prevents it from used for those purposes. Rustlr generates parsers for Rust and for F# (Microsoft's version of OCaml) and will target other typed, functional programming languages as these languages often lack choices in parsing tools.

Documentation on docs.rs serves as technical reference but not as tutorial.

The tutorial is organized around a set of examples, with each example explaining a set of more advanced features.

Tutorial Chapters

The chapters in bold listed below are complete. The others provide additional examples and generally contain enough comments to be readable. The first two chapters suffice to give a good working knowledge of the principal capabilities of rustlr.

  1. Introduction: Basic Calculators
  2. Advanced Examples
  3. Interactive Training for Enhanced Error Reporting
  4. Generating Bump-allocated ASTs
  5. Error Recovery Options
  6. Challenging Scenarios
  7. Special Example: Non-context free language, using External State. Link to full crate
  8. Special Chapter: Generating a Parser for F#
  9. Appendix: Experimental Features (Delayed Reductions and the Wildcard Token)

Additional Chapters Under Construction

References