Skip to content
/ tokengen Public

A work in progress token generator library that uses generative and procedural derive macros to create and implement keywords, symbols and other token types and traits.

Notifications You must be signed in to change notification settings

sezna/tokengen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tokengen

tokengen uses rust procedural derive and generative macros to make producing custom tokens and impls for lexers and parsers. Be warned that this crate is purely for my own research and implementation purposes, but I am happy to receive feedback, issues and pull requests!

Usage

Import the crate as a git dependency in your Cargo.toml. If you wish to use the derive features, enable them like so:

[dependencies]
tokengen = { git = "...", features = ["derive"] }

Generate single or multiples of keywords, symbols and more using the generative or derive macros. By doing this, all necessary implementations are provided, and spans are automatically calculated based on their start position and the UTF-8 length of the char provided.

symbols!(
    [EXCLAMATION_MARK, '!'],
    [POUND_SIGN, '#'],
    [OPEN_PARENTHESIS, '('],
    [CLOSE_PARENTHESIS, ')']
);
keywords!([If, "if"], [Else, "else"]);

Using derive macros require the derive feature.

You can then take your symbols and create token groups. Some are provided for you but everything is there to make a custom one if you wish. Additionally supplied symbols will be considered part of the pattern of a token. These macros also generate enums for the kind which is useful for pattern matching, as well as a sum type of all structs generated by the macro. The sum types are named after the macros in non-plural form, ie. symbols! -> Symbol.

symbols!([PLUS, '+'], [HYPHEN, '-'], [SEMICOLON, ';'], [OPEN_PAREN, '(']);
operators!([Addition, [PLUS]], [Subtraction, [HYPHEN]]);
punctuators!([Semi, [SEMICOLON]]);
delimiters!([OpenParen, [OPEN_PAREN]]);

Finally, with all of the token types declared, add them to the sum type of your language token. The identifiers enclosed in the curly braces are the variants of the sum type and must already exist. The following are provided by the library, but can be extended so long as the trait requirements are met. Both the traits and their derive macro components are available.

generate_token_sum_type!([MyLanguageToken, { Keyword, Operator, Punctuator, Delimiter, Ident }]);

General implementation of Span:

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub struct Ident {
    span: SourceSpan,
}
impl Ident {
    pub fn new(src: &str, start: usize, end: usize) -> Self {
        Self {
            span: SourceSpan::new(src, start, end),
        }
    }
}
impl Span for Ident {
    fn src(&self) -> &Arc<str> {
        self.span.src()
    }
    fn start(&self) -> usize {
        self.span.start()
    }
    fn end(&self) -> usize {
        self.span.end()
    }
    fn span(&self) -> &SourceSpan {
        self.span.span()
    }
    fn len(&self) -> usize {
        self.span.len()
    }
}

Derive Span:

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, tokengen::tokengen_derive::Span)]
pub struct Ident {
    span: SourceSpan,
}
impl Ident {
    pub fn new(src: &str, start: usize, end: usize) -> Self {
        Self {
            span: SourceSpan::new(src, start, end),
        }
    }
}

Note: The Span derive macro's only requirement is to have a field where the span is declared. In the future this will probably be expanded to cover more ground, with allowing attribute to declare how spans are defined for the derive macro.

Now you have some generated types that automatically implement key features used in most lexing and parsing scenarios!

About

A work in progress token generator library that uses generative and procedural derive macros to create and implement keywords, symbols and other token types and traits.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published