Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the tokenizer composability model. #281

Merged
merged 5 commits into from
Aug 18, 2024

Conversation

teo-tsirpanis
Copy link
Owner

While porting the samples to FarkleNeo, it was discovered that the existing tokenizer extensibility model was not adequate to implement the indent-based grammar. This PR makes the following changes:

  • Instead of going through the chain once, the chained tokenizer keeps running all components until:
    • One of them returns true.
    • One of them suspends.
    • No input characters were consumed after all components had the opportunity to run and returned false.
  • Tokenizer implementations are no longer greedy. When they encounter a noise symbol they must consume its characters and return false instead of continuing until they encounter a terminal.
    • Tokenizers can forgo yielding if they are the only ones in the chain, and an extension method on ParserInputReader<TChar> was added to check for it.
      • Checking for this property is mandatory for tokenizers that can skip the chained tokenizer wrapping.
    • The chained tokenizer wrapping maintains its eager semantics, which means that from the perspective of the parser, if a tokenizer returns false, the parser must still exit and return with more available input.

The new design was validated locally by successfully running the indent-based grammar checks.

Copy link

sonarcloud bot commented Aug 18, 2024

@teo-tsirpanis teo-tsirpanis merged commit 206b562 into mainstream Aug 18, 2024
5 checks passed
@teo-tsirpanis teo-tsirpanis deleted the tokenizer-api-update branch August 18, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant