Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to ETS #16

Merged
merged 26 commits into from
Feb 26, 2022
Merged

Move to ETS #16

merged 26 commits into from
Feb 26, 2022

Conversation

heywhy
Copy link
Owner

@heywhy heywhy commented Feb 23, 2022

Overview

The changes contained in this PR allow the library to make use of ETS to store underlying indexed data. This improves performance and most importantly reduces the amount of data needed to be copied when the Elasticlunr.IndexManager.save/1 is called, as this is an obvious bottleneck.

Also, Task.async_stream/2 is now used when indexing documents because this allows us to analyze and process multiple documents concurrently.

TODO

  • Decouple underlying index store to use ETS
  • Adjust serialization/deserialization layer to work with these changes
  • Run most indexing functions in an async stream
  • Refactor the disk storage module
  • GitHub Actions are all passing

@heywhy heywhy merged commit 7707595 into master Feb 26, 2022
@heywhy heywhy deleted the dev branch February 26, 2022 10:27
@heywhy heywhy restored the dev branch February 26, 2022 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant