Skip to content

Akshit8/node-fts

Repository files navigation

node-fts

a full text search engine built in nodejs using inverted index as search data structure.


NOTE: the above implementation is not production ready as of now, it just implements an in-memory data storage for performing full text search, coupled with an express server for external usage.

core architecture

the search fuctionality is implemented using the following pipeline

data pre-processing

before storing the data, it's cleaned and processed using following filters:

  • tokenization
  • lowercase filter
  • stopwords filter
  • punctuations filter
  • stemming filter[TODO]

for detail implementation check search.ts.

creating inverted index

once the tokens are processed they are added to an inverted index data structure, which basically maps the tokens to a list of documents in which they are present. Currently the server only maintains one index for searching purpose and uses intersection between search query tokens to return matched documents.

[TODO]: implement an inverted index with weighted ranks to support union of search query tokens.

for detail implementation of inverted index check post.ts.

sorting index

the documents can be retrieved sorted according to multiple fields at high speeds, as dedicated index's are maintained and updated for the same.

getting started

requirements

  • docker
  • docker-compose

start the server using following command

docker-compose -f ./deployment/docker-compose.yaml up -d

sample usage

sort queries

# get all posts at page = 0
# posts ordered by dateLastEdited
localhost:3000/api/post

# get all posts at page = 2
# posts ordered by dateLastEdited
localhost:3000/api/post?page=2

# get all posts at page = 2
# posts ordered by dateLastEdited in descending order
localhost:3000/api/post?asc=false&page=2

# get all posts at page = 2 and limit 15
# posts ordered by dateLastEdited
localhost:3000/api/post?limit=15&page=2

# get all posts at page = 0
# posts ordered by name
localhost:3000/api/post?sortBy=name

# get all posts at page = 2
# posts ordered by name
localhost:3000/api/post?sortBy=name&page=2

# get all posts at page = 2
# posts ordered by name in descending order
localhost:3000/api/post?asc=false&sortBy=name&page=2

# get all posts at page = 2 and limit 15
# posts ordered by name
localhost:3000/api/post?limit=15&sortBy=name&page=2

search queries

# search query in name
localhost:3000/api/post?searchIn=name&query=customer

# search query in name with pagination
localhost:3000/api/post?searchIn=name&page=1&query=human

# search with exact query
localhost:3000/api/post?searchIn=name&query="Human Communications Representative"

# search in description with pagination
localhost:3000/api/post?limit=15&searchIn=description&page=1&query=vel

# exact search in description
localhost:3000/api/post?searchIn=description&query="Explicabo quae rerum dolorum nostrum aut"

check applications logs

all logs generated by server are extracted by fluentd and dumped here.

Author

Akshit Sadana [email protected]

About

full-text search implementation in node

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages