Graph Database Manager

API Documentation

The APIs can be accessed and tested via the swagger in http://localhost:8000/docs/

Setting Up

Install dependencies by pip install -r requirements.txt
Run Neo4j and update .env file
Run python add_data.py to add the extracted scientific knowledge graph into the graph database (takes quite a long time)
Run python cluster_and_drop.py to drop some semantic/syntactic duplications
Run python gen_vocab.py to generate a replication of vocaburary from the graph database
Run python app.py to serve the endpoints (also generate a set of embedding vectors from the previous step if not exist)

Initial Dataset

The following files contain many essential field using for constructing knowledge graph. You can modify the dataset and script to add more information to the graph.

1. data/csv/kaggle-arxiv-cscl-2020-12-18.csv

Metadata of arxiv dataset retreived from Cornell-University/arxiv filtering only Computation and Language (CL) category.

2. data/pickle/kaggle_arxiv_cite_ref.pickle

Citations and references for each publication in the arxiv cs.CL dataset

3. data/pickle/kaggle_arxiv_cleaned.pickle

The combination of retreived metadata from Cornell-University/arxiv and additional essential fields

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
data		data
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
add_data.py		add_data.py
app.py		app.py
cluster_and_drop.py		cluster_and_drop.py
gen_vocab.py		gen_vocab.py
offline_setup_util.py		offline_setup_util.py
requirements.txt		requirements.txt
swagger.json		swagger.json
wait-for-neo4j.sh		wait-for-neo4j.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Database Manager

API Documentation

Setting Up

Initial Dataset

1. data/csv/kaggle-arxiv-cscl-2020-12-18.csv

2. data/pickle/kaggle_arxiv_cite_ref.pickle

3. data/pickle/kaggle_arxiv_cleaned.pickle

About

Releases

Packages

Contributors 3

Languages

StonehengeNLP/ESRA-Graph-database

Folders and files

Latest commit

History

Repository files navigation

Graph Database Manager

API Documentation

Setting Up

Initial Dataset

1. data/csv/kaggle-arxiv-cscl-2020-12-18.csv

2. data/pickle/kaggle_arxiv_cite_ref.pickle

3. data/pickle/kaggle_arxiv_cleaned.pickle

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages