The DBpedia Lookup can be used to index and search the contents of RDF files or databases.
The search engine is based on the Lucene framework. RDF parsing and SPARQL querying is utilizing the Apache Jena framework, thus supporting a wide range of RDF formats.
The general idea behind this indexer is leveraging the power of the SPARQL query language to select specific key-value pairs from a knowledge graph and add them to a inverse index. A user can then search over values and quickly retreive associated keys using fuzzy matching.
In order to create a meaningful index structure, it is important to have a rough understanding of the knowledge graph being indexed and to design the SPARQL queries accordingly.
A Lucene index can be understood as a collection of documents. Each document has a unique ID and can have multiple fields with one or more values each. The document collection is indexed in a way that documents can be found by searching over the values of all or only some fields. The lookup indexer handles the process of converting a knowledge graph into such a document collection.
The examples folder contains configuration files for a search index over a part of the DBpedia knowledge graph (using https://dbpedia.org/sparql).
It contains
- a configuration file for the lookup server instance (config.yml)
- a configuration file for the indexing request (dbpedia-resource-indexer.yml)
Run a server instance using the provided configuration in config.yml.
You can run the "Launch Lookup Server" setup from the launch-config.json in Visual Studio Code.
Alternatively, you can use maven to build a .jar
file by issuing
mvn package
and then running the resulting lookup-1.0-jar-with-dependencies.jar
file via
java -jar ./target/lookup-1.0-jar-with-dependencies.jar -c ../examples/config.yml
Run the indexing process. Issue the following HTTP request:
curl --request POST \
--url http://localhost:8082/api/index/run \
--header 'Content-Type: multipart/form-data' \
--form [email protected] \
--form values=http://dbpedia.org/resource/Berlin,http://dbpedia.org/resource/Leipzig,http://dbpedia.org/resource/Hamburg
This will send and indexing request to the indexer API that will fetch indexable data for the specified resource URIs from the DBpedia knowledge graph
Subsequently, the following request should return a result with the DBpedia entry of the city Berlin.
curl http://localhost:8082/api/search?query=Ber
There are two types of configuration files for lookup, each with their own documentation:
There is a discussion thread on the DBpedia forums for questions and suggestions concerning this app and service here.
In order to build the docker image run:
cd lookup
mvn package
docker build -t lookup .
Do this before running docker compose up
.