You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It looks like QLeverfile has evolved enough complexity to warrant using a more flexible language. A popular choice would be YAML which supports JSON Schema, arrays and deep structures
I only see a problem with the variable substitution which I don't think is supported OOTB but since it looks like scripting and not simple interpolation, I would think it is unrelated to how the file is formatted
Anyway, here's a possible rendition of wikidata example
# Qleverfile for Wikidata, use with the QLever CLI (`pip install qlever`)## qlever get-data # ~7 hours, ~110 GB (compressed), ~20 billion triples# qlever index # ~5 hours, ~20 GB RAM, ~500 GB index size on disk# qlever start # a few seconds, adjust MEMORY_FOR_QUERIES as needed## Adding a text index takes an additional ~2 hours and ~50 GB of disk space## Measured on an AMD Ryzen 9 5950X with 128 GB RAM, and NVMe SSD (18.10.2024)name: &name wikidataenv:
GET_DATA_URL: https://dumps.wikimedia.org/wikidatawiki/entitiesDATE_WIKIDATA: $$(date -r latest-all.ttl.bz2 +%d.%m.%Y || echo "NO_DATE")DATE_WIKIPEDIA: $$(date -r wikipedia-abstracts.nt +%d.%m.%Y || echo "NO_DATE")data:
get-data-cmd: > curl -LRC - -O ${GET_DATA_URL}/latest-all.ttl.bz2 -O ${GET_DATA_URL}/latest-lexemes.ttl.bz2 2>&1 | tee wikidata.download-log.txt && curl -sL ${GET_DATA_URL}/dcatap.rdf | docker run -i --rm -v $$(pwd):/data stain/jena riot --syntax=RDF/XML --output=NT /dev/stdin > dcatap.ntdescription: Full Wikidata dump from ${GET_DATA_URL} (latest-all.ttl.bz2 and latest-lexemes.ttl.bz2, version ${DATE_WIKIDATA})index:
input-files: [ latest-all.ttl.bz2, latest-lexemes.ttl.bz2, dcatap.nt ]input:
- cmd: lbzcat -n 4 latest-all.ttl.bz2format: ttlparallel: true
- cmd: lbzcat -n 1 latest-lexemes.ttl.bz2format: ttlparallel: false
- cmd: cat dcatap.ntformat: ntsettings:
languages-internal: []prefixes-external: [""]locale:
language: encountry: USignore-punctuation: trueascii-prefixes-only: truenum-triples-per-batch: 5000000stxxl-memory: 10Gserver:
port: 7001access-token: *namememory-for-queries: 20Gcache-max-size: 15Gcache-max-size-single-entry: 5Gtimeout: 600sruntime:
system: dockerimage: adfreiburg/qleverui:
config: wikidata
The text was updated successfully, but these errors were encountered:
It looks like QLeverfile has evolved enough complexity to warrant using a more flexible language. A popular choice would be YAML which supports JSON Schema, arrays and deep structures
I only see a problem with the variable substitution which I don't think is supported OOTB but since it looks like scripting and not simple interpolation, I would think it is unrelated to how the file is formatted
Anyway, here's a possible rendition of wikidata example
The text was updated successfully, but these errors were encountered: