This repository is the microservice that implements the HDX Index Adapter functionality.
First, make sure that you have the API gateway running locally.
We're using Docker which, luckily for you, means that getting the application running locally should be fairly painless. First, make sure that you have Docker Compose installed on your machine.
git clone https://github.com/GPSDD/hdx-index-connector.git
cd hdx-index-connector
./adapter.sh develop
```text
You can now access the microservice through the CT gateway.
It is necessary to define these environment variables:
- CT_URL => Control Tower URL
- NODE_ENV => Environment (prod, staging, dev)
This component executes a periodic task that updates the metadata of each indexed RW dataset. The task is bootstrapped
when the application server starts.
The task's implementation can be found on app/src/cron/cron
and the configuration is loaded from the
config files
The field correspondence is based on the metadata object for a single package - I.E. this link.
In the HDX domain, a package
entity may refer to multiple data files - identified as resources
.
Given that this structure does not match directly to the API structure, we use the following logic to map the HDX domain structure to ours:
- Each HDX
package
tentatively corresponds to one API Highways dataset. - Within each
package
, if there's one and onlyresource
withformat
of typeJSON
, we use thatresource
on step 4. If not, we proceed to step 3. - Within each
package
, if there's one and onlyresource
withformat
of typeCSV
, we use thatresource
on step 4. If not, thedataset
status is set tofailed
and no metadata is created. - We combine the
package
data and the selectedresource
data to generate metadata as described in the spec table below.
Field in SDG Metadata | Field in HDX data | Value |
---|---|---|
userId | ||
language | 'en' | |
resource | ||
name | package.title , or the name provided on the create request as fallback |
|
description | package.resource.description |
|
sourceOrganization | package.organization.title |
|
dataDownloadUrl | 'https://data.humdata.org' + package.resource.hdx_rel_url |
|
dataSourceUrl | 'https://data.humdata.org/dataset/' + package.name |
|
dataSourceEndpoint | 'https://data.humdata.org' + package.resource.hdx_rel_url |
|
license | Try to match the value of package.license to one of the accepted licenses, fallback to 'Other' |
|
status | 'published' |
HDX datasets have tags associated with them, which this connector uses to tag the index datasets. The tags are
loaded from the HDX metadata response using the following JSONPath expression: $.result.tags[*].display_name
Additionally, each HDX dataset is tagged with the "HDX API" tag, and a tag to match
the RW API application to which they belong.
No graph tagging is done on HDX datasets.