Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align with CSV-Format used in jskos-cli #266

Open
nichtich opened this issue Feb 10, 2025 · 3 comments
Open

Align with CSV-Format used in jskos-cli #266

nichtich opened this issue Feb 10, 2025 · 3 comments

Comments

@nichtich
Copy link

We (project coli-conc, also used in NFDI4Objects) also support conversion from CSV to SKOS via jskos-convert. jskos-convert supports these column names (≈ table Concepts):

  • notation (skos:notation and used to to build an URIs from namespace) ≈ Concept IRI
  • prefLabel (default language) and prefLabel@xx (arbitrary language code) ≈ Preferred Label + Preferred Label Language Codes
  • altLabel (default language) and altLabels@xx (arbitrary language code) ≈ Alternate Labels-
  • scopeNote (default language) and scopeNotes@xx (arbitrary language code) ≈ Definition + Definition Language Code
  • broaderNotation (≈ reverse of Children URIs)
  • level (seem like indentation but numerical value instead of spaces)

Provenance and Home Vocabulary URI are not supported but could be added.

An individual file must be specified with information about the concept scheme (≈ table Concept Scheme).

@dalito
Copy link
Member

dalito commented Feb 21, 2025

@nichtich - Hi Jakop! Thanks for reaching out. I had a look at your approach/tools and what I like and what we should adapt for voc4cat is to track provenance for mapping creation/editing separate from provenance of concepts/definitions.

We align with the Australian vocpub profile (SHACL) so far. But it is a bit limited (esp. regarding provenance) or ConceptScheme metadata. Do you think it would make sense to cooperate on a (NFDI) vocabulary profile?

The "alignment" of tools could then happen on profile level and it does not matter much, if csv and jskos-convert (JavaScript) or xlsx/turtle and voc4cat-tool (Python) is used.

@nichtich
Copy link
Author

Do you ask to not align the tabular input format but the resulting RDF only? I thought about having some standardization on the left side of transformation too:

flowchart LR
  xlsx[voc4cat xlsx: concepts, scheme, mappings] --> voc4cat(voc4cat) 
  voc4cat --> RDF[RDF]
  CSV[CSV with concepts] --> jskos-convert(jskos-convert)
  scheme[JSKOS Concept Scheme] --> jskos-convert
  mappings[CSV with Mappings] --> jskos-convert
  jskos-convert --> JSKOS
  JSKOS -- JSON-LD context --> RDF
Loading

track provenance for mapping creation/editing separate from provenance of concepts/definitions.

Beside mapping type (e.g. skos:exactMatch) that's primarily creator (optionally having IRI and name), created (date of creation), and uri (IRI of a mapping). SSSOM CSV defines some more properties and we are going to closer align SSSOM and JSKOS as well.

@dalito
Copy link
Member

dalito commented Feb 21, 2025

Ah, I see.

Our current table format is structure-wise still compatible with the former vocexcel. We know that the template-structure has various small problems and limitations. Since we don't want to change the template often (ideally only once), we have been collecting ideas what should change in #124 (and #252) but not yet implemented them (except that we created example xlsx-files for the proposed changes). Hence, we are still flexible to change some aspects for better alignment.

Regarding the mappings your needs (for cocoda) go probably beyond what we need for "our" mappings of just one vocabulary to others. However, we (voc4cat) should try to align with your approach and SSSOM where possible. One goal for us could be to generate files with only the mapping information (e.g. to load into and extend with cocoda).

For us it would be interesting to find out early what else we should consider for our v1.0-template plans. How should we proceed? Set up a zoom call?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants