Data missing in the result (and other problems) #260

jangxx · 2015-11-18T13:53:26Z

So I'm getting this data from an api, but when I parse it with xml2js fields are added, renamed and missing.

Input XML:

<?xml version="1.0" encoding="UTF-8"?>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"><version>1.1</version><numberOfRecords>1</numberOfRecords><records><record><recordSchema>oai_dc</recordSchema><recordPacking>xml</recordPacking><recordData><dc xmlns:dnb="http://d-nb.de/standards/dnbterms" xmlns="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <dc:title>Elektronik ohne Ballast : Einf. in d. Schaltungstechnik d. industriellen Elektronik; mit 3 Tab. / von Otto Limann</dc:title>
  <dc:creator>Limann, Otto</dc:creator>
  <dc:publisher>München : Franzis-Verlag</dc:publisher>
  <dc:date>1973</dc:date>
  <dc:identifier xmlns:tel="http://krait.kb.nl/coop/tel/handbook/telterms.html" xsi:type="tel:ISBN">3-7723-5613-3 kart. : DM 30.00</dc:identifier>
  <dc:identifier xsi:type="dnb:IDN">740202677</dc:identifier>
  <dc:subject>20a Technik, Industrie, Gewerbe</dc:subject>
  <dc:format>396 S.</dc:format>
</dc></recordData><recordPosition>1</recordPosition></record></records><nextRecordPosition>2</nextRecordPosition><echoedSearchRetrieveRequest><version>1.1</version><query>3772356133</query><xQuery xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/></echoedSearchRetrieveRequest><extraResponseData><accountOf xmlns="">S*** SRU</accountOf></extraResponseData></searchRetrieveResponse>

The outputted object is unexpectedly huge:

{ searchRetrieveResponse: 
   { '$': { xmlns: 'http://www.loc.gov/zing/srw/' },
     version: [ '1.1' ],
     numberOfRecords: [ '1' ],
     records: 
      [ { record: 
           [ { recordSchema: [ 'RDFxml' ],
               recordPacking: [ 'xml' ],
               recordData: 
                [ { 'rdf:RDF': 
                     [ { '$': 
                          { 'xmlns:gndo': 'http://d-nb.info/standards/elementset/gnd#',
                            'xmlns:marcRole': 'http://id.loc.gov/vocabulary/relators/',
                            'xmlns:lib': 'http://purl.org/library/',
                            'xmlns:owl': 'http://www.w3.org/2002/07/owl#',
                            'xmlns:skos': 'http://www.w3.org/2004/02/skos/core#',
                            'xmlns:rdfs': 'http://www.w3.org/2000/01/rdf-schema#',
                            'xmlns:geo': 'http://www.opengis.net/ont/geosparql#',
                            'xmlns:umbel': 'http://umbel.org/umbel#',
                            'xmlns:rdau': 'http://rdaregistry.info/Elements/u/',
                            'xmlns:sf': 'http://www.opengis.net/ont/sf#',
                            'xmlns:rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
                            'xmlns:dcterms': 'http://purl.org/dc/terms/',
                            'xmlns:bibo': 'http://purl.org/ontology/bibo/',
                            'xmlns:isbd': 'http://iflastandards.info/ns/isbd/elements/',
                            'xmlns:foaf': 'http://xmlns.com/foaf/0.1/',
                            'xmlns:dc': 'http://purl.org/dc/elements/1.1/' },
                         'rdf:Description': 
                          [ { '$': { 'rdf:about': 'http://d-nb.info/740202677' },
                              'rdf:type': [ { '$': { 'rdf:resource': 'http://purl.org/ontology/bibo/Document' } } ],
                              'dcterms:medium': [ { '$': { 'rdf:resource': 'http://rdvocab.info/termList/RDACarrierType/1044' } } ],
                              'owl:sameAs': [ { '$': { 'rdf:resource': 'http://hub.culturegraph.org/resource/DNB-740202677' } } ],
                              'bibo:isbn10': 
                               [ { _: '3772356133',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'rdau:P60521': 
                               [ { _: 'kart. : DM 30.00',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'dc:identifier': 
                               [ { _: '(OColc)74126361',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'dc:title': 
                               [ { _: 'Elektronik ohne Ballast',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'dcterms:creator': [ { '$': { 'rdf:resource': 'http://d-nb.info/gnd/105782297' } } ],
                              'rdau:P60163': 
                               [ { _: 'München',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'dc:publisher': 
                               [ { _: 'Franzis-Verlag',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'rdau:P60333': 
                               [ { _: 'München : Franzis-Verlag, 1973',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'isbd:P1053': 
                               [ { _: '396 S.',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'dcterms:issued': 
                               [ { _: '1973',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'rdau:P60493': 
                               [ { _: 'Einf. in d. Schaltungstechnik d. industriellen Elektronik; mit 3 Tab.',
                                   '$': { 'rdf:datatype': 'http://www.w3.org/2001/XMLSchema#string' } } ],
                              'bibo:authorList': 
                               [ { 'rdf:Description': 
                                    [ { '$': { 'rdf:nodeID': 'node1a3jmb6mbx1324003' },
                                        'rdf:type': [ { '$': { 'rdf:resource': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq' } } ],
                                        'rdf:_1': [ { '$': { 'rdf:resource': 'http://d-nb.info/gnd/105782297' } } ] } ] } ] } ] } ] } ],
               recordPosition: [ '1' ] } ] } ],
     nextRecordPosition: [ '2' ],
     echoedSearchRetrieveRequest: 
      [ { version: [ '1.1' ],
          query: [ '3772356133' ],
          xQuery: 
           [ { '$': 
                { 'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
                  'xsi:nil': 'true' } } ],
          recordSchema: [ 'RDFxml' ] } ],
     extraResponseData: [ { accountOf: [ { _: 'S*** SRU', '$': { xmlns: '' } } ] } ] } }

Not only are there new elements which aren't present in the original XML like rdau:P60493, but some like dc:creator are missing and replaced with dcterms:creator - a link to a resource. Is this correct behavior? Is there an option to parse the XML 'as-is', ignoring the namespaces (or whatever is causing this)?

The text was updated successfully, but these errors were encountered:

kaltri-n · 2018-03-22T16:04:31Z

Hi, have you already solved this issue? Since I use the same API and have same problems.

jangxx · 2018-03-22T16:13:17Z

No, I switched to marc4js, which was able to parse the data correctly.

Edit: Important to note: I was parsing bibliographic data, which was available in many different formats. The one I tried and failed to parse above was Dublin Core. Not only did I switch to marc4js, I also changed my requests to request data in MARCXML format.

Leonidas-from-XIV · 2018-03-22T20:02:46Z

I am very confused, since there isn't really a way that xml2js would invent new elements.

jangxx · 2018-03-28T11:25:14Z

Well, I share your confusion, which is why I raised this issue in the first place.

Leonidas-from-XIV · 2018-03-28T11:26:38Z

Can you post some minimal code to reproduce?

jangxx · 2018-03-31T14:56:31Z

I'm gonna pass that question on to @kaltri-n since they seem to be currently using (or at least trying to use) this library. The code with which I initially stumbled across this issue is long gone.

kaltri-n · 2018-04-03T12:06:35Z

Hi guys,
I failed to parse the data in Dublin Core and so far I haven't looked up at other formats. So, unfortunately, I cannot be helpful in here :(

jangxx changed the title ~~Result is missing data~~ Data missing in the result (and other problems) Nov 18, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data missing in the result (and other problems) #260

Data missing in the result (and other problems) #260

jangxx commented Nov 18, 2015

kaltri-n commented Mar 22, 2018

jangxx commented Mar 22, 2018 •

edited

Loading

Leonidas-from-XIV commented Mar 22, 2018

jangxx commented Mar 28, 2018

Leonidas-from-XIV commented Mar 28, 2018

jangxx commented Mar 31, 2018

kaltri-n commented Apr 3, 2018

Data missing in the result (and other problems) #260

Data missing in the result (and other problems) #260

Comments

jangxx commented Nov 18, 2015

kaltri-n commented Mar 22, 2018

jangxx commented Mar 22, 2018 • edited Loading

Leonidas-from-XIV commented Mar 22, 2018

jangxx commented Mar 28, 2018

Leonidas-from-XIV commented Mar 28, 2018

jangxx commented Mar 31, 2018

kaltri-n commented Apr 3, 2018

jangxx commented Mar 22, 2018 •

edited

Loading