Namespacemap reconstitution #490

sbarnum · 2023-08-31T18:33:44Z

This is a PR for the proposed namespace map solution addressing all explicitly identified issues (layering of namespace maps, confliction of peer namespace maps, etc) with the previous solution.
It also provides more contextual detail around the intended semantics and use for the namespace map construct in the NamespaceMap.md class specification.

davaya · 2023-09-04T10:12:42Z

X-Collection.md says only:

An X collection Element is a collection of Elements that does not contain any other X Collection Elements.
In this way it can be thought of as an outer shell collection of SPDX content without self-recursion that can be used as a content aggregation target for serialization.

There are no use cases or requirements for why, where, or when it is used or what it accomplishes, given the requirement in NamespaceMap.md that:

the namespace map set of prefixes and namespaces MUST be implemented in a given serialization form.

Without a justification, there is no reason to define an X-Collection element. Or it could be defined in the logical model with the requirement that it never be serialized. The amorphous "X" concept represents what we have been discussing, but not what we have yet to discuss: externalMap, which is a critical consideration.
The three options for the X-collection (or X-file) element semantics are:

applies to a specific collection/file/payload of elements in a specific data format (has a verifiedBy property)
applies to a specific logical collection of elements regardless of data format (does not have a verifiedBy property)
can be applied to future collections of elements (X is a subclass of Element, not ElementCollection, and has no element property)

One of those three must be chosen before an X element can be defined.

Currently externalMap is a property of ElementCollection, which means that an Sbom element applies only to a specific payload in a specific format. The Consumer/Producer who reads a payload includes the hash of that payload in the list of elements in the newly-produced BOM. If the Consumer/Producer had read a different payload containing the identical elements he would need to create a different Sbom element.

So if ElementCollection is payload-specific, then its subclass X-Collection is also payload-specific (option 1). At a minimum we want to make ElementCollection serialization-agnostic by removing externalMap from it. But if we include externalMap in X-collection then X-collection is still serialization-specific (option 1).

An instance of the X-collection element cannot be created until all of its externalMap property values are known.

Those values are not known until the Consumer/Producer reads a specific payload, which is why X-collection instances cannot exist in the logical model until the serialized source of an element has been chosen. As Max says, those instances can be created and inserted into the graph by a Consumer/Producer after reading a payload. And as I say, those instances could also be created and inserted into the graph by the producer, but not until the producer has serialized the payload and knows its signature/hash.

This boils down to my original objection from 9 months ago: the logical model should be serialization-agnostic. A logical SBOM can contain two File elements, for a total of three. Serialization independence means its serialized payload would contain 3 elements (SBOM, File1, File2), not be forced to serialize 4 elements (X-collection plus the other three.) Consumers of that payload have the option of creating the X-collection element, but don't have to unless they produce other payloads that depend on (reference) the first payload. In particular, the Consumer could copy those three element values into his new payload instead of referencing them, in which case no X-collection element is ever needed.

goneall

Thanks @sbarnum for writing this up - A few comments to consider for the tech call.

goneall · 2023-09-04T23:21:43Z

model/Core/Classes/NamespaceMap.md

+A serialization MAY choose to use prefixes and namespaces other than the namespace map content.
+A serialization MAY choose to use no prefixes at all and rather use the more verbose full ElementID IRIs.
+
+If utilized the namespace map set of prefixes and namespaces MUST be implemented in a given serialization 


It took me a few passes to parse this statement. At first, it seemed to contradict the "MAY" choose to use, but now I understand that this is describing that we must not replace the native serialization.

Suggest restructuring the sentence to start with the serialization form - something like "The prefix / namespace mapping of the serialization format must always be used to convey the namespace mapping for that serialization ..."

I am not sure if your characterization is aligned to intent here.
Lines 22-26 all go together.
They are saying that if you are going to actual use the namespace map content as prefixes in a given serialization then how those prefixes are represented in a given serialization form (e.g., json-ld, xml, etc) varies form to form and the implementation of how they are represented in any given form is specified in the binding rules (that bind the model to a given serialization form) for that serialization form. In other words, if you are going to implement them in json-ld you must use the specified form for prefixes in the json-ld binding rules specification. For serialization forms that natively support prefixes it is a little more obvious. For serialization forms that don't and we have to define custom representations this is much more of an important clause.

I believe the whole first sentence (lines 22-24) clearly states this point as is.
Does the above explanation clarify or do you still find the sentence unclear?

goneall · 2023-09-04T23:23:03Z

model/Core/Classes/NamespaceMap.md

+The namespace map itself is also conveyed as native SPDX content to support clarity, transparency and 
+consistency independent of any particular serialization form.
+
+A given serialization payload (whether file or streaming) MUST NOT contain multiple namespace maps with conflicting mappings.


I take it the above sentence refers to the "native" namespace mapping. If so, this may not be a required statement since the serialization format standards already have this requirement.

Serialization formats that natively support prefixes usually have this requirement though some do not enforce it and rather simply utilize the last defined prefix and ignore any earlier conflicting prefixes.
Serialization formats that do not natively support prefixes and where we will have to define custom prefix representations will have no such rules.

In either case, I would propose making this explicit statement is highly useful in the SPDX spec.

model/Core/Classes/NamespaceMap.md

goneall · 2023-09-04T23:29:43Z

model/Core/Classes/NamespaceMap.md

+   This would involve the conflicted mappings issue briefly characterized above this list of use cases.
+6) An SPDX content consumer wishing to maintain consistent prefix use while receiving serialized content that does not include a namespace map but does utilize prefixes, and at some future point reserializing that content.
+   The consumer can simply "wrap" the received content in a collection with a namespace map and specify the prefix to namespace mappings that were actually implemented in the received content.
+7) It should be possible to derive and maintain namespace mapping provenance for content.


Why would this be important?

As content is received and reserialized and potentially "wrapped" by consumer/producers with new namespace maps it becomes more complicated who asserted which namespace maps (prefixes) and it what context.
Understanding who asserted which namespace maps (prefixes) and it what context can help a consumer determine trust that asserted prefixes are from the original producer and which ones they should use.

goneall

Couple more comments

goneall · 2023-09-05T00:08:15Z

model/Core/Classes/ElementCollection.md

@@ -19,10 +19,10 @@ An SpdxCollection is a collection of Elements, not necessarily with unifying con
 ## Properties

 - element
-  - type: Element
+  - type: Element and NOT (X-Collection)


Note: the spec parser does not currently support this expression

That does not surprise me.
It will need to though.
This sort of issue I why I strongly prefer having the formal ontology specification (RDFS/OWL/SHACL) be THE ground truth specification rather than a prose form that must be converted.
Specifying this sort of range is simple, native and inherent in the RDFS/OWL/SHACL.

goneall · 2023-09-05T00:08:31Z

model/Core/Classes/ElementCollection.md

  - minCount: 1
 - rootElement
-  - type: Element
+  - type: Element and NOT (X-Collection)


Same as above

Same as above for me too. :-)

sbarnum · 2023-09-06T15:58:05Z

X-Collection.md says only:

An X collection Element is a collection of Elements that does not contain any other X Collection Elements.
In this way it can be thought of as an outer shell collection of SPDX content without self-recursion that can be used as a content aggregation target for serialization.

There are no use cases or requirements for why, where, or when it is used or what it accomplishes, given the requirement in NamespaceMap.md that:

We do not typically put " use cases or requirements for why, where, or when it is used or what it accomplishes" in the class specifications.
The class specification for NamespaceMap (NamespaceMap.md) atypically provides this greater level of detail for the namespace map rather than attempting to put is here at least one level abstracted from its direct application.
That being said I can see that the X-Collection.md could use a brief statement that it is intended to convey a namespace map for a given set of SPDX content and that it MAY (though not MUST) be used as the single outermost enclosing SPDX element in specific instance of serialization for simplicity and consistency.

the namespace map set of prefixes and namespaces MUST be implemented in a given serialization form.

The full statement from NamespaceMap.md that this is snippeted from is:
"If utilized the namespace map set of prefixes and namespaces MUST be implemented in a given serialization form (e.g., json-ld or xml) as specified in the binding rules specification for that serialization and utilizing the appropriate inherent or custom specified mechanism for that serialization. The namespace map itself is also conveyed as native SPDX content to support clarity, transparency and consistency independent of any particular serialization form."

This is saying that if you are going to actually use the namespace map content as prefixes in a given serialization then how those prefixes are represented in a given serialization form (e.g., json-ld, xml, etc) varies form to form and the implementation of how they are represented in any given form is specified in the binding rules (that bind the model to a given serialization form) for that serialization form. In other words, if you are going to implement them in json-ld you must use the specified form for prefixes in the json-ld binding rules specification. For serialization forms that natively support prefixes it is a little more obvious. For serialization forms that don't and we have to define custom representations this is much more of an important clause.

Without a justification, there is no reason to define an X-Collection element.

As stated above, I can see value in adding a brief statement to X-Collection.md to clarify its intent though I do not believe class specification must or should have to "justify" their existence.

Or it could be defined in the logical model with the requirement that it never be serialized.

No. The entire purpose of namespace maps are that they are serialized as part of the content.

The amorphous "X" concept represents what we have been discussing, but not what we have yet to discuss: externalMap, which is a critical consideration. The three options for the X-collection (or X-file) element semantics are:

I would assert that externalMap is a completely separate and mostly unrelated topic than what we are discussing here and am unaware of any currently identified issues with its current implementation.

I would caution not using any names like X-File in relation to X-Collection. X-Collection is NOT intended to deal with a specific instance of serialization.

applies to a specific collection/file/payload of elements in a specific data format (has a verifiedBy property)

applies to a specific logical collection of elements regardless of data format (does not have a verifiedBy property)

can be applied to future collections of elements (X is a subclass of Element, not ElementCollection, and has no element property)

One of those three must be chosen before an X element can be defined.

The very explicit intent of namespace map is 2.
It is very explicitly not tied to any specific instance of serialization.

Currently externalMap is a property of ElementCollection, which means that an Sbom element applies only to a specific payload in a specific format. The Consumer/Producer who reads a payload includes the hash of that payload in the list of elements in the newly-produced BOM. If the Consumer/Producer had read a different payload containing the identical elements he would need to create a different Sbom element.

I am confused here. How did externalMap get into this conversation? It is unrelated.

Currently externalMap is a property of ElementCollection, which means that an Sbom element applies only to a specific payload in a specific format.

ExternalMap on ElementCollection makes no such implication on Sbom elements. Sbom elements, or any part of the SPDX model other than the File object, have nothing to do with any specific instance (payload) of serialization.

So if ElementCollection is payload-specific, then its subclass X-Collection is also payload-specific (option 1). At a minimum we want to make ElementCollection serialization-agnostic by removing externalMap from it. But if we include externalMap in X-collection then X-collection is still serialization-specific (option 1).

ElementCollection is definitely not payload specific and neither is X-Collection.

An instance of the X-collection element cannot be created until all of its externalMap property values are known.

Again, I am confused at how externalMap got introduced here or how its meaning and intent got so confused.

Those values are not known until the Consumer/Producer reads a specific payload, which is why X-collection instances cannot exist in the logical model until the serialized source of an element has been chosen. As Max says, those instances can be created and inserted into the graph by a Consumer/Producer after reading a payload. And as I say, those instances could also be created and inserted into the graph by the producer, but not until the producer has serialized the payload and knows its signature/hash.

This boils down to my original objection from 9 months ago: the logical model should be serialization-agnostic. A logical SBOM can contain two File elements, for a total of three. Serialization independence means its serialized payload would contain 3 elements (SBOM, File1, File2), not be forced to serialize 4 elements (X-collection plus the other three.) Consumers of that payload have the option of creating the X-collection element, but don't have to unless they produce other payloads that depend on (reference) the first payload. In particular, the Consumer could copy those three element values into his new payload instead of referencing them, in which case no X-collection element is ever needed.

I agree that the logical model should be serialization-agnostic. It should also be agnostic of any serialization instance.

goneall · 2023-09-20T02:51:07Z

In the namespace meeting on 18 Sept 2023, we decided to move forward with the proposal documented in pull request #491

PR #491 is now merged into this pull request so we can review a single PR before merging into the base.

Note that the branch used for PR #491 was not deleted, so we can refer back to the changes if needed. I also did a merge commit so it would be easy to reconstruct the state of this PR prior to the merge.

Please review and comment on the wording for this proposal.

davaya · 2023-10-17T22:43:55Z

PR #500 includes NamespaceMap in payload data, allowing it to be used independently of SerializedCollection. This does not prevent SerializedCollection from defining NamespaceMap independently of serialization if there are use cases for doing so.

goneall · 2023-10-27T12:22:43Z

@sbarnum - Do you want to update this PR with the decisions from the serialization team?

Move imports from ElementCollection to X-Collection

goneall

One minor typo

model/Core/Classes/X-Collection.md

goneall · 2023-11-02T16:06:10Z

@sbarnum - can you rename the X-Collection to SpdxDocument?

goneall

LGTM - Thanks @sbarnum

goneall · 2023-11-03T17:36:31Z

Note that this is a previous discussion in the PR merged into this: #491

goneall · 2023-11-03T17:43:13Z

Fixes #467

goneall · 2023-11-03T17:45:27Z

Fixes #415

maxhbr · 2023-11-21T17:32:56Z

related: #557

goneall · 2023-11-28T18:59:14Z

@nishakm @maxhbr @zvr - I resolved the merge conflicts - ready for review.

Reset the ranges of element and rootElement to scope out X-Collection

Signed-off-by: Gary O'Neall <[email protected]>

Per review comments

The change from "serialization formats" to "serialization" should cover both multiple instances of serialization in a single format or multiple instances in different formats. Co-authored-by: Gary O'Neall <[email protected]>

… consensus

Signed-off-by: Gary O'Neall <[email protected]>

goneall force-pushed the main branch from 9441894 to 2f978cd Compare August 31, 2023 18:49

goneall reviewed Sep 4, 2023

View reviewed changes

goneall reviewed Sep 5, 2023

View reviewed changes

goneall mentioned this pull request Sep 5, 2023

Attempt to implement alternative proposal for namespaceMap #491

Merged

davaya mentioned this pull request Sep 20, 2023

Serialization: NamespaceMap, Solutions A and B, and Playground use cases #499

Closed

goneall added the serialization Something about the representation of data in bytes label Oct 12, 2023

goneall requested changes Oct 31, 2023

View reviewed changes

model/Core/Classes/X-Collection.md Outdated Show resolved Hide resolved

goneall approved these changes Nov 3, 2023

View reviewed changes

goneall added this to the 3.0-rc2 milestone Nov 3, 2023

goneall mentioned this pull request Nov 3, 2023

Clarify namespacemaps used in SPDX documents and collections #403

Closed

goneall linked an issue Nov 3, 2023 that may be closed by this pull request

Proposal: Move data license from CreationInfo to SpdxDocument #467

Closed

This was referenced Nov 3, 2023

Proposal: Move data license from CreationInfo to SpdxDocument #467

Closed

Clarify Bom & SpdxDocument in 3.0 Model #415

Closed

goneall linked an issue Nov 3, 2023 that may be closed by this pull request

Clarify Bom & SpdxDocument in 3.0 Model #415

Closed

nishakm approved these changes Nov 28, 2023

View reviewed changes

sbarnum added 4 commits November 30, 2023 11:43

Create NamespaceMap.md

8445ff6

Create X-collection.md

bea66eb

Create namespacemap.md

3e331b7

Update NamespaceMap.md

007f4bb

sbarnum and others added 16 commits November 30, 2023 11:43

Update and rename X-collection.md to X-Collection.md

9cc4fe8

Reset the ranges of element and rootElement

cf43d5b

Reset the ranges of element and rootElement to scope out X-Collection

Added very slight detail to Summary and Description

346d22b

Create namespace.md

46e7a8c

corrected "namespacemap" to "namespaceMap"

b8f9e73

Updated model diagram to address namespace reconstitution proposal

63eec3e

Modify the namespacemap_reconstitution to use unserialized X classes

3a1968a

Signed-off-by: Gary O'Neall <[email protected]>

Update proposal per comments

f1a5b16

Signed-off-by: Gary O'Neall <[email protected]>

Remove reference to in memory in NamespaceMap

b6c2be0

Per review comments

Update model/Core/Classes/X-Collection.md

f920823

Update model/Core/Classes/NamespaceMap.md

7d23ec5

The change from "serialization formats" to "serialization" should cover both multiple instances of serialization in a single format or multiple instances in different formats. Co-authored-by: Gary O'Neall <[email protected]>

Update to align with recent consensus of serialization WG.

ee86382

Updated to align to recent consensus in serialization WG

47ad280

Update model/Core/Classes/X-Collection.md

1599636

Changed name X-Collection to SpdxDocument aligned with Tech committee…

65b0704

… consensus

Fix CI failure due to missing space

3c5eaca

Signed-off-by: Gary O'Neall <[email protected]>

goneall force-pushed the namespacemap_reconstitution branch from 7d7de60 to 3c5eaca Compare November 30, 2023 19:58

goneall merged commit 3357c71 into main Nov 30, 2023
1 check passed

goneall deleted the namespacemap_reconstitution branch November 30, 2023 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Namespacemap reconstitution #490

Namespacemap reconstitution #490

sbarnum commented Aug 31, 2023

davaya commented Sep 4, 2023 •

edited

Loading

goneall left a comment

goneall Sep 4, 2023

sbarnum Sep 6, 2023

goneall Sep 4, 2023

sbarnum Sep 6, 2023

goneall Sep 4, 2023

sbarnum Sep 6, 2023

goneall left a comment

goneall Sep 5, 2023

sbarnum Sep 6, 2023

goneall Sep 5, 2023

sbarnum Sep 6, 2023

sbarnum commented Sep 6, 2023

goneall commented Sep 20, 2023

davaya commented Oct 17, 2023

goneall commented Oct 27, 2023 •

edited

Loading

goneall left a comment

goneall commented Nov 2, 2023

goneall left a comment

goneall commented Nov 3, 2023 •

edited

Loading

goneall commented Nov 3, 2023

goneall commented Nov 3, 2023

maxhbr commented Nov 21, 2023

goneall commented Nov 28, 2023

Namespacemap reconstitution #490

Namespacemap reconstitution #490

Conversation

sbarnum commented Aug 31, 2023

davaya commented Sep 4, 2023 • edited Loading

goneall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goneall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbarnum commented Sep 6, 2023

goneall commented Sep 20, 2023

davaya commented Oct 17, 2023

goneall commented Oct 27, 2023 • edited Loading

goneall left a comment

Choose a reason for hiding this comment

goneall commented Nov 2, 2023

goneall left a comment

Choose a reason for hiding this comment

goneall commented Nov 3, 2023 • edited Loading

goneall commented Nov 3, 2023

goneall commented Nov 3, 2023

maxhbr commented Nov 21, 2023

goneall commented Nov 28, 2023

davaya commented Sep 4, 2023 •

edited

Loading

goneall commented Oct 27, 2023 •

edited

Loading

goneall commented Nov 3, 2023 •

edited

Loading