Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve extension fields when marshalling/unmarshalling #511

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

cpuguy83
Copy link
Member

This makes it so external callers don't need to handle this themselves.

@cpuguy83 cpuguy83 requested a review from DannyBrito January 31, 2025 18:18
@cpuguy83 cpuguy83 requested a review from a team as a code owner January 31, 2025 18:18
spec.go Outdated Show resolved Hide resolved
@cpuguy83 cpuguy83 marked this pull request as draft February 3, 2025 20:34
@cpuguy83
Copy link
Member Author

cpuguy83 commented Feb 3, 2025

Converting this to draft since I realized it really doesn't help much except for when the extension field value is a core type.

@cpuguy83 cpuguy83 force-pushed the ext_field branch 2 times, most recently from 400a156 to 2576b43 Compare February 3, 2025 22:47
@cpuguy83 cpuguy83 marked this pull request as ready for review February 3, 2025 22:48
Copy link
Contributor

@pmengelbert pmengelbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the above comment. At the very least, we need to give it some serious thought before adding private fields with abstract data types into the main Spec struct.

load.go Outdated
return err
}

parsed, err := parser.ParseBytes(dt, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
parsed, err := parser.ParseBytes(dt, 0)
const noFlags = 0
parsed, err := parser.ParseBytes(dt, noFlags)

load.go Outdated
return err
}

parsedExt, err := parser.ParseBytes(dt, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

spec.go Outdated
Comment on lines 101 to 108

// extRaw is the raw AST of the extension fields in the spec.
// This is used to extract the ext fields in [Spec.Ext]
extRaw *ast.File
// ext is all the ext fields in the spec.
// This gets used when marshalling the spec back to yaml.
// It is used to avoid having to re-parse the raw AST.
ext map[string]interface{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like adding private fields to the spec struct, especially abstract stuff like *ast.File. We worked hard early on to ensure that the spec remained simple data, rather than data plus functionality.

For example, we decided not to implement the Source type as a group of interfaces because we wanted to maintain the spec as simple data. This also means that anyone importing the spec object will be importing goccy/go-yaml/ast.

I recognize that it will be a pain to wrap the spec and change function signatures, but a lot of that could be done with a find/replace operation and having the first member of the wrapper type be the spec:

type specExtensions struct {
        extRaw *ast.File
        ext map[string]interface{}
}

type SpecWithExtensions struct {
        *Spec
        ext specExtensions
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec is just a data type.
Currently it is unable to represent all the data that can go into a spec since we can't dynamically generate fields for extension fields.

These changes make it so Spec can represent that data and make it accessible to those who want it.
Its not changing the actual dalec spec in any, is not exposed outside of this package.

What concerns do you have for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the spec should remain simple data with no dependencies. I think if we want to provide the other information, we should expose some other mechanism to the user.

I don't like taking on a dependency on an external library for the spec data type. For something as important and user-facing as the spec data structure, I want the Dalec maintainers to maintain complete control over it. I want a clear boundary between our structural definition and external code.

I don't like mixing "simple data" (kv pairs with nesting) with more abtract representations of concepts like *ast.File. I don't like it because it makes things more complex. By "more complex", I mean harder to follow and harder to reason about.

This might not be a big deal in isolation. But I think these not-so-big deals add up in the long run, and compromise the maintainability of the project.

What I propose is a second public function, func ParseSpecWithExtensions(b []byte) (SpecWithExtensions, error) that returns a struct that wraps the spec and adds the additional data types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There mere act of even ignoring these fields during unmarshal requires access to the AST or other trickery.
Having the *ast.File in the spec type is purely an optimization (note, it is not part of the spec, just the data type we are using to represent the data the spec already allows for). It could be a []byte, but then we'd need to parse it again anytime someone wants to extract one of these extension fields, and we'd be parsing it to an *ast.File.

We are also already dependent on an external library to even encode and decode the plain spec (even without extensions).
It's there, we don't expose it anywhere (nor does this change expose it), but its there.

This change allows raw calls to yaml.Unmarshal to just work, and should work even if the caller is using a different yaml library.

Signed-off-by: Brian Goff <[email protected]>

WIP: smarter yamle parser use
Copy link
Contributor

@pmengelbert pmengelbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpuguy83 cpuguy83 merged commit ca7bfe0 into Azure:main Feb 14, 2025
18 checks passed
@cpuguy83 cpuguy83 deleted the ext_field branch February 14, 2025 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants