Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested path for JsonSchema.Net integration #1191

Conversation

gregsdennis
Copy link

@gregsdennis gregsdennis commented Mar 24, 2023

This branch is my recommended approach for using JsonSchema.Net. It will require a few paradigm shifts for the internals of OpenApi.Net, but these changes are necessary to take full advantage of what JsonSchema.Net has to offer.

Opening this as a draft as it's not nearly ready and so that we can discuss the changes.

I think with the approach I have here, we can completely remove the Readers library.

Comment on lines +290 to +292
.Ref("#/components/schemas/Person")
)
.Ref("Person")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the difference between these refs is supposed to be.

The original code is

                                    Reference = new OpenApiReference	
                                    {	
                                        Type = ReferenceType.Schema,	
                                        Id = "Person"	
                                    }	
                                }	
                            },	
                            Reference = new OpenApiReference { Id = "Person" }

What does this reference do? My .Ref() extension creates a $ref keyword, but I don't think that's really what's supposed to be happening here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Components have a self-reference. It's an ugly implementation detail due to the way that we load $refs in as a dummy empty object (because the target of the reference hasn't been parsed yet) and then post-reading we run a reference resolver to replace the dummy objects with the target object. This means the inline object and the component object are actually the same object instance. Therefore, objects in the components collection also need to have a Reference property set or the writers cannot write out the $ref.

This model is going to change OpenAPI.Net v2 to a proxy pattern. We need to do this because in OpenAPI v3.1 you can now have summary and description properties that sit along side the $ref. With the current model we have no place to store that additional information.

Anyway, I think the proxy pattern for handling references is going to be a much better model anyway as it will not require doing the post-reading reference resolution step. It also makes writing much easier. You can see my attempt to implement the proxy pattern here https://github.com/microsoft/OpenAPI.NET/blob/dm/ProxyReference/src/Microsoft.OpenApi/Models/OpenApiSchemaReference.cs However, I did it on OpenAPISchema which is no longer appropriate since 3.1 made the distinction between OpenAPI reference objects and JSON Schema $ref.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I can just remove all of those, probably. I'll need to go through and re-update all of those tests. (I'll probably end up closing this and cherry-picking it to another branch for a different PR.)

Comment on lines +127 to +132
null, // These nulls are invalid for allOf; I'm not sure what they're trying to achieve.
null,
new OpenApiSchema
{
Type = "string"
},
new JsonSchemaBuilder().Type(SchemaValueType.String),
null,
null
}
}
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defining the BrokenComponents object for this test, but submitting null here is more than broken; it's completely invalid. I'm not sure JsonSchema.Net will even support doing this, and even if it does, you'll likely get a NullReferenceException.

@@ -50,7 +51,12 @@ public OpenApiDocument Read(TextReader input, out OpenApiDiagnostic diagnostic)
return new OpenApiDocument();
}

return new OpenApiYamlDocumentReader(this._settings).Read(yamlDocument, out diagnostic);
var asJsonNode = yamlDocument.ToJsonNode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are going to make the switch to do Yaml => JsonNodes => OpenAPIDocument. This will facilitate us to use the JsonSchema.Net Examples and hopefully get rid of OpenApiAny and all of the OpenApiPrimitive classes.

var asJsonNode = yamlDocument.ToJsonNode();

diagnostic = null; // TBD
return asJsonNode.Deserialize<OpenApiDocument>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't see how using Deserialize is going to work for us. Using an automatic deserialization mechanism strongly couples the serialized format to the in-memory model. We specifically wrote the deserializer classes so we can deserialize any version of OpenAPI into the "latest" in-memory model.

It is not clear to me from this PR how you are proposing we continue allowing our users to import OpenAPI without caring what version of OAS the input format is. And to clarify, this is one of our most important features.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, this change isn't necessary because this code will be deleted.

Using an automatic deserialization mechanism strongly couples the serialized format to the in-memory model.

How so? I'm doing the actual parsing with YamlSharp; both YAML and JSON are supported through this. It parses into the YamlDocument model, which I then translate to JsonNode. Then I just use the built-in JsonSerializer to deserialize it into OpenApiDocument.

Going the other way, if you want JSON, you just use JsonSerializer.Serialize(openApiDoc). If you want YAML, you use this method which we'll need to expose as an extension.

We specifically wrote the deserializer classes so we can deserialize any version of OpenAPI into the "latest" in-memory model.

This supports the same thing. I can drop a 2.0 or a 3.1 schema into my DevTest test and it still deserializes just fine. Users don't need to care which OpenAPI version they have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the JsonDeserializer.Deserialize convert body parameters into a requestBody object? How does the JsonSerializer.Deserialize know how to convert produces/consumes into media type objects? We have many of these kinds of translations that System.Text.Json knows nothing about.

Copy link
Author

@gregsdennis gregsdennis Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. The paths use OpenApiParameter which isn't associated with OpenApiRequestBody (only used in components) that I can see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our V2 Deserializer, we convert parameters that are of type body or form into a request body. See code here:

https://github.com/microsoft/OpenAPI.NET/blob/vnext/src/Microsoft.OpenApi.Readers/V2/OpenApiOperationDeserializer.cs#L126-L138

namespace Microsoft.OpenApi.Readers
{
/// <summary>
/// Provides extensions to convert YAML models to JSON models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Thanks! Maggie is currently doing a spike to try and convert the ParseNode classes to work off JsonNode instead of YamlNode. If she is able to do that, then this class will enable us to continue supporting YAML parsing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is basically copied from my YamlToJsonNode package. That lib uses YamlDotNet which is the basis for YamlSharp, so it's the same code for "different" types.

@@ -82,17 +77,17 @@ public class OpenApiHeader : IOpenApiSerializable, IOpenApiReferenceable, IOpenA
/// <summary>
/// Examples of the media type.
/// </summary>
public IDictionary<string, OpenApiExample> Examples { get; set; } = new Dictionary<string, OpenApiExample>();
public IDictionary<string, OpenApiExample> Examples { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an intentional design choice to ensure collections were non-null so that library users did not have to check if collections had been created before adding properties to the collection. However, due to the performance cost we should do a lazy instantiation of these properties in the get accessor instead of doing it at create time.

Copy link
Author

@gregsdennis gregsdennis Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These initializations cause these properties to serialize. Lazy initialization will do the same. Not using custom JSON converters is preferred, where possible.

For ease-of-use, personally, I think a fluent builder is better. I made JsonSchema immutable, so I needed a separate JsonSchemaBuilder class. If you keep your types mutable, you won't need a separate builder class.

@darrelmiller
Copy link
Member

darrelmiller commented Apr 2, 2023

@gregsdennis Meggie mentioned to me that you had some concern of how the OpenApiSchemaDeserizalizer would be able to parse schema keywords that were not previously known. We using this deserialization pattern extensively in our SDK generation and are able to parse unrecognized properties into a dictionary without any problem. Currently the ParseField does report an error if it finds an unknown property

However, it will be trivial to augment this for parsing of JsonNodes that contain properties of JsonSchema so that we can either handle keywords that have been registered in your JsonSchema keyword registry or if they are completely unknown. This will allow us to handle both the current 3.1 case where unknown keywords are allowed and the planned future case where unknown keywords are not.

/// <returns>A collection of nodes representing the YAML documents in the stream.</returns>
public static IEnumerable<JsonNode> ToJsonNode(this YamlStream yaml)
{
return yaml.Documents.Select(x => x.ToJsonNode());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return yaml.Documents.Select(x => x.ToJsonNode());
return yaml.Documents.Select(static x => x.ToJsonNode());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they're using a compiler version that supports this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evidence would suggest otherwise

.Where(static r => r.Value.Content != null)

Copy link
Author

@gregsdennis gregsdennis Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, okay. I'm not worried about these nuances right now. Let's please focus on making it work.


Docs: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-9.0/static-anonymous-functions

Removing the static modifier from an anonymous function in a valid program does not change the meaning of the program.

This is an inconsequential, editorial change.

Comment on lines +91 to +97
var node = new JsonArray();
foreach (var value in yaml)
{
node.Add(value.ToJsonNode());
}

return node;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var node = new JsonArray();
foreach (var value in yaml)
{
node.Add(value.ToJsonNode());
}
return node;
return new JsonArray(yaml.Select(static x => x.ToJsonNode()).ToArray());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion allocates an extra array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, sorry I meant this

Suggested change
var node = new JsonArray();
foreach (var value in yaml)
{
node.Add(value.ToJsonNode());
}
return node;
return new JsonArray(yaml.Select(static x => x.ToJsonNode()));

@gregsdennis
Copy link
Author

gregsdennis commented May 20, 2023

We using this deserialization pattern extensively in our SDK generation and are able to parse unrecognized properties into a dictionary without any problem. - @darrelmiller

My concern was less about unknown keywords and more about keywords defined in custom vocabularies, especially those which provide additional functionality, like assertions. If you're supporting 2020-12, you will need to be able to support this somehow, probably through some extension mechanism where people can define their own keyword implementations.

@MaggieKimani1 MaggieKimani1 deleted the branch microsoft:mk/integrate-json-schema-library November 21, 2023 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants