Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 - Primitive parsing for strings as DateTimes is too greedy #2137

Open
captainsafia opened this issue Feb 8, 2025 · 7 comments · May be fixed by #2160
Open

v2 - Primitive parsing for strings as DateTimes is too greedy #2137

captainsafia opened this issue Feb 8, 2025 · 7 comments · May be fixed by #2160
Assignees
Milestone

Comments

@captainsafia
Copy link
Member

captainsafia commented Feb 8, 2025

Both @mikekistler and I have observed this bug while interacting with the new JsonNode-based example APIs in v2. The current logic tries to eagerly parse string JsonNodes as date times and write them as such when emitting the final document:

if (valueKind == JsonValueKind.String && primitive is JsonValue jsonStrValue)
{
if (jsonStrValue.TryGetValue<DateTimeOffset>(out var dto))
{
writer.WriteValue(dto);
}
else if (jsonStrValue.TryGetValue<DateTime>(out var dt))
{
writer.WriteValue(dt);
}
else if (jsonStrValue.TryGetValue<string>(out var strValue))
{
// check whether string is actual string or date time object
if (DateTimeOffset.TryParse(strValue, CultureInfo.InvariantCulture, DateTimeStyles.RoundtripKind, out var dateTimeOffset))
{
writer.WriteValue(dateTimeOffset);
}
else if (DateTime.TryParse(strValue, CultureInfo.InvariantCulture, DateTimeStyles.RoundtripKind, out var dateTime))
{ // order matters, DTO needs to be checked first!!!
writer.WriteValue(dateTime);
}
else
{
writer.WriteValue(strValue);
}
}
}

This can cause some rather unintended behaviors. For example, the current implementation casts example strings that are designated as a DateOnly time to DateTime based types. The current implementation will also errenously capture certain string examples (like a float 3.14) as dates.

The following code:

using System.Text.Json.Nodes;
using System.IO;
using Microsoft.OpenApi.Models;
using Microsoft.OpenApi.Writers;

var schema = new OpenApiSchema()
{
	Type = JsonSchemaType.String,
	Example = JsonValue.Create("2024-01-02")
};

var schema2 = new OpenApiSchema()
{
	Type = JsonSchemaType.String,
	Example = JsonValue.Create("3.14")
};

var textWriter = new StringWriter();
var writer = new OpenApiJsonWriter(textWriter);
schema.SerializeAsV31(writer);
System.Console.WriteLine(textWriter.ToString());

textWriter = new StringWriter();
writer = new OpenApiJsonWriter(textWriter);
schema2.SerializeAsV31(writer);
System.Console.WriteLine(textWriter.ToString());

Produces:

{
  "type": "string",
  "example": "2024-01-02T00:00:00.0000000+00:00"
}
{
  "type": "string",
  "example": "2025-03-14T00:00:00.0000000+00:00"
}

When I would expect it to produce:

{
  "type": "string",
  "example": "2024-01-02"
}
{
  "type": "string",
  "example": "3.14"
}
@captainsafia captainsafia changed the title v2 - Primitie parsing for strings as DateTimes is too greedy v2 - Primitive parsing for strings as DateTimes is too greedy Feb 14, 2025
@RachitMalik12
Copy link

RachitMalik12 commented Feb 14, 2025

@baywet what are your thoughts here?

@baywet
Copy link
Member

baywet commented Feb 17, 2025

This is something we've faced in the kiota serialization layers as well.
STJ is greedy sometimes.
The solution we found which:

  • allows configuration of the date formats
  • provides great performance
  • reduces greediness
  • works with trimming

what to use the deserialize method instead

But I don't think we need such a level of complexity here. We can probably drop the first to conditional blocks and add more cases to this block

What do you think of this approach @captainsafia ?

@darrelmiller
Copy link
Member

darrelmiller commented Feb 17, 2025

@baywet Why are we parsing the string at all? Since we dropped the OpenApiAny primitives, we should not be inferring any types beyond JSON primitives.

@baywet
Copy link
Member

baywet commented Feb 17, 2025

That was my thought as well. My assumption here was so we can call the "correct" serialization method (we need the right type). But it does seem superfluous to me as well.

@darrelmiller
Copy link
Member

My vote is to get rid of all that type casting beyond JSON types.

@baywet
Copy link
Member

baywet commented Feb 17, 2025

From looking at the blame for the file, before a fix I implemented there probably to unblock some of the migration work, it looks like @MaggieKimani1 implemented most of it. Maybe she could share additional context we don't have?

@MaggieKimani1 MaggieKimani1 self-assigned this Feb 19, 2025
@MaggieKimani1
Copy link
Contributor

MaggieKimani1 commented Feb 19, 2025

@darrelmiller @baywet I think at the time I was just refactoring the whole codebase to replace the parsing logic with the JSON node "equivalent" and failed to get rid of the type casting code for the primitives since we no longer need it. Doing a PR that corrects this asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants