Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Includes type forwarding for System.Text.Unicode.Utf8 to the Microsoft.Bcl.Memory library #111292

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

AlexRadch
Copy link
Contributor

Close #52947

This commit introduces the System.Text.Unicode.Utf8 type to the Microsoft.Bcl.Memory library.

It includes type forwarding for Utf8 in Microsoft.Bcl.Memory.Forwards.cs, updates the documentation in PACKAGE.md to include Utf8 functionality, and adds corresponding test cases in Microsoft.Bcl.Memory.Tests.csproj.

The documentation now emphasizes Utf8 alongside Index, Range, and Base64Url, highlighting its role in converting data between UTF-8 and UTF-16 encodings.

This commit introduces the `System.Text.Unicode.Utf8` type to the `Microsoft.Bcl.Memory` library.

It includes type forwarding for `Utf8` in `Microsoft.Bcl.Memory.Forwards.cs`, updates the documentation in `PACKAGE.md` to include `Utf8` functionality, and adds corresponding test cases in `Microsoft.Bcl.Memory.Tests.csproj`.

 The documentation now emphasizes `Utf8` alongside `Index`, `Range`, and `Base64Url`, highlighting its role in converting data between UTF-8 and UTF-16 encodings.
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 10, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 10, 2025
@stephentoub
Copy link
Member

stephentoub commented Jan 11, 2025

Thank you, but this is not the intent of that work item, which is to actually ship those Utf8 APIs downlevel, e.g. in addition to type forwarding, it also needs an implementation for netstandard.

@stephentoub stephentoub added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 11, 2025
@teo-tsirpanis teo-tsirpanis added area-System.Memory and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 11, 2025
Updated `PackageDescription` to include "Utf8" support.

Added new `ItemGroup` for conditional compilation of UTF-8
and Unicode handling files for non-net8.0 frameworks.

Modified visibility and implementations in `Ascii.Utility.Helpers.cs`,
`Utf8.cs`, and `Utf8Utility` based on `MICROSOFT_BCL_MEMORY` define.
@AlexRadch
Copy link
Contributor Author

AlexRadch commented Jan 11, 2025

Thank you, but this is not the intent of that work item, which is to actually ship those Utf8 APIs downlevel, e.g. in addition to type forwarding, it also needs an implementation for netstandard.

I added Utf8 APIs to Microsoft.Bcl.Memory library. I haven't added tests yet. I'll add them too.

Updated `Microsoft.Bcl.Memory.Tests.csproj` to include `UnicodeUtility.cs` and removed .NET 8.0 targeting condition.

Modified `Utf8Tests.cs` by adjusting using directives and enhancing the `DecodeHex` method with conditional compilation for .NET 5.0+.
Added a new property `<DefineConstants>$(DefineConstants);MICROSOFT_BCL_MEMORY</DefineConstants>` to the project file to define a new compilation constant for the project.
@AlexRadch
Copy link
Contributor Author

AlexRadch commented Jan 11, 2025

Thank you, but this is not the intent of that work item, which is to actually ship those Utf8 APIs downlevel, e.g. in addition to type forwarding, it also needs an implementation for netstandard.

@stephentoub I added Utf8 APIs to Microsoft.Bcl.Memory library for older .NET platforms. I've also added tests for older .NET platforms.

@@ -9,7 +9,10 @@

namespace System.Text.Unicode
{
#if SYSTEM_PRIVATE_CORELIB
#if SYSTEM_PRIVATE_CORELIB || MICROSOFT_BCL_MEMORY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is being referenced only by System.Private.CoreLib and Microsoft.BCL.Memory, so the class can always be public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid to make such changes and let someone else make them as a separate PR.

Comment on lines 99 to 104
#if !MICROSOFT_BCL_MEMORY
public
#else
private
#endif
static byte[] ToUtf8(Rune rune)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if !MICROSOFT_BCL_MEMORY
public
#else
private
#endif
static byte[] ToUtf8(Rune rune)
public static byte[] ToUtf8(Rune rune)

Why can't this always be public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tests for older versions of .Net where there is no Rune class yet, I create it as an internal class, and the compiler complains that the public API cannot reference internal classes.

AlexRadch and others added 5 commits January 12, 2025 08:29
Co-authored-by: Theodore Tsirpanis <[email protected]>
- Added polyfill for System.Numerics.BitOperations for .NET Standard 2.0.
Removed the `DefineConstants` property from the project file, which included the constant `MICROSOFT_BCL_MEMORY`. This change may impact conditional compilation within the project.
@teo-tsirpanis
Copy link
Contributor

(removing NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) as the reason it was added has been addressed)

@teo-tsirpanis teo-tsirpanis removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 19, 2025
@teo-tsirpanis
Copy link
Contributor

@dotnet/area-system-memory could you review this? Thanks.

/// the data could not be successfully decoded. This pattern provides convenient automatic U+FFFD substitution of
/// invalid sequences while iterating through the loop.
/// </remarks>
private static OperationStatus DecodeFromUtf8(ReadOnlySpan<byte> source, out uint result, out int bytesConsumed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the project include Rune.cs instead of including a copy of the implementation here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in teo-tsirpanis@79ee05d in order to unblock work that depends on this, after waiting for one week. Anyone with permissions can push my commit on the PR's branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed

Copy link
Contributor

@teo-tsirpanis teo-tsirpanis Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compile errors fixed in teo-tsirpanis@bf6f989.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed

For downlevel frameworks we add `Rune.cs` to `Microsoft.Bcl.Memory`.

(cherry picked from commit 79ee05d)
(cherry picked from commit bf6f989)
@teo-tsirpanis
Copy link
Contributor

CI failures are unrelated.

@akoeplinger akoeplinger requested a review from jkotas March 10, 2025 16:17
@jkotas
Copy link
Member

jkotas commented Mar 10, 2025

I do not have any additional comments. @dotnet/area-system-memory should review and signoff.

@jkotas jkotas requested review from a team and removed request for jkotas March 10, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Memory community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Create OOB package for Rune, System.Text.Unicode.Utf8, and related APIs
5 participants