Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(developer): .kvks file reader corrupts non-BMP html numeric entities such as 𒍅 #13348

Open
mcdurdin opened this issue Feb 25, 2025 · 1 comment
Assignees
Milestone

Comments

@mcdurdin
Copy link
Member

mcdurdin commented Feb 25, 2025

For example in the sample below, 𐀀 is converted to \u{0000} and 𒍅 is converted to \u{2345}.

<?xml version="1.0" encoding="utf-8"?>
<visualkeyboard>
  <header>
    <version>10.0</version>
    <kbdname>hex_escape</kbdname>
    <flags/>
  </header>
  <encoding name="unicode" fontname="arial" fontsize="-12">
    <layer shift="">
      <key vkey="K_1">&#x1234;</key>
      <key vkey="K_2">&#256;</key>
      <key vkey="K_3">&#65536;</key>
      <key vkey="K_4">&#x12345;</key>
    </layer>
  </encoding>
</visualkeyboard>

18.0.198-beta

Arising from investigations in #13263.

@mcdurdin mcdurdin changed the title bug(developer): .kvks file reader corrupts non-BMP character escapes such as `&#x12345;' bug(developer): .kvks file reader corrupts non-BMP character escapes such as &#x12345; Feb 25, 2025
@mcdurdin mcdurdin self-assigned this Feb 25, 2025
@mcdurdin
Copy link
Member Author

mcdurdin commented Feb 25, 2025

@mcdurdin mcdurdin added this to the B18S3 milestone Feb 25, 2025
@mcdurdin mcdurdin moved this to In Progress in Keyman Feb 25, 2025
@mcdurdin mcdurdin moved this from In Progress to Blocked in Keyman Feb 25, 2025
@mcdurdin mcdurdin changed the title bug(developer): .kvks file reader corrupts non-BMP character escapes such as &#x12345; bug(developer): .kvks file reader corrupts non-BMP html numeric entities such as &#x12345; Feb 25, 2025
mcdurdin added a commit that referenced this issue Feb 27, 2025
… XML reader

fast-xml-reader has a bug with numeric entities. See:
  NaturalIntelligence/fast-xml-parser#725

This commit adds a unit test to verify that non-BMP numeric entities
will be parsed correctly. It will fail until we update the
fast-xml-parser dependency.

Relates-to: #13348
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Blocked
Development

No branches or pull requests

1 participant