Skip to content

Releases: cometkim/unicode-segmenter

[email protected]

06 Mar 19:56
209baf5
Compare
Choose a tag to compare

Minor Changes

  • 21cd789: Removed deprecated APIs

    • searchGrapheme in unicode-segmenter/grapheme
    • takeChar and takeCodePoint in unicode-segmenter/utils

    Which are used internally before, but never from outside.

  • 483d258: Reduced bundle size, while keeping the best perf

    Some details:

    • Refactored to use the same code path internally as possible.
    • Removed pre-computed jump table, the optimization were compensated for by other perf improvements.
    • Previous array layout to avoid accidental de-opt turned out to be overkill. The regular tuple array is well optimized, so I fall back to using good old plain binary search.
    • Some experiments like new encoding and eytzinger layout for more aggressive improvements, but no success.

[email protected]

07 Dec 18:45
4a9be75
Compare
Choose a tag to compare

Patch Changes

  • a5f486f: Fix bloat in the NPM package.

    package.tgz was mostly bloated by CommonJS interop and sourcemap.

    However, sourcemap isn't necessary here as it uses sources as is,
    and the CommonJS shouldn't be different.

    Now fixed by simpler transpilation for CommoJS entries, and removed sourcemap files.
    Also removed inaccessible entries.

    So the unpacked total package size has been down to 135 KB from 250 KB

    Note: Node.js v22 will stabilize require(ESM), which will allow CommonJS projects to use this package without having to maintain separate entries. I'm very excited about that, and looking forward to it becoming more "common". The first major release may consider ending support for CommonJS entries and TypeScript's "Node" resolution.

[email protected]

29 Nov 17:09
28c3475
Compare
Choose a tag to compare

Patch Changes

  • 94ed937: Improved perf and bundle size a bit

    It seems using TypedArray isn't helpful,
    and deref many prototypes may cause deopt.

    Array is good enough while it ensures it's packed.

  • de71269: Update Intl type definition

[email protected]

24 Nov 03:23
f5bf190
Compare
Choose a tag to compare

Patch Changes

  • 9d688d8: grapheme: rename countGrapheme() to countGraphemes(). existing name is deprecated alias.
  • be49399: grapheme: Add splitGraphemes() utility
  • 5e86659: grapheme: add more detail to API JSDoc

[email protected]

02 Nov 21:20
8d8cd4f
Compare
Choose a tag to compare

Minor Changes

  • ffb41fb: Code size is signaficantly reduced, minified JS now works in half

    There are also some performance improvements.
    Not that much, but getting improvement on size without giving it up is a huge win.

    • Compress Unicode data more in Base36

    • Changed the internal representation into TypedArray to improve its access pattern.

    • Shrank the grapheme lookup table size.
      This does not impact performance except for some edges like Hindi and Demonic, but it does reduce the bundle size.

  • 9e0feca: Update to Unicode® 16.0.0

[email protected]

02 Sep 18:07
66e7f83
Compare
Choose a tag to compare

Patch Changes

  • 3665cf7: Fix Hindi text segmentation

[email protected]

01 Sep 03:56
c1e6464
Compare
Choose a tag to compare

Minor Changes

  • 73f5e6b: Significantly reduced bundle size by compressing data table. So the grapheme segmentation library is only takes 6.6kB (gzip) or 4.4kB (brotli)!

Patch Changes

  • b045320: Fix isSMP, and add more plane utils (isSIP, isTIP, isSSP)

[email protected]

05 Jul 05:54
03d1051
Compare
Choose a tag to compare

Patch Changes

  • 447b484: Fix polyfill to do not override existing, and also to be assigned as non-enumerable

[email protected]

14 Jun 02:26
6d02503
Compare
Choose a tag to compare

Patch Changes

  • 04fe2fc: Fix sourcemap reference error

    • Include missing sourcemap files for transformed cjs entries
    • Remove unnecessary transforms for esm entries and remove source map reference

[email protected]

13 Jun 19:29
56b3b74
Compare
Choose a tag to compare

Minor Changes

  • 657e31a: semi-breaking: removed _cat from grapheme cluster segments because it was useless

    Instead, added _catBegin and _catEnd as beginning/end category of segments, which are possibly useful to infer applied boundary rules.