Releases: cometkim/unicode-segmenter
[email protected]
Minor Changes
-
21cd789: Removed deprecated APIs
searchGrapheme
inunicode-segmenter/grapheme
takeChar
andtakeCodePoint
inunicode-segmenter/utils
Which are used internally before, but never from outside.
-
483d258: Reduced bundle size, while keeping the best perf
Some details:
- Refactored to use the same code path internally as possible.
- Removed pre-computed jump table, the optimization were compensated for by other perf improvements.
- Previous array layout to avoid accidental de-opt turned out to be overkill. The regular tuple array is well optimized, so I fall back to using good old plain binary search.
- Some experiments like new encoding and eytzinger layout for more aggressive improvements, but no success.
[email protected]
Patch Changes
-
a5f486f: Fix bloat in the NPM package.
package.tgz
was mostly bloated by CommonJS interop and sourcemap.However, sourcemap isn't necessary here as it uses sources as is,
and the CommonJS shouldn't be different.Now fixed by simpler transpilation for CommoJS entries, and removed sourcemap files.
Also removed inaccessible entries.So the unpacked total package size has been down to 135 KB from 250 KB
Note: Node.js v22 will stabilize
require(ESM)
, which will allow CommonJS projects to use this package without having to maintain separate entries. I'm very excited about that, and looking forward to it becoming more "common". The first major release may consider ending support for CommonJS entries and TypeScript's"Node"
resolution.
[email protected]
[email protected]
[email protected]
Minor Changes
-
ffb41fb: Code size is signaficantly reduced, minified JS now works in half
There are also some performance improvements.
Not that much, but getting improvement on size without giving it up is a huge win.-
Compress Unicode data more in Base36
-
Changed the internal representation into TypedArray to improve its access pattern.
-
Shrank the grapheme lookup table size.
This does not impact performance except for some edges like Hindi and Demonic, but it does reduce the bundle size.
-
-
9e0feca: Update to Unicode® 16.0.0
[email protected]
Patch Changes
- 3665cf7: Fix Hindi text segmentation
[email protected]
[email protected]
Patch Changes
- 447b484: Fix polyfill to do not override existing, and also to be assigned as non-enumerable
[email protected]
Patch Changes
-
04fe2fc: Fix sourcemap reference error
- Include missing sourcemap files for transformed cjs entries
- Remove unnecessary transforms for esm entries and remove source map reference
[email protected]
Minor Changes
-
657e31a: semi-breaking: removed
_cat
from grapheme cluster segments because it was uselessInstead, added
_catBegin
and_catEnd
as beginning/end category of segments, which are possibly useful to infer applied boundary rules.