Skip to content

v7.0.0

Compare
Choose a tag to compare
@jamesdbrock jamesdbrock released this 06 Oct 06:57
· 231 commits to main since this release

What's Changed

New package maintainers: @jamesdbrock @robertdp

Full Changelog: v6.0.2...v7.0.0

Unicode correctness by @jamesdbrock in #119

Correctly handle UTF-16 surrogate pairs in Strings. Fundamentally, we change the way we consume the next input character from Data.String.CodeUnits.uncons to Data.String.CodePoints.uncons.

Non-breaking changes

Add primitive parsers anyCodePoint and satisfyCodePoint for parsing CodePoints.

Add the match combinator.

Breaking changes

Move updatePosString to the Text.Parsing.Parser.String module and don't export it.

Change the definition of whiteSpace and skipSpaces toData.CodePoint.Unicode.isSpace.

To make this library handle Unicode correctly, it is necessary to either alter the StringLike class or delete it. We decided to delete it. The String module will now operate only on inputs of the concrete String type.

Breaking changes which won’t be caught by the compiler

anyChar will no longer always succeed. It will only succeed on a Basic Multilingual Plane character. The new parser anyCodePoint will always succeed.

We keep the Char parsers for backward compatibility. We also keep the Char parsers for ergonomic reasons. For example the parser char :: forall m. Monad m => Char -> ParserT String m Char. This parser is usually called with a literal like char 'a'. It would be annoying to call this parser with char (codePointFromChar 'a').

Benchmarks

A benchmark suite was added to this package, and the benchmarks show no performance difference in this release.