v7.0.0
What's Changed
New package maintainers: @jamesdbrock @robertdp
Full Changelog: v6.0.2...v7.0.0
Unicode correctness by @jamesdbrock in #119
Correctly handle UTF-16 surrogate pairs in String
s. Fundamentally, we change the way we consume the next input character from Data.String.CodeUnits.uncons
to Data.String.CodePoints.uncons
.
Non-breaking changes
Add primitive parsers anyCodePoint
and satisfyCodePoint
for parsing CodePoint
s.
Add the match
combinator.
Breaking changes
Move updatePosString
to the Text.Parsing.Parser.String
module and don't export it.
Change the definition of whiteSpace
and skipSpaces
toData.CodePoint.Unicode.isSpace
.
To make this library handle Unicode correctly, it is necessary to either alter the StringLike
class or delete it. We decided to delete it. The String
module will now operate only on inputs of the concrete String
type.
Breaking changes which won’t be caught by the compiler
anyChar
will no longer always succeed. It will only succeed on a Basic Multilingual Plane character. The new parser anyCodePoint
will always succeed.
We keep the Char
parsers for backward compatibility. We also keep the Char
parsers for ergonomic reasons. For example the parser char :: forall m. Monad m => Char -> ParserT String m Char
. This parser is usually called with a literal like char 'a'
. It would be annoying to call this parser with char (codePointFromChar 'a')
.
Benchmarks
A benchmark suite was added to this package, and the benchmarks show no performance difference in this release.