Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nitely committed Mar 22, 2024
1 parent 7e227d7 commit aed7a87
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 7 deletions.
50 changes: 43 additions & 7 deletions src/regex.nim
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,9 @@ clears both the x and y flags.
u Unicode support (enabled by default)
x ignore whitespace and allow line comments (starting with #)
.. note::
All flags are disabled by default unless stated otherwise
All flags are disabled by default unless stated otherwise
The regex accepts passing a set of flags to set global flags:
The regex accepts passing a set of flags:
.. code-block::
regexCaseless same as (?i)
Expand All @@ -106,10 +105,11 @@ The regex accepts passing a set of flags to set global flags:
regexUngreedy same as (?U)
regexAscii same as (?-u)
regexExtended same as (?x)
regexArbitraryBytes treat both the regex and the input text as arbitrary byte sequences
regexArbitraryBytes treat both the regex and the input text
as arbitrary byte sequences
.. note::
Read the `Match arbitrary bytes`_ section
Read the `Match arbitrary bytes <#examples-match-arbitrary-bytes>`_ section
to learn more about the arbitrary bytes mode and ascii mode
Escape sequences
Expand Down Expand Up @@ -337,15 +337,15 @@ This flag makes ascii mode the default.
.. code-block:: nim
:test:
let flags = {regexArbitraryBytes}
const flags = {regexArbitraryBytes}
doAssert match("\xff", re2(r"\xff", flags))
doAssert match("\xf8\xa1\xa1\xa1\xa1", re2(r".+", flags))
Beware of (un)expected behaviour when mixin UTF-8 characters.
.. code-block:: nim
:test:
let flags = {regexArbitraryBytes}
const flags = {regexArbitraryBytes}
doAssert match("Ⓐ", re2(r"Ⓐ", flags))
doAssert match("ⒶⒶ", re2(r"(Ⓐ)+", flags))
doAssert not match("ⒶⒶ", re2(r"Ⓐ+", flags)) # ???
Expand All @@ -355,6 +355,42 @@ regex is parsed as a byte sequence. The ``Ⓐ`` character
is composed of multiple bytes (``\xe2\x92\xb6``),
and only the last byte is affected by the ``+`` operator.
Compile the regex expression at compile time
############################################
Passing a regex literal or assigning the Regex
object to a ``const`` will compile the regex
expression at compile time.
Most other regex libs only support this at runtime, which
usually requires some sort of cache (a thread-var for example)
to avoid compiling the expression more than once.
.. code-block:: nim
:test:
let text = "abc"
block:
const rexp = re2".+"
doAssert match(text, rexp)
block:
doAssert match(text, re2".+")
Using a ``const`` can avoid confusion when passing flags:
.. code-block:: nim
:test:
let text = "abc"
block:
const rexp = re2(r".+", {regexDotAll})
doAssert match(text, rexp)
block:
doAssert match(text, re2(r".+", {regexDotAll}))
block:
# this will compile the expression at runtime
# because flags is a var, avoid it!
let flags = {regexDotAll}
doAssert match(text, re2(r".+", flags))
]##

import std/tables
Expand Down
6 changes: 6 additions & 0 deletions tests/tests2.nim
Original file line number Diff line number Diff line change
Expand Up @@ -3285,6 +3285,12 @@ test "tvarflags":
check match("a\L", re2(r"a.", {regexDotAll}))
check match("a\L", re2"(?s)a.")
check(not match("a\L", re2"a."))
block: # force compile time
const rexp = re2(r"a.", {regexDotAll})
check match("a\L", rexp)
block: # force run time
var flags = {regexDotAll}
check match("a\L", re2(r"a.", flags))
block:
var m: RegexMatch2
check match("aa", re2(r"(a*)(a*)", {regexUngreedy}), m) and
Expand Down

0 comments on commit aed7a87

Please sign in to comment.