You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: 9-regular-expressions/22-regexp-sticky/article.md
+10-8
Original file line number
Diff line number
Diff line change
@@ -7,17 +7,21 @@ One of common tasks for regexps is "parsing": when we get a text and analyze it
7
7
8
8
For instance, there are HTML parsers for browser pages, that turn text into a structured document. There are parsers for programming languages, like JavaScript, etc.
9
9
10
-
Writing parsers is a special area, with its own tools and algorithms, so we don't go deep in there, but there's a very common question: "What is the text at the given position?".
10
+
Writing parsers is a special area, with its own tools and algorithms, so we don't go deep in there, but there's a very common question in them, and, generally, for text analysis: "What kind of entity is at the given position?".
11
11
12
12
For instance, for a programming language variants can be like:
13
13
- Is it a "name" `pattern:\w+`?
14
14
- Or is it a number `pattern:\d+`?
15
15
- Or an operator `pattern:[+-/*]`?
16
16
- (a syntax error if it's not anything in the expected list)
17
17
18
-
In JavaScript, to perform a search starting from a given position, we can use `regexp.exec` with `regexp.lastIndex` property, but that's not what we need!
18
+
So, we should try to match a couple of regular expressions, and make a decision what's at the given position.
19
19
20
-
We'd like to check the match exactly at given position, not "starting" from it.
20
+
In JavaScript, how can we perform a search starting from a given position? Regular calls start searching from the text start.
21
+
22
+
We'd like to avoid creating substrings, as this slows down the execution considerably.
23
+
24
+
One option is to use `regexp.exec` with `regexp.lastIndex` property, but that's not what we need, as this would search the text starting from `lastIndex`, while we only need to text the match *exactly* at the given position.
21
25
22
26
Here's a (failing) attempt to use `lastIndex`:
23
27
@@ -33,13 +37,11 @@ alert (regexp.exec(str)); // function
33
37
34
38
The match is found, because `regexp.exec` starts to search from the given position and goes on by the text, successfully matching "function" later.
35
39
36
-
We could work around that by checking if "`regexp.exec(str).index` property is `5`, and if not, ignore the much. But the main problem here is performance.
37
-
38
-
The regexp engine does a lot of unnecessary work by scanning at further positions. The delays are clearly noticeable if the text is long, because there are many such searches in a parser.
40
+
We could work around that by checking if "`regexp.exec(str).index` property is `5`, and if not, ignore the match. But the main problem here is performance. The regexp engine does a lot of unnecessary work by scanning at further positions. The delays are clearly noticeable if the text is long, because there are many such searches in a parser.
39
41
40
42
## The "y" flag
41
43
42
-
So we've came to the problem: how to search for a match, starting exactly at the given position.
44
+
So we've came to the problem: how to search for a match exactly at the given position.
43
45
44
46
That's what `y` flag does. It makes the regexp search only at the `lastIndex` position.
45
47
@@ -66,6 +68,6 @@ As we can see, now the regexp is only matched at the given position.
66
68
67
69
So what `y` does is truly unique, and very important for writing parsers.
68
70
69
-
The `y` flag allows to apply a regular expression (or many of them one-by-one) exactly at the given position and when we understand what's there, we can move on -- step by step examining the text.
71
+
The `y` flag allows to test a regular expression exactly at the given position and when we understand what's there, we can move on -- step by step examining the text.
70
72
71
73
Without the flag the regexp engine always searches till the end of the text, that takes time, especially if the text is large. So our parser would be very slow. The `y` flag is exactly the right thing here.
0 commit comments