Regular expressions: using $

Hi All :slight_smile:

Hope all is well

I have a question regarding the $ character in regexs

In the following expression, why doesn’t it match the pattern below?

let myRegex = /^(a)(?!b)$/;
let myString = "an";
result = test.myRegex(myString); // returns false 

The way I understand the regex is:

  • Match first capture group if it is not followed by the pattern in negative lookahead
  • Ensure first capture group is at the beginning of the pattern
  • Whatever doesnt match the negative lookahead must be at the end of the pattern

But what appears to be happening is the pattern inside the negative lookahead is ignored by the $ boundary assertion entirely, and it only sees the first capture group, so when there are more patterns than allowed for in the first capture group it fails to match the entire pattern.

Is this understanding correct? do end of pattern assertions not work on negative lookaheads?

Thanks for taking the time to read this!

D.W.

EDIT:
On the other hand, inserting the boundary assertion inside of the negative lookahead does make this work; which is a curious result to me…

let myRegex = /^(a)(?!b$)/; // returns true when tested against \"am\"

Assertions do not consume what they assert, which is to say the whole parenthesized construct of an assert is “zero-width”.

Keeping that in mind: your first regex matches “a” at the start; then it asserts that the next character is not b (it’s “n” so it passes) but does not consume any of myString; finally it attempts to match the end of the string, but tries to match it against “n” and fails.

The solution should be clear: add a dot afterward so it matches something before the end of the string.

As for the second example, it works because it matches “a” at the start, then asserts that the rest of the string does not match b$. The “n” satisfies that, then the regex terminates successfully at the end.

1 Like

Hello @chuckadams

Thank you for your quick response :slight_smile: much appreciated.

There is one part that I would like to confirm I have a clear understanding of however,

What I’ve understood from this is:

  • The regex can be seen as parsing the pattern from left to right
  • When it comes across a lookahead, it doesn’t actually progress forward to the next pattern after the instruction/assertion has been processed
  • So in a way, you can see “n” as being parsed/processed twice; once for the lookahead, and again when the regex is looking to match “$” to the end of the pattern, instead finding “n” and returning false.

Is this in the right direction?

Thanks again!

.

You described it pretty much exactly: asserts scan the text, and if the assertion fails, the whole match fails. Otherwise, the regex matcher goes back and starts scanning from where it left off, which is where the assert began.

1 Like