I was really confused why my solution to the following challenge’s test.
Challenge: Use lookaheads in the pwRegex
to match passwords that are greater than 5 characters long, and have two consecutive digits.
Test: abc123
My solution: /(?=\w{6,})(?=\d{2})/
Suggested answer: /(?=\w{6})(?=\w*\d{2})/
Someone asked almost this same question, but I couldn’t find a clear explanation as to why the \w* was needed in the second lookahead.
So after some back and forth with ChatGPT, I got a really thorough explanation, which I hope may help someone else out. I asked it why
abc123
would be matched with regex: /(?=\d{2})/
but not with: /(?=\w{6,})(?=\d{2})/
Explanation From ChatGPT
When a regex engine processes a string, it moves from left to right, attempting to find matches. At each position in the string, it evaluates the regex.
For the regex /(?=\d{2})/
and the string “abc123”:
- The regex engine starts at position 0 (before “a”). It checks if the next two characters are digits. They are not.
- It then moves to position 1 (before “b”). Again, it checks if the next two characters are digits. They are not.
- It moves again to position 2 (before “c”). It checks and finds “12” (two consecutive digits), so it matches at this position.
- It will continue moving and checking but will not find any other matches in “abc123”.
For the regex /(?=\w{6,})(?=\d{2})/
:
- At position 0 (before “a”), the first lookahead
(?=\w{6,})
checks for 6 consecutive word characters. It finds “abc123”, so this lookahead is satisfied. - The engine’s position is still at 0. It then checks the second lookahead
(?=\d{2})
, which does NOT find two consecutive digits at the start of “abc123”. So, the overall regex does not match.
The key thing to understand is that after a lookahead (or lookbehind) is evaluated, the regex engine’s position in the string does NOT advance. If the regex has further components or further lookaheads, they’re evaluated from the same position.
So, the behavior is consistent between the two regex examples. The first matches because it’s simply looking for two digits anywhere. The second does not match because, after the first lookahead is satisfied at the start of “abc123”, the second lookahead does not find two consecutive digits also at that starting position.