Ultimate regexp things like lookaround

Groups & Lookaround
(abc)	capture group
\1	backreference to group #1
(?:abc)	non-capturing group
(?=abc)	positive lookahead
(?!abc)	negative lookahead

When to use that last three option? What is it for? Never saw in an example just got there have that things too. Can someone explain when to use with ? ones?

Thanks!

So those last three are very specialized. There are times you want to be able to, say, get the word following another word. Like this one:

/(?=https?://)([a-z0-9\.\-]*)/i

What that says is “given a string containing http:// or https://, capture the word or words immediately following that, so long as they’re letters, numbers, periods or hyphens”. So for example the stringhttp://foo-dawg.schmoogle.org/blitzen/index.php would match foo-dawg.schmoogle.com - we don’t capture the http://, we simply look for this (our lookahead), and match based on that.

Alternatively, there’s an experimental technology for doing the same thing with lookbehind. That would be a way of capturing a group that is followed by some criteria. Not all browsers support it, but it’s pretty handy.

2 Likes

Thank you and still what about negavite lookhead? Is match every case?

Like (?!https?:\/\/)([a-z0-9\.\-]*) ? For some reason on write down https:// is remove first h to match.

Thanks!

Run that through an online regex tester (something like regexr, for example) - you have a small typo in that. (?!https?:\/\/) is not the same as (?!=https?:\/\/).

I’ll try to break it down for more clarity. There are to essential terms you need to know about regex:

  1. Character
  2. Cursor

Example would be any text editor in which you will have characters and you can “traverse” them with cursor, that actually never appears on character itself, but either before or after it, right? The same way regex does its traversal:

GOTO Cursor position 0
GOTO Cursor position 1 processing character in between
...etc

Now, lookahead means that on any cursor position, regex will peek, without moving, the next character and IF it satisfies the condition (either positive or negative) it jumps to the next cursor position processing character in between, IF not - it still jumps to the next cursor position, BUT ignoring the character in between.

Example:

const passwd = '01strongpa33';
const re = /(?=.{6,})/;

// Cursor positions to be checked:
0 1 2 3 4 5 6 7 8 9 10 11
|0|1|s|t|r|o|n g p a  3  3

// Regex will ignore every check after cursor position 6 as the rest of the characters doesn't satisfy lookahead

Non-capturing group has nothing to do with lookaheads, it’s just a group that will not be captured. The most obvious use case, when you need to nest groups:

/((?:.+)?)/
// Translates as "Capture group of characters, that might not be there, in greedy way"
// This construct has to be nested because if you put question mark inside regex will look in lazy way
1 Like