If you have a pattern, you can quantify it using a variety of symbols.
- * means 0 or more
- + means 1 or more
- ? means 0 or 1
These are all just shorthand though. Here’s the key, and most important, way of quantifying a pattern:
{n,m}
What does this do? It matches between n
and m
times (inclusive). Let’s try to write out the previous shorthands:
? is {0,1}
* is {0,}
+ is {1,}
Wait, what’s that? I forgot to write my second number?
If you leave out m
, it defaults to infinity. In other words, there is no upper limit when you leave it out.
But there’s one more trick. You can even shorten stuff like {3, 3}
{3,3} is {3}
Here’s how you’d use this stuff:
All mean the same:
colou?r
colou{0,1}
(color|colour)
That matched both British and American spellings of the word “colour”.
Now, to answer your question
Is there a RegEx expression to search for multiple repeats of the same capturing group? Or do you have to give the \1 every time you want to see ( a ) repeated?
It’s a good question! And you now know the tools to do exactly what you want. Let’s say that you want to match a \1
three times. Well, you could write that out in a few different ways:
All the same:
(\d)\1\1\1
(\d)\1{3,3}
(\d)\1{3}
The last one is the best because it’s both more readable and more concise. Let’s say, however, you want to match a space character too. And let’s say you actually want five \1
s, with spaces in between. We have to special case the last sequence because it shouldn’t have a space at the end.
All the same:
(\d)\1\s\1\s\1\s\1\s\1
(\d)(\1\s){4,4}\1
(\d)(\1\s){4}\1
There’s actually another thing you can do. See, by grouping our pattern in brackets, we also assigned something to \2
. We don;t want to do that! We can avoid this by putting a ?
at the beggining. This tells the RegEx enginge that this isn’t a capturing group – but it should still group it together like parenthesis.
(\d)(?\1\s){4}\1
You should also be careful:
Two notes:
- These quantifiers match the patterns freshly each time. This means that
\d+
will match 543 (not just 111, 222, etc.)
- If you use a capturing group and quantify it, the stored value is the last one matched. That means
(\d)+\1
will match 1233
, but not 1231
.
If you have any more questions, please do ask.