Hello everyone, Iβm finding a bit hard to understand the difference between + and * in regular expressions.
I would be very happy if somebody could make it a bit clearer for me.
Thanks in advance.
Do you have a specific example of an expression being used with + and/or * which you do not understand the results?
I donβt really have an example but I know + is for one or more and * is for zero or more. That is the part that I donβt get.
Letβs say we have the following string.
var str = "abc";
If I want to test if str has zero or more digits (any number 0 through 9), then I would write:
/[0-9]*/.test(str)
The above would evaluate to true, because there are no digits at all which satisfies the need to find zero or more digits.
If I want to test if str has 1 or more digits, then I would write:
/[0-9]+/.test(str)
The above would evaluate to false, because there are not 1 or more characters in str.
Thanks a lot. Can you use this example to make it more clearer?
// example crowd gathering
let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /C+/; // Change this line
let matchedCriminals = crowd.match(reCriminals);
console.log(matchedCriminals);
why is matchedCriminals = ["CCC"]
when /C+/ is used
but when /C*/ is used matchedCriminals = [ "" ]
It might be easier to understand what is happening if we add the global flag βgβ on the end of the regular expression. I am going to shorten the string to make it easier to display what I want to show you. The global flag when used with match, will find all matches instead of just the first instance.
let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /C+/g; // Change this line
let matchedCriminals = crowd.match(reCriminals);
console.log(matchedCriminals);
The above displays [βCCCβ], because it is the only instance of finding one or more consecutive C characters.
If you were to use /C*/g it would return [ ββ, ββ, ββ, ββ, βCCCβ, ββ, ββ, ββ, ββ, ββ ]
Why? Because as the expression is evaluated over the entire string, it finds 10 instances of zero or more C character. The first instance is the letter βPβ. There are zero or more C characters of βPβ. This happens 3 more times and then we get to βCCCβ which is also zero or more C characters and 4 more instances of ββ because there a no more C characters.
Now, if you drop the global flag βgβ like in your original example, we will get [ ββ, index: 0, input: βP1P2CCCP3P4β ]. The ββ, is the first instance of zero or more C characters.
Everything is much clearer now.
Thanks a lot for your time.