Difference betwen + and * in regular expressions

Difference betwen + and * in regular expressions
0

#1

Hello everyone, I’m finding a bit hard to understand the difference between + and * in regular expressions.
I would be very happy if somebody could make it a bit clearer for me.
Thanks in advance.


#2

Do you have a specific example of an expression being used with + and/or * which you do not understand the results?


#3

I don’t really have an example but I know + is for one or more and * is for zero or more. That is the part that I don’t get.


#4

Let’s say we have the following string.

var str = "abc";

If I want to test if str has zero or more digits (any number 0 through 9), then I would write:

/[0-9]*/.test(str)

The above would evaluate to true, because there are no digits at all which satisfies the need to find zero or more digits.

If I want to test if str has 1 or more digits, then I would write:

/[0-9]+/.test(str)

The above would evaluate to false, because there are not 1 or more characters in str.


#5

Thanks a lot. Can you use this example to make it more clearer?

// example crowd gathering
let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /C+/; // Change this line
let matchedCriminals = crowd.match(reCriminals);
console.log(matchedCriminals);

why is matchedCriminals = ["CCC"] when /C+/ is used
but when /C*/ is used matchedCriminals = [ "" ]


#6

It might be easier to understand what is happening if we add the global flag ‘g’ on the end of the regular expression. I am going to shorten the string to make it easier to display what I want to show you. The global flag when used with match, will find all matches instead of just the first instance.

let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /C+/g; // Change this line
let matchedCriminals = crowd.match(reCriminals);
console.log(matchedCriminals);

The above displays [‘CCC’], because it is the only instance of finding one or more consecutive C characters.

If you were to use /C*/g it would return [ ‘’, ‘’, ‘’, ‘’, ‘CCC’, ‘’, ‘’, ‘’, ‘’, ‘’ ]

Why? Because as the expression is evaluated over the entire string, it finds 10 instances of zero or more C character. The first instance is the letter ‘P’. There are zero or more C characters of ‘P’. This happens 3 more times and then we get to ‘CCC’ which is also zero or more C characters and 4 more instances of ‘’ because there a no more C characters.

Now, if you drop the global flag ‘g’ like in your original example, we will get [ ‘’, index: 0, input: ‘P1P2CCCP3P4’ ]. The ‘’, is the first instance of zero or more C characters.


#7

Everything is much clearer now.
Thanks a lot for your time.:sweat_smile::sweat_smile: