SOLVED - Regex Lesson help - zero or more times and lazy matching

The following is an exchange between me and @JM-Mendez. He will post his replies here, this post is here so people can benefit from it.

Could you explain these two lessons to me? Please?
I had posted it on the forum but I didn’t get a proper explanation.

https://learn.freecodecamp.org/javascript-algorithms-and-data-structures/regular-expressions/match-characters-that-occur-zero-or-more-times/
What does zero or more time mean? that means if I put string “g” then would it return g if it isn’t present? cause zero or more times. Can you give an example this would be helpful?

https://learn.freecodecamp.org/javascript-algorithms-and-data-structures/regular-expressions/find-characters-with-lazy-matching/ 1
I don’t get why or how this is helpful.

#1
The asterisk * simply specifies the quantity, zero or more. It doesn’t say what character to match with.
For example,

/g*/ reads as
    give me the match with zero or more 'g' character.
It matches with:
    ''
    'g'
    'gg'
    'ggg'
    ...

Notice, it matches with an empty string because empty string is zero character of ‘g’.

#2
Lazy evaluation is helpful when you want the least sufficient match.
For example,

/g.+?f/ reads as 
     give me the shortest match of g
     followed by one or more any character
     followed by f
It matches with:
     "gulf" of the string "gulfwar kickoff"

The greedy matching version, /g.*+f/, will match the whole string “gulfwar kickoff” because it searches for the longest match.

Obviously, if lazy matching is suffice, you don’t want to use greedy match that runs over the whole string.
i.e) if lazy matching does the job, prefer it over greedy matching.

2 Likes

The * means that the character before it does not need to be present. But if it is, then it can repeat infinitely.

So yes, if your regex is /go*/ it will match the single ‘g’. This is because regex is greedy and will match as many times as it can. And since you requested a ‘g’, it will match that.

A real world use case would be this:

Suppose you want to match user inputed filenames from a specific department, that have to be formatted like this:

{dept}-{filename}.txt

Since you’d want to match the dept ‘supply’, and the filename ‘testing’, your regex looks like this

/supply-testing/g

but this only matches this specific string. What if they instead wrote ‘supp-testing’ or ‘supplies-testing’? You’d have to match 0 or more letters after the match

Now your regex looks like

/(supp)\w*-testing/

This let’s us match supp exactly, while giving us the flexibility to match anything after only if it’s present.

There are a few reasons this is helpful, depending on the usage

  • if the ? is preceded by a quantifier, e.g *, or +
  • if the ? is not preceded by a quantifier, e.g *, or +
    • this means it’s an optional match.
    • colou?r matches color and colour
1 Like