Complex split using Regex

Hello,

I am working a discrete math parser to turn equations into abstract syntax trees.

"!!(a)"

To do this, I must the string above into the following:
["!!(", "a", ")"]

If an exclamation point is before an open parenthesis or letters, then split at that point.

I created the following regex (only a part of it, excluding operators).

let equation = "!!(a)".split(/\!*\(|\)|\!*[a-z]+/gi);

If ( starts with zero or more !
OR
is a )
OR
is a letter starting with zero or more !

The issue is that this returns empty strings in an array, as it is splitting and removing the element rather than capturing it.

["", "", "", ""] 

How can I tell split to split between these two elements rather than the elements themselves? Is their a better way to do this? If I can split like this, can it be modular to add operators?

Hi, String.split takes in an input, and uses it as a separator. This means if you do "I am a sentence.".split(" "); you get ["I", "am", "a", "sentence"] and the same thing happens when you use a regex instead of a string as parameter. Whenever Javascript finds a regex match, thats where Javascript splits the string.

What you are looking for, is String.match which can capture all regex matches in a string as long as you have the global flag.
Here is an example

1 Like

What should the result be after the operation (splitting, regex, etc)? Basically, what are hoping for it to return?

How would I specify that it must only contain a select few characters for all characters?

I tried

/(?![a-zΩ∅∪∩⊕!()\s-])/gi

The s should do whitespace.

However, this has no effect when the string is “|abc” for example via .test().

I’m slightly unclear on what exactly you want here. One thing to note is if the characters within the [] is meant to replace the [a-z] of your original regex, then you shouldn’t enclose it in a negative lookahead.
One website I always recommend for figuring regex out in a fast and responsive manner, is https://regex101.com/ you can look on the right to see what the regex you wrote will do, and how that compares to what you think it should do. You can also write testcases and see the regex work in real time.

Sorry, this is for a separate expression for the syntax error checker of the parser.

I want to test if the string input only contains certain characters.

I tried using a negative look ahead to see if the string does not contain one of the following via the regex:

/(?![a-zΩ∅∪∩⊕!()\s-])/gi

However, this does not seem to be detecting the “|” inside of the equation “|abc” when using the test method.

that could be because | is not in the [] part of your regex.

How would I re-write this regex so it makes sure the entire string only has those characters? Would it be best just to use a loop?

if you are doing an invalid character check, you can do a simple regex like /[Ω∅∪∩⊕]/gi for the invalid characters of “Ω∅∪∩⊕”. Then do a .test on the user input. If it returns true then you know the user input contains invalid characters.

For a valid character check, you can do something like /^[\w]*$/gi which would make sure the user input from beginning to end would only contain a-zA-Z0-9_