How to handle "" as output from split() for punctuation?

One of the last lessons in the Functional Programming section uses a Regex of /\W/ to remove any non-word characters. The problem is it adds “” into the array for punctuation, and I assume any non-word character.

I’m trying to break up a string, sort it, and somehow get a count for all words that appear more than once and get a count for how many times they occur. I may be posting again if I can’t figure that out, but how do I get rid of the empty quotes? Would empty quotes be considered NULL and can I filter them out?

I imagine the issue comes from when you have two \W characters near each other, between the two there is an empty string so you get an empty string in the resulting array.

You can change your regex, adding a single character, to make it work in a way to avoid empty strings in the resulting array - or at least reduce them, there may still be a situation where those stay

And I just noticed that the Regex above removes dashes which is a problem for hyphenated words. I changed it to /[^A-Za-z0-9_-]/ adding the dash at the end. For now, that works.

Actually, I am getting the empty string in the exact position where commas and periods are.

aren’t period and commas near a space? so that mean you have two characters matched by that regex near each other. When you have consecutive matched characters. You can change your regex to manage that.

I barely understand regex - that regex is copied from a solution to a lesson. How would I change the regex to handle that? I tried the following but it didn’t work:

let wordsToSplit = "Here is a Sentence, and a Comma, semi-colon; and a period."
let lowerCase = wordsToSplit.toLowerCase();

function splitWords(str) {
  let words = str.split(/[^A-Za-z0-9_-]/);
  for (let i = 0; i < words.length; i++) {
    if (words !== "") {
      return words.sort();
    }
  }

}
console.log(splitWords(lowerCase));

the regex you have matches a single character, instead you need a regex able to match multiple consecutive characters - is there something able to do that?

I have no idea. I tried changing the if condition from !== “” to !== null and I got an error.

There’s a great little site called regexr that I suggest you check out. On the sidebar they have a cheatsheet that lists out all the most common control sequences.

Thanks, bookmarked that. I’ll definitely check it out.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.