Why is my RegEx in split() not returning capitalized words?

I want to split a string by removing any non-word character - I’, including apostrophes and dashes for when they occur in words. Here is what I have:

let words = str.split(/[^a-zA-Z'-]+/gi);

My output includes every word except words starting with a capital letter. I even got words where I capitalized every letter other than the first letter. What is it about the RegEx that is ruling out words that start with a capital letter? And more importantly, how do I fix that?

please give an example input and output, show us what you see happening

Here is a link to my CodePen version. At the top of the JS are the words - take a look at the last words starting with “z”. Then hover over the Z button to see that ZED and Zero do not appear but “zoNE” does.

they are there
image

the issue is wih sort, not with the regex.
first log is from

  let words = str.split(/[^a-zA-Z'-]+/gi);
  console.log(words.slice(340))

second one from

const a = splitWords(wordsToSplit)
console.log(a)

Ok, it looks like ‘z’" and ‘Z’ have different values. Is there a way to sort that takes that into account? I could turn everything to lowercase but how would I identify the capitalized words to convert them back? I tried using localeCompare from a Stackoverflow thread but that threw an error.

Never mind - I see that I am outputting to ul with the id of “letter” + lowercase letter. I created a ul with id=“letterZ” and that worked but that is a lot of extra html for each letter of the alphabet, plus they are displaying as block so breaking onto a new line.

you need to use a callback in sort, and inside there use the word turned to lowercase to determine the order

I was thinking of maybe using concat() in an if statement to check for charAt(0) is upper or lower case. But I still have the output ul tags to consider. I think I may create a Proper Nouns section that is separate from the alphabetical list.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.