Need help with Regex match

justoverclockl · June 1, 2021, 8:28am

hi all, actually i have a code that works fine except for 2 things…my code actually replace all words with hashtag (e.g. #example) and turn it into link like:

https://myurl/?q=#example

here my code:

export default function () {

  const regex = /#[^\s!@#$%^&*()=+.\/,\[{\]};:'"?><]+/g;

  const p = this.$('.Post-body');
  const baseurl = app.forum.attribute('baseUrl');

  p.html = p.html(p.html().replace(regex, match => `<a href="${baseurl}/?q=${match}" class="hashlink" title="Find more post with this hashtag">${match}</a>`))

}

now i can’t find a way to do this:

ignore html url
remove “#” from generated link

gaac510 · June 1, 2021, 10:12am

You post reads very confusing. Would you mind rephrasing/elaborating preferably with more examples?

justoverclockl · June 1, 2021, 10:45am

hi gaac,
sorry but i can’t provide example.

this code actually find and replace #words with
<a href="http://mylink/?q=#words">#words</a>

but actually he math also anchor tag in html link (i want to skip all html links), and the result must be

https://mylink/?q=words and not https://mylink/?q=#words

gaac510 · June 1, 2021, 10:58am

can you give an example of a string returned by p.html() which you are chaining .replace(...) with? can you also show your desired string after replacements, as well as what your replace code actually gives you currently?

justoverclockl · June 1, 2021, 2:09pm

exactly this <a href="http://mylink/?q=#words">#words</a>

this is the final output, and i want to remove the # and avoid to parse html url

so returned string muste be

<a href="http://mylink/?q=words">#words</a>

codyjamesbrooks · June 1, 2021, 3:30pm

Hey there justoverclockl,

The easiest solution that I can think up is using slice(1) in your replace callback.

const regex = /#[^\s!@#$%^&*()=+.\/,\[{\]};:'"?><]+/g;
"These are some sample #words".replace(regex, match => match.slice(1))
// output: "There are some sample words"

So using some combo of match.slice(1) and match should be what you need.

If I am understanding the question correctly that should get the job done for ya. I do agree with @gaac510 , in that the question as posted is a little confusing.

Sylvant · June 1, 2021, 3:34pm

your regex has the “g” flag, which makes it remove all “#” symbols. You could just use something like:

let input='<a href="http://mylink/?q=#words">#words</a>'

console.log(input.replace('#', ''))
console.log(input.replace(/#/, ''))

// '<a href="http://mylink/?q=words">#words</a>'

That seems to achieve what you demanded in your last post

justoverclockl · June 1, 2021, 3:43pm

thanks, this actually fix the 1st problem, now remain only one! skip all html url in regex!

because now if a url contain an hashtag, this become:

so the only way is to have a regex that skip all html tag

codyjamesbrooks · June 1, 2021, 4:16pm

instinctually I think you should just be able to add a
‘^’ into your regex. That will force it to match on words that start with ‘#’

const regex = /^#[^\s!@#$%^&*()=+.\/,\[{\]};:'"?><]+/g;

Should work. But I would test it a bunch to make sure there aren’t any side effects that you aren’t intending

justoverclockl · June 1, 2021, 4:20pm

this actually not work unfortunately, he does not recognize #words anymore

codyjamesbrooks · June 1, 2021, 4:34pm

Humm… I see that. Now it is only searching the beginning of the string. My bad.

What about a negative look behind?
Something like

regex = /(?<!html)#[^\s!@#$%^&*()=+.\/,\[{\]};:'"?><]+/g
"some #words with a link https://docs.flarum.org/composer.html#regex-are-complicated".replace(regex, match => match.slice(1))
// output: "some words with a link https://docs.flarum.org/composer.html#regex-are-complicated"

does that work. for ya?

justoverclockl · June 1, 2021, 4:59pm

this works pretty much :)…thank you

justoverclockl · June 1, 2021, 10:05pm

after few try he broke link with anchor…

https://docs.flarum.org/#goals this become:
#goals" rel=" nofollow ugc">https://docs.flarum.org/#goals

gaac510 · June 2, 2021, 12:33am

I’m still not 100% sure what you are trying to do. Are you saying it’s something like:

// The complete string returned by `p.html()` before replacement:
"These are some sample #words"; 

// The complete string you want to get after replacement:
`These are some sample <a href="${baseurl}/?q=words" class="hashlink" title="Find more post with this hashtag">#words</a>`

As well as ignoring strings similar to the below:

// The complete string returned by `p.html()` before replacement:
"Here's a url: https://docs.flarum.org/#goals" // Part of a url, so ignore.

If the above sound about right here’s a modified version of your regex you can try in combination with @codyjamesbrooks’ suggested usage of slice():

const regexModified = /(?<!https?:\/\/\S*)#[^\s!@#$%^&*()=+.\/,\[{\]};:'"?><]+/g;

Do note about regexModified that:

It’s logic is similar to that of @codyjamesbrooks’s final suggestion; his would disregard a match if "#" is immediately preceded by "html"; mine would disregard a match if "#" is preceded by "http://" (or "https://") with any number of non-white spaces separating the two.
regexModified relies on JS’ implementation of regex where non-fixed length lookbehind assertions are allowed. regexModified may not work in other languages (e.g. Python, PHP, Java).
regexModified may not cover all situations, similar to what you have experienced with @codyjamesbrooks’ final suggestion. It’s definitely possible to make it more robust but you need to provide us with better explained context (what your project is like and what role the replacement routine in question plays in your project) and examples of edge cases. Until we are given these information we’d just be guessing your needs and not getting anywhere.

justoverclockl · June 2, 2021, 8:57pm

so, this code will be used in a forum, simply find and replace any word with hashtag (#example) and turn into a link that search through the forum.

http://mylink/?q=example

this is my project

system · December 2, 2021, 8:58am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need help understanding regex function JavaScript	5	605	April 22, 2022
Build a Markdown to HTML Converter - Build a Markdown to HTML Converter JavaScript	3	33	July 23, 2025
Question about hint on regular expressions JavaScript	1	293	June 1, 2021
Detect URL from string JavaScript	8	531	June 1, 2021
RegExp - can't find out how to use in practice JavaScript	4	460	February 2, 2021

Need help with Regex match

Related topics