I am looking for a word tokenizer library for node.js, that supports as many languages as possible. I’d like to pass in a string like: tokenize('Hello, world!', 'en')
and have it return ['Hello', 'world']
. The number of supported languages is more important than precision.