How do I properly validate and sanitize user input?

For example, I am creating a forum. The registration page requires a name, a username, an email, a password, and a confirm password field.

For the username, I will check for length. I assume I need to trim the beginning and end of the string for whitespace. I also need to verify there is no whitespace within the string. What about HTML entitites or the use of unicode characters? Will that affect the length of the string? Would I need to decode these first before verifying the length? What about stripping tags? Would I need to decode any html entities before I strip any tags? What else should I consider?

1 Like

This is the job regular expressions were born for.

1 Like

Yes, but what if someone enters a backslash or a left or right angle (such as in an HTML tag) . I don’t want it to be interpreted as the beginning or end of an HTML tag or an escape character.