JavaScript RegExp URL slug validation
Summary
This article provides JavaScript RegExp examples to validate URI paths, or URL slugs, by matching lowercase characters, hyphens, numbers, and reserved characters. It includes patterns for various combinations, emphasizing the importance of testing and adjusting patterns.
Matching #
Regular expressions (RegEx) are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. The RegExp object is used for matching text with a pattern.
Lowercase characters #
The following pattern matches a URL slug string with all lowercase a-z
characters:
[a-z]+
This would match the following strings:
foo
bar
Uppercase characters #
The following pattern matches a URL slug string that also containins uppercase (A-Z
) characters:
[a-zA-Z]+
This would match the following strings:
foo
Foo
bar
Bar
rfc3986 says: Although schemes are case-insensitive, the canonical form is lowercase and documents that specify schemes must do so with lowercase letters. An implementation should accept uppercase letters as equivalent to lowercase in scheme names (e.g., allow "HTTP" as well as "http") for the sake of robustness but should only produce lowercase scheme names for consistency.
Hyphens and underscores #
The following pattern matches a URL slug string that also containins hyphens (-
) and underscores (_
):
[a-zA-Z\-\_]+
This would match the following strings:
foo-Foo
bar_Bar
Numbers #
The following pattern matches a URL slug string that also containing numbers (0-9
):
[a-zA-Z\-\_0-9]+
This would match the following strings:
foo-Foo-1
bar_Bar_2
Reserved characters #
The following pattern matches a URL slug string that also containing reserved characters /&?=:%
:
[a-zA-Z-_/&?=:%0-9]+
This would match the following strings:
http://foo/bar/baz_123:80?var=qux&quux
https://Foo/bar_baz-123:80?var=qux&quux%100
More unreserved characters #
URIs consist of even more unreserved characters, which we will add now.
The following pattern matches a URL slug string that also containing characters like #.+~
:
[a-zA-Z-_#.+~/&?=:%0-9]+
This would match the following strings:
http://foo/bar/baz_123:80/index.php?var=qux&quux#item
https://Foo/bar_baz-123/default.html?var=qux&quux%100
RegExp Validation #
Feel free to adjust the patterns to your needs. Like this, you can already be sure not to match any illegal characters.
Do not forget to always validate your patterns with validation and testing tools. This makes your RegExp robust.
Further readings #
Sources and recommended, further resources on the topic:
- IETF: URI Generic Syntax
- IETF: URI Design and Ownership
- MDN: JS Regular Expressions
- MDN: ReExp
- Regular expression testing tool
License
JavaScript RegExp URL slug validation by Jonas Jared Jacek is licensed under CC BY-SA 4.0.
This license requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only. To give credit, provide a link back to the original source, the author, and the license e.g. like this:
<p xmlns:cc="http://creativecommons.org/ns#" xmlns:dct="http://purl.org/dc/terms/"><a property="dct:title" rel="cc:attributionURL" href="https://www.ditig.com/javascript-regex-url-slug-validation">JavaScript RegExp URL slug validation</a> by <a rel="cc:attributionURL dct:creator" property="cc:attributionName" href="https://www.j15k.com/">Jonas Jared Jacek</a> is licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" rel="license noopener noreferrer">CC BY-SA 4.0</a>.</p>
For more information see the Ditig legal page.