Regular expressions let us manipulate strings with ease. They are patterns that let us match the text in pretty much in any way we imagine. Without it, we would have trouble searching for text with complex patterns. It’s also useful to check inputs against regular expressions for form validation. In this article, we’ll look at advanced special characters that are part of regular expressions and also JavaScript regular expression methods.
We can define JavaScript regular expressions with the literal as follows:
const re = /a/
Or, we can use the regular expression constructor to define the regular expression object by passing in a string to the RegExp
constructor as follows:
const re = new RegExp('a');
More Special Characters
There’re more special characters that we can use to form regular expressions:
[^xyz]
Matches whatever character isn’t in the brackets and shows up up first. For example, if we have the string xylophone
and the pattern [^xyz]
, then it matches l
since it’s the first character that’s not x
, y
or x
in xylohphone
.
[\b]
This pattern matches a backspace character.
\b
This pattern matches a word boundary. A word boundary is the position of a string that’s between a word character followed by a non-word character.
For example, if we have the pattern \babc
and the string abc
, then we get abc
as the match since we have \b
at the beginning of the string.
If we have abc\b
, then we get abc
as the match for abc 1
since \b
matches the boundary at the end of the string as we have space after abc
.
abc\babc
won’t match anything because we have word characters before and after the \b
so there’s no word boundary,
\B
Matches a non-word boundary. It matches anything before the first character of the string, after the last character of a string, between 2-word characters, between 2 non-word characters or an empty string.
For example, if we have the pattern ab\B.
and the string abc 1
, then abc
will be the match.
\cX
This pattern matches a control character of a string, where X
is A to Z.
\d
Matches any digit. It’s the same as [0-9]
For example, \d
matches 1 in 123.
\D
This pattern matches any non-digit characters. It’s the same as [^0-9]
. For example, \D
matches a
in the string abc123
.
\f
Matches the form feed character.
\n
Matches the line feed character.
\r
Matches the carriage return character.
\s
Matches a white space character.
\S
Matches any non-whitespace character.
\t
Matches the tab character.
\v
Matches the vertical tab character.
\W
Matches any non-word character. It’s the same as [^A-Za-z0-9_]
. For example, if we have the string /.
, then we get the /
as the match.
\n
If n
is a positive integer, then it’s a backreference to the last substring that matches the n
capturing group.
For example, if we have a regular expression a(_)b\1
and the string a_b_c
, then we get the matches a_b_
and _
since we have the _
in both substrings.
\0
Matches the null character.
\xhh
Matches the character with the code hh
where hh
is 2 decimal digits. For example, since the hexadecimal code for ©
is A9
, we can match it with the pattern \xA9
.
\uhhhh
Matches the character with the code hhhh
, where hhhh
is 4 hexadecimal digits.
\u{hhhh}
Matches the character with the Unicode value hhhh
, where hhhh
is 4 hexadecimal digits.
Regular Expression Methods
The JavaScript regular expression object has a few methods that let us do various things with regular expressions like searching strings, testing if strings match a pattern, replace strings and so on.
The methods are below.
exec
The exec
method searches for a match of the regular expression in a string. It returns the results array or null
if there’re no matches.
For example, if we write:
/[a-z0-9.]+@[a-z0-9.]+.[a-z]/.exec('[abc@abc.com](mailto:abc@abc.com)')
Then we get back:
["[abc@abc.com](mailto:abc@abc.com)", index: 0, input: "[abc@abc.com](mailto:abc@abc.com)", groups: undefined]
We can also use the g
flag to search for the whole string for the match. For example, we can write:
/\d+/ig.exec('123')
Then we get 123 as the match.
test
The test
method executes a search for a match between a regular expression and the specified string. It returns true
if there’s a match and false
otherwise.
For example, /foo/.test(‘foo’)
returns true
, but /foo/.test(‘bar’)
returns false
.
match
The match
method returns all matches of a string against a regular expression.
For example, if we write:
'table tennis'.match(/[abc]/g)
Then we get back:
["a", "b"]
since we have the g
flag to search the whole string for matches. If we remove the g
flag, then we get the first match only. For instance, if we write:
'table tennis'.match(/[abc]/)
Then we get back:
["a", index: 1, input: "table tennis", groups: undefined]
**matchAll**
The matchAll
method returns all matches of a string against a regular expression in a iterator which lets us get the results with the spread operator or a for...of
loop.
For example, if we write:
[...'table tennis'.matchAll(/[abc]/g)]
Then we get back:
0: ["a", index: 1, input: "table tennis", groups: undefined]
1: ["b", index: 2, input: "table tennis", groups: undefined]
since we have the g
flag to search the whole string for matches. If we remove the g
flag, then we get the first match only. For instance, if we write:
[...'table tennis'.matchAll(/[abc]/)]
Then we get back:
0: ["a", index: 1, input: "table tennis", groups: undefined]
search
The search
method gets the index of the match of the string. It returns -1 is no match is found. For example, we can write:
'table tennis'.search(/[abc]/)
Then we get back 1 since we have a
as the second character of 'table tennis'
.
However, if we have:
'table tennis'.search(/[xyz]/)
Then we get back -1 because all 3 letters don’t exist in 'table tennis'
.
replace
The replace
method finds matches in a string and then replace the matches with the string that we specify.
For example, if we have:
'table tennis'.replace(/[abc]/, 'z')
Then we get:
"tzble tennis"
split
We can use the split
method to split a string according to the pattern that we put in as the delimiter. It returns an array of strings that are split from the original string according to the matches of the regular expression.
For example, if we have:
'a 1 b 22 c 3'.split(/\d+/)
Then we get back:
["a ", " b ", " c ", ""]
As we can see, there’re lots of characters and combinations of them that we can search for with JavaScript regular expressions. It lets us do string validation and manipulation very easy since we don’t have to split them and check them. All we have to do is to search for them with regular expression objects.
Likewise, splitting and replace strings by complex patterns is also made easy with regular expression objects and their split
and replace
methods respectively.