Character Sets

2 min read ·

Character sets allow you to match one character from a group of characters using square brackets.

Using []

Square brackets are used to define a set of characters.
Matches:
  • cat
  • bat
  • rat

Ranges a to z

You can define a range of characters.
This matches lowercase words.

Multiple Ranges


Matching Digits


Negation

Use ^ inside brackets to match everything except given characters.
Matches all non digit characters.

Note

^ inside [] means NOT, but outside [] means start of string.


Pro Tip

Character sets are useful for filtering and validating data.


Predefined Character Classes

These are shortcuts for commonly used character sets.

\d and \D

  • \d matches digits
  • \D matches non digits

\w and \W

  • \w matches letters, digits, underscore
  • \W matches everything else

\s and \S

  • \s matches whitespace
  • \S matches non whitespace

Caution

Always use raw strings when working with predefined classes to avoid escape issues.


Exercise

  • Match all digits in a string
  • Match all alphabets using range
  • Use negation to remove numbers
  • Use \w and \s in examples