Learn Regular Expressions in 20 Minutes

By Martin Angelov

learn-regex-20-minutes

You run into a problem and decide to use a regular expression. Now you have two problems. Or at least this is how the saying goes. Regular expressions are a powerful tool that skillful coders leave as a last resort, but when they do use it, they strike terror in the hearts of their enemies (and colleagues).

Regular expressions (or regex-es, as is the correct term for what we use in programming languages today) are specialized languages for defining pattern matching rules for text. They have their own grammar and syntax rules, which every beginner gets wrong. But you don’t have to! Here is what you need to know:

1. Matching a single character

Every programming language has a way of defining and using regular expressions. They have some differences, but the basics which are covered in this article should work anywhere. The examples here are written in JavaScript, so that you can try them out in your browser.

The most basic regexes are those that match a single character. Here are the rules:

  • The dot (.) matches any character. If you want to match the dot as a character, escape it like this: .
  • A question mark (?) means that the preceding character is optional. If you want to match an actual question mark, escape it: ?

You can play with our editor below. Clicking the Run button will execute your code.

(Play with our code editor on Tutorialzine.com)

2. Matching a character of a set

Building up from the previous example, we can write regexes that match only certain character by using sets:

  • A set is one or more characters enclosed in brackets [abc]. It matches only one of those characters – in this example only a, b or c. You can negate a set with ^. [^abc] will match any character that is not a, b or c. You can also specify a range [0-9], [a-z], which will match everything in the range.
  • There are built-in sets that make writing regexes easier (they are called shorthand). Instead of [0-9] you can write d and for [^0-9] you can write D. There are also sets for word characters (a through z with digits and underscore) – w and W, and spaces (including tabs and new lines) – s and S.

This example will makes things clearer:

(Play with our code editor on Tutorialzine.com)

3. Matching words

Most of the time, you will want to match entire words, instead of single characters. This is done by using modifiers which repeat a character or a character set. These are:

  • +, which repeats the preceding character or set one or more times
  • *, which repeats the preceding character or set zero or more times
  • {x} for an exact number of repetitions, {x,y} for varying number of repetitions (where x and y are numbers)

Also, there is the special b pattern which matches the boundaries at the ends of words (not a real symbol).

(Play with our code editor on Tutorialzine.com)

4. Matching/validating entire lines

In JavaScript, this is the type of patterns you would use to validate user input from text fields. It is just a ordinary regex, but anchored to the start and end of the text using ^ (start of line), $ (end of line) expressions. This will make sure that the pattern that you write spans the entire length of the text, and doesn’t only match a part of it.

Also, in this case we use the test() method of the regex object, which returns either true or false if the regex matches the string.

(Play with our code editor on Tutorialzine.com)

5. Search and replace

Another common task that often calls for the use of regular expressions is searching and replacing text. There are two basic ideas here:

  • A group is a set of patterns enclosed in braces (). Each group collects the text that was matched by the patterns inside it. The text matched by each group can be addressed later with indexes prefixed with dollar signs (starting from $1 for the first group).
  • Each group is available in the pattern itself as a back reference – backward slash followed by the group index, starting from 1 (see the example below). This is only rarely used, so you can blissfully forget about this feature.

(Play with our code editor on Tutorialzine.com)

Resources and further reading

And this concludes our quick overview! If you learn what was presented in this article, you will be well prepared to solve 80% of the problems that involve regexes. For the other 20%, try these tools and resources:

Source:: Tutorialzine.com