The Regex tool is a super-powered search-and-match tool, like “Find” in a text editor but much more flexible. It’s a pattern-matching language that lets you describe what you’re looking for rather than searching for exact text.

To exemplify, you could use (\d+) to extract all the numbers from a column. Check out the examples at the end of the page for additional examples.

Regex (short for Regular Expressions) utilies a special language to extract certain elements of a text/string column.

Regex Cheatsheet

Configuration

The Regex tool consists of three required input.

1

Select Column

Select the column you want to extract text from with Regex

2

Input Column Name

The Regex tool will create a new column. Use this input to name it.

3

Input Regex

Input your regex statement. You can only extract one regex group at the time. Remeber to add parentheses around your marked group.

When To Use

The Regex tool is helpful when you have suboptimal data quality and need to standardize the data:

  • You have different date formats (e.g. US, European and ISO) and would like to standardize these
  • Your HR systems use different time formats (24-hour clock, AM/PM, etc.) and you need to standardize these
  • You need to extract social media handles or hashtags from customer feedback or mentions
  • Extract and standarize ratings from customer reviews

Examples