The Filter tool is one of the most used tool in any data analytics process. The Filter tool is designed to selectively pass through data that meets particular conditions set by the user. In other words, you make a rule that each row has to fulfill and in doing so the output of your Filter tool will typically be a smaller dataset measured by the number of rows. You can use it to remove or keep null values, keep rows where an amount column is larger than a value or remove rows where a text column does not equal a certain requirement.

Configuration

The Filter tool is comprised of three required and one optional input.

1

Select Column

Select the column you want to filter on. This the column that you will use as the foundation for your condition.

2

Set Condition

Select the condition that you want to filter your column by. Contains, Starts With and Ends With are reserved for text/string columns while Larger Than (Or Equal) and Less Than (or Equal) is best suited for dates and numbers.

3

Input Value

Input the value that you want to filter with. Use double-quotes ("") around text/string columns and dates. When working with integers and decimals, you don’t need double-quotes. You can also reference existing columns in your dataset for dynamic filtering (see “Filtering a Date column” example). For instance, if you want to all pass through all the data where a transaction_year column is equal to the current year. Just start typing the name of the column and you will see the columns in your dataset.

You can input Null to catch the rows with a null value
Booleans: Input 0 for FALSE and 1 for TRUE.

4

Optional: AND/OR

You can choose to add AND or OR clauses to your filter which is a mechanism to chain conditions together. You can use AND clauses if you want two conditions to be TRUE at the same time or OR clauses to have two conditions valid at the same time. See more in the examples below.

When To Use

You can use the Filter tool for a wide number of use cases. In general, when you need to control which data points are passed through for the analysis, the Filter tool is a good choice, such as:

  • Removing rows that do not meet a number-based condition, like age or cost limits
  • Selecting rows from a dataset where a given column is not empty (null)
  • Keeping only true values from a boolean (true/false) column
  • Limiting the dataset to a certain time period using date or datetime columns
  • Removing rows where text columns contain with specific keywords, prefixes, or suffixes

Examples