
Configuration
The Group By tool consists of one required input and one optional input.1
Select Column(s) to Group By
Select the column(s) to group by. You can input as many columns as you need. Adding more columns will make your groups/boxes more unique as the “boxes” become more specific.This is technically an optional input. You don’t have to greate any groups if you just want to
2
Select Aggregations
Add your aggregation(s) pairs by selecting the column you want to aggregate and how you want to aggregate them.You can choose from the following types of aggregations:
- Sum: summarize numeric values
- Concatenate: get all your string values in a comma-separated cell
- Mean: find the average of numeric values
- Count: count the occurances
- Count Distinct: count the unique occurances
- Std. Deviation: find the standard deviation
- First: get the first value in the group
- Last: get the last value in the group
- Min: find the smallest numeric value
- Max: find the largest numeric value
When To Use
The Group By tool can be used to solve a wide range of different problems:- The Group By tool is perfect for any analysis the requires aggregating data
- Use max or min on a date column to find the latest/earliest date a customer bought something, an order was delivered, or when any other event occured the first or last time
- Aggregating different datasets into similar boxes can be very smart to transform them into similar boxes before using the Combine tool to merge the two datasets. It’s essentially a method to create identical IDs in datasets that don’t naturally match - for example by aggregating by Month.
- If you want to use an “IN” function, you can do that by using the concatenate function in the Group By tool along with the Append and Filter with a Contains functions.
- Whenever you want to count rows within a group of customers, suppliers, months, etc. the Group By tool is excellent
Examples
One Group; One Aggregation
One Group; One Aggregation
I want to find the average session length in each of my customer groups

Without Groups
Without Groups
I just want to find the earliest date that I onboarded any customer

Multiple in Groups
Multiple in Groups
I want summarize my amounts by the Month and Year
If you forget the Year column in your groups, you would end up mixing amounts from multiple years

Multiple Aggregations
Multiple Aggregations
I want to get the sum of my revenue and the average revenue in each my customer’s industry

First versus Min
First versus Min
I want the first record (the one at the top) in each of my groups

If you wanted the smallest value, you should use Min - see the second screenshot

