Group By
Separate data in groups and perform calculations within those groups
The Group By tool is one of the most used tools in Less. The Group By tool works by sorting data into boxes where each box represents a group. Once everything is in its box, you can count, total, or summarize what’s in each one. It’s a way to organize and analyze similar items together. To exemplify, “group by customer_type average session_length” could be translated to “put my data in the customer_type column groups and provide an average of the session length for those customer types”.
Configuration
The Group By tool consists of one required input and one optional input.
Select Column(s) to Group By
Select the column(s) to group by. You can input as many columns as you need. Adding more columns will make your groups/boxes more unique as the “boxes” become more specific.
This is technically an optional input. You don’t have to greate any groups if you just want to
Select Aggregations
Add your aggregation(s) pairs by selecting the column you want to aggregate and how you want to aggregate them.
You can choose from the following types of aggregations:
- Sum: summarize numeric values
- Concatenate: get all your string values in a comma-separated cell
- Mean: find the average of numeric values
- Count: count the occurances
- Count Distinct: count the unique occurances
- Std. Deviation: find the standard deviation
- First: get the first value in the group
- Last: get the last value in the group
- Min: find the smallest numeric value
- Max: find the largest numeric value
When To Use
The Group By tool can be used to solve a wide range of different problems:
- The Group By tool is perfect for any analysis the requires aggregating data
- Use max or min on a date column to find the latest/earliest date a customer bought something, an order was delivered, or when any other event occured the first or last time
- Aggregating different datasets into similar boxes can be very smart to transform them into similar boxes before using the Combine tool to merge the two datasets. It’s essentially a method to create identical IDs in datasets that don’t naturally match - for example by aggregating by Month.
- If you want to use an “IN” function, you can do that by using the concatenate function in the Group By tool along with the Append and Filter with a Contains functions.
- Whenever you want to count rows within a group of customers, suppliers, months, etc. the Group By tool is excellent