Combine
Merge datasets horizontally
The Combine tool is probably the most important and used tool in Less. It is used to merge datasets horisontally via one of more IDs.
The Combine is called a Join in SQL, Merge in Pandas and XLOOKUP in Excel.
Conceptually, it can be helpful to understand the Combine tool by imaging that you and your friend are comparing your favourite movies and would like to create a common list of your joint favourite movies.
The Inner version of the Combine tool is like creating a joint list of all the movies that you both like. You will end up with a list containing only the movies that you both have as your favourites, i.e. it’s all about what matches between your individual lists.
For the Left version of the Combine, the starting point is your list of favourite movies. Every time your friend also likes a movie that you like, you add that to the joint lists. But you also keep all your favourites movies that your friend doesn’t share on the joint list - you just note in the joint list that your friend doesn’t have that particular movie as a favourite. Less will create a NULL value for those movies that only you have as a favourite.
The Right version of the Combine is essentially the reverse of the Left version. The starting point is your friend’s list of favourite movies. All of your friend’s favourite movies will be on the joint list and your favourites will be added when your friend also has that particular movie as a favourite.
You can view an illustration of each of the versions below. In the examples, we are combining by user_id for each of the three versions.
Configuration
The Combine tool configuration requires specifying two inputs.
Note that the L and R of the Combine anchors refers to left table and right table respectively.
Select Join Type
Choose whether you want to use the Inner, Left or Right version of the Combine tool.
Add ID Pairs
You need to add the ID pairs that are used to combine your datasets. In the movie example above, the column would be called movie_title. In the visual examples above, it would be user_id.
You can add as many fairs as you need with the “Add additional columns to join on”-button. Check out the examples below.
When To Use
The use cases of the Combine tool are very diverse. At the core, whenever you need to look-up data, map data or combine datasets, the Combine tool can help.
- Merging datasets by one or multiple IDs that these datasets share
- Combine datasets via dates or aggregated dates (month, quarters, years)
- To find records that are not present in a another dataset, for example find all invoices (invoice table) that haven’t been paid (transaction table) (see examples below)