The join tool allows you to combine two data sets linked by a common field, usually by a unique identifier, such as a customer ID, email address, or phone number. ultrix also lets you join two data sets based on record position. Which means if you had 30 Records in one set, and five records in the other, the first five records in both sets will join horizontally, even though there's nothing common between them. Referring to the illustration here, the first picture shows our two data sets. The first is employee information, consisting of employee ID, first and last name, age, position, and department ID. The second data set contains department information in which includes department ID, department name and department email.
We essentially want to find the department name for each employee, as just showing the ID at the moment isn't too helpful. So the steps required here are dragging in two input data tools, a join tool, and then joining by a common field, and then outputting. The results in the third picture, to see the department ID, department name and department email joined to the employee data set. Let's start a new workflow by adding our data sets. We'll go to the Insert tab, drag in to input data tools. And the first one will be 3.1 point one employee and the second one will be department will then go to the join tab and dragon join.
And you will notice that there are two inputs to this one, a left and a right. Make sure that the employee is joined to the left and that the right employees is joined to department. Looking in the join tool, there are three types of joins, we can perform a left in a and writes join. Let's go through what each of these mean. I'll start with the inner join, because I think that's easiest to explain. Inner Join means that the joint only returns records were found matching IDs in both data sets.
Therefore, looking at the department file, we have department IDs of 100 200 300 607 hundred. If we go to our employee file, we have department IDs of 1234 and 500. Therefore, the IDs that match are only 100 203 hundred and we should only get back 17 Records. Left join means that you We'll return the left data set, which is employee when it doesn't find a match between the two data sets. Because we know that there are 17 records that match the data set from our inner join, we should see 13 records that don't match. In other words, the 405 hundred department IDs.
Right join means that it will return the right data set, which is department when it also doesn't find a match between the data set. We know that department ID 100 203 hundred exists, and 6700 don't. So we should get back these three records. If we add a browse tool to the outputs of our join tool using Ctrl Shift B and let's update our join tool to join by specific fields and the common identifier between the two data sets is department ID. So, on the left inputs, we have our employee file, and we'll select a department ID. And on the right, we have our department file, and we're going to link department ID with it.
Let's run our workflow. And starting with our inner join output in the middle, we should see 17 records that matched between the two files. So what was common between the two was department IDs 100 to 300. If we go to our left join, we should see 13 records that didn't match but it'll return the data from the employee file and 13 Records here. And 405 hundred were the IDs that didn't match the department file. And lastly, with the right join, it will return the right data set which was department information and because department I 607 hundred couldn't be found in employee.
We have our two results here. Let's start a new workflow and import spreadsheets 3.1 point two. So we'll delete these ones here, dragging two new input data tools and connect to 3.1 point two. So have 2018 sales and 2019 sales. Our first data set contains 2018 accounting data per department. So that's department ID, revenue, expenses, profit and the year of the file.
The second data set contains the same data, but for 2019 because the two data sets are identical in schema, in other words, they have the same columns have the same time departments per record, we can use the join by record position to combine the two data sets. So let's drag in the join tool connecting both. And we'll join by record position and add a browse tool to the output of our join tool and run the workflow. Both sales figures are combined together into one table. We've now joined our sales data by record position. To make it a bit easier to read, we can add a select tool to the output of our join and maybe give these field names something more meaningful for 2018 revenue 2018 expenses 2018 profit, we can probably remove year from it as well.
And the right department ID so we can only keep the first one. We'll add 2019 revenue, expenses and 2019. profit. let's rerun it. And you can see here that per department ID we have the 2018 and 2019 figures looking a lot more legible than it was before.