Hi. In this lesson, we understand how to join tables to get data from multiple tables. Typically, in relational model, the data is not present in a single table. It's generally spread across tables. We don't find all the details of employees in employees table. We find details regarding job done by employ in job stable.
We get information about department name and others in departments table. So how do we display the employee name and the job that the employee is doing not just the job ID but the job title and also the department name, not just the department number. So to get information about employ from different tables, we need To use joining join is very important operation in relational algebra. In relational algebra, we have already seen how to deal with select and project. And now let's talk about join. In order to join two or more tables, you need to have a column that is common in those tables.
Common need not be a column with the same name, but it should have the same data. So let us see how we can join tables. The first of the options is natural join. We have in fact three different ways to join tables. natural join is the simplest of all, but not always reliable. The next option is using using class by using Using class we can specify on which column we want tables to be joined.
And finally on, which is the most flexible. And it allows you to specify exactly how you want tables to be joined. And then we move on to something called OUTER JOIN. Because what we do in the beginning is also known as inner join. I'll explain the difference between INNER JOIN and outer join, when we get to outer join, and here is our first example, with natural join. natural join is where all columns with the same names in both the tables are used to join the tables.
If I take employees and jobs, the common column is job ID. So natural join is using job Have implies and joining implies with the jobs using that job ID. It means wherever you find the same job ID between employers and jobs, the roles are joined. And we get first name from employee's table, job title from job stable. Let's switch to our SQL Developer. And here is my command, select first name and the job title from employees natural join jobs.
All that I need to do is this and Oracle understands it has to take the common columns from these two tables to join the tables. So here we go. Run it, and it is going to give you the name of the person and the job title. If you want to arrange them in any specific order You can always give order by one, which is the name of the person. So here you are, you're getting the first name in the sorted order, followed by the job title. But you don't find job title in employees.
So this is coming from job stable. That's what joining is all about. While you're joining the tables, you can also use WHERE clause. So this is where I'm joining imply stable with jobs table. Again, I'm using natural join. Well, the order of these clauses is important.
You need to give where after that, but you can use whatever condition you like. So I want to pick up the title and first name of implies where the salary is greater than 10,000. So this is where you Get only those employees with salary greater than 10,000. Of course, if you want, you can include salary as well, to be sure that we are getting what we want. There you are. So we get only those employees where the salary is greater than 10,000.
So we can include where class along with join. So pick up the title from jobs first name from employees salary from employees, we don't need to mention anything about the table from where it needs to be taken. Because Oracle knows that it is present only in employees. So we need not mentioned that. But if you have the same column in multiple tables, if you want to specify from which table you want to take that you need to qualify the column with the table name, we'll see that little later. Now this is where I'm trying to join employers and jobs, but this time, I want to know what is the total salary for each job.
If you can recollect, we did this in the past, we did group the data based on the job ID, but this time we are grouping data based on job title. This is where we are displaying the job title and the total salary of all the employees doing this job. So, this is how you can combine joining with grouping. You can use natural join to join employees and jobs because there is only one common column between these two. But how about using eight in the context of departments? Let us say I want to get the department name so depart Name, sum of salary from implies natural join departments and group by, of course, whatever column we give here the same we need to get there.
So if you go ahead and run this, the output is not quite what you want. And this is not what you expect. It's because natural join is saying it is going to join tables by using all the common columns. between departments and employees. There are two common columns. The first common column is department ID.
The second common column is manager ID. But the manager ID in departments table is different from the manager ID in employees table. In employees table manager ID means the supervisor or Manager of the employ in departments table manager ID is the ID of H od the head of the department. So it is trying to join departments and employees using those two columns, which is not our intention. But that's what the problem with the natural join is. It takes all the common columns that are available in those two tables.
But what if you don't want to do that? Well, that's where exactly you need to use using class. Using is specifically mentioning which column should be used to join the tables. So we have to now use using this is how you can get the name of the employee and the department name using only department ID. So we very Specifically and explicitly mentioned, we want only department ID to be used then go ahead and run you get the name of the person and the department name, this is the data. So, we can use using class to make things very explicit.
So you can go back to this and then remove this natural join and convert that to using so what we say employs join department and then we say using a department ID. So we are talking about department ID and then grouped by department name. Well, we can just make it even more interesting by grouping followed by ordering. This is the way we want to do it. Now. This is the total and this as you can see is quite different from the output we got earlier because When we use the natural join, it was not actually joining it based on department ID alone, it was unnecessarily bringing manager ID which jeopardized the whole process.
Now, this is using and we prefer using to natural join, because using is more explicit, and you don't have to worry about other columns other than what you explicitly mentioned. But of course, both the tables should have the same name. Only when tables have the same name, it is possible otherwise it is not possible. Now the next one is on as you can see here, in this we explicitly mentioned, not just the column name, but we talk about the exact condition. So I'm saying I want to join employees with the department based on department ID in employees and department ID in departments. And we want to refer to employees with alias II and departments with alias D. And we are using a dot department ID and that means department ID should come from employees.
And here we use the alternatively you can leave these aliases outside and directly refer to table name here, but as that is generally lengthy, we prefer to use alias and why is this needed? Can we do the same with using Yes you can, but there are cases where you cannot accomplish this without using on. So let us see an example. I want to find out The department name and the manager name, but the problem with the departments and managers, the manager ID in department is nothing but the employ ID in employees table. So, this is what I need to do, I need to get the department name and then the first name of the employee and this is nothing but the manager of the department and the data should come from departments and you join that implies, but this time, you are not going to use using because the column names to be joined or not same.
This is important. The column names are not same. In departments. The column name is manager ID, button implies the column name is employee ID. So you have I have no other option but us on. So, you have to say on join the tables and also we prefer to give alias as usual.
And on what the dot manager ID, the main is that ID is what you need to take. So many is that ID from departments table is to be taken, it is to be compared with the E dot employ ID, this is what you need to do. So, when you give a condition like this manager ID from departments is to be compared with employee ID, then you get the department name and corresponding manager's name. So, that's the way on is use. There are cases where on is a must, you cannot ignore on. So in the previous case, you can use Are you can also use simple using, but in this case on a is inevitable because the column names are not same.
And even if you go to our courses table and students table you cannot join them unless use on because in students table the column name is caused, but in courses table the column name is called. So you find good number of such cases. So on is what you need to use. So, three options for you natural join using class on you can use any of them. You can combine joining with the grouping with other options like were ordered by you can put any of them together whichever way you like. And now what if you want to join not two but multiple tables When you join multiple tables, like this implies with the departments and jobs, you need to join two tables at a time using a common column.
So I joined employees with the departments using department ID, and then implies with the jobs using job ID. And this is going to be the query. So I'm saying compare department ID of employees with the department ID of departments, and then join that result with the jobs and compare a job ID of employers with job ID of jobs. And this can be done even by using using class if you like, but it is more explicit if you use on but it's your choice. So let us see. This is the way we join three tables.
So, I want to get the employee name, department name, job title, I want to get the details from employees table departments table, as well as jobs table. So let's go and run this. And you see the details from three different tables, the person name, department name, and the job title. And it's possible for you to convert this to using How do I do that? I just need to use using here I can say using and here I said department ID. And then I say join that.
Okay with the jobs using again, and this time we talk about job ID, it's okay if the same name is present in multiple tables. And this is our modified version with the same result. So you can And use sometimes using in place of on, but in some cases on is mandatory because the column names are not same. So, this is the way you can join more than two tables if and you can join for five even 10 tables if that is needed. But as we have seen, you just need to know how to join tables, what is the common column between the tables. So, all this is called as INNER JOIN or simple join.
But interestingly, sometimes we want to join a table to itself that is called as self join. It is a kind of join, where we join the table to itself. And we have a good example in our database. I want to display the name of the employ The manager's name, but name of employ is coming from employees table. What about the manager? So if employee ID is one or two, if manager is one, not one, that one not one is again present in employee table.
So we have one not one also in the same table, right? might be one or three is employ one or two is his manager. So, this one not two is present in the same table. But how do we get the name for one, not three, then the name for one or two. The trick is to treat the same table as if it were two tables. So, we treat employees as two tables.
Copy one is called e copy two is called M and then we try to join those things. Tables using manage that ID of employees with employee ID of employees, but this time we call it as m not E. So, first a copy is referred as a second copy is referred as M. And we try to join those two copies using this condition. When a table is joined to itself we call it as self join. In some cases self join plays a vital role and this is where we are talking about self join. So, here we are talking about two tables and we are going to copy this let's put this here. Now, I'm trying to display this first name as imply this first name as manager and from employees.
He First copies name, second copy is treated as M. And this is how we join that go ahead and run. And you see the employee name and manager. So this is the way you join the tables. And this is an interesting variant of join called self join, you join the table itself. So whenever you see that same table being used twice, with the different aliases, you know, it is self join. Well, all this is fine.
So let's just get rid of all that. And let us see something interesting. I want to display the department name and the hedge body name. And we know it is already possible, right? How do we join them? Here we are.
We can display the department name and the last name or first name, whatever you like of the employ from departments join employees. And we would like to join using something called order by. So let us say, we'd like to join using join, and then on, on what, let's give D here and then give either. So d dot manager ID is equal to e dot employ ID. And then we want to just sort it by the first column, which is department. Let's go and run this.
This is going to give you the department name and hedge body for that. This is the name of the manager. Let's call it as manager. So this is what I want to display. But if you see something interesting, I don't have just 11 days. departments I have more than 11 departments.
Let us find out what departments we have. Let's go and run a simple command select department name from departments, this is giving me not just 1127 departments you can see that here always you fetch it 27 rows, but when I executed this command, it's giving me only 11. Why so, well, this is called inner join. Inner Join means only the rows that are present in the parent table. In this case departments is the parent ID table. And employs is called child table.
Only when the parent ID table and child table have the common rows. They are considered in departments table, we have Have some departments that don't have any employees that means those departments have no manager and when there is no corresponding manager for the department, these departments are not considered in normal join, that's called inner join. So, INNER JOIN is saying I will pick up only those rows from parent that have corresponding child rows. So, if I have a department let us say to not to our let us say 200 and if this department has no employ, no man is nothing, then this is not considered. Can we afford to have a parent without child Ross? Of course, can we have a child row without parent?
No. So, it's very common to have some departments without implies might be Some customers without any transactions, some accounts without any transactions, there might be some courses in the college. Nobody took them so far. So it's okay to have a parent without child rows. But what if I want to take those parent rows, even though they don't have child rows? Well, that's where we get what is called as OUTER JOIN.
And here is OUTER JOIN. I say pick up all the departments along with employs even if department has no employ. Now, this is a slightly different command, but it does demonstrate what is OUTER JOIN. So get me the department get made, employ and get me the department even if he doesn't have any employees go ahead and run, you find the departments that don't have any employees also. So these are the departments that have no employees. There are departments that have employees and we get all of them here.
But when the department has no employees, still I want to consider the department. Now what is left here, it indicates on the left we have the parent table. for any reason, if you want to have this to be treated as the parent, you can change this to Right Outer Join means you are treating the right set table as parent. So in nutshell, OUTER JOIN is all about retrieving rows in the parent table, even though they don't have child rows. So you can use left outer join, Right Outer Join. And also you can use full OUTER JOIN means you can start rows that are present only in the left side table place only in the right side table, of course, place the common rows.
In that case you should go for full OUTER JOIN. So, this is OUTER JOIN. And we have an example here, where I trade departments as parent table employees as the child. And here are some more examples for that. This example is very important. I want to know how many employees are there in each department.
I want to get the number of employees in each department and if you go for normal join, and if you sought the data and then first group that data by department name and sorted by the name, you get something like this. You're getting 11 departments and the count of implies. But you see, I'm not talking about live and I'm talking about 27 departments. And why am I getting only 11? Same old reason. So let's just change it to left outer join.
And then if you run this you include all the departments, but there is a problem here, we are getting all these departments with one employ, but actually they have no employees at all. The problem is with the count of star, because count star is counting rows and there is a row in the result for this department because of Outer Join, so we need to now change this From count of star to something like count of employ ID, because count counts the number of not null values, but employ ID for those departments that don't have employees would be null. So now go ahead, run it, the problem is solved. This is the way you can include all the departments and the departments with employees and without employees. And if you sort the data by two, then you get that in the order of count. So you are getting a lot of departments with the count to zero because they all have no employees in employees table.
So that's called Outer Join a very important thing and we talked about INNER JOIN, self join OUTER JOIN. There are a few other flavors like non queue join. But as it is a course for beginners, I want you to first get comfortable with the most common things before you try to explore what else is available. So in this course, we talked about joining tables. We applied natural join using class on class. And then we understood how to join more than two tables self join.
And we also talked about how to deal with left, right, and full OUTER JOIN. So we understood something extremely important. Experiment with all the queries and try to write a lot more queries, with more about querying. We'll review in the next lesson.