Hi. In this lesson, we talk about subqueries. After understanding how to group the data, how to join tables to get data from multiple tables, now it's time to explore sub queries, which are pretty powerful. Now, we have a query, which is providing input to another query. As you can see in the diagram there, I want to execute a query First, get some data and supply this as input to another query. This is called main query.
This is called sub query, also known as inner query outer query. But why do we want to do this? Why do we want to provide some data from one query to another query? Let's understand some requirements. I want to display the details of employees who are having salary greater than average salary of employees. Normally, I can just get the average salary in this way that is something like 6461.
Now, I want to find out the employees where salary is greater than 6461. Now, this is how you can do it, it works and these are their employees, some 51 employees having salary greater than average salary, but this is a tedious process, you have to first find out the average salary, then use that in the next query in order to get implies when salary is greater than this. How about clubbing these two into one that's exactly what Sub query allows you to do, this is your main query, it remains the same. But instead of giving a value directly, we give a query, the result of the query. In this case, average salary is made available to our outer query. So this is how our inner query value is sent to outer query.
And outer query depends on the value coming from this. This is one of the use cases. But what if I want to get employee with the highest salary? Fine, it's very simple. You just need to say maximum of salary. Just make sure what you send and what you receive are compatible.
The columns are supposed to have the same kind of data type. Now here if you run you get employ with highest salary. Now in both these cases, in this and in the previous query, we were sending only one value from sub query to main query. But sometimes the sub query might be sending multiple rows multiple values. So this is where we bring a slightly different structure using in. So get me implies if employee ID is present in the list of employee IDs coming from job history.
So if the job history is having the employee ID that is there in employee's table, I'm interested in the details of the employ. IT pros this employ has done a job in the past because his role is present in job history. Just Got their run, you get employees who did a job in the past. According to our data, seven rows are selected. These are the employees who did a job in the past. You can also use not in if you want to find out employees who are not in this list.
So go ahead and run and you can see a lot more. Well, in SQL Developer, I said, I want to take only first 250 to start with. So though we have more than 50 rows, it is fetching only 50 rows to start with, well, that can be changed if you want. Now this is about in, not in, and how we can use sub query. Well, your sub query can be as complex as you want. This is where I want to find out the jobs for which we have more than 10 employees.
Find out the job ID where the count is more than 10 and get me all the details of those jobs. There we are, we are getting it. So, similarly, you can find out all the departments in which the average salary is more than 5000 are the number of employees is more than 20 whatever you like. So, fundamental concept remains the same sub query is executed First, it is sending some data back to the main query, it might be one value, or it might be a collection of values. It depends on your requirement. But based on what the subqueries sends, your main query is expected to use appropriate operators.
If you're using in it's fine to get multiple values, but if you're using equal to or greater than And you are expected to have only one value from sub coding, otherwise it will be an error. For example, if I say equal to here, it's going to throw an error. And the error clearly says sub query returns more than one row, because the context expects sub query to return only one, not multiple rows. So that's about sub query. And here are some examples. We have already seen how to use group by and having in the sub query, and we have seen how to select a single value and also multiple values.
Then, there is another variant, which is nested sub query. In this, we can have a sub query again, depending on another sub query like this, the nesting can go to many levels, but when you're nesting too deep performance will be an issue. But generally a few levels of nesting like three to five is very common. Depending on your database, the number of tables, it can be even higher. Now, what am I trying to do here? The concept is simple.
To start with, I want to pick up all the location IDs related to city, Tokyo. This will send location IDs, find out all the departments that are in those location IDs. And this will give me a set of department IDs. Now get employees who are working in those departments. So the innermost sub query is executed first. It sends result it to next to sub query.
And there it is going to sell some more values and it'll send them to the main query. So this is the order in which those three are executed. Now, let us see how it works. I just made a small change, I went with a Toronto, not Tokyo. Also be careful. Generally I write queries in uppercase to make them more visible to you.
But when it comes to the name of the city, you got to put it in the correct case, because comparison of strings is always case sensitive. If I put it in uppercase, no city will match that value. So make sure you use appropriate case when you're dealing with strings. Now here if I go ahead and run, it is saying there are two employees who are working in a department that is in Tokyo. So city Tokyo, get all the locations. Find out the department and display the details of employees in those departments.
You can also do it in the reverse way. I mean, I want to know in which city employ 115 is working. So we just need to reverse the order. So I say select city from locations where location ID is I can go for equal to because for one employee there is only one department. So here I say location ID from department where department ID in or is equal to, and now the third one is getting the department ID from employees for which employ the employ ID is given employee IDs 115 Whatever you want, this is just the same, but we are moving in a different direction. Well, so employ 115 is working in Seattle.
So that's the city of the implied, you can't get that value unless you get the department ID and find out the location ID of the department. And then you get the city based on the location ID. Well, what if you want to know what is the country of this employ in which country is he working? Well, you know, we have countries table also, which is linked to location stable, so you can go further and find out the country also. So that's about nested sub query, a query within another query, like that you can have many subqueries a slightly different scenario is a multi Typical subqueries where a main query depends on multiple sub queries, I want to find out the employs where the job of the employ is programmer. Again, please be aware you have to give it in the correct case not in uppercase always.
And then if the employ job is programmer or the department is sales department, so find out the department ID for sales. Remember, neither job title is present in employs nor department name. So we need to use appropriate tables, departments and jobs to get the job ID and the department ID so that we can use employees table and those columns to get appropriate the details. So Get all the employees who are doing the job programmer or who are working in department sales. So here is the query, we can have something like this. But again, make sure you give them in the correct case, go ahead and run.
We got a bunch of employees. In fact, 39 employees are selected because there are more employees in sales. And if I say it's not the sales department, I'm looking for something like production department. Well, I say production. And then let us see I think we have a department called production. Well, we got no department exactly with that name.
That's the reason we are getting only employees who are doing the job programmer. Well, if not production, you can go with some other department like admin or whatever you like. Well, we can And find out what are the departments we have. Let's go check that. And here is the data. Oh, it's full name administration.
And we don't have anything like production, but we do have something like purchasing department so we can try that. So here I say purchasing. And then if you go and run this query, you're going to see all the details. So it should be more than five for sure. Yeah. 11 because five implies or anyway it programmers and the rest of them are coming from the purchasing department.
So five programmers and a few of them are coming from purchasing departments are totally you have 11 Records retrieved by this. That's how our Multiple sub queries work one query depends on multiple sub queries. Then comes sub queries in the context of DML. It is possible to execute subqueries in the context of update, insert an event delete. The first example shows we are trying to change the salary of imply 1102 average salary of all employees. So whatever the sub query is returning, I want that to be assigned to salary of this employee.
So it's a sub query in update. And this is where we are inserting a new row into departments table and the new row is going to have 300 as the ID sports is the name of it, but I don't know The manager ID. So I get the manager ID by using a sub query. And you are using a sub query to find out what is the employee ID for employee legs. And the last column is location ID, I'm giving 1000. So insert can be used with sub query update can be used with sub query.
Delete can also be used with sub query. But when we use delete, we don't do any manipulations. We only delete the row. So sub query is used only in the context of where clause. So delete the jobs that are not being done by employees. So find out all the jobs that employees are dealing with.
And if a job is not in that, I want to delete that so this way Delete all the rows that are in jobs table, but do not have any corresponding rows in employees table. So that is called as subqueries in DML. Well, it's also possible to have a sub query in the context of CREATE TABLE, which is a DDL command. When you say CREATE TABLE costly employs as Select star from employees where salary greater than 15,000. Whatever rows this query is retrieving. Of course, this is what we call a sub query, whatever rows retrieved by this all that will be now the rows of this new table.
So we get a new table created, and these rows and columns will be inserted into this new table. The new table is going to have no constraints, except the not null constraints for existing columns. So, when a table is created in this way, new table will not get constraints except not null constraint. Well, if you are particular about constraints then you need to create those constraints after table is created. Now, the next and a very powerful concept is correlated sub query. This is where we are sending data from sub query to main query this we have been doing.
The new thing is we also take data from main query to sub query. We take data from main query and use it in sub query and it is two way main query to subqueries Upgrade to main query. So there is correlation between both the queries. So we call it as correlated sub query. And this is the general syntax, you refer to some column of outer query in the sub query. When a sub query refers to something related to outer query the main query, we call it as correlated sub query.
And here is an example. I want to display employees where their salary is greater than average salary of HR department in which they work. So I want to pick up employees if the salary of the employee is greater than average salary of employees in the department in which their employees work. We have seen a case in the beginning where we wanted employees Where salary is greater than average salary of all employees. But I want to pick up only employees in the same department. That's where this comes into picture implies is used in both outer query and inner query.
But we want to refer to that with alias outer. It's not a keyword you can give anything you like. And here I'm saying find out the average of salary for employees, where the department is equivalent to department coming from outer query. So, whatever is the department ID coming from the main query for that we need to find out average salary. So, if I say employ one, not one belongs to department 20, any salary is 8000. Now when this row is considered We asked subqueries to find out what is the average salary for employees in department 20.
And if the sub query is sending, we send 22 sub query, if sub query is sending 7000 as the average salary, then we select this employ because his salary is greater than the average coming from subqueries. So, when there is a correlation between the main query and sub query, we call it as a correlated sub query. And this is how we executed so if you go down and try this, this is what we need to do. Okay, outer dot department ID. So go execute that and you get to see employees whose salary is greater than average salary, but I can change things slightly I say get me the implies drawing the maximum salary in each department, then I can just change it to salary is equal to max. And then we get only 11 rows because remember we have 11 departments.
So in each department, we are selecting the employee with the highest salary. Well, if two or more employees have the same highest salary in a department, we select all of them. So that's about how we deal with the correlated sub query. And then we have two operators which are used in conjunction with the correlated sub query exists not exist. I want to find out the departments that have employees. We can do it in different ways.
And this is one of the ways so I will Like to get, let's get rid of all that stuff. Now I want to get the departments where they have some employees in employees table. So where department ID in department ID from employees. So we get leaven departments because we know only 11 departments are having employees. But then if I say find out the departments that are not in this list, let's go and run this and then we get nothing. Or is this if we get 11 departments in the previous case, I should get remaining 16 departments, but it doesn't work in that way.
Not 18 is not going to give you the results in the way you expect when there is a null value in this column. In implies there is a null value for department ID, and that's going to jeopardize your result. Well, so this is not to be used in this way when there are null values here. One way to fix that is to use NVL. If you say it's a null value, I want to treat it as zero, then it's fine, you solve the problem. But the other way to solve the same problem is to use exist and not exist.
So here we are. I say get me the departments were not exist. That means where the sub query does not retrieve any rows. If sub query does not retrieve any row not exists is true. But when the sub query doesn't retrieve any row, when the department ID is not having any value that is equivalent to this. It means when there is no employ, who belongs to department that is equivalent to this, this is a correlated sub query and exist and not exist are used in the context of correlated subqueries.
Now, here I send a department ID to sub query and if any employee is retrieved, this is false, but if no employee is retrieved, this is true, no employees retrieve it means no employees present in the department as you can see, currently, we got 16 rows. So, if you're using correlated sub query exists and not exist could be useful. And then you can use correlated sub query even in the context of update and delete here update is changing the salary of the employee to maximum salary of all the employees in the same job. So where job ID is equal to outer dot job ID, and you need to understand when you have a correlated sub query and a normal sub query, whenever the sub query refers to a value coming from the main query, it is correlated. So for imply 110, we change the salary to maximum salary of all the employees in the job employ one one Jiro is dealing with.
If he's an IT programmer, we take the maximum salary of all it programmers. That's what we set to salvage and the next one, we are deleting depart means that do not have any employees is it correlated is because I use this DPT in sub query and that is correlated. So, correlated subqueries can solve the problems which are normal subqueries cannot solve. So you have to be comfortable with both normal subqueries and correlated. This is about the subqueries. We already understood joining and grouping.
Put together these three are extremely important concepts, and these three can get you whatever data you want from database. So, in this lesson, we focused on subqueries. We understood how to deal with the nested and multiple subqueries. We understood how to use subqueries in the context of insert, update and delete. We can solve some problems correlated subqueries and exits and non exists operators are used in the context of correlated subqueries. Well, correlated subqueries are needed even when we deal with update and delete commands.
That's like DML with correlated sub query. So, in this lesson we understood a lot about querying. keep experimenting with that. See you in the next lesson.