In this video, we will try to understand how we can build SSIS cubes, when we have complex relationships between the fact and the dimension table. Till now, you know we have created cubes you know where we had one fact table connected with a dimension table. So, we did not go into complex relationships like many to many or something right. But now, let us see that you know if we have a complex database design where we have many to many relationship or where we have a reference relation relationship between the fact and the dimension at that time, what kind of functionalities are provided by SSIS to create the cube. So, let us start with situation number one. Now, here is a simple scenario we have, what we have done is we had a country table if you remember so, to that country table I have added one more column here called as the area.
So what I've done is you know, I have just divided the countries into various areas. So, we have two areas at this moment. So, you can see that I have created a new table called as the dim area table. And in this dim area table, I have two entries here one for Asia and the second for non issue and these IDs are linked to the country table over here. So you can see it over here I am saying okay, India is Asia, Nepal is Asia and us is non issue. Not now, I would like to stress a very important point over here.
There is no relationship between the area dimension and the fact table. So in other words, the fact table what we have, so let me just go to my Object Explorer here. So this fact customer table has no relationship With the area dimension so you can see over here I have the fact table, but this fact table has no connection with the area dimension. Now, let us go and import all these database designs what we have amended over here in our project so let us go back to a project and let us go and add those tables over here. So, I'm going to go and add some right click and we add the area dimension over here, right. And so you can see that I am on the SSIS DSP file that is our the if you see the first thing when we when we add a new table or when we do some database changes, right?
We need to first go and update the view. So I'm on the DSP on the data source view here. So you can see now I have this country table here and the ad A table here, but you can see that the relationships are not coming out right that is because this country dimension is a query. So, let us go and edit the query and let us select the area ID as well. And we say okay, right. And also let us go and import the relationship between the country and the area dimension.
So, I'm going to go and refresh Oh, it is not coming in. So, let me do it manually problem right. So, there it is. So, you can see now, we have imported the relationships into our cube okay. So, you can see that we have the country dimension, the country dimension is linked to the area dimension by using this area ID and the country dimension is linked to the fact table. So now we have a situation here you know, where the country dimension is connected to the fact customer table, but this area, dimension not connected to this fact table.
And now let's say I want to go and derive a report, you know, where I would like to know as per area, how much sales has happened. So how do you go and create a cube you know where, you know, you can drag and drop the area and get the seals as for it. So, the first thing what we have to do is, we have to go and first create a dimension for the area. So that is very straightforward. So let us go now to our solution explorer, and the first thing is, let us go and add a dimension. So I'm going to go and add a dimension for the area TV.
So let us do a next. And yes, I want to use an existing table to create this dimension. And the table is dim area and do a next. I also want the area name to be included next, and finish. And this dimension we have to also include in our cube so you can see that dimension area has been created. So, there it is right.
But you know this dimension also has to be included in our cube. So, let us go to our cube and let us right click on this dimension area and let us go and include this area dimension into the cube. So, till here everything is straightforward okay. But now the complication starts now, the main complication is that, how do we connect the area dimension table with the fact customer table because if you see at this moment of design we have the fact customer table the site customer table is connected with the country dimension and the country dimension is connected with the area dimension. So, the area dimension has no connection with the fact table directly. So, there is an intermediate table that is a country table and why are that the area dimension is connected.
So, how do we go and define such kind of relationship In the SSE sq. So, if you remember, you know, I have said in the previous part of the video, when you want to go and define relationship between the fact table and the dimension table, we need to use this dimension usage tab. So, you can see that I've clicked on the dimension usage tab and in this dimension usage tab, you need to go and define that how your dimension table is connected with the fact table. Till now, if you see let us say I go to this country ID the one the relationship what we have used in now is the regular relationship a regular relationship is nothing but you know, where the dimension table is connected to the fact table with a one to many relationship. In other words, one record of dimension will have a lot of records in the fact table right.
So, this regular relationship is nothing but you know, it defines a one to many relationship between the dimension and the fact. So, one record in the dimension has lot of records in the Now in this case we need to use the regular will not work right. So, we need to use the reference relationship a reference relationship as you can see that when I click on the reference relationship it is saying that Okay, so that is so you can see that this is a fact here you can see the symbol the icon, it says that okay, this is the fact and the fact has no connection with the dimension table, but he is connected via an intermediate table, right? So you can see that he has shown your intermediate dimension combo box. So we'll go here we'll say that yes, the dim area is connected to the country table, intermediate Li and the connection is using the area ID right and i will say okay, and I will save this.
So, now that we have almost done with all the things let us go and process the cube. So let us go and process and deploy the cube and let us see the output in the browser. So in the browser now, we should Be able to see the sales as per the areas. So, there the processing is happening and the process succeeded Good. Now let us go back to our browser and remember that always to reconnect you know, whenever you go and process the cube ensure that you go and you reconnect the cube you know in order to get the updated changes on the browser. So now there the browser has opened.
So now if I go and say customer amount, and if I go and put the area name, you can now see that I'm able to see the total as per the area. So even though the dimension area has no connection with the fact table, but because you know, it has an intermediate connection via the country table, you know, I'm able to get the bifurcation as per the area right. So this is one type of relationship you know that one dimension table is connected with the fact why are interested table. So, this kind of relationship is termed as the reference relationship. Now, one more option, you know, which is worth discussing is this option over here called as materialize the materialized option helps us to decide that do we need performance during the processing time or do we need performance during the query time if you see the various phases of the cube, if you analyze you know how a cube is built, it goes through a two step process.
First, the cube is processed in other words, all the table data is taken and all the calculations are stored in the analysis services database and in the second phase that is applied like the Excel or SSRS they will come and pull data from the cube. So, we have basically a build phase you can think about and then we have Have a basically a query face. If you select this materialized option, in muddlers option, you know all this processing of referencing the table intermediate table fact table all happens during the build phase itself. But if you uncheck this materialized option, then you know all these things don't happen during the build phase, but they happen during the query phase. So, if you check this option, what will happen is you know, it will take lot of time to process the cube, okay, but your query will be faster.
And if you uncheck this option, what will happen, your processing of the cube will happen fast, but the query will take time. So, this option has to be selected in a very carefully depending on situation what your project demands. So, again, I'm iterating the Martinez option helps us to decide do we need performance during the processing time or the build time of the cube or do we need performance you know, while you are querying the cube. Now, let us take the other scenario Many too many. For example, if you remember, we had this product table here. Now, let us say that every product is manufactured by multiple vendors.
So for example, if you say sport shoes, then for the sport shoes, the material is provided by one by one vendor. And you know, the, the shoe lace is made by other vendor, the shoe soul is made by other vendor, right? So, basically, we have multiple vendors for one product, and one product, you know, can be made by multiple vendors. And we would like to then create a report you know, where you'd like to say, Okay, if this is the sale, then from that sale, you know, this vendor gets this much money, this vendor gets this much money, right. So in other words, if we have a many to many relationship, how do we go and create our cube or how do we go and create the report from a cube. So what I'm going to do is first let us go and create one more dimension table here called as a vendor table.
So, quickly let me go and create a table here called as a vendor table this vendor table will have a Vendor ID and and a vendor name right. So Vendor ID and vendor name and I will say this is them vendor okay. And let us go and make a couple of entries into this vendor table here. So, let me go here and make a couple of entries into the vendor table for vendor four, so you can see I may have made entries of four vendors into the vendor table. So now this table, the product table, and the vendor table have a many to many relationship right. So in order to be Create a many to many relationship between two tables we have to create one more intermediate table in between called as the dim vendor product right.
So, let me go here again and let me create a table here called as the dim vendor product right and in this so this table will have entries from both vendor as well as the product okay. So this will have the Vendor ID and this will have the product ID or ID and it will also have one more entry or saying that okay for this vendor for this product, what is he manufacturing right so I'll say description. So basically if you say shoe, then for a shoe he's manufacturing the soul is manufacturing the lace and the other person is providing the material right. So Also, that vendor has how much amount of profit percentage from that whole sale? So over here I will say profit percentage. Okay, so if a vendor is that who's providing a shoelace or a shoe, then probably he has a 10% profit.
If he's providing the material then he has a 90% profit, right? So again, one more entry here, profit percentage is a set primary key. So this table is them vendor product. Right. And now let us go and fill this table as well. Right?
So I'm gonna go and edit the dim vendor product ID so for example, the first product here is a sport shoes, right? So and we have vendor one vendor to vendor three metaphor like this. So basically what I can do here now is I can say Okay, the first vendor Vendor ID one produces shoes that is product ID one and let us say he provides the material material for shoes, and the percentage is 80. Then the second vendor, again, he also manufactures that product, and he provides a list of the shoes so he gets 10% of it good when the three manufacturers product and he makes the soul of the shoe, so 10% again, so in this way, every product let's say is manufactured by multiple vendors. And when a sale happens, you know, you would like to distribute the profit accordingly to all the vendors.
So let me go and make entry for all the products here as well. So you can see that you know, for the rest of the products also have filled the data. So if I now go and make an inner join, just to cross check if the data filling is proper. And remember that this data filling will actually take place from SSIS. In other words, actually there will be an ETL logic and it will extract those CSV files and fill this data. But at this moment, you know, I'm not making it complicated, I'm filling it manually, but remember that at the end of the day this data comes from OLTP or probably this data comes from a CSV file right.
So over here I will say okay, this is a vendor and this is a product okay. And so, let us see if the data is seen properly. So, there it is. So, basically, vendor 123 the manufacturers for shoes and vendor one manufacturer, vendor one provides the material vendor to provide the lace and vendor three provides the soul. Again, short is manufactured by Two vendors one is vendor one and vendor four, vendor one provides material and vendor four provides a short buttons and so on right. So good.
So, now we have the product table, we have the vendor table and there is an intermediate table called as the dim vendor product, which does the many to many relationship right and also let us go to our database diagram and let us ensure that we also create the relationships right. So, let us go here and over here let us import the vendor as well as the vendor product table and let us create the appropriate relationships right. So that we have the product table but we also need the vendor and product right I'm over here. This Vendor ID is the primary key and this is the foreign key, right? And this product ID is over here. Great.
So you can see now the dim product and dim vendor is having an intermediate table. Right and so let us move other tables out we'll see them later on, but at this moment, let us concentrate on these three tables here. So, we can see there is a dim product, there is a dim vendor and there is a dim vendor product which does the many to many relationship right and Let us save this diagram over here. Now let us go to our SSIS package and let us import those tables right so Those newly created tables we need to import here. So we need to import the vendor and we need to import the vendor product right? So I'm going to go and import both of those tables into my database and there it is.
And where is the game collapse Ah right and over here this product ID is having a relationship with the productive right. So now at this moment just concentrate on these tables, the other tables leave it right so let us look at these four tables. So if you see at this moment, the connection between the fact table is why or the product table and the product table has a many to many relationship with the vendor table. Right. And the vendor table has no relationship with the fact table. So now what we want is we want to go and derive report where we would like to say that okay, for this vendor, you know, this most sales have happened, and this is their profit distribution.
Okay, so how do we go about doing that? So I'm going to go and save this first. So the first thing, so let us go to our solution explorer, view Solution Explorer, and let us go step by step. Right? The first thing is, we need to go and create three dimensions. Okay, one dimension, two dimensions, I'm sorry.
Okay, so one dimension, which is the dim vendor, and the other dimension, which is the dim vendor product. Okay, so let's go and create both of these dimensions one by one. So I'm going to go and add a new dimension. Next Just a quick note here This many to many relationship is very easy actually. But you know the steps are bit complicated. So, basically if you just also the number of steps are you know quite in large numbers right.
So, if you just remember those steps in a logical manner It is very easy but if you try to by hearted you know it is stuff so, the many to many relationship you know what happens is my I myself you know after so many years of experience, sometimes even I forget you know the logical steps So, whatever I'm doing at this moment, just follow in the same way so that you get the output properly. Okay, so I'm going to go and say next so let us first create the dim product, dim vendor I'm sorry. So let us import dim vendor Vendor ID. We will look about the we will see the vendor name later on okay and finish and also let us add a new dimension and let us also import dim vendor product right and here we also need the Vendor ID and the product ID right.
So, next we will we will see that you know later on to the pool attributes like percentage and vendor name etc right. So, we have created two dimensions one is a vendor and the other is a vendor product okay. And now, in the queue what we need to do is we need to go and add both of these dimensions. So let us go and add the dim vendor and implement vendor product. Right. And in the fact table now, okay, coming back here.
If you see in our dim vendor productive we have a profit percentage. Now this profit percentage, you know when to you use it for a calculation Right. So one more entry we need to make now in the fact, now this is a step, which is bit weird. And a lot of developers get confused over here, when it comes to many to many relationship. So basically, we have added both the dimension table in the dimension section that is easy to understand. But because you know there is a many to many relationship, we need to make entry of that dim vendor product into the factory.
So right, so in the fact, I'm going to go and make a new measure entry. Now remember that this measure is not going to be used outside this measure is for our internal purpose to establish that many to many relationship, right? So this is a weird step, but we need to perform that. So I'm going to say here, I want to go and use this dim vendor product. And we will use the count of the rows here we are not interested in doing summation, right? So I'm gonna say okay, I want to use a count of the rows.
And I will say, okay, and this dim vendor, product count, you know, whatever. I have here Right. I don't want people to see it, this is for my internal purpose, right? So I'm going to go and make the visible as false Great. So, let me repeat the steps here. So, the first thing is I imported two dimension tables, one is the vendor Master, the other one is the vendor product master the many to many relationship.
The second is I went ahead and added a measure in over here which is an invisible measure called as a dim vendor product count. Remember that this is the name is not dim vendor This is actually that we do like this let me say this is this name is very weird here. So let me say this is the many to many connector action. Okay. So the main intention here is that this will help us to connect or help us to do the many to many, connector right. So that means More sense it is not a fact actually it is just for our internal purpose to do the connection right now let's go to a dimension usage okay and here is where all the fun is okay.
So we can see that okay we have two facts now one is our fact customers right and the other one is this many to many connector just for establishing that many to many relationship inside the cube okay you can see that there is a dim when dim vendor product and dim product no etc. So you can see now there are two entries which are added here one is a dim vendor, which is there definitely and one of the dim vendor product. Now again this dim vendor product dimension we have added for our internal purpose. So what I'm going to do is I don't need that so I'm going to go and remove this from you. Okay. Right.
So, now the dim vendor, by the dimension vendor is connected with the connector by using the Vendor ID but with the fact you know the vendor is not connected in other This dim vendor is not connected to the fact customer right it is connected via that dim vendor product dimension. So, what I will do here is now this is a block which is need to be filled the other dimensions you don't have to fill right you know because there is no connection of these dimensions with the many to many connector one only the product and the vendor have the connection. So, we can see for the product already the things are established and that is good. And remember that this dim product has a direct connection with the fact sales table. So, we don't need to worry right the only big problem here is that how does this vendor connect with the fact customer right.
So over here I will go and say take this and I will say that this is a many to many relationship and the intermediate table or the intermediate connector or the intermediate measure is the many to many connector which we had created. Remember this many to many connector and I will say okay, and that's it. So, again, I repeat the steps first You need to add two dimensions you need to add one fact which is this many to many connector and then you need to go and connect that many to many relationship table by using this relationship type that is many to many right. So, there it is we have done with almost all the things now let us go and process the cube So, I will say yes and you can see that there the processing is successful, very nice. So, now let us go to our browser right and let us see that if you can see the vendor wise report right.
So, I will go here first thing So, let me go and drop okay now very interesting okay. You can see that there are a lot of duplicate dimensions which have been introduced like the product ID and vendors, okay, and the dim vendor product, okay, and then again, there is dim vendor and dim product So, these product ID and vendor and dim vendor is for internal purpose okay. So, basically you can go and hide these dimensions if you wish, the main thing is this vendor he is here and product is here. So, the other tables what you see down below these three tables are for internal purpose to establish that many to many relationship right. So now I go here and let's say I say okay I want to go and this is a customer amount. So if I take product, I can see that right?
If I take vendors now that is very interesting. So let me go and take the vendors. You can see for the vendor, we have not taken the vendor name right. I just have the Vendor ID. So I need to ensure that I also take the vendor name. Okay very quickly.
And let us go and just process the coupons once again, so that we can see with the vendor names And Ron good. So we are done with it and remember that always we need to reconnect reconnect, yes. After some time, you know, you start getting habituated with the reconnection refresh. Okay, so we're here now I can go and now drag the rest of me, right? So you can see that this vendor one from the total sales, you know, he has to pay 2012 the vendor two is making this much sales, the vendor three is making this much sales, right. But what we really need is the percentage, right because this amount here is the total sum of the amount.
Every vendor is having a certain percentage. So basically, if you remember each one of these vendors right So this render product had a profit percentage, right and a description. So again, these two fields I had not taken I'm so sorry for this, I just need to rebuild the cube again. So that I have rebuild the cube to get the percentage as well as the description right. So, reconnect. So, now, we have the customer amount we have the vendor and also from this vendor product, we can now get the percentage.
Now you can see that things are a bit weird here. We are because and I don't have to say that why it is you know, so, you can see when one has 10% profit from this customer amount, right. vendor one again has 100% profit from this customer amount right. So you can see that you know, because of that mentor mentee relationship, this thing is coming here now What I want is I want to actually go and multiply this percentage with a custom customer amount and get the real profit out for that particular vendor right. So over here actually what we need to also do is we need to also go and drag and drop the product okay. So, we need to also go and drag and drop the product name like that it is okay.
So the vendor product as well as the description okay. So you can see now the vendor one manufacturers right, let's see. So, the vendor one for a particular sale is manufacturing the lace sometimes is manufacturing the buttons right? And accordingly he has a profit, right. So now what we need to do is we need to take that particular sale and multiply with that percentage to get that vendors profit. Right.
So how do we do that? So for that, you know, if you remember in one of the previous videos, so can shad explained calculations? Okay, so this example this thing of multiplying the profit percentage with the customer amount I'm not going to do over here, I will leave it as a homework for you guys. So, but I will give you an idea of how to do it. So, what you have to do is you have to go and create a calculated member and in the calculated member, you have to take that percentage and multiply it by the customer sales amount. So, basically just quickly so, over here I need to go and create a new calculated member right and this I will say actual profit for a particular vendor, right, actual profit.
And in this what I will do is I will say that take that particular customer amount and multiply it with the percentage Provided by that vendor. So multiply this with a percentage provided by the vendor, right? So in other words, we need to go and create a calculated member. Right. So this thing I believe it has a homework for you guys right. So I hope that you enjoyed this video.
So in this video we did two kinds of relationship in SSIS. One is the reference relationship you know, where basically there is a dimension table and the dimension table is not connected with the fact table it goes via intermediate table. And the other kind of relationship which we saw in this is a many to many relationship where we have two dimension tables and they are having a many to many relationship. And one dimension table has no connection with the fact table and we still want to go and create a report out of it. So again, this is a repeated request, which keeps coming after every third video or fourth video. Basically we want to know that which lab you are viewing because I am Acacia putting a lot of hard work to create these videos.
But we also we are trainers we are humans. And at some moment of time we also need encouragement. every two hours is a big series right by itself to not only see the video but also to record the video as well. So please go ahead to facebook.com slash correspond and let us know that on which video you are on because by that we come to know that how many people are watching our series what they are thinking about our series and any kind of inputs please go ahead and write in the comments saying that this is the input and we would like to incorporate that input into our video series. So thank you very much and keep learning msbi We are going in the right direction. We are going more in depth so if you see what you were on the first video and what you see now, you know there is a huge journey which we have traveled, but we still have a huge journey to travel and it is my hundred percent guarantee.
That will make you a true msbi professional. Every checkbox every option what you see in msbi will be covered in this video series. So that is our attempt. That is our goal, where we want to take the series at the end of the day. Thank you very much and keep watching the series. Thank you again