Now that we have the data there, let's try to analyze it using different types of graphs. We will only show you how to generate different type of graphs we will not go into interpreting the graphs, the graphs, we will see our histograms, we will see trends using the line charts. Then we will see box plots which can show the central tendencies. And lastly, we will see scatter plots. We start with histograms. Now histograms can be generated using the hist command in our now it has got very a lot of options.
We will go one by one Bluetooth to look at all the different options. Now, what we do first is we have the data frame we take one particular data we take the sensor data and we convert it into a data matrix we stored in a variable called x. Then we will generate bins and then we will see how to generate the histogram using these different options. Now, we've Convert the data we say of the Sensex data we have converted to a matrix. Now we can straightaway say hist of x instead of x, and there the histogram is generated. Now we can increase the number of bins or decrease the number of bins.
The bins is how many number of intervals we want to have. The bins is nothing but a sequence of number between the minimum and the maximum census value. And we can say we want 30 different bins. Today we created the bins. Now, let's create the histogram with a hist x comma. Brix is equal to bins.
So, now let's generate the histogram. There you see the number of bins have increased, we can change the number of bits, we can say 20 and we generate the number of prints and we run the histogram command once again. So there you see the number of bins have changed next weekend color the bars in the color of our choice. So, we can say this, we can get the histogram come on back and say call color is equal to dark gray. So, now we run this and we see that the bars have been colored in dark gray, we can change the color to any color that we want we can say blue for example. And there you see you have blue bars.
We can also paint the borders we can say border and we use the color white is equal to white. So now you see the borders have been changed to white color. Next we can give labels to the x axis and the y axis. so we can say x lab for x axis label x lab is equal to Sensex index is a index and white lab is a frequency. So we can see why lab is equal to frequency. So you'll see that the labels are changed according to what we have given here.
Yeah, we can also give a title to the histogram of our choice. So we can say main is equal to and we can give the title we say histogram off Sensex gave. The title also is available to you now. So now let's change the data we can try the gold data and we run this entire code once again. So there you see the histogram for gold prices are made here. We can change the number of bins and check whether there is a difference and we can change that label to rise in rupees and you can run this once again.
So, you see the changes have been made we can also create bar plots bar plots are different from histograms in the sense that bar plot is a frequency where every individual value. So, q plot is a very powerful function in our where we using q plot you can create various kind of graphs. So, here I've shown you how to use q plot for generating a bar graph. So, you need to give the data that is x in Q plot and then you need to give the geometry. So, the geometry what we want is a bar graph and then of course, the color which what is the color to fill and what is the color of the borders. Next up we see how to draw line graphs, line graphs for line graphs.
We will use the package GG plot two, GG plot two is a very powerful package and it is a chapter by itself it implements the grammar of graphs. Now, it it works in layers they can see in the I have given a code sample where there is a plus between all the different code segments. So, you can add layers by using this plus at each stage. Now, in this in GG plot we have to give the data of which will be utilized for plotting the graph. Then we have to mention the aesthetics, A stands for aesthetics, the aesthetics I have said that plot the graph on the x axis is observation date and y axis you plot sets x data. Then I have told him your genome line basically indicates that please draw a line graph the genome smooth, this is a this draws the regression line across the data.
So it uses a method called gam and I will In the formula what has to be utilized in that, then there are the regular stuff where I give the labels for x axis or Y axis and I give a title and a gg plot we can also give a theme. So, the theme can be different things which you can find in another package called GG themes. So, all of these put together we can draw a simple line graph. Now, I cannot demonstrate it on the studio, I will use a software which I have developed to demonstrate the output of this particular graph. So here you see the output of the command we just formulated. Using GG plot, we can plot multiple lines also on a single graph.
For example, here is the code for plotting the prices of potato, tomato and onion. All put all in a single graph. Now here you see I have given a subset of the data because I have this data only from first of January 2018. So I Instead for the data I have given a subset of the total data what I have fetched from my database. And then for we have said you draw the line graph for potato, tomato and onion separately and then you also draw the regression lines for each of them. Then each of the lines I have given a different color.
So, you can see the command which I have used for color. And then lastly, I have given the labels and then the theme. So here you see the output of this command which we just discussed before. Now the output is from a application developed in SHINee. We will discuss shiny apps in a different session. So here you can see that we have the three different lines, one for protector, one for tomato, one for onion, each in a different color, and each of them showing a different trend line.
Next we try to create the boxplot of a box plot we first required to create the data appropriately. Now what we have is the prices of tomato, potato and onion in separate columns require two columns one column indicating what is the commodity and one column indicating its price. So we create a data frame containing two columns. One is called commodity and one is called value. In the commodity we replicate the labels tomato, onion and potato, the number of for the number of rows that we have in the data. We take the data from first January 2018.
And against each of these commodities, we capture the values of the prices of tomato, onion and potato as they are present in our data. Then we can use the GG plot command as shown here and create the box plot. You can see there are various options for creating the box. plot, you can also give colors and you can give shapes and you can scale the color Brewer etc. Now, we see the output of the boxplot we take the code and we run it there you have your box plot you can see the whiskers and you can see the outliers. Next up, we see how to generate scatter plots.
Scatter plots can also be generated by GG plot command itself. So, here we are drawing the scatter plot between gold price and Sensex. So, we will color the different observations based on the month of the observation. So, here is the complete command we will see how we how to run it and we will see the output. So, now I have placed the command in the our studio and we run it there you have your scatter plot. We can save this plot also or any of the plots by saying export save You can save as image or a PDF.
Now I select a directory. So there we select a directory and then we specify a file name. So I say scatterplot. And then we say save. It says the file already exists. So we will change the name scatterplot one and save it.
So we have seen how to create five different type of graphs. Next, we will use these code snippets that we have generated so far to create a presentation in our we will see how to do that. Now the presentations or the creating are these will automatically get updated whenever the data underlying the presentations change. This is unlike the normal presentations that you make using Microsoft PowerPoint etc.