Okay, so this is the weakest pourer. So in this weaker explorer we can open the AR FFR or we can open the CSV file. So our for Vika actually in the installation directory, we can actually come, come see us our bodies our data. So we can go into the installation directory. We can okay because my, so I can go into this with our installation directory, going to data. In data, I can get a Iris data set here.
So I click on iris.ai there, and I click Open. So in the card, in this pre process tab, we can actually Have some of the data expiration or data understanding. So, I can create on all these variables. And I can see all the descriptive statistics here. And and I can see all the histogram here. So the descriptive statistics, descriptive statistics, and then this is a frequency table, because this class is a categorical variable.
So for the data mining, I can do the pre processing. So, let's see after I understand the data. So I agree on all these are available. And I understand that data. Then I can do the data preparation. So I can click on choose.
And I can select let's see. algorithm, API, select attributes. selection then I can click on Apply. So, these are Vika selected are these variables that are more useful or important to predict this class variable. So, this is pre process, then I can go into classify. So in classify can select let's say the algorithms.
So I can select see linear regression or logistic regressions into logic stay and lazy algorithm. Meta miscellaneous then rules is a decision table then the decision tree. Now, these are the logistic regressions and then a base. So, for this data set I will use a logistic regression I cannot use a linear regression in this data set. So, I will use logistic regressions. So, if you want to use linear regression you can just select this linear regression here.
So, I choose a logistic regressions. After I choose logistic regression, I can choose the cross validation. cross validation method means, let's say, for 10 fold cross validation means I select the select from the data set of training data in a cell by testing data. I use the training data to train the model. I use the testing data to test a model then I really Be these selection 10 times. So, I will train and test the model 10 times and then I get the accuracy of all the training and all the testing and all the evaluation and then I get an average accuracy.
So, this is the cross validation or the tempo cross validation. So, tempo cross validation when they select the training data in the testing data, they will not select the same even a saliva same cell same partition. So, I've set out cross validation, then cross to predict the cross they'll cross over then I can increase that here. data has a confusion matrix, Bada f measure all these data via correlation they classify instances which is around 86% accuracy. So this is how I can do a logistic regression in this car. So, if you want to try linear regression, you can select a data set that can use linear regression and you just click on linear regressions and you Krista and you will have an evaluation and other accuracy here.
Okay, so, let's see when I went to visualize the data, more advanced the level so initially I choose a visualization I chose Norma visualizations here eval and to view at a more advanced level So, I can do some settings here and then I could update can increase the size. So, this is actually the what we called a correlation matrix on a scatter plot matrix. So you can get all these are Advanced Data understanding or advanced data visualization charts. These are visualized here. The last type here