Okay, so now we will do the data preparation station using the D SDK script writer. Okay, so previously we done data understanding. So now we do the data preparation. Okay, so in data preparation, I'm going to remove my missing values, okay. Remove all the rows that has the missing values normalized, normalized data is the score for variable to formalize it in the scope or the variable tree and then select the variables. Okay, there we go the data to the variables Okay.
So now I do II Okay, import a target the variables are these I can remove Okay, so now I'm going to the prepare. So prepare remove missing values okay normalize we standard score. Okay. And any one more normalized score. Okay. These two are Tree Parable tour to entry okay?
So they're about to have a tree and the new pala name should be their own to normalize and essentially tree normalize okay okay I show you the data okay run the code okay. So you can see here. So the first one in the data preparation is the Remove missing values and the next one will be repair and normalized the robot to with the score normalize the global tree with a standard score. Okay. So the new color name will be to normalize and the rubber tree normalize. Okay, so we normalize this column, but they're about to renormalize this other variable tree.
Okay, so now we have additional columns here. He's trying to be proper to normalize essentially rubber tree and normalize Okay, so you notice Okay, so the rubber to normalize probatory normalize so the Assign should be empty. Then this one is man, Marquis, AC me okay they're about to normalize a rubber tree normalize so now I want to select all NIDA normalized values, normalize our variables. So, I come here oh hi pool view variables. And I come and prepare feature selection oh my gosh no ah prepare select variables. Okay, so that rail bus is Ah, this one is now trickier STL data.
So it was Lady normalizable variables. The son is variable zero for Bo wide Rubber to the rubber tree a robot for a robot. So, when you write your script I will suggest is you write by myself write a small small portion, then run the script and see the result and then continue on. So now I normalize the variables, I like to select these two variables, so should be universal 12345 So, now I select number five okay all by no spacing in between no spacing in the brackets. Okay. I will want to do the data and the variables okay run Cool Blender run the script okay.
So in rd I have selected the to normalize variables if you want you can export is data as a CSV file or maybe something the A CMP a CME something ACM cm e tree CSV separator comma is co equal true so we can spot these selector variables ah we can spot these selected normalize variables So let's see a variable run. Okay. Okay if I spot a to d a cmeg okay d by modified AC mi t reviewing notepad plus, plus Okay, so this is a pull up a CSV file okay. So VBS file you can do some of the law you can use these data those are modification ID D and then put it into the modeling stage or the evaluation stage okay. You may need to do some modifications so there is two comma here. So, you may want to replace two comma with a comma and replace all okay, you may want to do something like this To export CSV and then modify it accordingly to your next okay though we we have completed the data preparation stage the preparation stage okay.
Previously we do the data understanding stage now we do the data preparation stage. So I as body normalize data so that normalize data can be used in a modeling stage. This is very clear the prediction models or the classifiers okay so the classifier can use the prepared data and training data and so on. Okay.