Alteryx Essentials Data Preparation

By: Edgecate

4 minutes

Share the link to this page

Copied

Facebook

Twitter

Add the class to your calendar

Add to Google Calendar

Add to Apple Calendar

Add to Yahoo Calendar

Add to Outlook Calendar

Completed

You need to have access to the item to view this lesson.

One-time Fee

$49.99

List Price: $69.99

You save: $20

€42.53

List Price: €59.54

You save: €17.01

£36.81

List Price: £51.54

You save: £14.73

CA$69.03

List Price: CA$96.65

You save: CA$27.62

A$75.57

List Price: A$105.80

You save: A$30.23

S$64.01

List Price: S$89.62

You save: S$25.60

HK$389.24

List Price: HK$544.98

You save: HK$155.73

CHF 39.67

List Price: CHF 55.54

You save: CHF 15.87

NOK kr498.32

List Price: NOK kr697.69

You save: NOK kr199.36

DKK kr317.57

List Price: DKK kr444.63

You save: DKK kr127.05

NZ$83.94

List Price: NZ$117.53

You save: NZ$33.58

د.إ183.58

List Price: د.إ257.03

You save: د.إ73.45

৳6,081.69

List Price: ৳8,514.86

You save: ৳2,433.16

₹4,406.72

List Price: ₹6,169.76

You save: ₹1,763.04

RM210.25

List Price: RM294.37

You save: RM84.12

₦75,331.93

List Price: ₦105,470.73

You save: ₦30,138.80

₨14,140.22

List Price: ₨19,797.45

You save: ₨5,657.22

฿1,582.93

List Price: ฿2,216.23

You save: ฿633.30

₺2,062.63

List Price: ₺2,887.85

You save: ₺825.21

B$271.02

List Price: B$379.45

You save: B$108.43

R872.93

List Price: R1,222.17

You save: R349.24

Лв83.06

List Price: Лв116.29

You save: Лв33.23

₩69,266.95

List Price: ₩96,979.27

You save: ₩27,712.32

₪166.71

List Price: ₪233.41

You save: ₪66.69

₱2,843.40

List Price: ₱3,980.99

You save: ₱1,137.59

¥7,322.04

List Price: ¥10,251.45

You save: ¥2,929.40

MX$929.42

List Price: MX$1,301.27

You save: MX$371.84

QR182.03

List Price: QR254.86

You save: QR72.82

P667.26

List Price: P934.22

You save: P266.95

KSh6,458.70

List Price: KSh9,042.70

You save: KSh2,584

E£2,398.08

List Price: E£3,357.51

You save: E£959.42

ብር7,111.06

List Price: ብር9,956.05

You save: ብር2,844.99

Kz45,588.63

List Price: Kz63,827.73

You save: Kz18,239.10

CLP$48,443.30

List Price: CLP$67,824.50

You save: CLP$19,381.20

CN¥356

List Price: CN¥498.43

You save: CN¥142.43

RD$3,152.87

List Price: RD$4,414.27

You save: RD$1,261.40

DA6,481.50

List Price: DA9,074.62

You save: DA2,593.12

FJ$112.04

List Price: FJ$156.86

You save: FJ$44.82

Q383.16

List Price: Q536.45

You save: Q153.29

GY$10,454.61

List Price: GY$14,637.29

You save: GY$4,182.68

ISK kr6,099.27

List Price: ISK kr8,539.47

You save: ISK kr2,440.20

DH451.34

List Price: DH631.91

You save: DH180.57

L828.54

List Price: L1,160.02

You save: L331.48

ден2,615.10

List Price: ден3,661.35

You save: ден1,046.25

MOP$400.06

List Price: MOP$560.11

You save: MOP$160.05

N$875.74

List Price: N$1,226.11

You save: N$350.36

C$1,833.46

List Price: C$2,566.99

You save: C$733.53

रु7,014.83

List Price: रु9,821.33

You save: रु2,806.49

S/175.76

List Price: S/246.08

You save: S/70.32

K210.10

List Price: K294.16

You save: K84.05

SAR187.52

List Price: SAR262.55

You save: SAR75.02

ZK1,194.75

List Price: ZK1,672.74

You save: ZK477.99

L215.78

List Price: L302.11

You save: L86.33

Kč1,034.19

List Price: Kč1,447.94

You save: Kč413.75

Ft16,724.66

List Price: Ft23,415.87

You save: Ft6,691.20

SEK kr467.96

List Price: SEK kr655.18

You save: SEK kr187.22

ARS$71,136.08

List Price: ARS$99,596.21

You save: ARS$28,460.12

Bs346.54

List Price: Bs485.18

You save: Bs138.64

COP$198,845.49

List Price: COP$278,399.60

You save: COP$79,554.10

₡25,201.42

List Price: ₡35,284.01

You save: ₡10,082.58

L1,307.97

List Price: L1,831.26

You save: L523.29

₲359,195.38

List Price: ₲502,902.28

You save: ₲143,706.89

$U1,997.91

List Price: $U2,797.24

You save: $U799.32

zł180.90

List Price: zł253.28

You save: zł72.37

Already have an account? Log In

Transcript

The data cleansing tool replaces and removes inconsistent or improperly formatted data in your inputs. I know that sounds a bit abstract, so let's go through it with an example. As you can see with the illustration on the right, the first picture shows in red boxes, all the anomalies we have with our data set. employee ID has random white spaces between the numbers. First name has punctuation issues, h has a couple of random tabs, and favorite coffee has two null values. What we're going to do in this exercise is import our example HR sheet, Dragon Age a data cleansing tool to clean it, and then run our workflow to view the results of our cleaned data set.

Let's start a new workflow by importing spreadsheet 2.1. We'll go to the input data tool and connect to our spreadsheet. In here we have our example HR five from chapter one, but this time it needs to be cleansed. To get a better view of our data will add a browse tool to our input data tool by using the following keyboard shortcut Ctrl Shift V will run our workflow with Ctrl R. And we can see at the bottom in our preview pane that with employee ID, there's a couple of white spaces. First name has punctuation issues. Age has a couple of random tabs, and under favorite coffee, there are two null values.

A quick way to identify issues with our data set is to check the color of each column. If it's anything but green, there's potentially something wrong with it. under age ultrix tells us that 40% of our records in age aren't okay. And under favorite coffee. There are 20% null values having these anomalies in your data might not seem like a big deal, but it can easily throw your data set off. From an ETL or data ingestion point of view, not cleaning the data before ingesting it can corrupt the data set.

And as a user of it can skew the results of your queries. So cleaning the data set is very important before using it. Let's go to the preparation tab and drag in a data cleansing tool into our workflow. In the configuration pane on the left, we have several options for selecting which fields we want to clean and how to clean them. The fields in scope for this exercise where employee ID, first name, age, and favorite coffee. Under replace Knowles, let's replace null string values with an empty piece of string and replace no numeric fields with zero.

Just to quickly cover off nulls and blank values are two different things. means that nothing is stored in that field whilst blank means a blank value is stored in that field. Under remove unwanted characters, we can untick leading and trailing white spaces. Since we didn't have any, we can remove tabs line breaks and duplicate white spaces to fixed values that we had in age. Under all white spaces, we'll take that to fix employee ID, which had white spaces between the numbers. We can leave letters and numbers alone since we actually want those in our data set.

And we can tick punctuation to remove exclamation marks in first name. Let's run our workflow with Ctrl R. And we can see that our data set has been cleansed. All our columns show a green color and we can see that the whitespace has been removed from this record. First name has the punctuation is removed The tabs in age are gone and our null values and favorite coffee have been removed.

Alteryx Essentials

2.1 Data Cleansing

Transcript

Sign Up

Sign Up

Share