Cleaning data is a widely known process that can let us explore data and see beyond its raw form. Multiple technologies can solve this task, but we have a problem.
The data-driven problem we face: Whenever you want to import a CSV file, by habit, you go to Google and see how to find the two lines that you always forget (in Python for example) so you get them to open up your text editor to make a file and paste what you found in it.
Why the command line?
The simplest data cleaning tasks might sound frustrating or time-wasting and maybe you use a higher-level library like Pandas but I bet you still write more code than just dealing with the terminal which can pack a bunch of lines of codes into just one-liner at the command line.
This ebook makes dealing with CSV files, JSON, or in general any text file much easier.
What's in it for you?
In this ebook, I'm trying to save your time and the hassle of dealing with files at the system level. You may also like the adventure of exploring command-line tools and programs that you may not have heard of. I encourage you to try these tools as I do on my workdays.
While dealing with the command line may sound a bit geeky, this ebook is simple and easy to follow, and it's a lot of fun. There are real examples from a scientific paper, Covid tracking project data, Reddit user data, and more that you can practice with and try useful programs and tools at the comfort of your command line.
Content:
Who should take this Product?
If you are a data scientist, data engineer, data analyst, software developer, or you use data a lot (like TXT, CSV, or JSON), this ebook is for you.
You should have a basic understanding of how the terminal works.