Real-Time Spark Project for Beginners: Hadoop, Spark, Docker

In many data centers, a different type of server generates a large amount of data (events, in this case, is the status of the server in the data center) in real-time. There is always a need to process these data in real-time and generate insights that will be used by the server/data center monitoring people and they have to track these server's status regularly and find the resolution in case of issues occurring, for better server stability. Since the data is huge and coming in real-time, we need to choose the right architecture with scalable storage and computation frameworks/technologies.

Hence we want to build the real-time data pipeline using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django, and Flexmonster on Docker to generate insights out of this data. The Spark Project/Data Pipeline is built using Apache Spark with Scala and PySpark on Apache Hadoop Cluster which is on top of Docker. Data visualization is built using the Django web framework and Flexmonster.

Author

Pari Margu

Data Engineer(Big Data/Hadoop, Apache Spark, Python) cum Freelance Consultant, YouTube Creator. Having 12+ years of experience in implementing solutions to the enterprise clients and having strong Framework skills to implement complex business solutions. Worked on, Web, Windows, Mobile and Hadoop/Big Data, Apache Spark applications. Having 6+...

School

Pari Margu's School

Requirements

You should have a basic understanding of programming language and Apache Hadoop and Apache Spark.

Class Contents

Introduction

Introduction to Apache Spark

Preview

Real Time Spark Project Overview: Building End to End Streaming Data Pipeline

Environment Setup

Setting up Docker Environment

Create Single Node Kafka Cluster on Docker

Create Single Node Apache Hadoop and Spark Cluster on Docker

Setting up IntelliJ IDEA Community Edition (IDE)

Setting up PyCharm Community Edition (IDE)

Setting up Django Web Framework

Development: Project Code Walk-through

Event Simulator using Python(Server Status Detail)

Building Streaming Data Pipeline using Scala | Spark Structured Streaming

Building Streaming Data Pipeline using PySpark | Spark Structured Streaming

Setting up PostgreSQL Database(Events Database)

Building Dashboard using Django Web Framework and Flexmonster | Visualization

Complete Project Demo

Real Time Spark Project Demo

Running Real Time Streaming Data Pipeline using Spark Cluster On Docker

Bonus Tutorial: Docker Tutorial for Beginners

Introduction to Docker

Install Docker on Ubuntu 18.04

Docker Commands: Commonly Used

Create First Docker Image and Container

Create MySQL Docker Container

Cassandra on Docker Container

MongoDB on Docker Container

Setting up Docker Compose

How to Create Docker Volume

One-time Fee

$19.99

€17.07

£14.94

CA$27.59

A$30.03

S$25.86

HK$155.56

CHF 15.93

NOK kr202.05

DKK kr127.53

NZ$34.42

د.إ73.41

৳2,430.77

₹1,806.15

RM82.15

₦28,957.71

₨5,616.87

฿634.88

₺851.90

B$109.81

R338.02

Лв33.39

₩29,436.46

₪64.22

₱1,179.39

¥3,115.77

MX$363.60

QR72.43

P264.68

KSh2,578.71

E£950.32

ብር3,095.14

Kz18,233.75

CLP$18,467.36

CN¥141.09

RD$1,280.08

DA2,595.76

FJ$45.41

Q152.24

GY$4,161.78

ISK kr2,537.33

DH183.65

L340.43

ден1,051.21

MOP$159.40

N$337.74

C$735.33

रु2,860.78

S/66.80

K84.87

SAR75.01

ZK462.08

L86.91

Kč413.54

Ft6,508.90

SEK kr185.03

ARS$28,740.60

Bs137.39

COP$76,653.87

₡9,786

L523.36

₲135,392.50

$U778.73

zł72.17

What's Included

Language: English

Level: All levels

Skills: Apache Kafka, PostgreSQL, Flexmonster, Apache Hadoop, Apache Spark, Django, Docker

Age groups: 18+ years

Duration: 6 hours 34 minutes

24 Videos

13 Documents

All Topics

Free
    Live Classes

    Recorded Classes

    Products

    Bundles

    Videos

    Programs
Academics
Business
Creative
Health and Fitness
LifeStyle
Personal Development
Software

Academics

Creative

Health and Fitness

LifeStyle

Personal Development

Software

Admissions

Engineering

Hardware

Hospitality

Humanities

Chinese

Languages

Maths

Other

Pharma

BioPharma

Research

Science

Teaching

Test Preparation

K-12

School

IGCSE

Accounting

Advertising

Analysis

Analytics

Business Communication

Writing

eCommerce

Entrepreneurship

Finance - India

Investing

Freelancing

Internet of Things

Digital Transformation

Human Resources

Industry

Management

Marketing

Media

Operations

Other

Law
Security

Project Management

Public Relations

Real Estate

Sales

Strategy

Audio Editing

Premiere Pro

Audio Production

Dance

Design

Film Production

Music

Photography

Video Production

Writing

Dieting

Food Safety

Games

Chess

Medical

Medical Professionals

Meditation

Pregnancy

Safety & First Aid

Self Defense

Sports

Beauty & Makeup

Food

Fashion

Gaming

Home Improvement

Parenting

Pet Care & Training

Relationships

Sustainable Living

Travel

Career Development

Religion and Spirituality

Accounting

Amazon Web Services

App Development

Continuous Integration

Backup Software

Business Automation

Computational Fluid Dynamics

Business Intelligence

Computer Aided Design (CAD)

Content Management System

Customer Relationship Management

Database

Data Mining

E-Commerce

Enterprise Asset Management

Enterprise Resource Planning

Game Development

Google Cloud

Linux

Artificial Intelligence

Machine Learning

Master Data Management

Microsoft

Music Software

Ableton

Network and Security

Open Source

Operating System

Other

Process Management

Oracle

Productivity Software

Programming Languages

Robotics

Supply Chain Management

Testing

Teaching

LearnDesk

Web Development

Real-Time Spark Project for Beginners: Hadoop, Spark, Docker

Real-Time Spark Project for Beginners: Hadoop, Spark, Docker

About the Class

Author

Pari Margu

School

Pari Margu's School

Requirements

Class Contents

What's Included

Sign Up

Sign Up

Share