Big Data Engineering - TutorialScience

Course Overview

Dive into the world of big data engineering and learn to build scalable systems that handle massive datasets. Master the tools and technologies used by top tech companies to process petabytes of data efficiently.

What You'll Learn

Apache Spark for distributed computing
Hadoop ecosystem and HDFS
Cloud platforms (AWS, Azure, GCP)
Stream processing with Kafka
Data pipeline orchestration
NoSQL databases (MongoDB, Cassandra)

Curriculum

Weeks 1-2: Big Data Fundamentals

Introduction to big data concepts, distributed systems, and data engineering principles

Weeks 3-5: Hadoop Ecosystem

HDFS, MapReduce, Hive, HBase, and cluster management

Weeks 6-8: Apache Spark

Spark Core, SQL, Streaming, MLlib, and performance optimization

Weeks 9-10: Cloud & Streaming

Cloud data services, Kafka, real-time processing, and microservices

Week 11: NoSQL & Data Lakes

NoSQL databases, data lake architecture, and modern data stack

Week 12: Capstone Project

Build and deploy a complete big data solution with real-time analytics

Course Details

Duration: 12 weeks

Level: Advanced

Students: 1,156

Rating:

4.5 (156)

Price: $379

Your Instructor

Michael Thompson

Principal Data Engineer at Netflix

MS in Computer Science from Stanford, 12+ years in big data, architect of Netflix's recommendation data pipeline serving 200M+ users.

Prerequisites

Strong programming skills (Python/Java)
Database and SQL knowledge
Basic distributed systems concepts
Linux command line experience

Hands-on Projects

Real-time Analytics Pipeline

Build a streaming data pipeline using Kafka and Spark for real-time user behavior analysis.

Kafka + Spark

Data Lake Architecture

Design and implement a cloud-based data lake using AWS services for petabyte-scale storage.

AWS + Hadoop

ML Pipeline at Scale

Create an end-to-end machine learning pipeline that processes millions of records daily.

Spark MLlib

Student Success Stories

"This course was exactly what I needed to advance from traditional databases to big data. Now I'm leading data architecture at Spotify!"

Ryan Chen

Senior Data Engineer, Spotify

"The hands-on projects were incredible. Building real-time pipelines with Michael's guidance prepared me perfectly for my role at Uber."

Priya Sharma

Data Platform Engineer, Uber