Skip to content

lmizner/Codecademy_Big_Data_with_PySpark

Repository files navigation

Codecademy_Big_Data_with_PySpark

See how big data is used across different industries and learn how to work with big data using PySpark!

Course covers the following topics:

  • Introduction to Big Data

    • Learn about how we define big data, how big data is stored and processed, and what ethical considerations we need to keep in mind
  • Spark RDDs with PySpark

    • Learn one way that Spark handles big data – through Resilient Distributed Datasets (RDDs)
  • Spark DataFrames with PySpark SQL

    • Learn about how PySpark lets you do SQL-like queries on big data datasets
  • Putting it all Together

    • Combine everything you’ve learned so far about PySpark to work with a big data dataset!

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published