Course curriculum

    1. What is PySpark?

    2. PySpark Features & Advantages

    3. PySpark Architecture

    4. PySpark Ecosystem

    5. Use Cases and Applications of PySpark

    1. Install PySpark 3.5 on MacOS

    2. Install PySpark 3.5 on Windows

    3. Install Anaconda, PySpark 3.5 and Jupyter Notebook

    1. What is Spark Session?

    2. Creating SparkSession

    3. SparkSession Most Used Methods

    4. What is Spark Context?

    5. What does SparkContext do?

    6. SparkContext Most Used Methods

    7. FAQ's or Interview Questions

    1. RDD - Introduction

    2. RDD - Create RDD from Parallelize

    3. RDD - Collect Data from RDD

    4. RDD - Read Text and CSV File

    5. RDD - How to Parallelize RDD?

    6. RDD - Transformations

    7. RDD - Actions

    8. RDD - Word Count Example

    9. RDD - Repartition

    10. RDD - Types of RDD

    1. RDD - Cache and Persistence

    2. Spark Persistence Levels

    3. RDD - Broadcast Variables

    4. RDD - Accumulator Variable

    1. PySpark Next Steps

    2. RDD Examples to Explore

About this course

  • Free
  • 31 lessons

Discover your potential, starting today