Webinar

Apache Spark Basics

Inhalte

  • Introduction to Apache Spark with Python (PySpark)
    • Overview of big data processing challenges
    • Introduction to distributed computing and parallel processing
    • Introduction to Sparks architecture and components (driver, executor, cluster manager)
    • Comparison with traditional batch processing frameworks (Hadoop MapReduce)
    • Setting up Spark with Python-Shell
  • Spark Fundamentals with PySpark
    • Understanding Resilient Distributed Datasets (RDDs)
      • RDD characteristics (immutable, partitioned, resilient)
      • RDD operations: transformations (map, filter, flatMap, etc.) and actions (count, collect, reduce, etc.)
      • Lazy evaluation and lineage in Spark
    • Hands-on exercises using PySpark
  • Spark Streaming
    • Introduction to Spark Streaming
    • Streaming data processing concepts
    • DStream (Discretized Stream) operations in Spark Streaming
      • Windowed operations
      • Stateful processing using updateStateByKey()
    • Handling data sources (Flume, Kafka) and sinks (HDFS, Cassandra) in Spark Streaming
    • Hands-on exercises with Spark Streaming
  • Integration with Flume, Kafka, and Cassandra
    • Introduction to Apache Flume and its integration with Spark
      • Overview of Flumes event-based data ingestion
      • Setting up Flume agents and Spark integration
    • Integration of Apache Kafka with Spark Streaming
      • Overview of Kafkas distributed publish-subscribe messaging system
      • Configuring Kafka and Spark integration for real-time data processing
    • Introduction to Apache Cassandra and its integration with Spark
      • Overview of Cassandras distributed NoSQL database
      • Connecting Spark to Cassandra for data storage and retrieval
LernzieleThe goal of the Apache Spark Basics course is to provide participants  with a solid understanding of Apache Spark and its fundamental concepts.  By the end of the course, participants should be able to understand the  challenges of big data processing and the advantages of Spark. They  will gain comprehension of Sparks architecture and its components, such  as the driver, executor, and cluster manager. Participants will also  learn how to work with Resilient Distributed Datasets (RDDs) and perform  various transformations and actions on them. Additionally, they will  acquire knowledge of Spark Streaming for real-time data processing and  gain the ability to integrate Spark with other technologies like Flume,  Kafka, and Cassandra. Through hands-on exercises using PySpark,  participants will develop practical skills and gain the confidence to  effectively utilize Apache Spark for big data processing and analytics  tasks.Zielgruppen
  • Data Engineers: Data engineers responsible for processing and analyzing large datasets can benefit from learning Apache Spark to leverage its distributed computing capabilities.
  • Data Scientists: Data scientists looking to work with big data and perform advanced analytics can enhance their skills by gaining knowledge of Apache Spark and its machine learning library, MLlib.
  • Software Developers: Software developers interested in distributed computing and working with big data can expand their skill set by learning Apache Spark and PySpark.
  • Data Analysts: Data analysts who want to analyze and process large datasets efficiently can learn Apache Spark to improve their data processing workflows.
  • IT Professionals: IT professionals involved in managing big data infrastructure and processing can benefit from understanding Apache Sparks architecture and its integration with other technologies.
  • Introduction to Apache Spark with Python (PySpark)
    • Overview of big data processing challenges
    • Introduction to distributed computing and parallel processing
    • Introduction to Sparks architecture and compo...
Mehr Informationen

Termine und Orte

Datum Dauer Preis
Webinar
23.07.2026 - 24.07.2026 14 h 14 h Details Details Jetzt buchen
10.12.2026 - 11.12.2026 14 h 14 h Details Details Jetzt buchen
11.03.2027 - 12.03.2027 14 h 14 h Details Details Jetzt buchen
10.06.2027 - 11.06.2027 14 h 14 h Details Details Jetzt buchen
09.09.2027 - 10.09.2027 14 h 14 h Details Details Jetzt buchen
09.12.2027 - 10.12.2027 14 h 14 h Details Details Jetzt buchen

SG-Seminar-Nr.: 9261306

Anbieter-Seminar-Nr.: 2973

Termine

  • 23.07.2026 - 24.07.2026

    Webinar

  • 10.12.2026 - 11.12.2026

    Webinar

  • 11.03.2027 - 12.03.2027

    Webinar

  • 10.06.2027 - 11.06.2027

    Webinar

  • 09.09.2027 - 10.09.2027

    Webinar

Seminare mit Termin haben Plätze verfügbar. Rechnung erfolgt durch Veranstalter. Für MwSt. Angabe auf den Termin klicken.

Seminar merken
Vergleichen
Jetzt buchen

Der Anbieter ist für den Inhalt verantwortlich.

Veranstaltungsinformation

  • Webinar
  • Deutsch
    • Keine
  • 14 h
  • Anbieterbewertung   (258)

Ihre Vorteile

mehr erfahren
  • Anbietervergleich von über 1.500 Seminaranbietern
  • Vollständige Veranstaltungsinformationen
  • Schnellbuchung
  • Persönlicher Service
Datum Dauer Preis
Webinar
23.07.2026 - 24.07.2026 14 h 14 h Details Details Jetzt buchen
10.12.2026 - 11.12.2026 14 h 14 h Details Details Jetzt buchen
11.03.2027 - 12.03.2027 14 h 14 h Details Details Jetzt buchen
10.06.2027 - 11.06.2027 14 h 14 h Details Details Jetzt buchen
09.09.2027 - 10.09.2027 14 h 14 h Details Details Jetzt buchen
09.12.2027 - 10.12.2027 14 h 14 h Details Details Jetzt buchen