Apache Spark — Multi-part Series: Spark Architecture

By Medium - 2021-03-15

Description

Spark Architecture was one of the toughest elements to grasp when initially learning about Spark. I think one of the main reasons is that there is a vast amount of information out there, but nothing…

Summary

  • Apache Spark — Multi-part Series: Driver Node and Worker Node Architecture (created by Luke Thorp) Worker nodes are able to communicate and pass data between each other but in regards to work and tasks, the Driver node is solely responsible for providing Workers with jobs to complete.
  • Cluster Task Assignment (created by Luke Thorp) Once this is done and resources have been allocated, tasks are distributed to the worker nodes (executors) who have free time, and the driver program monitors the progress.
  • For example, there are tabs for Jobs and Stages.

 

Topics

  1. Backend (0.25)
  2. Machine_Learning (0.07)
  3. UX (0.07)

Similar Articles

HDBSCAN Clustering with Neo4j

By Medium - 2021-01-15

I recently came across the article “How HDBSCAN works” by Leland McInnes, and I was struck by the informative, accessible way he explained…

A Custom

By Kubernetes - 2020-12-21

Author: Chris Seto (Cockroach Labs) As long as you're willing to follow the rules, deploying on Kubernetes and air travel can be quite pleasant. More often than not, things will "just work". However, ...