Skip to content

talbi28/data-algorithms-with-spark

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

201 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goal of this book: enable writing efficient & simpler code for data algorithms using Spark




Software:

Spark Python Scala Java
Apache Spark 3.2.0 Python 3.7.2 Scala 2.13 Java 8

Table of Contents

Chapter Title
Bonus Chapters Bonus Chapters (TF-IDF, Correlation, K-mers, anagrams, ...)
Chapter 1 Introduction to Data Algorithms
Chapter 2 Transformations in Action
Chapter 3 Mapper Transformations
Chapter 4 Reductions in Spark
Chapter 5 Partitioning Data
Chapter 6 Graph Algorithms
Chapter 7 Interacting with External Data Sources
Chapter 8 Ranking Algorithms
Chapter 9 Fundamental Data Design Patterns
Chapter 10 Common Data Design Patterns
Chapter 11 Join Design Patterns
Chapter 12 Feature Engineering in PySpark

Data Algorithms with Spark

About

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 50.7%
  • Scala 40.1%
  • Shell 9.2%