Skip to content

An extension to the amazing Spark framework for better functional programming.

License

Notifications You must be signed in to change notification settings

gm-spacagna/sparkz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sparkz

Build Status

A proof-of-concept extension to the amazing Spark framework for better functional programming. The project aims to extend, and in a few cases re-implement, some of the functionalities and classes in the Apache Spark framework.

The main motivation is to make statically typed the APIs of some Machine Learning components, to provide the missing functional structures of some classes (Broadcast variables, data validation pipelines, utility classes...) and to work around the unnecessary limitations imposed by private fields/methods. Moreover, the project introduces a bunch of util functions, implicits and tutorials to show the power, conciseness and elegance of the Spark framework when combined with a fully functional design.

Sonatype dependency

Maven:

<dependency>
  <groupId>com.github.gm-spacagna</groupId>
  <artifactId>sparkz_2.10</artifactId>
  <version>0.1.0</version>
</dependency>

sbt:

"com.github.gm-spacagna" % "sparkz_2.10" % "0.1.0"

Current features

WIP

  • Functor for Spark Broadcast

Limitations

The original Spark implementations are intentionally not fully functional in order to avoid overloading the garbage collector and have more efficient and mutable data structures. This project is only a proof-of-concept with the goal of inspiring developers, data scientists and engineers to think their design in pure functional terms but does not guarantee better performances. It is strongly encouraged to tailor and tune each component based on your speficif needs.

Related projects

About

An extension to the amazing Spark framework for better functional programming.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages