Name	Name	Last commit message	Last commit date
parent directory ..
src/basicdask	src/basicdask
README.md	README.md
pyproject.toml	pyproject.toml
requirements-dev.lock	requirements-dev.lock
requirements.lock	requirements.lock

Name

Last commit message

Last commit date

src/basicdask

README.md

pyproject.toml

requirements-dev.lock

requirements.lock

Some Examples of Dask

Dask can be used for programming data processing functions in a data platform. For example:

extract and transform data which can be described in DataFrame
define custom task graphs of data processing.

The examples here illustrate a few aspects that we discuss how Dask supports the embarrassingly parallel model and task graphs for data processing.

Setup

Set up Dask

Setting up Dask can be done by following the Dask document.

The dask and libraries required for the code examples can be installed by running rye:

$rye sync

This will install dask and other libraries for the example

Create a simple distributed cluster in a single machine

Follow Dask to create a distributed cluster within a single machine but basically the following step can create the cluster:

Run a schedule in a terminal

$dask scheduler

Open new terminals and run a worker (as many as you want)

$dask worker localhost:8786 --nworkers 2 --nthreads 4

assume "localhost" and 8786 are information about the scheduler. here the number of workers and threads per worker are based on your need.

Then open the dashboard to see if things work (http://localhost:8787/status)

Running example code

Running the code examples by python, after enabling the environment

$python src/...

$rye run python src/...

Examples

We use taxi data and bts data as examples.

DataFrame partitions: see how data can be partitioned in Dask.
Basic calculation with Dask DataFrame: illustrates a simple Dask program that is very similar to pandas in a local machine.
Distributed Dask calculation with Dash DataFrame: illustrates using distributed resources for data processing.
Using delayed and future features to define a task graph: illustrates delayed and future tasks for customized graphs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Some Examples of Dask

Setup

Set up Dask

Create a simple distributed cluster in a single machine

Running example code

Examples

FilesExpand file tree

basicdask

Directory actions

More options

Directory actions

More options

Latest commit

History

basicdask

Folders and files

parent directory

README.md

Some Examples of Dask

Setup

Set up Dask

Create a simple distributed cluster in a single machine

Running example code

Examples