Add parallelization functions to the package #29

lvermue · 2019-04-05T16:45:19Z

Full parallelization was added to the package using the joblib library.
Now NxM matrices, i.e. N-time series with M-time points, can be calculated in parallel.
To embed different lengths the missing time points can be padded with np.nan values.

The changes were tested on a machine with 20 cores leading to following results:
Single core

from scipy.spatial.distance import euclidean
from time import time
from fastdtw import fastdtw

X = np.random.randint(1, 40, size=(100, 100))
dist_mat = np.zeros((X.shape[0], X.shape[0]))
indices = np.vstack(np.triu_indices(X.shape[0], 1)).T

start = time()

for row, column in indices:
    distance, path = fastdtw(X[row], X[column], dist=euclidean)
    dist_mat[row, column] = distance

print('It took {:.0f} seconds'.format(time()-start))

# It took 175 seconds

Parallel

import numpy as np
from scipy.spatial.distance import euclidean
from time import time
from fastdtw import fastdtw_parallel, get_path

# Using the X-matrix
X=X

start = time()

# Same machine with 20 cores
distance_matrix, path_list = fastdtw_parallel(X, dist=euclidean, n_jobs=-1)

print('It took {:.0f} seconds'.format(time()-start))

# It took 11 seconds

Examples on how to use the new functions were added to the README.rst file and the docstring of the respective functions.

Full parallelization was added using the joblib library as well as a helper function to handle the nested path lists. modified: README.rst modified: fastdtw/__init__.py modified: fastdtw/_fastdtw.pyx modified: fastdtw/fastdtw.py

The files have been updated to incorporate the respective changes within the _fastdtw.pyx and fastdtw.py files. modified: fastdtw/__init__.py modified: fastdtw/_fastdtw.cpp

slaypni · 2019-04-06T19:23:04Z

@lvermue Thank you for the PR!
The execution time improvement looks significant.

~~Cloud you writes tests for the new functions?~~
(Strikethroughed because of the following question)

slaypni · 2019-04-06T20:03:13Z

@lvermue Is just writing something like this code insufficient?

import itertools

from fastdtw import fastdtw
from joblib import Parallel, delayed
import numpy as np

X = np.random.randint(1, 40, size=(100, 100))
results = Parallel(n_jobs=-1)(delayed(fastdtw)(X[i], X[j]) for i, j in itertools.product(range(100), repeat=2))
distance_mat = np.array([r[0] for r in results]).reshape(100, 100)

lvermue · 2019-04-08T09:30:15Z

@slaypni There are two main aspects to this:

The way it is written now it includes some optimization considerations, namely:

It does not calculate the diagonal of the distance matrix

Because of the inner workings of python parallelization, the data transferred to the worker has to be serialized, causing overhead. The example given would require this serialization and de-serialization step for each pairwise comparison causing a lot of overhead. The way I wrote the module now, it looks at how many workers are available and cuts all tasks in even chunks, which are then transferred to the workers only once allowing each worker to receive data once, calculate all its tasks and return the result. The effect of this can be shown by testing both approaches and profiling them:

Simple script

 import itertools
 from time import time
 from fastdtw import fastdtw
 from joblib import Parallel, delayed
 import numpy as np

 n = 1000
 X = np.random.randint(1, 40, size=(n, 40))

 start = time()
 results = Parallel(n_jobs=-1)(delayed(fastdtw)(X[i], X[j]) for i, j in itertools.product(range(n), repeat=2))
 distance_mat = np.array([r[0] for r in results]).reshape(n, n)

 print('It took {:.0f} seconds'.format(time()-start))
 # It took 82 seconds

The profiling

name	Call Count	Time (ms)	Own Time (ms)
test_parallel_script.py	1	82177	3
call	1	81111	0
retrieve	1	81028	3027
wrap_future_result	5966	77949	8
result	5966	77941	48
wait	5113	77874	29
method 'acquire' of '_thread.lock' objects	10268	77825	77825

Method as proposed

import numpy as np
from scipy.spatial.distance import euclidean
from time import time
from fastdtw import fastdtw_parallel, get_path

# Using the X-matrix from above
X = X

start = time()
# Same machine with 20 cores
distance_matrix, path_list = fastdtw_parallel(X, n_jobs=-1)
print('It took {:.0f} seconds'.format(time()-start))

# It took 24 seconds

The profiling

name	Call Count	Time (ms)	Own Time (ms)
test_parallel.py	1	24683	0
fastdtw._fastdtw.fastdtw_parallel	1	24485	91
call	1	24161	0
method 'acquire' of '_thread.lock' objects	34	24019	24019
retrieve	1	24019	0
wait	6	24019	0
result	20	24018	0
wrap_future_result	20	24018	0

As to be seen in the two profiles, the proposed module calls each worker only once allowing it to be about 4 times faster than just writing a simple script.

It would not be as user-friendly, especially for less python versed users of this package.

Previously the method assumed symmetric behaviour of the DTW-method and created symmetric distance matrices by copying the upper triangle distances to the lower triangle. Now the method correctly calculates the lower triangle by explicitly calculating those inverse relations. modified: fastdtw/_fastdtw.cpp modified: fastdtw/_fastdtw.pyx modified: fastdtw/fastdtw.py

slaypni · 2019-04-13T22:09:13Z

@lvermue As you mentioned, the simple script could reduce the execution time by half replacing itertools.product with itertools.compinations to cut unnecessary pairs. Even in the case, the simple version takes 39ms which is still longer than proposed version by 60%.

So I think the proposed version is good for the use of computing distance matrix, but also prefer to have some changes in terms of its code structure.

Glimpsing diff of the code, I noticed there are same pattern of codes which seem redundant. So it is nicer to gather those codes.

And, computing distance matrix is a bit out of the scope of this package, however it would be nice to have convenient function to calculate it. So, I would like to have the function under fastdtw.util where some utilities can be placed rather than directly under fastdtw.

Taking those into account, I prefer something like the following distmat to be implemented instead of dtw_parallel and fastdtw_parallel.

from functools import partial
from fastdtw import fastdtw
from fastdtw.util import distmat

dists, paths = distmat(partial(fastdtw, radius=3), X)

lvermue added 2 commits April 5, 2019 18:24

Add parallelization options for all functions

e53747d

Full parallelization was added using the joblib library as well as a helper function to handle the nested path lists. modified: README.rst modified: fastdtw/__init__.py modified: fastdtw/_fastdtw.pyx modified: fastdtw/fastdtw.py

Update __init__.py file and Cython _fastdtw.cpp file

0b0d612

The files have been updated to incorporate the respective changes within the _fastdtw.pyx and fastdtw.py files. modified: fastdtw/__init__.py modified: fastdtw/_fastdtw.cpp

slaypni closed this Apr 6, 2019

slaypni reopened this Apr 6, 2019

lvermue added 2 commits April 8, 2019 13:15

Merge remote-tracking branch 'upstream/master'

3f780b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add parallelization functions to the package #29

Add parallelization functions to the package #29

Uh oh!

lvermue commented Apr 5, 2019

Uh oh!

slaypni commented Apr 6, 2019 •

edited

Loading

Uh oh!

slaypni commented Apr 6, 2019 •

edited

Loading

Uh oh!

lvermue commented Apr 8, 2019

Uh oh!

slaypni commented Apr 13, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add parallelization functions to the package #29

Are you sure you want to change the base?

Add parallelization functions to the package #29

Uh oh!

Conversation

lvermue commented Apr 5, 2019

Uh oh!

slaypni commented Apr 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaypni commented Apr 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvermue commented Apr 8, 2019

Uh oh!

slaypni commented Apr 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

slaypni commented Apr 6, 2019 •

edited

Loading

slaypni commented Apr 6, 2019 •

edited

Loading

slaypni commented Apr 13, 2019 •

edited

Loading