Maintining an order by nagpall · Pull Request #21695 · apache/spark

nagpall · 2018-07-02T09:06:19Z

What is the problem?

In both IndexedRowMatrix.computeSVD and IndexedRowMatrix.multiply indices are dropped before calling the methods from RowMatrix.
For the IndexedRowMatrix.multiply I have observed that ordering within partitions is preserved, but that it seems to get mixed up between partitions. For example, for:

part1Index1 part1Vector1
part1Index2 part1Vector2
part2Index1 part2Vector1
part2Index2 part2Vector2

I got:

part2Index1 part1Vector1
part2Index2 part1Vector2
part1Index1 part2Vector1
part1Index2 part2Vector2

You can find the more details here :
https://issues.apache.org/jira/browse/SPARK-8614

What changes were proposed in this pull request?

Instead of converting IndexedRowMatrix to RowMatrix and loosing index, we are keeping it IndexedRowMatrix and taking out index and row matrix and then multiplying the row with matrix and placing it at right index.

How was this patch tested?

With this changes all Ut's are passing for mllib module.

Please review http://spark.apache.org/contributing.html before opening a pull request.

AmplabJenkins · 2018-07-02T09:08:16Z

Can one of the admins verify this patch?

srowen · 2018-07-02T14:06:53Z

As noted above, please see http://spark.apache.org/contributing.html

tygert · 2018-07-02T18:35:23Z

As requested, I can comment here: the issue cited is indeed a problem, and is fixed in hl475/svd#1

Maintining an order

d833d1e

nagpall closed this Jul 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Maintining an order#21695

Maintining an order#21695
nagpall wants to merge 1 commit intoapache:branch-2.3from
nagpall:patch-spark-8614

nagpall commented Jul 2, 2018

Uh oh!

AmplabJenkins commented Jul 2, 2018

Uh oh!

srowen commented Jul 2, 2018

Uh oh!

tygert commented Jul 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

nagpall commented Jul 2, 2018

What is the problem?

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

AmplabJenkins commented Jul 2, 2018

Uh oh!

srowen commented Jul 2, 2018

Uh oh!

tygert commented Jul 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants