Skip to content

Conversation

@pkufool
Copy link
Collaborator

@pkufool pkufool commented Aug 2, 2021

We met serveral crashes when decoding with single wav, as post in k2-fsa/snowfall#239. I thought these crashes are all about empty inputs.

@luomingshuang 's issue locating in the k2.index_select function, occurs when the input indexes is an empty tensor, see the comments in source code for details.

@alucassch 's issue occuring in the nbest_decoding, is a bug in the k2.ragged.arg_max_persublist,. The function return [-1] with the empty input, which causes a strange behavior in the following code.

Update: I made a mistake, the return value of arg_max_persublist is expected, this was also a bug in k2.index(src, index). If src is an empty tensor, k2.index should return an empty tensor, in current implementation, k2.index([], [-1]) would return [0].

argmax_indexes = k2.ragged.argmax_per_sublist(ragged_tot_scores)


# Since we invoked `k2.ragged.unique_sequences`, which reorders
# the index from `paths`, we use `new2old`
# here to convert argmax_indexes to the indexes into `paths`.
#
# Use k2.index here since argmax_indexes' dtype is torch.int32
best_path_indexes = k2.index(new2old, argmax_indexes)


paths_2axes = k2.ragged.remove_axis(paths, 0)


# best_paths is a k2.RaggedInt with 2 axes [path][arc_pos]
best_paths = k2.index(paths_2axes, best_path_indexes)

@pkufool
Copy link
Collaborator Author

pkufool commented Aug 2, 2021

I fixed an issue in the TransposeRagged before #782, it was also about the empty input.

After reading lots of code, I am still confusing about the empty ragged.I think we have two kinds of empty ragged, their row_splits(1) could be [0] or [0, 0] (with num_axes equals to two). They are all valid ragged, for example, an Fsa with no state and an Fsa with one state but no arcs.

Anyway, there are certainly some other bugs relating to the empty ragged in our code, I will find time to find them out.

@pkufool pkufool changed the title Fix some bugs of empty inputs [WIP] Fix some bugs of empty inputs Aug 2, 2021
@pkufool pkufool marked this pull request as draft August 3, 2021 00:51
@danpovey
Copy link
Collaborator

danpovey commented Aug 3, 2021

Yes, in a 2-d ragged tensor, [ ] and [ [] ] are not the same but they both have zero elements. Note: the text form does not tell you the number of axes, for example [ ] is a representation of a valid ragged tensor with any number of axes.

@pkufool pkufool marked this pull request as ready for review August 3, 2021 03:53
@pkufool pkufool changed the title [WIP] Fix some bugs of empty inputs Fix some bugs of empty inputs Aug 3, 2021
@danpovey
Copy link
Collaborator

danpovey commented Aug 3, 2021

It looks fine to me. If you find any case where you are really not sure what should happen, because in your opinion the comments on the interface do not properly define the behavior for some input, then let me know and we can discuss it.

@pkufool pkufool merged commit 81cec9e into k2-fsa:master Aug 7, 2021
@pkufool pkufool deleted the empty_input branch April 25, 2022 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants