In previous experiments from @vysarge (June 2022) it was found that tf.RaggedTensor representation is slower than using fixed-length dense tf.Tensor for embedding lookup, as shown in this spreadsheet.
This tasks is about benchmarking the difference of embeddings lookup for dense x ragged multi-hot columns, as MM does extensive usage of tf.RaggedTensor for multi-hot and for sequential / session-based recommendation.
Notes
- Merlin dataloader will output ragged tensors (__values, __offsets format) if in column schema the value_count.max is None and will output tf.Tensor if value_count.max == value_count.min.
- You can find more information in this related PR on ragged tensors padding).