Skip to content

Improve dialog2flow clustering handling with big datasets #72

@sergioburdisso

Description

@sergioburdisso

Dialog2Flow-based components relay on building the graph, however, when the number of dialogues is larger than 2000, the clustering takes too much time, this is something that must be improved. Some ideas:

  • Set a maximum number k as part of the class construction, if len(dataset) > k then we sample k dialogues.
  • Use a more efficient clustering algorithm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions