Skip to content

Embeddings Analysis #40

@daniel-z-kaplan

Description

@daniel-z-kaplan

PCA analysis is a common method of looking at embeddings spaces. It's a potentially relevant method here for our visualization.

For this task, we would like to take data from existing benchmarks (such as BACH), and embedd it, using a current checkpoint from the Midnight replication (this will be provided). These embeddings are also available after the evaluation process, natively.

We will try at least two different methods:

  • PCA analysis using top 4 components (matching the number of classes),
  • Clustering + coloration

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions