-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Analyses thus far have used Euclidean distance, which has worked well enough for initial eyeballing. However, it doesn't distinguish much between a small value and zero, which is important given the PUF's sparsity. One rule of thumb proposed is that Euclidean isn't useful when less than 3/4 of attributes are non-zero, which is certainly the case in the PUF.
That same thread suggested that cosine similarity can be better in these cases, though a comment here suggests it's best for categorical data. Cosine similarity should be normalized. Others like Gower and Mahalanobis distances can be investigated here.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels