Diversity Acquisition Functions by jaiswalsuraj487 · Pull Request #12 · sustainability-lab/ASTRA

jaiswalsuraj487 · 2023-11-01T00:40:48Z

Implemented Furthest Acquisition and Centroid Acquisition on commit a6e59ff.

Files Added:

astra/torch/al/acquisitions/furthest.py: contain implementation of Furthest acquisition
astra/torch/al/acquisitions/centroid.py: contain implementation of Centroid acquisition
astra/torch/al/strategies/diversity.py: modified this file as per the need
tests/torch/acquisitions/test_furthest.py: contains test for furthest acquisition function.
tests/torch/acquisitions/test_centroid.py: contains test for centroid acquisition function.

Passes all test cases, including those already existing(commit: a6e59ff).

Explanation:

furthest.py: For the furthest acquisition function, we use the furthest_first method of Class distil.active_learning_strategies.core_set.CoreSet link where we pass dummy object strategy as an argument along with labeled_embeddings, unlabeled_embeddings and n. This returns list of indices of n data points that are furthest from all.
centroid.py: For the centroid acquisition function: For the Centroid Acquisition function, we pass labeled_embeddings, unlabeled_embeddings , and n as input.

Below lines initializes min_dist as tensor with all values infinity of size [len(n_pool)] when our n_train is 0.

    if labeled_embeddings.shape[0] == 0:
        min_dist = torch.full((unlabeled_embeddings.shape[0],), float("inf"))

Else we find centroid of train data and then pairwise distance between centroid and all pool data.

    else:
        centroid_embedding = torch.mean(labeled_embeddings, dim=0).unsqueeze(0)
        dist_ctr = torch.cdist(unlabeled_embeddings, centroid_embedding, p=2)
        min_dist = torch.min(dist_ctr, dim=1)[0]

We find index of n points from pool data, which has max distance.

    idxs = []
    for i in range(n):
        idx = torch.argmax(min_dist)
        idxs.append(idx.item())
        dist_new_ctr = torch.cdist(unlabeled_embeddings, unlabeled_embeddings[[idx], :])
        min_dist = torch.minimum(min_dist, dist_new_ctr[:, 0])
    return idxs

diversity.py: Since the acquisition function implemented in link takes (unlabeled_embeddings, labeled_embeddings, n) as parameters, I did same and modified diversity.py instead of using (features, pool_indices, context_indices) suggested in diversity.py of sustainability-lab/ASTRA
and 5. test_furthest.py and test_centroid: Used CIFAR10 to test. Here we want to pass features extractor of model instead of forward pass of model, so I implemented feature extractor as below:

# Define the model
net = CNN(32, 3, 3, [4, 8], [2, 3], 10).to(device)

def extract_features(net):
    def feature_extractor(input_tensor):
        # Initialize features with the input tensor
        features = input_tensor

        # Apply each layer, activation, and max-pooling
        for layer in net.feature_extractor:
            features = layer(features)
            features = net.activation(features)
            features = net.max_pool(features)
        features = net.flatten(features)
        return features

    return feature_extractor

# Create a feature extractor callable from the network
feature_extractor = extract_features(net)

This feature_extractor gives us features ie. embedding of input.

# example: this snippet is not included in code
# input shape: (data_dim, height, width, channels)
input = input.permute(0, 3, 1, 2) #input shape: (data_dim, channels, height, width)
features = feature_extractor(input) # shape (data_dim, feature_dim)

We then pass this feature_extractor in strategy.query() which gives best_indices based on furthest or centroid acquisition provided.

# Query the strategy
best_indices = strategy.query(
    feature_extractor, pool_indices, train_indices, n_query_samples=n_query_samples
)

patel-zeel · 2023-11-03T04:28:40Z

@jaiswalsuraj487 Now that our plan is broadened, let's not use distil library. Use your own implementation. Can you visually show if your acquisition is picking the correct points?

jaiswalsuraj487 · 2023-11-04T17:19:13Z

@patel-zeel I have made the required changes as per the current version of sustainability-lab:main and added sandbox/diveristy_acquisition_demo.ipynb to show a visual of selected data points using corresponding acquisition functions on dummy data.

jaiswalsuraj487 · 2023-11-05T01:05:19Z

@patel-zeel Added AL notebook for diversity acquisitions notebooks/al/diversity_acq_AL.ipynb

jaiswalsuraj487 added 7 commits October 31, 2023 18:00

added furthest acquisition and test

6d5b611

added furthest and test

3e989c0

small changes

a942516

Merge branch 'diversity' into feature_diversity

50edd8d

test_furthest passed

94b1073

added centroid and passed all test

2e18384

deleted random.py

055bd70

jaiswalsuraj487 added 7 commits November 4, 2023 10:38

added implementation without distil lib

d7f29b2

added diversity acq demo

452f930

clean up

f6e18bc

Merge branch 'main' into feature_diversity_al

892aacc

updated according to main branch latest commit

01c59b9

Merge branch 'main' into feature_diversity

2a1b646

updated featurizer

fc04b4d

jaiswalsuraj487 added 3 commits November 4, 2023 22:54

clean up

99afe57

Merge branch 'feature_diversity' into feature_diversity_al

d2e6bfa

AL experiments on diversity acq

d07aff9

jaiswalsuraj487 added 3 commits November 5, 2023 06:39

AL experiments on diversity acq

735754c

updated results

cf225c8

added random exp

5aae7ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diversity Acquisition Functions#12

Diversity Acquisition Functions#12
jaiswalsuraj487 wants to merge 20 commits intosustainability-lab:mainfrom
jaiswalsuraj487:feature_diversity

jaiswalsuraj487 commented Nov 1, 2023

Uh oh!

patel-zeel commented Nov 3, 2023

Uh oh!

jaiswalsuraj487 commented Nov 4, 2023

Uh oh!

jaiswalsuraj487 commented Nov 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jaiswalsuraj487 commented Nov 1, 2023

Implemented Furthest Acquisition and Centroid Acquisition on commit a6e59ff.

Files Added:

Explanation:

Uh oh!

patel-zeel commented Nov 3, 2023

Uh oh!

jaiswalsuraj487 commented Nov 4, 2023

Uh oh!

jaiswalsuraj487 commented Nov 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants