-
Notifications
You must be signed in to change notification settings - Fork 7
Aioli Algorithm for Dynamic Mixtures #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Result of Pytest Coverage---------- coverage: platform linux, python 3.10.0-final-0 -----------
|
MaxiBoether
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Beste. I checked the code on the level of understanding of the algorithm that I have. You are the expert on the actual logic. I just gave some small nitpicky remarks :) After fixing them and fixing CI, we can merge this and you can build on top of it. I think it would be good to test this with an actual training, to see how it behaves
|
|
||
| Args: | ||
| losses: A numpy array of losses per domain. | ||
| counts: A numpy array of counts per domain. | ||
| """ | ||
| num_incoming_domains = len(losses) | ||
| num_internal_domains = len(self.losses) | ||
| num_domains = max(num_incoming_domains, num_internal_domains) | ||
|
|
||
| if num_internal_domains < num_domains: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this doing anything more than the parent class? Similar to ADO, we definitely want a super() call here and only do additional work if necessary
mixtera/core/algo/aioli/aioli.py
Outdated
| self.aioli_diagonal = aioli_diagonal | ||
| self.domain_count = len(self.losses) | ||
| self.graph = np.zeros((self.domain_count, self.domain_count)) | ||
| self.weights: np.ndarray = None # The latest weights after update steps have been completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if something can be none, the type is typically | None
mixtera/core/algo/aioli/aioli.py
Outdated
| perturbed_domain = self.last_received_mixture % (self.domain_count + 1) | ||
| if perturbed_domain != self.domain_count: | ||
| self.graph[:, perturbed_domain] += self.losses - losses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want some warmup logic here? Do you immediately start perturbing? Because imagine the very first client feedback, that will have update_at_client = True. Not sure if some checking here is needed.
mixtera/core/algo/aioli/aioli.py
Outdated
| if self.weights is None: | ||
| self.weights = np.multiply(weights_init, np.exp(self.eta * self.graph.sum(axis=0))) | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights = weights_init if self.weights is None else self.weights
self.weights = np.multiply(weights, np.exp(...))
mixtera/core/algo/aioli/aioli.py
Outdated
| self.weights = np.multiply(self.weights, np.exp(self.eta * self.graph.sum(axis=0))) | ||
|
|
||
| logger.info(f"The new mixture proportions={self.weights/sum(self.weights)}. ") | ||
| self.weights = self.weights / sum(self.weights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably want np.sum for speed
mixtera/core/algo/aioli/aioli.py
Outdated
| for i in range(self.domain_count): | ||
| weight_row = np.ones(self.domain_count) * (1 - self.one_hot_factor) / (self.domain_count - 1) | ||
| weight_row[i] = self.one_hot_factor | ||
| weight_matrix[i] = weight_row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can vectorize this for the entire matrix, that would of course be even better, but I don't know if this currently would be a major bottleneck
In this pr, the initial Aioli algorithm is created for dynamic mixtures.