feat(costs): Add dunder methods to costs#243
feat(costs): Add dunder methods to costs#243nicola-bastianello merged 4 commits intoteam-decent:mainfrom
Conversation
Add scalar multiplication/division, negation, subtraction, and reverse addition for costs. Introduce ScaledCost for weighted objectives and cover the behavior with operator tests. closes team-decent#178
| self.cost = cost.cost | ||
| self.scalar = scalar * cost.scalar | ||
| else: | ||
| self.cost = cost |
There was a problem hiding this comment.
the way we assign self.cost here is as a reference to cost, since that's the default in python. I'm thinking that maybe we should do a deepcopy to avoid unexpected behavior
@Simpag would there be any problem making deep copies of pytorch costs?
There was a problem hiding this comment.
That might be an issue, we'd need to investigate this. Currently the SumCost does not copy the cost functions either. It should be fine to deep copy but it might cause massive memory usage if big models or datasets are used. For empirical costs the issue might be that the batch_used property gets updated if you use the SumCost/ScaledCost.
But I am not sure, this should be tested.
There was a problem hiding this comment.
I see, I hadn't realized self._n_function_calls was also doing this (there's always something that goes unnoticed).
then I'm inclined to merge the PR with this implementation, and after we test both SumCost and ScaledCost we can come back to the discussion. I opened #244
|
a couple of other things:
|
|
another thing: I think maybe we should not translate scalar/cost as cost/scalar. it could lead to very unexpected behavior |
The |
It should be possible to have "scalar/cost", could have an init parameter in the scaled cost to inverse the output of the cost function to (1/method_value) in all calls or create something like "InverseCost" which is just "1/cost". However, I’m not sure how useful this would be or if it would ever be used. It would cause issues when we have division by zero. I would imagine that this could pop up somewhat frequently, especially for gradients. Maybe we just should not support it. |
yes cannot think of any situation where this would be useful honestly. I would avoid supporting it altogether |
|
Yes, I also can’t think of a practical use for |
Reject scalar / cost, validate proximal parameters for scaled costs, and document supported cost operations in the user guide. Update tests and formatting accordingly.
Summary
Closes #178