Skip to content

First issues#1

Merged
DimitriTimoz merged 13 commits intoDimitriTimoz:masterfrom
Cobord:master
Jan 8, 2026
Merged

First issues#1
DimitriTimoz merged 13 commits intoDimitriTimoz:masterfrom
Cobord:master

Conversation

@Cobord
Copy link
Contributor

@Cobord Cobord commented Nov 20, 2025

  • do many steps of SimpleOptimizer so example of optimization_demo get actually closer to the target (This can be left only for tests rather than be pub, because it is just looping steps)
  • multi_constraint example gave false positives for is_in_manifold
  • clippy suppressions

… actually closer to the target, multi_constraint example gave false positives for is_in_manifold
…grad_remove as well when only using that gradient information once
…not strictly tied of other content of the files they were in, expand tests and examples
…t fail with runtime errors if they are not precisely 1 or 2 depending on the example, an optimizer that is allowed to use hessian but just ignores it
@DimitriTimoz
Copy link
Owner

I'm not sure if it's finished, but if it is, could you please undraft the PR? That way I'll know.

@Cobord
Copy link
Contributor Author

Cobord commented Dec 17, 2025

I am building off in a way that isn't suited for open PR. The initial work so far is fine. These earlier commits fix issues that are general enough. But the more specific stuff does not really fit here. So to take what is already here as needed, and I can continue the other stuff on own fork that doesn't need to merge in if it is suited to my purposes and not for general use.

@Cobord Cobord marked this pull request as ready for review December 17, 2025 18:54
Comment on lines +83 to +90
const DECAY_STATE_TO_DEVICE: bool = false;
if DECAY_STATE_TO_DEVICE {
ManifoldRGDState {
lr_decay: state.lr_decay.to_device(device),
}
} else {
state
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this constant cannot be defined by the user, what is its purpose?

Comment on lines +34 to +50
fn step<const D: usize, const H: usize>(
&self,
lr: LearningRate,
tensor: Tensor<B, D>,
grad: Tensor<B, D>,
hessian: Option<Tensor<B, H>>,
state: Option<Self::StateWithHessian<D, H>>,
) -> (Tensor<B, D>, Option<Self::StateWithHessian<D, H>>) {
if let Some(hessian) = hessian {
self.step_with_hessian(lr, tensor, grad, hessian, state)
} else {
let (new_pt, new_state) =
SimpleOptimizer::step(self, lr, tensor, grad, state.map(Into::into));
(new_pt, new_state.map(Into::into))
}
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the Hessian matrix optional? It's a Hessian optimizer; I don't think it should be able to optimize differently.

Comment on lines 92 to 103
/// Given a tensor of shape `a_1 ... a_k x N x N`
/// create a tensor of shape `a_1 ... a_k x 1 x 1`
/// whose `i_1...i_k,0,0` entry is the trace
/// of the `N x N` matrix with those previous `k`
/// fixed but the last two remaining free.
#[allow(dead_code)]
pub(crate) fn trace<B: Backend, const D: usize>(
t: Tensor<B, D>,
) -> Tensor<B, D> {
ein_sum(t, D-2, D-1)
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use the burn implementation of the trace() ?

fn lie_mul<const D: usize>(points0: Tensor<B,D>, points1: Tensor<B,D>) -> Tensor<B,D> {
points0.matmul(points1)
}
} No newline at end of file
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

);
}

pub(crate) fn create_test_matrix<TestBackend: Backend>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use Tensor::from_datainstead

Comment on lines 32 to 36
/// whose `i_1...i_k,m,n` entry is `f(m)` if `m==n` and `0` otherwise
#[allow(dead_code)]
pub(crate) fn diag_i<B: Backend, const D: usize>(
example: &Tensor<B, D>,
diag_fun: impl Fn(usize) -> f32
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but the loop with slice_assign generates multiple tensor writes and keeps part of the computation on the CPU, which doesn’t scale well on GPU backends. Using a broadcasted diagonal mask and vectorized multiplication would achieve the same result more efficiently in Burn.

Comment on lines 6 to 29
pub(crate) fn identity_in_last_two<B: Backend, const D: usize>(
example: &Tensor<B, D>,
) -> Tensor<B, D> {
let shape: [usize; D] = example.shape().dims();
debug_assert!(D >= 2);
debug_assert_eq!(shape[D - 1], shape[D - 2]);
let n = shape[D - 1];
let mut other = example.zeros_like();
let mut ones_shape = [1usize; D];
ones_shape[..(D - 2)].copy_from_slice(&shape[..(D - 2)]);
let ones_patch = Tensor::<B, D>::ones(ones_shape, &example.device());
for diag in 0..n {
let ranges: [_; D] = std::array::from_fn(|dim| {
if dim < D - 2 {
0..shape[dim]
} else {
diag..diag + 1
}
});
other = other.slice_assign(ranges, ones_patch.clone());
}
other
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use eye to generate and identity matrix and expand to better use the GPU efficiency.

@DimitriTimoz
Copy link
Owner

I am building off in a way that isn't suited for open PR. The initial work so far is fine. These earlier commits fix issues that are general enough. But the more specific stuff does not really fit here. So to take what is already here as needed, and I can continue the other stuff on own fork that doesn't need to merge in if it is suited to my purposes and not for general use.

Thank you for your contribution! It helps establish solid foundations for manops in Rust. We can discuss the more specific details in issues if you like.

@DimitriTimoz DimitriTimoz merged commit 16c1b5f into DimitriTimoz:master Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants