Research LSGO techniques, as they involve optimizing functions with a large number of parameters (exactly what we need for deep learning), maybe this would allow our methods to be applied to larger methods than small MLPs, such as CNN and RNNs (even Transformers, but that might be a stretch...)