This list contains papers that adopt Neural Tangent Kernel (NTK) as a main theme or core idea.
NOTE: If there are any papers I've missed, please feel free to raise an issue.
| Title | Venue | CODE | |
|---|---|---|---|
| Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes | NeurIPS | - | |
| On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory | NeurIPS | - | |
| Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View | NeurIPS | - | |
| Continual learning with the neural tangent ensemble | NeurIPS | - | |
| A generalized neural tangent kernel for surrogate gradient learning | NeurIPS | - | |
| Temporal Graph Neural Tangent Kernel with Graphon-Guaranteed | NeurIPS | CODE | |
| How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks | NeurIPS | - | |
| When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective | ICASSP | - | |
| Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models | ICLR | CODE | |
| PINNACLE: PINN Adaptive ColLocation and Experimental points selection | ICLR | - | |
| On the Foundations of Shortcut Learning | ICLR | - | |
| Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation | ICLR | - | |
| Sample Relationship from Learning Dynamics Matters for Generalisation | ICLR | - | |
| Robust NAS benchmark under adversarial training: assessment, theory, and beyond | ICLR | - | |
| Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach | ICLR | CODE | |
| Heterogeneous Personalized Federated Learning by Local-Global Updates Mixing via Convergence Rate | ICLR | - | |
| Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization | ICLR | - | |
| Grokking as the Transition from Lazy to Rich Training Dynamics | ICLR | - | |
| Generalization of Deep ResNets in the Mean-Field Regime | ICLR | - | |
| The conjugate kernel for efficient training of physics-informed deep operator networks | ICLR-W | - | |
| RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural Networks | ICLR-W | CODE | |
| Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization | AISTATS | CODE | |
| Grounding and Enhancing Grid-based Models for Neural Fields | CVPR | - | |
| FINER: Flexible spectral-bias tuning in Implicit NEural Representation by Variable-periodic Activation Functions | CVPR | - | |
| Improved Implicit Neural Representation with Fourier Reparameterized Training | CVPR | CODE | |
| Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning | CVPR | CODE | |
| Batch Normalization Alleviates the Spectral Bias in Coordinate Networks | CVPR | CODE | |
| Neural Linage | CVPR | - | |
| Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Models | CVPR-W | - | |
| Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation | ECCV | CODE | |
| Neural Tangent Kernels for Axis-Aligned Tree Ensembles | ICML | - | |
| Neural Tangent Kernels Motivate Cross-Covariance Graphs in Neural Networks | ICML | CODE | |
| An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network | ICML | - | |
| Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning | ICML | - | |
| Reward-Free Kernel-Based Reinforcement Learning | ICML | - | |
| Non-Parametric Representation Learning with Kernels | AAAI | - | |
| G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection | AAAI | CODE | |
| Meta Clustering of Neural Bandits | KDD | - | |
| Fast Graph Condensation with Structure-based Neural Tangent Kernel | WWW | - | |
| Differentially Private Kernel Inducing Points using features from ScatterNets (DP-KIP-ScatterNet) for Privacy Preserving Data Distillation | TMLR | CODE | |
| Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent | JMLR | - | |
| Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space | JMLR | - | |
| Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks | JMLR | - | |
| On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains | JMLR | - | |
| Differentially Private Neural Tangent Kernels (DP-NTK) for Privacy-Preserving Data Generation | JAIR | - |
| Title | Venue | CODE | |
|---|---|---|---|
| Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models | NeurIPS | CODE | |
| Deep Learning with Kernels through RKHM and the Perron–Frobenius Operator | NeurIPS | - | |
| A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression | NeurIPS | - | |
| Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs | NeurIPS | - | |
| Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time | NeurIPS | - | |
| Feature-Learning Networks Are Consistent Across Widths At Realistic Scales | NeurIPS | - | |
| Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks | NeurIPS | CODE | |
| Spectral Evolution and Invariance in Linear-width Neural Networks | NeurIPS | - | |
| Analyzing Generalization of Neural Networks through Loss Path Kernels | NeurIPS | - | |
| Neural (Tangent Kernel) Collapse | NeurIPS | - | |
| Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension | NeurIPS | CODE | |
| On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay | NeurIPS | - | |
| On the Convergence of Encoder-only Shallow Transformers | NeurIPS | - | |
| The Geometry of Neural Nets' Parameter Spaces Under Reparametrization | NeurIPS | - | |
| Leveraging the two timescale regime to demonstrate convergence of neural networks | NeurIPS | - | |
| Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks | NeurIPS | - | |
| Sample based Explanations via Generalized Representers | NeurIPS | - | |
| On skip connections and normalisation layers in deep optimisation | NeurIPS | - | |
| Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks | NeurIPS | - | |
| A PAC-Bayesian Perspective on the Interpolating Information Criterion | NeurIPS-W | - | |
| A Kernel Perspective of Skip Connections in Convolutional Networks | ICLR | - | |
| Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel | ICLR | - | |
| Symmetric Pruning in Quantum Neural Networks | ICLR | - | |
| The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks | ICLR | - | |
| Few-shot Backdoor Attacks via Neural Tangent Kernels | ICLR | - | |
| Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel | ICLR | - | |
| Supervision Complexity and its Role in Knowledge Distillation | ICLR | - | |
| NTK-SAP: Improving Neural Network Pruning By Aligning Training Dynamics | ICLR | CODE | |
| Tuning Frequency Bias in Neural Network Training with Nonuniform Data | ICLR | - | |
| Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth | ICLR | - | |
| Characterizing the spectrum of the NTK via a power series expansion | ICLR | CODE | |
| Adaptive Optimization in the |
ICLR | - | |
| Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization | ICLR | - | |
| The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes | ICLR | - | |
| Restricted Strong Convexity of Deep Learning Models with Smooth Activations | ICLR | - | |
| Feature selection and low test error in shallow low-rotation ReLU networks | ICLR | - | |
| Exploring Active 3D Object Detection from a Generalization Perspective | ICLR | CODE | |
| On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks | AISTATS | - | |
| Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks | AISTATS | - | |
| Regularize Implicit Neural Representation by Itself | CVPR | - | |
| WIRE: Wavelet Implicit Neural Representations | CVPR | CODE | |
| Regularizing Second-Order Influences for Continual Learning | CVPR | CODE | |
| Multiplicative Fourier Level of Detail | CVPR | - | |
| KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection | ICCV | CODE | |
| TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning | ICCV-W | - | |
| A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel | ICML | - | |
| Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels | ICML | CODE | |
| Graph Neural Tangent Kernel: Convergence on Large Graphs | ICML | - | |
| Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels | ICML | CODE | |
| Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels | ICML | - | |
| Benign Overfitting in Deep Neural Networks under Lazy Training | ICML | - | |
| Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space | ICML | - | |
| A Kernel-Based View of Language Model Fine-Tuning | ICML | - | |
| Combinatorial Neural Bandits | ICML | - | |
| What Can Be Learnt With Wide Convolutional Neural Networks? | ICML | CODE | |
| Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits | AAAI | - | |
| Neural tangent kernel at initialization: linear width suffices | UAI | - | |
| Kernel Ridge Regression-Based Graph Dataset Distillation | SIGKDD | CODE | |
| Can Infinitely Wide Deep Nets Help Small-data Multi-label Learning? | ACML | - | |
| Analyzing Deep PAC-Bayesian Learning with Neural Tangent Kernel: Convergence, Analytic Generalization Bound, and Efficient Hyperparameter Selection | TMLR | - | |
| The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks | TMLR | CODE | |
| Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning | TMLR | - | |
| Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel | TMLR | - | |
| A Framework and Benchmark for Deep Batch Active Learning for Regression | JMLR | CODE | |
| A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel Regime | IEEE | - | |
| The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning | QE | - | |
| NeuralBO: A Black-box Optimization Algorithm using Deep Neural Networks | NC | - | |
| Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent Kernel | NN | CODE | |
| Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equations | CMAME | - | |
| A non-gradient method for solving elliptic partial differential equations with deep neural networks | JoCP | - | |
| Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism | JoCP | - | |
| Towards a phenomenological understanding of neural networks: data | MLST | - | |
| Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel | ML | CODE | |
| Tensor Programs IVb: Adaptive Optimization in the ∞-Width Limit | arXiv | - |
| Title | Venue | CODE | |
|---|---|---|---|
| Generalization Properties of NAS under Activation and Skip Connection Search | NeurIPS | - | |
| Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study | NeurIPS | CODE | |
| Graph Neural Network Bandits | NeurIPS | - | |
| “Lossless” Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach | NeurIPS | - | |
| GraphQNTK: Quantum Neural Tangent Kernel for Graph Data | NeurIPS | CODE | |
| Evolution of Neural Tangent Kernels under Benign and Adversarial Training | NeurIPS | CODE | |
| TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels | NeurIPS | CODE | |
| Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels | NeurIPS | CODE | |
| Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel | NeurIPS | CODE | |
| On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model | NeurIPS | - | |
| What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? | NeurIPS | - | |
| On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels | NeurIPS | - | |
| Fast Neural Kernel Embeddings for General Activations | NeurIPS | CODE | |
| Bidirectional Learning for Offline Infinite-width Model-based Optimization | NeurIPS | - | |
| Infinite Recommendation Networks: A Data-Centric Approach | NeurIPS | CODE1 CODE2 |
|
| Distribution-Informed Neural Networks for Domain Adaptation Regression | NeurIPS | - | |
| Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks | NeurIPS | - | |
| Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime | NeurIPS | CODE | |
| Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) | NeurIPS | - | |
| Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture | NeurIPS | - | |
| A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity | NeurIPS | - | |
| NFT-K: Non-Fungible Tangent Kernels | ICASSP | CODE | |
| Label Propagation Across Grapsh: Node Classification Using Graph Neural Tangent Kenrels | ICASSP | - | |
| A Neural Tangent Kernel Perspective of Infinite Tree Ensembles | ICLR | - | |
| Neural Networks as Kernel Learners: The Silent Alignment Effect | ICLR | - | |
| Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective | ICLR | - | |
| Overcoming The Spectral Bias of Neural Value Approximation | ICLR | CODE | |
| Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features | ICLR | CODE | |
| Learning Neural Contextual Bandits Through Perturbed Rewards | ICLR | - | |
| Learning Curves for Continual Learning in Neural Networks: Self-knowledge Transfer and Forgetting | ICLR | - | |
| The Spectral Bias of Polynomial Neural Networks | ICLR | - | |
| On Feature Learning in Neural Networks with Global Convergence Guarantees | ICLR | - | |
| Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks | ICLR | - | |
| Eigenspace Restructuring: A Principle of Space and Frequency in Neural Networks | COLT | - | |
| Neural Networks can Learn Representations with Gradient Descent | COLT | - | |
| Neural Contextual Bandits without Regret | AISTATS | - | |
| Finding Dynamics Preserving Adversarial Winning Tickets | AISTATS | - | |
| Embedded Ensembles: Infinite Width Limit and Operating Regimes | AISTATS | - | |
| Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning | CVPR | CODE | |
| Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training? | CVPR | CODE | |
| A Structured Dictionary Perspective on Implicit Neural Representations | CVPR | CODE | |
| NL-FFC: Non-Local Fast Fourier Convolution for Image Super Resolution | CVPR-W | CODE | |
| Intrinsic Neural Fields: Learning Functions on Manifolds | ECCV | - | |
| Random Gegenbauer Features for Scalable Kernel Methods | ICML | - | |
| Fast Finite Width Neural Tangent Kernel | ICML | CODE | |
| A Neural Tangent Kernel Perspective of GANs | ICML | CODE | |
| Neural Tangent Kernel Empowered Federated Learning | ICML | - | |
| Reverse Engineering the Neural Tangent Kernel | ICML | CODE | |
| How to Train Your Wide Neural Network Without Backprop: An Input-Weight Alignment Perspective | ICML | CODE | |
| Bounding the Width of Neural Networks via Coupled Initialization – A Worst Case Analysis – | ICML | - | |
| Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time | ICML | - | |
| Lazy Estimation of Variable Importance for Large Neural Networks | ICML | - | |
| DAVINZ: Data Valuation using Deep Neural Networks at Initialization | ICML | - | |
| Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization | ICML | CODE | |
| NeuralEF: Deconstructing Kernels by Deep Neural Networks | ICML | CODE | |
| Feature Learning and Signal Propagation in Deep Neural Networks | ICML | - | |
| More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize | ICML | CODE | |
| Fast Graph Neural Tangent Kernel via Kronecker Sketching | AAAI | - | |
| Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime | AAAI | - | |
| On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | UAI | - | |
| Feature Learning and Random Features in Standard Finite-Width Convolutional Neural Networks: An Empirical Study | UAI | - | |
| Out of Distribution Detection via Neural Network Anchoring | ACML | CODE | |
| Learning Neural Ranking Models Online from Implicit User Feedback | WWW | - | |
| Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes | CoRL | - | |
| When and why PINNs fail to train: A neural tangent kernel perspective | CP | CODE | |
| How Neural Architectures Affect Deep Learning for Communication Networks? | ICC | - | |
| Loss landscapes and optimization in over-parameterized non-linear systems and neural networks | ACHA | - | |
| Feature Purification: How Adversarial Training Performs Robust Deep Learning | FOCS | - | |
| Kernel-Based Smoothness Analysis of Residual Networks | MSML | - | |
| Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | MSML | - | |
| The Training Response Law Explains How Deep Neural Networks Learn | IoP | - | |
| Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural Networks | PNAS | CODE | |
| Representation Learning via Quantum Neural Tangent Kernels | PRX Quantum | - | |
| TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models | arXiv | CODE | |
| Neural Tangent Kernel Analysis of Shallow α-Stable ReLU Neural Networks | arXiv | - | |
| Neural Tangent Kernel: A Survey | arXiv | - |
| Title | Venue | CODE | |
|---|---|---|---|
| Neural Tangent Kernel Maximum Mean Discrepancy | NeurIPS | - | |
| DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent Kernel | NeurIPS | - | |
| Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel | NeurIPS | - | |
| Scaling Neural Tangent Kernels via Sketching and Random Features | NeurIPS | - | |
| Dataset Distillation with Infinitely Wide Convolutional Networks | NeurIPS | - | |
| On the Equivalence between Neural Network and Support Vector Machine | NeurIPS | CODE | |
| Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels | NeurIPS | CODE | |
| Explicit Loss Asymptotics in the Gradient Descent Training of Neural Networks | NeurIPS | - | |
| Kernelized Heterogeneous Risk Minimization | NeurIPS | CODE | |
| An Empirical Study of Neural Kernel Bandits | NeurIPS-W | - | |
| The Curse of Depth in Kernel Regime | NeurIPS-W | - | |
| Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels | ICASSP | CODE | |
| The Dynamics of Gradient Descent for Overparametrized Neural Networks | L4DC | - | |
| The Recurrent Neural Tangent Kernel | ICLR | - | |
| Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS | ICLR | - | |
| Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime | ICLR | - | |
| Meta-Learning with Neural Tangent Kernels | ICLR | - | |
| How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks | ICLR | - | |
| Deep Networks and the Multiple Manifold Problem | ICLR | - | |
| Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective | ICLR | CODE | |
| Neural Thompson Sampling | ICLR | - | |
| Deep Equals Shallow for ReLU Networks in Kernel Regimes | ICLR | - | |
| A Recipe for Global Convergence Guarantee in Deep Neural Networks | AAAI | - | |
| A Deep Conditioning Treatment of Neural Networks | ALT | - | |
| Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping | COLT | - | |
| Learning with invariances in random features and kernel models | COLT | - | |
| Implicit Regularization via Neural Feature Alignment | AISTATS | CODE | |
| Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network | AISTATS | - | |
| One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks | AISTATS | - | |
| Fast Adaptation with Linearized Neural Networks | AISTATS | CODE | |
| Fast Learning in Reproducing Kernel Kreın Spaces via Signed Measures | AISTATS | - | |
| Stable ResNet | AISTATS | - | |
| A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks | AISTATS | - | |
| Can We Characterize Tasks Without Labels or Features? | CVPR | CODE | |
| The Neural Tangent Link Between CNN Denoisers and Non-Local Filters | CVPR | CODE | |
| Nerfies: Deformable Neural Radiance Fields | ICCV | CODE | |
| Kernel Methods in Hyperbolic Spaces | ICCV | - | |
| Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks | ICML | - | |
| On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models | ICML | - | |
| Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics | ICML | - | |
| Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks | ICML | CODE | |
| FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis | ICML | - | |
| On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent | ICML | - | |
| Feature Learning in Infinite-Width Neural Networks | ICML | CODE | |
| On Monotonic Linear Interpolation of Neural Network Parameters | ICML | - | |
| Uniform Convergence, Adversarial Spheres and a Simple Remedy | ICML | - | |
| Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels | ICML | - | |
| Efficient Statistical Tests: A Neural Tangent Kernel Approach | ICML | - | |
| Neural Tangent Generalization Attacks | ICML | CODE | |
| On the Random Conjugate Kernel and Neural Tangent Kernel | ICML | - | |
| Generalization Guarantees for Neural Architecture Search with Train-Validation Split | ICML | - | |
| Tilting the playing field: Dynamical loss functions for machine learning | ICML | CODE | |
| PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well Without Training Data | ICML | - | |
| On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization | IJCAI | CODE | |
| Towards Understanding the Spectral Bias of Deep Learning | IJCAI | - | |
| On Random Kernels of Residual Architectures | UAI | - | |
| How Shrinking Gradient Noise Helps the Performance of Neural Networks | ICBD | - | |
| Unsupervised Shape Completion via Deep Prior in the Neural Tangent Kernel Perspective | ACM TOG | - | |
| Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis | TIT | - | |
| Reinforcement Learning via Gaussian Processes with Neural Network Dual Kernels | CoG | - | |
| Kernel-Based Smoothness Analysis of Residual Networks | MSML | - | |
| Mathematical Models of Overparameterized Neural Networks | IEEE | - | |
| A Feature Fusion Based Indicator for Training-Free Neural Architecture Search | IEEE | - | |
| Pathological spectra of the Fisher information metric and its variants in deep neural networks | NC | - | |
| Linearized two-layers neural networks in high dimension | Ann. Statist. | - | |
| Geometric compression of invariant manifolds in neural nets | J. Stat. Mech. | CODE | |
| A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks | arXiv | - | |
| Learning with Neural Tangent Kernels in Near Input Sparsity Time | arXiv | - | |
| Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks | arXiv | - | |
| Properties of the After Kernel | arXiv | CODE |
| Title | Venue | CODE | |
|---|---|---|---|
| Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations | ECCV | - | |
| Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? — A Neural Tangent Kernel Perspective | NeurIPS | - | |
| Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity | NeurIPS | CODE | |
| Finite Versus Infinite Neural Networks: an Empirical Study | NeurIPS | - | |
| On the linearity of large non-linear models: when and why the tangent kernel is constant | NeurIPS | - | |
| On the Similarity between the Laplace and Neural Tangent Kernels | NeurIPS | - | |
| A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks | NeurIPS | - | |
| Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics | NeurIPS | - | |
| Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains | NeurIPS | CODE | |
| Network size and weights size for memorization with two-layers neural networks | NeurIPS | - | |
| Neural Networks Learning and Memorization with (almost) no Over-Parameterization | NeurIPS | - | |
| Towards Understanding Hierarchical Learning: Benefits of Neural Representations | NeurIPS | - | |
| Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher | NeurIPS | - | |
| On Infinite-Width Hypernetworks | NeurIPS | - | |
| Predicting Training Time Without Training | NeurIPS | - | |
| Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel | NeurIPS | - | |
| Spectra of the Conjugate Kernel and Neural Tangent Kernel for Linear-Width Neural Networks | NeurIPS | - | |
| Kernel and Rich Regimes in Overparametrized Models | COLT | - | |
| Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK | COLT | - | |
| Finite Depth and Width Corrections to the Neural Tangent Kernel | ICLR | - | |
| Neural tangent kernels, transportation mappings, and universal approximation | ICLR | - | |
| Neural Tangents: Fast and Easy Infinite Neural Networks in Python | ICLR | CODE | |
| Picking Winning Tickets Before Training by Preserving Gradient Flow | ICLR | CODE | |
| Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory | ICLR | - | |
| Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee | ICLR | - | |
| The asymptotic spectrum of the Hessian of DNN throughout training | ICLR | - | |
| Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks | ICLR | CODE | |
| Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks | ICLR | - | |
| Asymptotics of Wide Networks from Feynman Diagrams | ICLR | - | |
| The equivalence between Stein variational gradient descent and black-box variational inference | ICLR-W | - | |
| Neural Kernels Without Tangents | ICML | CODE | |
| The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization | ICML | - | |
| Dynamics of Deep Neural Networks and Neural Tangent Hierarchy | ICML | - | |
| Disentangling Trainability and Generalization in Deep Neural Networks | ICML | - | |
| Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks | ICML | CODE | |
| Finding trainable sparse networks through Neural Tangent Transfer | ICML | CODE | |
| Associative Memory in Iterated Overparameterized Sigmoid Autoencoders | ICML | - | |
| Neural Contextual Bandits with UCB-based Exploration | ICML | - | |
| Optimization Theory for ReLU Neural Networks Trained with Normalization Layers | ICML | - | |
| Towards a General Theory of Infinite-Width Limits of Neural Classifiers | ICML | - | |
| Generalisation guarantees for continual learning with orthogonal gradient descent | ICML-W | CODE | |
| Neural Spectrum Alignment: Empirical Study | ICANN | - | |
| A type of generalization error induced by initialization in deep neural networks | MSML | - | |
| Disentangling feature and lazy training in deep neural networks | J. Stat. Mech. | CODE | |
| Scaling description of generalization with number of parameters in deep learning | J. Stat. Mech. | CODE | |
| Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective | NC | - | |
| Kolmogorov Width Decay and Poor Approximation in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels | RMS | - | |
| On the infinite width limit of neural networks with a standard parameterization | arXiv | CODE | |
| On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures | arXiv | - | |
| Infinite-Width Neural Networks for Any Architecture: Reference Implementations | arXiv | CODE | |
| Every Model Learned by Gradient Descent Is Approximately a Kernel Machine | arXiv | - | |
| Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory? | arXiv | - | |
| Scalable Neural Tangent Kernel of Recurrent Architectures | arXiv | CODE | |
| Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning | arXiv | - |
| Title | Venue | CODE | |
|---|---|---|---|
| Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel | NeurIPS | - | |
| Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent | NeurIPS | CODE | |
| On Exact Computation with an Infinitely Wide Neural Net | NeurIPS | CODE | |
| Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels | NeurIPS | CODE | |
| On the Inductive Bias of Neural Tangent Kernels | NeurIPS | CODE | |
| Convergence of Adversarial Training in Overparametrized Neural Networks | NeurIPS | - | |
| Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks | NeurIPS | - | |
| Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers | NeurIPS | - | |
| Limitations of Lazy Training of Two-layers Neural Networks | NeurIPS | - | |
| The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies | NeurIPS | CODE | |
| On Lazy Training in Differentiable Programming | NeurIPS | - | |
| Information in Infinite Ensembles of Infinitely-Wide Neural Networks | AABI | - | |
| Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation | arXiv | - | |
| Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems | arXiv | - | |
| Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems | arXiv | - | |
| Mean-field Behaviour of Neural Tangent Kernel for Deep Neural Networks | arXiv | - | |
| Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts | arXiv | - | |
| A Fine-Grained Spectral Perspective on Neural Networks | arXiv | CODE | |
| Enhanced Convolutional Neural Tangent Kernels | arXiv | - |
| Title | Venue | CODE | |
|---|---|---|---|
| Neural Tangent Kernel: Convergence and Generalization in Neural Networks | NeurIPS | - |