icml 2020 论文列表
Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event.
|
Transformer Hawkes Process.
A general recurrent state space framework for modeling neural dynamics during decision-making.
Influenza Forecasting Framework based on Gaussian Processes.
Laplacian Regularized Few-Shot Learning.
Learning Optimal Tree Models under Beam Search.
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE.
When Demands Evolve Larger and Noisier: Learning and Earning in a Growing Environment.
Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming.
Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization.
Thompson Sampling Algorithms for Mean-Variance Bandits.
Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health.
Variance Reduction and Quasi-Newton for Particle-Based Variational Inference.
Robust Outlier Arm Identification.
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization.
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks.
Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support.
Time-Consistent Self-Supervision for Semi-Supervised Learning.
Nonparametric Score Estimators.
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time.
Neural Contextual Bandits with UCB-based Exploration.
Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting.
Bisection-Based Pricing for Repeated Contextual Auctions against Strategic Buyer.
Robust Graph Representation Learning via Neural Sparsification.
Error-Bounded Correction of Noisy Labels.
What Can Learned Intrinsic Rewards Capture?
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion.
Learning to Learn Kernels with Variational Random Features.
Smaller, more accurate regression forests using tree alternating optimization.
Individual Calibration with Randomized Forecasting.
Feature Quantization Improves GAN Training.
Do RNN and LSTM have Long Memory?
On Learning Language-Invariant Representations for Universal Machine Translation.
On Leveraging Pretrained GANs for Generation with Limited Data.
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization.
Learning with Feature and Distribution Evolvable Streams.
Variance Reduction in Stochastic Particle-Optimization Sampling.
Perceptual Generative Autoencoders.
A Flexible Latent Space Model for Multilayer Networks.
Attacks Which Do Not Kill Training Make Adversarial Learning Stronger.
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case.
Sparsified Linear Programming for Zero-Sum Equilibrium Finding.
Convex Calibrated Surrogates for the Multi-Label F-Measure.
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods.
Adaptive Reward-Poisoning Attacks against Reinforcement Learning.
Invariant Causal Prediction for Block MDPs.
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation.
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values.
Self-Attentive Hawkes Process.
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions.
Dual-Path Distillation: A Unified Framework to Improve Black-Box Attacks.
Optimal Estimator for Unlabeled Linear Regression.
Learning Structured Latent Factors from Dependent Data:A Generative Model Framework from Information-Theoretic Perspective.
Privately Learning Markov Random Fields.
Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning.
Spread Divergence.
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization.
Approximation Capabilities of Neural ODEs and Invertible Residual Networks.
A Tree-Structured Decoder for Image-to-Markup Generation.
Learning the Valuations of a k-demand Agent.
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings.
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate.
Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games.
Robustness to Programmable String Transformations via Augmented Abstract Training.
Designing Optimal Dynamic Treatment Regimes: A Causal Reinforcement Learning Approach.
Learning Calibratable Policies using Programmatic Style-Consistency.
Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing.
Learning Near Optimal Policies with Low Inherent Bellman Error.
Graph Random Neural Features for Distance-Preserving Graph Representations.
Training Deep Energy-Based Models with f-Divergence Minimization.
Federated Learning with Only Positive Labels.
Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters.
Intrinsic Reward Driven Imitation Learning via Generative Model.
Label-Noise Robust Domain Adaptation.
Graphical Models Meet Bandits: A Variational Thompson Sampling Approach.
Simultaneous Inference for Massive Data: Distributed Bootstrap.
Graph Structure of Neural Networks.
When Does Self-Supervision Help Graph Convolutional Networks?
Robustifying Sequential Neural Processes.
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning.
Data Valuation using Reinforcement Learning.
It's Not What Machines Can Learn, It's What We Cannot Teach.
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection.
Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification.
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback.
Searching to Exploit Memorization Effect in Learning with Noisy Labels.
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks.
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks.
Multi-Agent Determinantal Q-Learning.
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound.
Interpolation between Residual and Non-Residual Networks.
On the consistency of top-k surrogate losses.
Improving Molecular Design by Stochastic Iterative Target Augmentation.
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning.
Randomized Smoothing of All Shapes and Sizes.
Energy-Based Processes for Exchangeable Data.
Variational Bayesian Quantization.
Stochastic Optimization for Non-convex Inf-Projection Problems.
Feature Selection using Stochastic Gates.
Amortized Finite Element Analysis for Fast PDE-Constrained Optimization.
Video Prediction via Example Guidance.
MetaFun: Meta-Learning with Iterative Functional Updates.
Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control.
Variational Label Enhancement.
Learning Factorized Weight Matrix for Joint Filtering.
Learning Autoencoders with Relational Regularization.
Understanding and Stabilizing GANs' Training Dynamics Using Control Theory.
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation.
Class-Weighted Classification: Trade-offs and Robust Approaches.
On Variational Learning of Controllable Representations for Text without Supervision.
On Layer Normalization in the Transformer Architecture.
On the Number of Linear Regions of Convolutional Neural Networks.
Lower Complexity Bounds for Finite-Sum Convex-Concave Minimax Optimization Problems.
Zeno++: Robust Fully Asynchronous SGD.
Maximum-and-Concatenation Networks.
Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing.
Disentangling Trainability and Generalization in Deep Neural Networks.
Generative Flows with Matrix Exponential.
A Flexible Framework for Nonparametric Graphical Modeling that Accommodates Machine Learning.
Continuous Graph Neural Networks.
Amortized Population Gibbs Samplers with Neural Sufficient Statistics.
On the Generalization Effects of Linear Transformations in Data Augmentation.
Adversarial Robustness via Runtime Masking and Cleansing.
Sequence Generation with Mixed Representations.
Stronger and Faster Wasserstein Adversarial Attacks.
On the Noisy Gradient Descent that Generalizes as SGD.
DeltaGrad: Rapid retraining of machine learning models.
Obtaining Adjustable Regularization for Free via Iterate Averaging.
Is Local SGD Better than Minibatch SGD?
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling.
Causal Inference using Gaussian Processes with Structured Latent Confounders.
Learning to Rank Learning Curves.
Efficiently sampling functions from Gaussian process posteriors.
Efficient nonparametric statistical inference on population feature importance using Shapley values.
State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes.
Predictive Sampling with Forecasting Autoregressive Models.
How Good is the Bayes Posterior in Deep Neural Networks Really?
Amortised Learning by Wake-Sleep.
Towards Understanding the Regularization of Adversarial Robustness on Neural Networks.
Domain Aggregation Networks for Multi-Source Domain Adaptation.
Batch Stationary Distribution Estimation.
Online Control of the False Coverage Rate and False Sign Rate.
The Implicit and Explicit Regularization Effects of Dropout.
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems.
PoKED: A Semi-Supervised System for Word Sense Disambiguation.
Learning Representations that Support Extrapolation.
A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model.
Thompson Sampling via Local Uncertainty.
On Lp-norm Robustness of Ensemble Decision Stumps and Trees.
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning.
On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data.
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling.
Cost-effectively Identifying Causal Effects When Only Response Variable is Observable.
Neural Network Control Policy Verification With Persistent Adversarial Perturbation.
Sequential Cooperative Bayesian Inference.
Loss Function Search for Face Recognition.
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables.
When deep denoising meets iterative phase retrieval.
Bandits for BMO Functions.
Optimizing Data Usage via Differentiable Rewards.
BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates.
Deep Streaming Label Learning.
Haar Graph Pooling.
Enhanced POET: Open-ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions.
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere.
Frustratingly Simple Few-Shot Object Detection.
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach.
Continuously Indexed Domain Adaptation.
Non-separable Non-stationary random fields.
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles.
Upper bounds for Model-Free Row-Sparse Principal Component Analysis.
Self-Modulating Nonparametric Event-Tensor Factorization.
Towards Accurate Post-training Network Quantization via Bit-Split and Stitching.
On the Global Optimality of Model-Agnostic Meta-Learning.
Logistic Regression for Massive Data with Rare Events.
Projection-free Distributed Online Convex Optimization with $O(\sqrt{T})$ Communication Complexity.
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks.
Safe Reinforcement Learning in Constrained Markov Decision Processes.
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space.
Conditional gradient methods for stochastically constrained convex minimization.
New Oracle-Efficient Algorithms for Private Synthetic Data Release.
Private Reinforcement Learning with PAC and Regret Guarantees.
Born-Again Tree Ensembles.
OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning.
Non-Stationary Delayed Bandits with Intermediate Observations.
Linear bandits with Stochastic Delayed Feedback.
Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks.
Uncertainty Estimation Using a Single Deep Deterministic Neural Network.
Undirected Graphical Models as Approximate Posteriors.
StochasticRank: Global Optimization of Scale-Free Discrete Functions.
Minimax Weight and Q-Function Learning for Off-Policy Evaluation.
Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network.
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks Using PAC-Bayesian Analysis.
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks.
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources.
GraphOpt: Learning Optimization Models of Graph Formation.
Single Point Transductive Prediction.
Bayesian Differential Privacy for Machine Learning.
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization.
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations.
Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances.
Alleviating Privacy Attacks via Causal Learning.
TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics.
Choice Set Optimization Under Discrete Choice Models of Group Decisions.
Multi-step Greedy Reinforcement Learning Algorithms.
Convolutional dictionary learning based auto-encoders for natural exponential-family distributions.
Sequential Transfer in Reinforcement Learning with a Generative Model.
Student Specialization in Deep Rectified Networks With Finite Width and Input Dimension.
Few-shot Domain Adaptation by Causal Mechanism Transfer.
Inductive Relation Prediction by Subgraph Reasoning.
Sparse Sinkhorn Attention.
No-Regret Exploration in Goal-Oriented Reinforcement Learning.
Learning disconnected manifolds: a no GAN's land.
Variational Imitation Learning with Diverse-quality Demonstrations.
Taylor Expansion Policy Optimization.
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies.
The Buckley-Osthus model and the block preferential attachment model: statistical analysis and application.
Reinforcement Learning for Integer Programming: Learning to Cut.
DropNet: Reducing Neural Network Complexity via Iterative Pruning.
Fiedler Regularization: Learning Neural Networks with Graph Sparsity.
Multi-fidelity Bayesian Optimization with Max-value Entropy Search and its Parallelization.
Quantized Decentralized Stochastic Learning over Directed Graphs.
Distinguishing Cause from Effect Using Quantiles: Bivariate Quantile Causal Discovery.
Multi-Agent Routing Value Iteration Network.
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks.
Multi-objective Bayesian Optimization using Pareto-frontier Entropy.
The Many Shapley Values for Model Explanation.
The Shapley Taylor Interaction Index.
An EM Approach to Non-autoregressive Conditional Sequence Generation.
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts.
Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking.
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data.
Adaptive Estimator Selection for Off-Policy Evaluation.
ConQUR: Mitigating Delusional Bias in Deep Q-Learning.
Task Understanding from Confusing Multi-task Data.
Doubly robust off-policy evaluation with shrinkage.
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks.
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information.
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods.
Which Tasks Should Be Learned Together in Multi-task Learning?
Robustness to Spurious Correlations via Human Annotations.
Hypernetwork approach to generating point clouds.
Provably Efficient Model-based Policy Adaptation.
Bridging the Gap Between f-GANs and Wasserstein GANs.
Multiclass Neural Network Minimization via Tropical Newton Polytope Approximation.
On the Generalization Benefit of Noise in Stochastic Gradient Descent.
When Explanations Lie: Why Many Modified BP Attributions Fail.
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning.
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis.
Interpretable, Multidimensional, Multimodal Anomaly Detection with Negative Sampling for Detection of Device Failure.
Small-GAN: Speeding up GAN Training using Core-Sets.
FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis.
Second-Order Provable Defenses against Adversarial Attacks.
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise.
Reinforcement Learning for Molecular Design Guided by Quantum Mechanics.
A Generative Model for Molecular Distance Geometry.
Naive Exploration is Optimal for Online LQR.
Collaborative Machine Learning with Incentive-Aware Model Rewards.
Deep Gaussian Markov Random Fields.
Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards.
Piecewise Linear Regression via a Difference of Convex Functions.
Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits.
A Markov Decision Process Model for Socio-Economic Systems Impacted by Climate Change.
Predictive Coding for Locally-Linear Control.
On Conditional Versus Marginal Bias in Multi-Armed Bandits.
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation.
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective.
A Graph to Graphs Framework for Retrosynthesis Prediction.
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making.
Message Passing Least Squares Framework and its Application to Rotation Synchronization.
Incremental Sampling Without Replacement for Sequence Models.
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks.
One-shot Distributed Ridge Regression in High Dimensions.
Extreme Multi-label Classification from Aggregated Labels.
PowerNorm: Rethinking Batch Normalization in Transformers.
Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints.
Educating Text Autoencoders: Latent Representation Guidance via Denoising.
Deep Reinforcement Learning with Robust and Smooth Policy.
PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions.
Adaptive Sampling for Estimating Probability Distributions.
Causal Strategic Linear Regression.
Lookahead-Bounded Q-learning.
ControlVAE: Controllable Variational Autoencoder.
Channel Equilibrium Networks for Learning Deep Representation.
Evaluating Machine Accuracy on ImageNet.
Learning Robot Skills with Temporal Variational Inference.
Neural Kernels Without Tangents.
Optimistic Policy Optimization with Bandit Feedback.
An Explicitly Relational Neural Network Architecture.
Planning to Explore via Self-Supervised World Models.
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures.
Universal Asymptotic Optimality of Polyak Momentum.
Discriminative Adversarial Search for Abstractive Summarization.
Off-Policy Actor-Critic with Shared Experience Replay.
Implicit competitive regularization in GANs.
Harmonic Decompositions of Convolutional Networks.
A Sample Complexity Separation between Non-Convex and Convex Meta-Learning.
Constrained Markov Decision Processes via Backward Value Functions.
Detecting Out-of-Distribution Examples with Gram Matrices.
Explicit Gradient Learning for Black-Box Optimization.
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent.
Learning to Simulate Complex Physics with Graph Networks.
Spectral Subsampling MCMC for Stationary Time Series.
A Quantile-based Approach for Hyperparameter Transfer Learning.
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization.
The Performance Analysis of Generalized Margin Maximizers on Separable Data.
Inferring DQN structure for high-dimensional continuous control.
Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models.
From Sets to Multisets: Provable Variational Inference for Probabilistic Integer Submodular Models.
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics.
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model.
Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards.
An Investigation of Why Overparameterization Exacerbates Spurious Correlations.
Causal Structure Discovery from Distributions Arising from Mixtures of DAGs.
Radioactive data: tracing through training.
Bounding the fairness and accuracy of classifiers from population statistics.
Adversarial Attacks on Copyright Detection Systems.
Bio-Inspired Hashing for Unsupervised Similarity Search.
Inter-domain Deep Gaussian Processes.
Bayesian Optimisation over Multiple Continuous and Categorical Inputs.
Simple and sharp analysis of k-means||.
FetchSGD: Communication-Efficient Federated Learning with Sketching.
Revisiting Training Strategies and Generalization Performance in Deep Metric Learning.
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing.
Predicting Choice with Set-Dependent Aggregation.
Near-optimal Regret Bounds for Stochastic Shortest Path.
Finite-Time Convergence in Continuous-Time Optimization.
Attentive Group Equivariant Convolutional Networks.
Reverse-engineering deep ReLU networks.
Double-Loop Unadjusted Langevin Algorithm.
Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning.
FR-Train: A Mutual Information-Based Approach to Fair and Robust Training.
On Semi-parametric Inference for BART.
Strength from Weakness: Fast Learning Using Weak Supervision.
Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge.
Decentralised Learning with Random Features and Distributed Gradient Descent.
Overfitting in adversarially robust deep learning.
Normalizing Flows on Tori and Spheres.
NetGAN without GAN: From Random Walks to Low-Rank Approximations.
The Sample Complexity of Best-k Items Selection from Pairwise Comparisons.
Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation.
Optimistic Bounds for Multi-output Learning.
Learning Human Objectives by Evaluating Hypothetical Behavior.
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch.
Universal Equivariant Multilayer Perceptrons.
Implicit Generative Modeling for Efficient Exploration.
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning.
Closing the convergence gap of SGD without replacement.
A Game Theoretic Framework for Model Based Reinforcement Learning.
Multi-Precision Policy Enforced Training (MuPPET) : A Precision-Switching Strategy for Quantised Fixed-Point Training of CNNs.
Improving Robustness of Deep-Learning-Based Image Reconstruction.
Fast Adaptation to New Environments via Policy-Dynamics Value Functions.
Understanding and Mitigating the Tradeoff between Robustness and Accuracy.
Transparency Promotion with Model-Agnostic Linear Competitors.
Fast and Private Submodular and k-Submodular Functions Maximization with Matroid Constraints.
DeepCoDA: personalized interpretability for compositional health data.
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs.
Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis.
Scalable Differentiable Physics for Learning and Control.
Unsupervised Speech Decomposition via Triple Information Bottleneck.
Deep Isometric Learning for Visual Recognition.
Adversarial Risk via Optimal Transport and Optimal Couplings.
Graph-based Nearest Neighbor Search: From Practice to Theory.
SoftSort: A Continuous Relaxation for the argsort Operator.
Skew-Fit: State-Covering Self-Supervised Reinforcement Learning.
On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness.
Explaining Groups of Points in Low-Dimensional Representations.
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning.
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation.
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition.
Randomization matters How to defend against strong adversarial attacks.
WaveFlow: A Compact Flow-based Model for Raw Audio.
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks.
Scalable Differential Privacy with Certified Robustness in Adversarial Learning.
On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm.
IPBoost - Non-Convex Boosting via Integer Programming.
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning.
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent.
Low Bias Low Variance Gradient Estimates for Boolean Stochastic Networks.
Budgeted Online Influence Maximization.
Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks.
Performative Prediction.
Non-Autoregressive Neural Text-to-Speech.
Learning Selection Strategies in Buchberger's Algorithm.
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits.
Acceleration through spectral density estimation.
Reducing Sampling Error in Batch Temporal Difference Learning.
Regularized Optimal Transport is Ground Cost Adversarial.
Structured Policy Iteration for Linear Quadratic Regulator.
Meta Variance Transfer: Learning to Augment from the Others.
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis.
Stabilizing Transformers for Reinforcement Learning.
Adversarial Mutual Information for Text Generation.
Recovery of Sparse Signals from a Mixture of Linear Samples.
Neural Clustering Processes.
Learning to Score Behaviors for Guided Policy Optimization.
Interferometric Graph Transform: a Deep Unsupervised Graph Representation.
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
On the (In)tractability of Computing Normalizing Constants for the Product of Determinantal Point Processes.
Eliminating the Invariance on the Loss Landscape of Linear Autoencoders.
T-Basis: a Compact Representation for Neural Networks.
Consistent Structured Prediction with Max-Min Margin Markov Networks.
Supervised learning: no loss no cry.
Semi-Supervised StyleGAN for Disentanglement Learning.
LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction.
Streaming k-Submodular Maximization under Noise subject to Size Constraint.
Robust Bayesian Classification Using An Optimistic Score Ratio.
Knowing The What But Not The Where in Bayesian Optimization.
Graph Homomorphism Convolution.
LEEP: A New Measure to Evaluate Transferability of Learned Representations.
Aggregation of Multiple Knockoffs.
Involutive MCMC: a Unifying Framework.
In Defense of Uniform Convergence: Generalization via Derandomization with an Application to Interpolating Predictors.
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization.
Oracle Efficient Private Non-Convex Optimization.
Bayesian Sparsification of Deep C-valued Networks.
PolyGen: An Autoregressive Generative Model of 3D Meshes.
Goal-Aware Prediction: Learning to Model What Matters.
Up or Down? Adaptive Rounding for Post-Training Quantization.
From Chaos to Order: Symmetry and Conservation Laws in Game Dynamics.
Reliable Fidelity and Diversity Metrics for Generative Models.
Voice Separation with an Unknown Number of Multiple Speakers.
Full Law Identification in Graphical Models of Missing Data: Completeness Results.
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees.
Missing Data Imputation using Optimal Transport.
Fast computation of Nash Equilibria in Imperfect Information Games.
Unique Properties of Flat Minima in Deep Networks.
Two Simple Ways to Learn Individual Fairness Metrics from Data.
Continuous-time Lower Bounds for Gradient-based Algorithms.
Consistent Estimators for Learning to Defer to an Expert.
Fair Learning with Private Demographic Data.
Explainable k-Means and k-Medians Clustering.
Topological Autoencoders.
Confidence-Aware Learning for Deep Neural Networks.
An end-to-end approach for the verification problem: learning the right distance.
Efficiently Learning Adversarially Robust Halfspaces with Noise.
Transformation of ReLU-based recurrent neural networks from discrete-time to continuous-time.
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach.
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules.
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning.
Coresets for Data-efficient Training of Machine Learning Models.
Learning Reasoning Strategies in End-to-End Differentiable Proving.
Automatic Shortcut Removal for Self-Supervised Representation Learning.
Strategic Classification is Causal Modeling in Disguise.
The Effect of Natural Distribution Shift on Question Answering Models.
VideoOneNet: Bidirectional Convolutional Recurrent OneNet with Trainable Data Steps for Video Processing.
Projective Preferential Bayesian Optimization.
The Role of Regularization in Classification of High-dimensional Noisy Gaussian Mixture.
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning.
Training Binary Neural Networks using the Bayesian Learning Rule.
Randomized Block-Diagonal Preconditioning for Parallel Learning.
Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM.
On the Global Convergence Rates of Softmax Policy Gradient Methods.
Neural Datalog Through Time: Informed Temporal Modeling via Logical Specification.
On Approximate Thompson Sampling with Langevin Algorithms.
Fast and Consistent Learning of Hidden Markov Models by Incorporating Non-Consecutive Correlations.
Adding seemingly uninformative labels helps in low data regimes.
Predictive Multiplicity in Classification.
Minimax Pareto Fairness: A Multi Objective Perspective.
Stochastically Dominant Distributional Reinforcement Learning.
On Learning Sets of Symmetric Elements.
Adaptive Adversarial Multi-task Representation Learning.
Emergence of Separable Manifolds in Deep Language Representations.
Adaptive Gradient Descent without Descent.
From Local SGD to Local Fixed-Point Methods for Federated Learning.
Proving the Lottery Ticket Hypothesis: Pruning is All You Need.
Optimal transport mapping via input convex neural networks.
Estimation of Bounds on Potential Outcomes For Decision Making.
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination.
Adversarial Robustness Against the Union of Multiple Perturbation Models.
Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization.
Anderson Acceleration of Proximal Gradient Methods.
How recurrent networks implement contextual processing in sentiment analysis.
Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization.
Individual Fairness for k-Clustering.
Adversarial Neural Pruning with Latent Vulnerability Suppression.
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle.
Quadratically Regularized Subgradient Methods for Weakly Convex Optimization with Weakly Convex Constraints.
Normalized Loss Functions for Deep Learning with Noisy Labels.
Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space.
Efficient Continuous Pareto Exploration in Multi-Task Learning.
Bandits with Adversarial Scaling.
Progressive Identification of True Labels for Partial-Label Learning.
Learning Algebraic Multigrid Using Graph Neural Networks.
Adversarial Nonnegative Matrix Factorization.
Progressive Graph Learning for Open-Set Domain Adaptation.
Improved Communication Cost in Distributed PageRank Computation - A Theoretical Study.
Does label smoothing mitigate label noise?
Countering Language Drift with Seeded Iterated Learning.
A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth.
Moniqua: Modulo Quantized Communication in Decentralized SGD.
Working Memory Graphs.
Differentiating through the Fréchet Mean.
Error Estimation for Sketched SVD via the Bootstrap.
Stochastic Hamiltonian Gradient Methods for Smooth Games.
Too Relaxed to Be Fair.
Weakly-Supervised Disentanglement Without Compromises.
Finding trainable sparse networks through Neural Tangent Transfer.
Learning to Encode Position for Transformer with Continuous Dynamical Model.
Learning Deep Kernels for Non-Parametric Two-Sample Tests.
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton.
Median Matrix Completion: from Embarrassment to Optimality.
Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks.
A Chance-Constrained Generative Framework for Sequence Optimization.
Hallucinative Topological Memory for Zero-Shot Visual Planning.
Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits.
An Imitation Learning Approach for Cache Replacement.
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates.
Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors.
Boosting Deep Neural Network Efficiency with Dual-Module Inference.
Sparse Shrunk Additive Models.
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling.
Time-aware Large Kernel Convolutions.
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games.
Generalized and Scalable Optimal Sparse Decision Trees.
Improving Generative Imagination in Object-Centric World Models.
InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs.
Handling the Positive-Definite Constraint in the Bayesian Learning Rule.
On the Theoretical Properties of the Network Jackknife.
Extrapolation for Large-batch Training in Deep Learning.
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems.
Hierarchical Verification for Adversarial Robustness.
AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation.
Adaptive Droplet Routing in Digital Microfluidic Biochips Using Deep Reinforcement Learning.
Variable Skipping for Autoregressive Range Density Estimation.
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation.
On a projective ensemble approach to two sample test for equality of distributions.
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr.
Input-Sparsity Low Rank Approximation in Schatten Norm.
Temporal Logic Point Processes.
Nearly Linear Row Sampling Algorithm for Quantile Regression.
Almost Tune-Free Variance Reduction.
Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
Evolutionary Topology Search for Tensor Network Decomposition.
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective.
Visual Grounding of Learned Physical Models.
Latent Space Factorisation and Manipulation via Matrix Subspace Projection.
On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation.
Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization.
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning.
Implicit Euler Skip Connections: Enhancing Adversarial Robustness via Numerical Stability.
PENNI: Pruned Kernel Sharing for Efficient CNN Inference.
Neural Architecture Search in A Proxy Validation Loss Landscape.
Manifold Identification for Ultimately Communication-Efficient Distributed Optimization.
ACFlow: Flow Models for Arbitrary Conditional Likelihoods.
Learning Quadratic Games on Networks.
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent.
SGD Learns One-Layer Networks in WGANs.
Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks.
Tensor denoising and completion based on ordinal observations.
Temporal Phenotyping using Deep Predictive Clustering of Disease Progression.
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning.
Learning Compound Tasks without Task-specific Knowledge via Imitation and Self-supervised Learning.
Accelerated Message Passing for Entropy-Regularized MAP Inference.
Batch Reinforcement Learning with Hyperparameter Gradients.
Self-supervised Label Augmentation via Input Transformations.
Estimating Model Uncertainty of Neural Networks in Sparse Information Form.
Causal Effect Identifiability under Partial-Observability.
Self-Attentive Associative Memory.
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization.
Learning with Good Feature Representations in Bandits and in RL with a Generative Model.
Efficient Proximal Mapping of the 1-path-norm of Shallow Networks.
CURL: Contrastive Unsupervised Representations for Reinforcement Learning.
Robust and Stable Black Box Explanations.
Bidirectional Model-based Policy Optimization.
Recht-Re Noncommutative Arithmetic-Geometric Mean Conjecture is False.
Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses.
Optimal Randomized First-Order Methods for Least-Squares Problems.
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions.
Principled learning method for Wasserstein distributionally robust optimization with local perturbations.
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics.
Soft Threshold Weight Reparameterization for Learnable Sparsity.
Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks.
Online Dense Subgraph Discovery via Blurred-Graph Feedback.
Two Routes to Scalable Credit Assignment without Weight Symmetry.
Efficient Identification in Linear Structural Causal Models with Auxiliary Cutsets.
Problems with Shapley-value-based explanations as feature importance measures.
On Implicit Regularization in β-VAEs.
Understanding Self-Training for Gradual Domain Adaptation.
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness.
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition.
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks.
Asynchronous Coagent Networks.
On the Sample Complexity of Adversarial Multi-Source PAC Learning.
SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates.
Meta-learning for Mixed Linear Regression.
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates.
Online Learning for Active Cache Synchronization.
Equivariant Flows: Exact Likelihood Generative Learning for Symmetric Densities.
Learning Similarity Metrics for Numerical Simulations.
Concept Bottleneck Models.
Optimal Continual Learning has Perfect Memory and is NP-hard.
Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation.
Active World Model Learning with Progress Curiosity.
Variational Inference for Sequential Data with Future Likelihood Estimates.
Domain Adaptive Imitation Learning.
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup.
FACT: A Diagnostic for Group Fairness Trade-offs.
Uniform Convergence of Rank-weighted Learning.
What can I do here? A Theory of Affordances in Reinforcement Learning.
Private Outsourced Bayesian Optimization.
Entropy Minimization In Emergent Languages.
Feature Noise Induces Loss Discrepancy Across Groups.
Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems.
Quantum Expectation-Maximization for Gaussian mixture models.
Efficient Non-conjugate Gaussian Process Factor Models for Spike Count Data using Polynomial Approximations.
Rate-distortion optimization guided autoencoder for isometric embedding in Euclidean latent space.
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention.
Non-autoregressive Machine Translation with Disentangled Context Transformer.
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
Operation-Aware Soft Channel Pruning using Differentiable Masks.
Learning and Evaluating Contextual Embedding of Source Code.
On the Power of Compressed Sensing with Generative Models.
Statistically Efficient Off-Policy Policy Gradients.
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation.
DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training.
Variational Autoencoders with Riemannian Brownian Motion Priors.
Strategyproof Mean Estimation from Multiple-Choice Questions.
Partial Trace Regression and Low-Rank Kraus Decomposition.
Sub-Goal Trees a Framework for Goal-Based Reinforcement Learning.
Distribution Augmentation for Generative Modeling.
Sets Clustering.
A simpler approach to accelerated optimization: iterative averaging meets optimism.
Stochastic Differential Equations with Variational Wishart Diffusions.
Evaluating the Performance of Reinforcement Learning Algorithms.
Being Bayesian about Categorical Probability.
Fair k-Centers via Maximum Matching.
On Relativistic f-Divergences.
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization.
AdaScale SGD: A User-Friendly Algorithm for Distributed Training.
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model.
Efficiently Solving MDPs with Stochastic Mirror Descent.
What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?
Reward-Free Exploration for Reinforcement Learning.
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition.
Multi-Objective Molecule Generation using Interpretable Substructures.
Hierarchical Generation of Molecular Graphs using Structural Motifs.
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders.
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation.
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels.
BINOCULARS for efficient, nonmyopic sequential experimental design.
Optimizing Black-box Metrics with Adaptive Surrogates.
Information-Theoretic Local Minima Characterization and Regularization.
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms.
T-GD: Transferable GAN-generated Images Detection Framework.
Extra-gradient with player sampling for faster convergence in n-player games.
Source Separation with Deep Generative Priors.
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making.
Parametric Gaussian Process Regressors.
Debiased Sinkhorn barycenters.
Learning Portable Representations for High-Level Planning.
Tails of Lipschitz Triangular Flows.
Generalization to New Actions in Reinforcement Learning.
Optimal Robust Learning of Discrete Distributions from Batches.
Correlation Clustering with Asymmetric Classification Errors.
Implicit Regularization of Random Feature Models.
Semi-Supervised Learning with Normalizing Flows.
Do We Need Zero Training Loss After Achieving Zero Training Error?
Fast Deterministic CUR Matrix Decomposition with Accuracy Assurance.
Linear Lower Bounds and Conditioning of Differentiable Games.
Meta-Learning with Shared Amortized Variational Inference.
Multigrid Neural Memory.
Curvature-corrected learning dynamics in deep neural networks.
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy.
Deep Graph Random Process for Relational-Thinking-Based Speech Recognition.
Accelerated Stochastic Gradient-free and Projection-free Methods.
InstaHide: Instance-hiding Schemes for Private Distributed Learning.
Generating Programmatic Referring Expressions via Program Synthesis.
More Information Supervised Probabilistic Deep Face Embedding Learning.
Improving Transformer Optimization Through Better Initialization.
Communication-Efficient Distributed PCA by Riemannian Optimization.
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control.
Evaluating Lossy Compression Rates of Deep Generative Models.
From Importance Sampling to Doubly Robust Policy Gradient.
Momentum-Based Policy Gradient Methods.
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation.
"Other-Play" for Zero-Shot Coordination.
The Non-IID Data Quagmire of Decentralized Machine Learning.
Infinite attention: NNGP and NTK for deep attention networks.
Lifted Disjoint Paths with Application in Multiple Object Tracking.
Set Functions for Time Series.
Learning Mixtures of Graphs from Epidemic Cascades.
Black-Box Variational Inference as a Parametric Approximation to Langevin Dynamics.
Graph Filtration Learning.
Topologically Densified Distributions.
Parameterized Rate-Distortion Stochastic Encoder.
Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion.
Optimizing Dynamic Structures with Bayesian Generative Search.
Optimization and Analysis of the pAp@k Metric for Recommender Systems.
Towards Non-Parametric Drift Detection via Dynamic Adapting Window Independence Drift Detection (DAWIDD).
Likelihood-free MCMC with Amortized Approximate Ratio Estimators.
Cost-Effective Interactive Attention Learning with Neural Attention Processes.
Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization.
Minimax Rate for Learning From Pairwise Comparisons in the BTL Model.
Data-Efficient Image Recognition with Contrastive Predictive Coding.
Gradient-free Online Learning in Continuous Games with Delayed Rewards.
Hierarchically Decoupled Imitation For Morphological Transfer.
Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation.
The Tree Ensemble Layer: Differentiability meets Conditional Computation.
Nested Subspace Arrangement for Representation of Relational Data.
Contrastive Multi-View Representation Learning on Graphs.
CoMic: Complementary Task Learning & Mimicry for Reusable Skills.
Bayesian Graph Neural Networks with Adaptive Connection Sampling.
A Natural Lottery Ticket Winner: Reinforcement Learning with Ordinary Neural Circuits.
Improving generalization by controlling label-noise information in neural network weights.
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising.
Data Amplification: Instance-Optimal Property Estimation.
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems.
Stochastic Subspace Cubic Newton Method.
Training Binary Neural Networks through Learning with Noisy Supervision.
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust.
DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images.
Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix.
FedBoost: A Communication-Efficient Algorithm for Federated Learning.
Optimal approximation for unconstrained non-submodular minimization.
Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets.
Streaming Submodular Maximization under a k-Set System Constraint.
Retrieval Augmented Language Model Pre-Training.
Multidimensional Shape Constraints.
Neural Topic Modeling with Continual Lifelong Learning.
Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data.
Accelerating Large-Scale Inference with Anisotropic Vector Quantization.
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning.
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks.
Learning to Branch for Multi-Task Learning.
LTF: A Label Transformation Framework for Correcting Label Shift.
Certified Data Removal from Machine Learning Models.
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search.
Recurrent Hierarchical Topic-Guided RNN for Language Generation.
Improving the Gating Mechanism of Recurrent Neural Networks.
Implicit Geometric Regularization for Learning Shapes.
Near-Tight Margin-Based Generalization Bounds for Support Vector Machines.
Monte-Carlo Tree Search as Regularized Policy Optimization.
Robust Learning with the Hilbert-Schmidt Independence Criterion.
On the Iteration Complexity of Hypergradient Computation.
Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling.
Scalable Gaussian Process Separation for Kernels with a Non-Stationary Phase.
DROCC: Deep Robust One-Class Classification.
PackIt: A Virtual Environment for Geometric Planning.
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination.
Ordinal Non-negative Matrix Factorization for Recommendation.
Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning.
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions.
Automatic Reparameterisation of Probabilistic Programs.
The continuous categorical: a novel simplex-valued exponential family.
Differentially Private Set Union.
Towards a General Theory of Infinite-Width Limits of Neural Classifiers.
Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks.
SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification.
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent.
One Size Fits All: Can We Train One Denoiser for All Noise Levels?
Adaptive Sketching for Fast and Convergent Canonical Polyadic Decomposition.
Representations for Stable Off-Policy Reinforcement Learning.
Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos.
A Distributional Framework For Data Valuation.
Gradient Temporal-Difference Learning with Regularized Corrections.
Aligned Cross Entropy for Non-Autoregressive Machine Translation.
Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead.
Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs.
Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics.
Online Multi-Kernel Learning with Graph-Structured Feedback.
Black-Box Methods for Restoring Monotonicity.
Generalisation error in learning with random features and the hidden manifold model.
Multilinear Latent Conditioning for Generating Unseen Attribute Combinations.
Deep PQR: Solving Inverse Reinforcement Learning using Anchor Actions.
Generalization and Representational Limits of Graph Neural Networks.
Predicting deliberative outcomes.
Symbolic Network: Generalized Neural Policies for Relational MDPs.
Online Convex Optimization in the Random Order Model.
Can Stochastic Zeroth-Order Frank-Wolfe Method Converge Faster for Non-Convex Problems?
A Free-Energy Principle for Representation Learning.
Abstraction Mechanisms Predict Generalization in Deep Neural Networks.
Stochastic bandits with arm-dependent delays.
Accelerating the diffusion-based ensemble sampling by non-reversible dynamics.
Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions.
DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths.
Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript.
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks.
Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods.
No-Regret and Incentive-Compatible Online Learning.
Linear Mode Connectivity and the Lottery Ticket Hypothesis.
Leveraging Frequency Analysis for Deep Fake Image Recognition.
Stochastic Latent Residual Video Prediction.
p-Norm Flow Diffusion for Local Graph Clustering.
Logarithmic Regret for Adversarial Online Control.
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles.
Topic Modeling via Full Dependence Mixtures.
Information Particle Filter Tree: An Online Algorithm for POMDPs with Belief-Based Rewards on Continuous Domains.
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data.
How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization.
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study.
Why Are Learned Indexes So Effective?
Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data.
Accountable Off-Policy Evaluation With Kernel Bellman Statistics.
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation.
Global Concavity and Optimization in a Class of Dynamic Discrete Choice Models.
Learning with Multiple Complementary Labels.
Revisiting Fundamentals of Experience Replay.
Improved Optimistic Algorithms for Logistic Bandits.
Growing Action Spaces.
Do GANs always have Nash equilibria?
Stochastic Regret Minimization in Extensive-Form Games.
Online mirror descent and dual averaging: keeping pace in the dynamic case.
On hyperparameter tuning in general clustering problemsm.
Spectral Graph Matching and Regularized Quadratic Relaxations: Algorithm and Theory.
Optimal Sequential Maximization: One Interview is Enough!
Latent Bernoulli Autoencoder.
Faster Graph Embeddings via Coarsening.
Rigging the Lottery: Making All Tickets Winners.
Distributed Online Optimization over a Heterogeneous Network with Any-Batch Mirror Descent.
Identifying Statistical Bias in Dataset Replication.
Continuous Time Bayesian Networks with Clocks.
Parallel Algorithm for Non-Monotone DR-Submodular Maximization.
Generalization Error of Generalized Linear Models in High Dimensions.
Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks.
Revisiting Spatial Invariance with Low-Rank Local Connectivity.
Decision Trees for Decision-Making under the Predict-then-Optimize Framework.
Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location.
Training Linear Neural Networks: Non-Local Convergence and Complexity Results.
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients.
Self-Concordant Analysis of Frank-Wolfe Algorithms.
Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing.
Sparse Gaussian Processes with Spherical Harmonic Features.
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors.
On Contrastive Learning for Likelihood-free Inference.
Equivariant Neural Rendering.
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers.
Kernel Methods for Cooperative Multi-Agent Contextual Bandits.
Cooperative Multi-Agent Bandits with Heavy Tails.
Familywise Error Rate Control by Interactive Unmasking.
Online Bayesian Moment Matching based SAT Solver Heuristics.
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.
NGBoost: Natural Gradient Boosting for Probabilistic Prediction.
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders.
Optimal Non-parametric Learning in Repeated Contextual Auctions with Strategic Buyer.
The Complexity of Finding Stationary Points with Stochastic Gradient Descent.
Expert Learning through Generalized Inverse Multiobjective Optimization: Models, Insights, and Algorithms.
Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems.
On the Expressivity of Neural Networks for Deep Reinforcement Learning.
Towards Adaptive Residual Network Training: A Neural-ODE Perspective.
Multinomial Logit Bandit with Low Switching Cost.
Optimal Differential Privacy Composition for Exponential Mechanisms.
Provable Smoothness Guarantees for Black-Box Variational Inference.
Inexact Tensor Methods with Dynamic Accuracies.
Growing Adaptive Multi-hyperplane Machines.
Layered Sampling for Robust Optimization Problems.
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features.
Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence.
Enhancing Simple Models by Exploiting What They Already Know.
Margin-aware Adversarial Domain Adaptation with Optimal Transport.
A Swiss Army Knife for Minimax Optimal Transport.
Robust Pricing in Dynamic Mechanism Design.
Towards Understanding the Dynamics of the First-Order Adversaries.
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC.
Interpreting Robust Optimization via Adversarial Influence Functions.
Randomly Projected Additive Gaussian Processes for Regression.
Structure Adaptive Algorithms for Stochastic Bandits.
Gamification of Pure Exploration for Linear Bandits.
An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm.
Representing Unordered Data Using Complex-Weighted Multiset Automata.
Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid Flow Prediction.
Low-Variance and Zero-Variance Baselines for Extensive-Form Games.
Probing Emergent Semantics in Predictive Agents via Question Answering.
Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors.
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.
Sharp Statistical Guaratees for Adversarially Robust Gaussian Classification.
Goodness-of-Fit Tests for Inhomogeneous Random Graphs.
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting.
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse.
Scalable Deep Generative Modeling for Sparse Graphs.
R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games.
Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime.
Supervised Quantile Normalization for Low Rank Matrix Factorization.
Momentum Improves Normalized SGD.
Parameter-free, Dynamic, and Strongly-Adaptive Online Learning.
Flexible and Efficient Long-Range Planning Through Curious Exploration.
Privately detecting changes in unknown distributions.
Real-Time Optimisation for Online Learning in Auctions.
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks.
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack.
Causal Modeling for Fairness In Dynamical Systems.
DINO: Distributed Newton-Type Optimization Method.
Learnable Group Transform For Time-Series.
Online Learning with Dependent Stochastic Feedback Graphs.
Adaptive Region-Based Active Learning.
Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows.
Learning Opinions in Social Networks.
Boosting Frank-Wolfe by Chasing Gradients.
Word-Level Speech Recognition With a Letter to Word Encoder.
Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data.
On Efficient Low Distortion Ultrametric Embedding.
Healing Products of Gaussian Process Experts.
Composable Sketches for Functions of Frequencies: Beyond the Worst Case.
Leveraging Procedural Generation to Benchmark Reinforcement Learning.
Model Fusion with Kullback-Leibler Divergence.
Deep Divergence Learning.
Teaching with Limited Information on the Learner's Behaviour.
Feature-map-level Online Adversarial Knowledge Distillation.
Scalable and Efficient Comparison-based Search without Features.
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations.
Semismooth Newton Algorithm for Efficient Projections onto ℓ1, ∞-norm Ball.
Distance Metric Learning with Joint Representation Diversification.
Online Continual Learning from Imbalanced Data.
Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models.
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels.
Stochastic Flows and Geometric Optimization on the Orthogonal Group.
k-means++: few more steps yield constant approximation.
Encoding Musical Style with Transformer Autoencoders.
Fair Generative Modeling via Weak Supervision.
How to Solve Fair k-Center in Massive Data Models.
On Coresets for Regularized Regression.
Streaming Coresets for Symmetric Tensor Factorization.
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism.
Convergence Rates of Variational Inference in Sparse Deep Learning.
Representation Learning via Adversarially-Contrastive Optimal Transport.
Stochastic Gradient and Langevin Processes.
Mutual Transfer Learning for Massive Data.
Learning with Bounded Instance and Label-dependent Label Noise.
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information.
High-dimensional Robust Mean Estimation via Gradient Descent.
(Locally) Differentially Private Combinatorial Semi-Bandits.
Automated Synthetic-to-Real Generalization.
On Breaking Deep Generative Model-based Defenses and Beyond.
Simple and Deep Graph Convolutional Networks.
Optimization from Structured Samples for Coverage Functions.
Negative Sampling in Semi-Supervised learning.
Generative Pretraining From Pixels.
An Accelerated DFO Algorithm for Finite-sum Convex Functions.
More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models.
VFlow: More Expressive Generative Flows with Variational Data Augmentation.
Estimating the Error of Randomized Newton Methods: A Bootstrap Approach.
Angular Visual Hardness.
On Efficient Constructions of Checkpoints.
Differentiable Product Quantization for End-to-End Embedding Compression.
Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search.
A Simple Framework for Contrastive Learning of Visual Representations.
Learning Flat Latent Manifolds with VAEs.
Convolutional Kernel Networks for Graph-Structured Data.
Mapping natural-language problems to formal-language solutions using structured neural representations.
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization.
Graph Optimal Transport for Cross-Domain Alignment.
Combinatorial Pure Exploration for Dueling Bandit.
Learning To Stop While Learning To Predict.
Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training.
Deep Reasoning Networks for Unsupervised Pattern De-mixing with Constraint Reasoning.
Uncertainty-Aware Lookahead Factor Models for Quantitative Investing.
Explainable and Discourse Topic-aware Neural Language Understanding.
Better depth-width trade-offs for neural networks through the lens of dynamical systems.
Circuit-Based Intrinsic Methods to Detect Overfitting.
Invariant Rationalization.
Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions.
Learning to Simulate and Design for Structural Engineering.
Optimizing for the Future in Non-Stationary MDPs.
Imputer: Sequence Modelling via Imputation and Dynamic Programming.
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift.
Concise Explanations of Neural Networks using Adversarial Training.
Description Based Text Classification with Reinforcement Learning.
Meta-learning with Stochastic Linear Bandits.
Data preprocessing to mitigate bias: A maximum entropy based approach.
Fully Parallel Hyperparameter Search: Reshaped Space-Filling.
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently.
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills.
Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates.
Near-linear time Gaussian process optimization with adaptive batching and resparsification.
Provably Efficient Exploration in Policy Optimization.
Uncertainty quantification for nonconvex tensor completion: Confidence intervals, heteroscedasticity and optimality.
On Validation and Planning of An Optimal Decision Rule with Application in Healthcare Studies.
Boosted Histogram Transform for Regression.
Online Learned Continual Compression with Adaptive Quantization Modules.
Structured Prediction with Partial Labelling through the Infimum Loss.
DeBayes: a Bayesian Method for Debiasing Network Embeddings.
Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models.
Online Pricing with Offline Data: Phase Transition and Inverse Square Law.
Scalable Exact Inference in Multi-Output Gaussian Processes.
A Pairwise Fair and Community-preserving Approach to k-Center Clustering.
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences.
TaskNorm: Rethinking Batch Normalization for Meta-Learning.
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation.
The FAST Algorithm for Submodular Maximization.
Estimating the Number and Effect Sizes of Non-null Hypotheses.
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference.
Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension.
Calibration, Entropy Rates, and Memory in Language Models.
Adversarial Filters of Dataset Biases.
Preference Modeling with Context-Dependent Salient Features.
Tightening Exploration in Upper Confidence Reinforcement Learning.
Latent Variable Modelling with Hyperbolic Normalizing Flows.
Small Data, Big Decisions: Model Selection in the Small-Data Regime.
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks.
Proper Network Interpretability Helps Adversarial Robustness in Classification.
Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More.
Lorentz Group Equivariant Neural Network for Particle Physics.
Deep Coordination Graphs.
Modulating Surrogates for Bayesian Optimization.
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Fast Differentiable Sorting and Ranking.
Provable guarantees for decision tree induction: the agnostic setting.
My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits.
Tight Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance.
The Boomerang Sampler.
Adversarial Robustness for Code.
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders.
Spectral Clustering with Graph Neural Networks for Graph Pooling.
Low-Rank Bottleneck in Multi-head Attention Models.
Near-optimal sample complexity bounds for learning Latent k-polytopes and applications to Ad-Mixtures.
Learning and Sampling of Atomic Interventions from Observations.
When are Non-Parametric Methods Robust?
Online Learning with Imperfect Hints.
Implicit differentiation of Lasso-type models for hyperparameter optimization.
Training Neural Networks for and by Interpolation.
Efficient Policy Learning from Surrogate-Loss Classification Reductions.
Preselection Bandits.
Interference and Generalization in Temporal Difference Learning.
Defense Through Diverse Directions.
The Cost-free Nature of Optimally Tuning Tikhonov Regularizers and Other Ordered Smoothers.
Decoupled Greedy Learning of CNNs.
Kernel interpolation with continuous volume sampling.
On Second-Order Group Influence Functions for Black-Box Predictions.
ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications.
Private Query Release Assisted by Public Data.
Frequency Bias in Neural Networks for Input of Non-Uniform Density.
Learning the piece-wise constant graph structure of a varying Ising model.
Option Discovery in the Absence of Rewards with Manifold Analysis.
Fast OSCAR and OWL Regression via Safe Screening Rules.
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training.
Inductive-bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters.
Dual Mirror Descent for Online Allocation Problems.
Stochastic Optimization for Regularized Wasserstein Estimators.
Ready Policy One: World Building Through Active Learning.
Refined bounds for algorithm configuration: The knife-edge of dual class approximability.
Coresets for Clustering in Graphs of Bounded Treewidth.
Sparse Subspace Clustering with Entropy-Norm.
Provable Self-Play Algorithms for Competitive Reinforcement Learning.
Deep k-NN for Noisy Labels.
Learning De-biased Representations with Biased Representations.
Fiduciary Bandits.
Agent57: Outperforming the Atari Human Benchmark.
Scalable Nearest Neighbor Search for Optimal Transport.
Constant Curvature Graph Convolutional Networks.
Forecasting Sequential Data Using Consistent Koopman Autoencoders.
Model-Based Reinforcement Learning with Value-Targeted Regression.
Sparse Convex Optimization via Adaptively Regularized Hard Thresholding.
Sample Amplification: Increasing Dataset Size even when Learning is Impossible.
Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks.
Safe screening rules for L0-regression from Perspective Relaxations.
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings.
Invertible generative models for inverse problems: mitigating representation error and dataset bias.
Black-box Certification and Learning under Adversarial Perturbations.
Quantum Boosting.
Provable Representation Learning for Imitation Learning via Bi-level Optimization.
NADS: Neural Architecture Distribution Search for Uncertainty Awareness.
Online metric algorithms with untrusted predictions.
Low-loss connection of weight vectors: distribution-based approaches.
Population-Based Black-Box Optimization for Biological Sequence Design.
Fairwashing explanations with off-manifold detergent.
Customizing ML Predictions for Online Algorithms.
The Differentiable Cross-Entropy Method.
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning".
Discount Factor as a Regularizer in Reinforcement Learning.
LowFER: Low-rank Bilinear Pooling for Link Prediction.
Structural Language Models of Code.
The Implicit Regularization of Stochastic Gradient Flow for Least Squares.
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation.
Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay.
A new regret analysis for Adam-type algorithms.
Random extrapolation for primal-dual coordinate descent.
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions.
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions.
Why bigger is not always better: on finite and infinite neural networks.
Invariant Risk Minimization Games.
Learning What to Defer for Maximum Independent Sets.
LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments.
Optimal Bounds between f-Divergences and Integral Probability Metrics.
An Optimistic Perspective on Offline Reinforcement Learning.
Boosting for Control of Dynamical Systems.
Rank Aggregation from Pairwise Comparisons in the Presence of Adversarial Corruptions.
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization.
Efficient Intervention Design for Causal Discovery with Latents.
Context Aware Local Differential Privacy.
A Geometric Approach to Archetypal Analysis via Sparse Projections.
Super-efficiency of automatic differentiation for functions defined as a minimum.
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation.
A distributional view on multi-objective policy optimization.
Selective Dyna-Style Planning Under Limited Model Capacity.