Intel Labs will be presenting 17 novel research projects at the Neural Information Processing Systems (NeurIPS) 2020 conference as oral presentations, workshops, or accepted papers. Held virtually on December 6-12, NeurIPS 2020 fosters the exchange of research on neural information processing systems in their biological, technological, mathematical, and theoretical aspects.
From an oral presentation on multiscale deep equilibrium models to two spotlights on heavy-tail processes in neural networks and language-conditioned imitation learning, Intel Labs continues to make contributions to novel research focused on deep learning, deep reinforcement learning, neural network modeling and optimization, and meta learning.
The following abstracts outline Intel Labs research presented at the conference:
Oral Presentation: Multiscale Deep Equilibrium Models
Multiscale Deep Equilibrium Models
Shaojie Bai (Carnegie Mellon University), Vladlen Koltun (Intel Labs), and J. Zico Kolter (Carnegie Mellon University / Bosch Center for AI)
Researchers propose a new class of implicit networks, the multiscale deep equilibrium model (MDEQ), suited to large-scale and highly hierarchical pattern recognition domains. An MDEQ directly solves for and backpropagates through the equilibrium points of multiple feature resolutions simultaneously, using implicit differentiation to avoid storing intermediate states (and thus leading to O(1) memory consumption). These simultaneously-learned multi-resolution features allow researchers to train a single model on a diverse set of tasks and loss functions, such as using a single MDEQ to perform both image classification and semantic segmentation. The effectiveness of this approach is shown on two large-scale vision tasks: ImageNet classification and semantic segmentation on high-resolution images from the Cityscapes dataset. In both settings, MDEQs are able to match or exceed the performance of recent competitive computer vision models: the first time such performance and scale have been achieved by an implicit deep learning approach.
Spotlights: Heavy-Tail Processes in Neural Networks and Language-Conditioned Imitation Learning
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli (Institut Polytechnique de Paris / University of Oxford), Ozan Sener (Intel Labs), George Deligiannidis (Oxford), and Murat Erdogdu (University of Toronto)
Researchers prove generalization bounds for stochastic gradient descent (SGD) under the assumption that its trajectories can be well-approximated by a Feller process, which defines a rich class of Markov processes that include several recent stochastic differential equations (SDE) representations (both Brownian or heavy-tailed) as its special case. Researchers show that the generalization error can be controlled by the Hausdorff dimension of the trajectories, which is intimately linked to the tail behavior of the driving process. The results imply that heavier-tailed processes should achieve better generalization. Hence, the tail-index of the process can be used as a notion of capacity metric. The theory is supported with experiments on deep neural networks illustrating that the proposed capacity metric accurately estimates the generalization error, and it does not necessarily grow with the number of parameters unlike the existing capacity metrics in the literature.
Language-Conditioned Imitation Learning for Robot Manipulation Tasks
Simon Stepputtis (Arizona State University), Joseph Campbell (Arizona State University), Mariano Phielipp (Intel AI Labs), Stefan Lee (Oregon State University), Chitta Baral (Arizona State University), and Heni Ben Amor (Arizona State University)
Motivated by insights into the human teaching process, researchers introduce a method for incorporating unstructured natural language into imitation learning. At training time, the expert can provide demonstrations along with verbal descriptions in order to describe the underlying intent (for example, “go to the large green bowl”). The training process then interrelates these two modalities to encode the correlations between language, perception, and motion. The resulting language-conditioned visuomotor policies can be conditioned at runtime on new human commands and instructions, which allows for more fine-grained control over the trained policies while also reducing situational ambiguity. Researchers demonstrate in a set of simulation experiments how the approach can learn language-conditioned manipulation policies for a seven-degree-of-freedom robot arm and compare the results to a variety of alternative methods.
The following papers also will be presented at NeurIPS 2020: