PonderNet: Learning to Ponder by DeepMind
Machine Learning Whiteboard (MLW) Open-source Series
For our new visitors, we started our machine learning whiteboard (MLW) series earlier this year as an open-invite space to brainstorm ideas and discuss the latest papers, techniques, and workflows in the AI space. In which, we emphasize an informal and open environment to everyone interested in learning about machine learning. So, if you are interested in learning about ML, we encourage you to join us on our next ML whiteboard.
In this episode, Curtis Giddings, a machine learning engineer with our information extraction team, focuses on “PonderNet: Learning to Ponder,” by Andrea Banino, Jan Balaguer, and Charles Blundell, one of the most recent DeepMind papers presented at ICML 2021. As you may know, DeepMind usually comes up with many exciting ideas and new state-of-the-art research, and PonderNet is no exception.
This episode is part of the #MLwhiteboard video series hosted by Snorkel AI. Check out the episode here:
Some of the primary facts that are exciting about PonderNet are:
- PonderNet represents a general technique that can be applied to a wide variety of methods, techniques, network architectures, etc.
- Intuitively makes some sense—it is easier to understand over other black-box-related methods and research.
- Potentially able to save on computational costs.
- Generate dramatically improved SotA results over previous SotA adaptive computation methods.
In standard neural networks, the amount of computation used is directly proportional to the size of the inputs, instead of the complexity of the problem being learned. To overcome this limitation, we introduce PonderNet, a new algorithm that learns to adapt the amount of computation based on the complexity of the problem at hand. PonderNet requires minimal changes to the network architecture and learns end-to-end the number of computational steps to achieve an effective compromise between training prediction accuracy, computational cost and generalization. On a complex synthetic problem, PonderNet dramatically improves performance over previous state-of-the-art adaptive computation methods by also succeeding at extrapolation tests where traditional neural networks fail. Finally, we tested our method on a real-world question and answering dataset where we matched the current state-of-the-art results using less compute. Ultimately, PonderNet reached state-of-the-art results on a complex task designed to test the reasoning capabilities of neural networks.
If you are interested in learning with us, consider joining us at our biweekly ML whiteboard.