Foundation models in AI give rise to new capabilities for efficiently implementing tasks
If you’ve seen photos of a teapot shaped like avocado or read a well-written article that veers off on slightly weird tangents, you may have been exposed to a new trend in AI. Machine learning systems called DALL-E, GPT and PaLM are making a splash with their incredible ability to generate creative work. These systems are known as “foundation models” and are not all hype and party tricks.
Foundation models are models trained on broad data (generally using self-supervision at scale) that can be adapted to a wide range of downstream tasks. These models, which are based on standard ideas in transfer learning and recent advances in deep learning and computer systems applied at a very large scale, demonstrate surprising emergent capabilities and substantially improve performance on a wide range of downstream tasks. Given this potential, foundation models are seen as the subject of a growing paradigm shift, where many AI systems across domains will directly build upon or heavily integrate foundation models.
Foundation models incentivize homogenization: the same few models are repeatedly reused as the basis for many applications. Such consolidation is a double-edged sword: centralization allows researchers to concentrate and amortize their efforts (e.g., to improve robustness, to reduce bias) on a small collection of models that can be repeatedly applied across applications to reap these benefits (akin to societal infrastructure), but centralization also pinpoints these models as singular points of failure that can radiate harms (e.g., security risks, inequities) to countless downstream applications.
Foundation models in AI are algorithms that train and develop with broader datasets to execute various functions. Moreover, Artificial Intelligence is going through a tremendous amount of evolutions. Further, foundation models are built on conventional deep learning and transfer learning algorithms. Therefore, foundation models in AI give rise to new capabilities for efficiently implementing tasks.
How do Foundation Models in AI work?
Firstly, foundation models use deep neural networks to comprehend how a brain works. Therefore, it implies complex mathematics and high computing power and deduces to a pattern matching ability.
For instance, a deep neural network examines millions of sets of images and can associate the word “cat” with the frequency of pixels that emerge in the images. Further, more examples and data offer the system to develop its skills to recognize, visualize, and presume components in the images. Moreover, the scope of the model increases with the analysis of complex patterns and correlations. Foundation models also augment the deep learning model that influences AI research. It also demonstrates emergent capabilities that are unconventional and unorthodox.
To help developers make better-informed decisions with input from the broader community, we at the Center for Research on Foundation Models at the Stanford Institute for Human-Centered AI have proposed creating a foundation models review board. The role of the board would be to facilitate the process of developers releasing foundation models to external researchers. This approach will expand the group of researchers who can study and improve foundation models while helping to manage the risks of release.