BLOOM: The world’s largest open multilingual language models widely available for research
BLOOM is the brainchild of BigScience, an international, community-powered project to make large natural language models widely available for research. It aims to democratize large-language models(LLMs). LLMs can translate, summarize and write text with human-like nuance, more or less. The project has produced an open-source language model that they claim is as powerful as OpenAI’s GPT-3, but free and open for anyone to use
Its language model bigger than GPT-3 has arrived with a bold ambition: freeing AI from Big Tech clutches. It has made a significant impact on AI research. It’s also multilingual — unlike Google’s LaMDA and OpenAI’s GPT-3 an unusual feature in an English-dominated field. And it is trained in complete transparency, to change this status quo. The result of the largest collaboration of AI researchers ever involved in a single research project. It promises a similar performance to Silicon Valley’s leading systems but with a radically different approach to access. It can generate text in 46 natural languages and dialects and 13 programming languages.
The world’s largest open multilingual language models:
Bloom is designed to be applied in a range of research applications, such as extracting information from historical texts. It was created by BigScience, a research project that launched in early 2021. The initiative is bootstrapped and led by AI startup Hugging Face. BigScience’s backers also hope that Bloom will spur new investigations into ways to combat the problems that plague all LLMs. Large ML models have changed the world of AI research over the last two years but the huge compute cost necessary to train them resulted in very few teams having the ability to train and research them.
Large-language models are proving proficient at a growing range of tasks, including writing essays, generating code, and translating languages. While tech giants tend to keep their vaunted LLMs hidden from the public, BLOOM is available to anyone for free. These features could democratize access to technology that’s set to make a deep impact on society. It’s also uniquely affordable. BigScience says researchers can use BLOOM for less than $40/hr on a cloud provider.
BLOOM gives researchers a unique chance to explore its risks and benefits. Its capabilities will continue to improve as the workshop continues to experiment and tinker with the model. And it is the seed of a living family of models that we intend to grow, not just a one-and-done model, and we’re ready to support community efforts to expand it. The model isn’t likely to compete with those built by Big Tech but it at least provides a way to scrutinize them.