Facebook’s parent company invites researchers to investigate and fix bugs in the new GPT3 version.
Meta AI lab has created a large new language model that shares both the amazing features and harmful drawbacks of OpenAI’s groundbreaking GPT3 neural network. And Big Tech’s unprecedented move provides researchers with details on how it was built and trained.
We firmly believe that the ability of others to question your work is an important part of your research. The Meta movement is the first time that a fully trained, large-scale language model is available to all researchers who want to study it. The news was welcomed by many who were concerned about how this powerful technology was built by a small team in a closed room.
A large language model, a powerful program that can generate paragraphs of text and mimic human conversation, has become one of the hottest trends in AI in recent years. However, they are deeply flawed, abusing false information, prejudice, and toxic language.
In theory, it should help to get more people to work on the problem. However, language models have traditionally been a project of rich technology companies because they require large amounts of data and computing power for training. The wider research community, including ethicists and social scientists concerned about the abuse, had to be watched over by bystanders.
Meta makes a model called Open Pretrained Transformer (OPT) available for non-commercial purposes. We also publish code and logbooks that document the training process. The logbook contains daily updates from team members regarding training data. It includes the methods added to the model and those that worked and those that did not. Researchers logged all errors, crashes, and restarts in a three-month training process that ran uninterrupted from October 2021 to January 2022, using over 100 pages of notes.
With 175 billion parameters (neural network values adjusted during training), the Open Pretrained Transformer is the same size as GPT3. The team built the Open Pretrained Transformer to match GPT3 in both the accuracy and toxicity of language tasks. OpenAI made GPT3 available as a paid service but did not share the model itself or its code. The idea was to provide researchers with a similar language model for learning.
OpenAI declined an invitation to comment on Meta’s announcement.
Google, which is investigating the use of large language models in search products, has also been criticized for its lack of transparency. The company caused controversy when it expelled a senior member of the AI ethics team in 2020 after conducting an investigation that uncovered a technology issue.
Why is Meta doing this? After all, Metaverse is a company that talks very little about how the algorithms behind Facebook and Instagram work, and has a reputation for filling in unwanted findings from internal research teams. The main reason for the different approaches of Meta AI, which has been advocating increased AI transparency for several years.
Weighing the risks
One of the AI ethics researchers who has been replaced by Google in 2020, now in Hugging Face, sees the release of the Open Pretrained Transformer as a positive step. But she believes there is a limit to transparency. Is the language model tested rigorously enough? Do the predictable benefits outweigh the predictable shortcomings such as misinformation and the generation of racist and misogynist words?
It is your responsibility to release a large language model to a world that may be used by a wide range of audiences and affected by the release. This model can generate malicious content not only by itself but also by the downstream applications that researchers build on it. Meta AI has reviewed the Open Pretrained Transformer to remove some harmful behavior, but the important thing is to release models, warts, etc. for researchers to learn.
There was a lot of conversation about how to do it in a way that you can sleep at night, knowing that there is a non-zero risk when it comes to reputation and a non-zero risk when it comes to harm. She rejected her idea that the model shouldn’t be released because it’s too dangerous. As a result, OpenAI did not release GPT2, the predecessor of GPT3.
One of the most important things to mitigate the risk of any type of machine learning technology is to pin the assessment and research to a particular use case. What is the system used for? Who uses it and how are the system results presented to them?
Some researchers wonder why large language models are built given the potential for damage. AI is part of this conversation and you shouldn’t expect the language model to say everyone agrees. But how do you deal with it? This discussion needs a lot of voice.