Artificial Intelligence can generate images from text and may soon be capable of producing movies
There have been numerous advancements in the field of artificial intelligence (AI) in recent months that have sparked interest in the technology world. And some of them are already capturing the public’s attention. The first is the text-to-image artificial intelligence systems recently announced by OpenAI, dubbed DALL-E 2, and Google, dubbed Imagen. These artificial intelligence systems can generate an entirely new image from a single line of text. And the images that result are nothing short of surreal, if not mind-boggling. Write a sentence like, “A small cactus in the Sahara Desert wearing a straw hat and neon sunglasses,” and you’ll get exactly that image.
Another significant development revolves around Alphabet-owned AI firm Deepmind’s general AI model known as Gato. Most AI systems are trained to perform a single task, which is known as narrow AI. Gato, on the other hand, can perform over 600 tasks, such as playing video games, moving robotic arms, and captioning images. Then there’s the recent development involving Google’s LaMDA chatbot. It is so good at free-flowing conversations on an infinite number of topics that a Google engineer nicknamed it? Sentient’-aware of its existence and capable of perception. It even persuaded someone that someone is now going to court to fight for the rights of that AI system.
Many technologists believe it is premature to declare Gato to have general AI capabilities, and most are skeptical that LaMDA’s achievements represent a sentient state. However, almost everyone believes that these developments represent a significant advancement for AI systems and open up vast new possibilities. India’s first unicorn AI company, the building blocks for AI to begin making videos and movies already exist. With advances in natural language generation (NLG) and the LaMDA chatbot as examples, there are now bots that can write scripts. Combine that with image-to-text capabilities, and we could see an AI movie in 2-5 years.
Text-to-image AI may also enable the self-generation of content that can be used to improve AI systems. AI engines can now generate their simulated data for various scenarios, which is difficult to obtain in the real world. “For example, consider training an AI engine to detect violence in real-time camera streams. To do so, we may need data from actual violent incidents, which could be difficult or even impossible to obtain. Human intervention is no longer required to obtain data for AI training with the new image/video generation. The AI will generate its training data and will be able to train another AI model for a variety of purposes.
Amonkar is conflicted about Gato’s accomplishment. He claims that while there has been significant progress, it is still not general AI. Regardless of their domain, even simple animals can learn for themselves. What is currently happening in AI is still in its early stages. In general, AI should be able to figure things out on its own. According to some experts, Gato was trained to perform each of its 600 tasks, and if faced with a new challenge, it would be unable to use its knowledge from the 600 tasks to logically analyze and solve that new problem.
However, not everything is perfect. For a species as prone to immoral behavior as ours, technology has always been a double-edged sword. According to Mayank Jain, CEO and co-founder of AI-powered content creation firm Scalenut, the photorealistic quality of text-to-image AI systems can have a profound impact on our society and will be one of the most difficult problems the AI community will face. “The world is still coming to terms with deep-fakes and fake news, and with this technology becoming public, the problem could grow exponentially.” “As Uncle Ben said in Spider-Man, with great power comes great responsibility,” he says.