Google’s New Text-to-Image Generator Is Scary Accurate

Tony Ho Tran

May 24, 2022 at 11:31 AM·3 min read

Photo Illustration by Kelly Caminero / The Daily Beast / Photos by Google

Text-to-image generators are a fun way to create uncanny images using a short description and an AI. For example, using the AI Art Maker tool from Hotpot.ai, all you have to do is type in a few words or a description of something you want to see (e.g. a dog wearing a hat), and its model produces a picture with hilarious—and frequently creepy-looking—results based on your words (e.g. the dog looks like it’s melting).

The images produced from these AI tools aren’t perfect. After all, even machine-learning neural networks have their limits. However, that all might change with yesterday’s unveiling of Google’s new AI image generator, Imagen. The powerful text-to-speech model can create incredibly photorealistic images based off of a sentence-long description. The results—which can be found in a paper released by the Google Brain Team yesterday—are astounding.

Below are just a few images created by the AI—captioned with the sentence that was given to the model to create it. Keep in mind, these are cherry-picked examples from the team. Still, it’s pretty impressive:

<div class="inline-image__caption"> <p>A photo of a Corgi dog riding a bike in Times Square. It is wearing sunglasses and a beach hat.</p> </div> <div class="inline-image__credit"> Courtesy of Google </div> — A photo of a Corgi dog riding a bike in Times Square. It is wearing sunglasses and a beach hat.

Courtesy of Google

<div class="inline-image__caption"> <p>A brain riding a rocketship heading towards the moon.</p> </div> <div class="inline-image__credit"> Courtesy of Google </div> — A brain riding a rocketship heading towards the moon.

Courtesy of Google

<div class="inline-image__caption"> <p>A photo of a raccoon wearing an astronaut helmet, looking out of the window at night.</p> </div> <div class="inline-image__credit"> Courtesy of Google </div> — A photo of a raccoon wearing an astronaut helmet, looking out of the window at night.

Courtesy of Google

You can see more on Imagen’s website.

The model isn’t available to the public. However, the team at Google claims that their AI is more powerful than other, similar text-to-speech generators such as the powerful VQ-GAN+CLIP, Latent Diffusion Models, and DALL-E 2. To compare the quality of Imagen against those models, Google created DrawBench, a “comprehensive and challenging benchmark for text-to-image models.” For this, human volunteers evaluated and rated the images created by the different AI generators using a list of roughly 200 text prompts.

While fun, it should be noted that there’s a dark side to this type of AI image generation. After all, many of these models—including Google’s—are trained using data scraped from the internet, which we all know is filled with a whole lot of racist, sexist, and generally problematic crap (and that’s putting it lightly). As such, these algorithms often come with their own set of biases that have harmful results. It’s not hard to imagine bad actors weaponizing them in order to gin up fake images to harm someone’s reputation or sow discord in the news cycle.

Even the team behind Google’s new generator acknowledges this saying in the Imagen website, “there is a risk that Imagen has encoded harmful stereotypes and representations, which guides our decision to not release Imagen for public use without further safeguards in place.”

So it’s probably for the best that the model has a long way before it sees the light of day. Hopefully, when it does, it’ll be used to create more pictures of dogs wearing hats and less fake news.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Google’s New Text-to-Image Generator Is Scary Accurate

Recommended Stories

Former NBA guard Darius Morris dies at 33

The FDIC change that leaves wealthy bank depositors with less protection

Phil Mickelson on the majors: 'What if none of the LIV players played?'

Blockbuster May trade by Padres, MVP Ohtani has arrived, Willie Mays’ 93rd birthday & weekend recap

No one was airing Angel Reese and Kamilla Cardoso's WNBA preseason debuts, so an X user livestreamed it

NBA playoffs: Predictions for Celtics-Cavaliers and every second-round series

New details emerge in alleged gambling ring behind Shohei Ohtani-Ippei Mizuhara scandal

Kyle Larson beats Chris Buescher at Kansas in closest finish in NASCAR history

The Scorecard: Andy Pages looks set to go down as one of the best fantasy baseball waiver wire pickups of 2024

2024 NFL Draft grades: Denver Broncos earn one of our lowest grades mostly due to one pick

NFL Power Rankings, draft edition: Did Patriots fix their offensive issues?

Monday Leaderboard: Brooks Koepka is ready to slow the Scottie Scheffler train

Formula 1: Miami Grand Prix sends cease and desist letter to prevent Donald Trump fundraiser during race

The best RBs for 2024 fantasy football according to our analysts

Timberwolves' Rudy Gobert questionable for Game 2 vs. Nuggets due to 'personal reasons'

Why did Musk ax the Supercharger team?

Kentucky Derby: Mystik Dan wins in three-horse photo finish, outruns favorite Fierceness in stunning upset

CVS stock plunges after earnings numbers one analyst 'did not even believe'

Dwayne Johnson is difficult to work with, report claims. The star has 'mountains of public goodwill' to offset negativity, expert says.

Lionel Messi smashes MLS and personal records with goal, 5 assists in a single half