OpenAI releases ChatGPT's hyperrealistic voice to some paying users

Maxwell Zeff

Updated July 31, 2024 at 2:39 AM·3 min read

OpenAI began rolling out ChatGPT's Advanced Voice Mode on Tuesday, giving users their first access to GPT-4o's hyperrealistic audio responses. The alpha version will be available to a small group of ChatGPT Plus users today, and OpenAI says the feature will gradually roll out to all Plus users in the fall of 2024.

When OpenAI first showcased GPT-4o's voice in May, the feature shocked audiences with quick responses and an uncanny resemblance to a real human's voice – one in particular. The voice, Sky, resembled that of Scarlett Johansson, the actress behind the artificial assistant in the movie "Her." Soon after OpenAI's demo, Johansson said she refused multiple inquiries from CEO Sam Altman to use her voice, and after seeing GPT-4o's demo, hired legal counsel to defend her likeness. OpenAI denied using Johansson's voice, but later removed the voice shown in its demo. In June, OpenAI said it would delay the release of Advanced Voice Mode to improve its safety measures.

One month later, and the wait is over (sort of). OpenAI says the video and screensharing capabilities showcased during its Spring Update will not be part of this alpha, launching at a "later date." For now, the GPT-4o demo that blew everyone away is still just a demo, but some premium users will now have access to ChatGPT's voice feature shown there.

ChatGPT can now talk and listen

You may have already tried out the Voice Mode currently available in ChatGPT, but OpenAI says Advanced Voice Mode is different. ChatGPT's old solution to audio used three separate models: one to convert your voice to text, GPT-4 to process your prompt, and then a third to convert ChatGPT's text into voice. But GPT-4o is multimodal, capable of processing these tasks without the help of auxiliary models, creating significantly lower latency conversations. OpenAI also claims GPT-4o can sense emotional intonations in your voice, including sadness, excitement or singing.

In this pilot, ChatGPT Plus users will get to see first hand how hyperrealistic OpenAI's Advanced Voice Mode really is. TechCrunch was unable to test the feature before publishing this article, but we will review it when we get access.

OpenAI says it's releasing ChatGPT's new voice gradually to closely monitor its usage. People in the alpha group will get an alert in the ChatGPT app, followed by an email with instructions on how to use it.

In the months since OpenAI's demo, the company says it tested GPT-4o's voice capabilities with more than 100 external red teamers who speak 45 different languages. OpenAI says a report on these safety efforts is coming in early August.

The company says Advanced Voice Mode will be limited to ChatGPT's four preset voices – Juniper, Breeze, Cove and Ember – made in collaboration with paid voice actors. The Sky voice shown in OpenAI's May demo is no longer available in ChatGPT. OpenAI spokesperson Lindsay McCallum says "ChatGPT cannot impersonate other people's voices, both individuals and public figures, and will block outputs that differ from one of these preset voices."

OpenAI is trying to avoid deepfake controversies. In January, AI startup ElevenLabs's voice cloning technology was used to impersonate President Biden, deceiving primary voters in New Hampshire.

OpenAI also says it introduced new filters to block certain requests to generate music or other copyrighted audio. In the last year, AI companies have landed themselves in legal trouble for copyright infringement, and audio models like GPT-4o unleash a whole new category of companies that can file a complaint. Particularly, record labels, who have a history for being litigious, and have already sued AI song-generators Suno and Udio.

Engadget
OpenAI rolls out advanced Voice Mode and no, it won't sound like ScarJo
OpenAI is bringing the alpha version of its advanced Voice Mode to a limited number of ChatGPT Plus users today.
Engadget
OpenAI's new, lightweight GPT-4o mini model promises an improved ChatGPT experience
GPT-4o mini can currently take in and generate text and images, but the model will eventually be able to process other types of content like audio and video.
TechCrunch
Meta releases its biggest 'open' AI model yet
Meta's latest open source AI model is its biggest yet. Today, Meta said it is releasing Llama 3.1 405B, a model containing 405 billion parameters. Trained using 16,000 Nvidia H100 GPUs, it also benefits from newer training and development techniques that Meta claims makes it competitive with leading proprietary models like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet (with a few caveats).
TechCrunch
TTT models might be the next frontier in generative AI
After years of dominance by the form of AI known as the transformer, the hunt is on for new architectures. Transformers underpin OpenAI’s video-generating model Sora, and they're at the heart of text-generating models like Anthropic’s Claude, Google’s Gemini and GPT-4o. A promising architecture proposed this month is test-time training (TTT), which was developed over the course of a year and a half by researchers at Stanford, UC San Diego, UC Berkeley and Meta.
TechCrunch
Tokens are a big reason today's generative AI falls short
Generative AI models don't process text the same way humans do. Most models, from small on-device ones like Gemma to OpenAI's industry-leading GPT-4o, are built on an architecture known as the transformer. Due to the way transformers conjure up associations between text and other types of data, they can't take in or output raw text — at least not without a massive amount of compute.
TechCrunch
Quora's Poe now lets users create and share web apps
Poe, Quora's subscription-based, cross-platform aggregator for AI-powered chatbots like Anthropic's Claude and OpenAI's GPT-4o, has launched a feature called Previews that lets people create interactive apps directly in chats with chatbots. Previews allows Poe users to build data visualizations, games and even drum machines by typing things like "Analyze the information in this report and turn it into a digestible and interactive presentation to help me understand it." Previews are a lot like Anthropic's recently introduced Artifacts: dedicated workspaces where users can edit and add to AI-generated content like code and documents.
Yahoo Finance
Small caps have nearly caught up to the S&P 500: Morning Brief
July's market turmoil has hit some stocks harder than others, notably the big ones. And during that time the Russell 2000 small-cap index has roared back.
Yahoo Finance
Investors look to the Fed for clues about a September rate cut
The Federal Reserve is widely expected to hold interest rates steady Wednesday while opening the door to a September cut if inflation continues to show progress.
Yahoo Sports
2024 Paris Olympics Soccer: How to watch the USWNT vs. Australia today
The U.S. Women's National Soccer Team plays Australia in their final group game of the tournament.
Yahoo Sports
2024 Paris Olympics: How to watch Katie Ledecky compete in 1500 meter freestyle final today
Here's how to watch Katie Ledecky make a splash at the 2024 Summer Olympics.
Engadget
Nothing just announced the Phone 2a Plus, a minor refresh of a pre-existing model
Nothing just announced a hardware refresh for the Phone 2a. There’s a faster chip and quicker wired charging.
Yahoo Sports
Taking a look at how every MLB team did at the trade deadline | Baseball Bar-B-Cast Podcast
Jake Mintz & Jordan Shusterman break down all the moves made by every MLB team before the trade deadline and give their grades for how each of them did.
TechCrunch
Tiger Global partner Alex Cook to leave firm, sources say
Alex Cook, a partner at Tiger Global who oversaw some of its largest fintech investments and India deals, is departing the firm after a tenure of nearly seven years, three people familiar with the matter told TechCrunch. A Tiger Global spokesperson didn't immediately respond to a request for comment. Scott Shleifer, who previously led the firm's private equity business, stepped down last year.
Yahoo Sports
Fantasy Baseball 2024: MLB trade deadline winners and losers
Whose fantasy value is on the rise after the wave of deals? Whose value took a hit? Fred Zinkie breaks down the biggest changes.
Yahoo Sports
Paris Olympics: Simone Biles celebrates USA's gymnastics gold medal with a shot at ex-teammate MyKayla Skinner
Simone Biles didn't say MyKayla Skinner's name. She didn't have to.
Yahoo Sports
USA Rugby announces $4M donation from Washington Spirit owner Michele Kang to invest in women's rugby
Hours after winning a historic bronze medal, USA Rugby announced the donation from Kang.
Yahoo Sports
Bears and receiver DJ Moore agree to 4-year, $110 million extension
The Chicago Bears and receiver D.J. Moore agreed to a four-year, $110 million contract extension that is the largest deal in franchise history.
TechCrunch
Lineaje raises $20M to help organizations combat software supply chain threats
A 2024 report by the Ponemon Institute found that over half of organizations have experienced a software supply chain attack, with 54% having experienced one within the past year. Supply chain attacks typically target services from third-party vendors or open source software that make up a company's tech stack, and they can financially devastate an organization. According to a Juniper Research study, supply chain cyberattacks could cost the global economy almost $81 billion in lost revenue and damages by 2026.
Yahoo Sports
2024 Paris Olympics results: Team USA wins big in gymnastics, rugby, swimming on Day 4
Simone Biles led the U.S. to gold and Coco Gauff was eliminated as the U.S. hit 3,000 Olympic medals on another busy day at the Olympics.
Yahoo Sports
MLB trade deadline: Dodgers acquire starting pitcher Jack Flaherty from Tigers in last-minute deal
The Dodgers also traded with the Blue Jays for center fielder Kevin Kiermaier as Los Angeles added some needed outfield depth.

News

Life

Entertainment

Finance

Sports

New on Yahoo

OpenAI releases ChatGPT's hyperrealistic voice to some paying users

ChatGPT can now talk and listen

Recommended Stories

OpenAI rolls out advanced Voice Mode and no, it won't sound like ScarJo

OpenAI's new, lightweight GPT-4o mini model promises an improved ChatGPT experience

Meta releases its biggest 'open' AI model yet

TTT models might be the next frontier in generative AI

Tokens are a big reason today's generative AI falls short

Quora's Poe now lets users create and share web apps

Small caps have nearly caught up to the S&P 500: Morning Brief

Investors look to the Fed for clues about a September rate cut

2024 Paris Olympics Soccer: How to watch the USWNT vs. Australia today

2024 Paris Olympics: How to watch Katie Ledecky compete in 1500 meter freestyle final today

Nothing just announced the Phone 2a Plus, a minor refresh of a pre-existing model

Taking a look at how every MLB team did at the trade deadline | Baseball Bar-B-Cast Podcast

Tiger Global partner Alex Cook to leave firm, sources say

Fantasy Baseball 2024: MLB trade deadline winners and losers

Paris Olympics: Simone Biles celebrates USA's gymnastics gold medal with a shot at ex-teammate MyKayla Skinner

USA Rugby announces $4M donation from Washington Spirit owner Michele Kang to invest in women's rugby

Bears and receiver DJ Moore agree to 4-year, $110 million extension

Lineaje raises $20M to help organizations combat software supply chain threats

2024 Paris Olympics results: Team USA wins big in gymnastics, rugby, swimming on Day 4

MLB trade deadline: Dodgers acquire starting pitcher Jack Flaherty from Tigers in last-minute deal