Google soups up Gmail and other apps with Bard AI updates

The Scene

Users of Google products like Gmail, Docs and Sheets, will have access to new integrations with Bard, the company’s main competitor to OpenAI’s ChatGPT.

Known as Extensions, the feature was one of several announcements Google made Tuesday on updates to its AI offerings as competition heats up with Microsoft, which has a partnership with OpenAI. The AI model that powers Bard, known as PaLM2, is also getting a makeover that the company says makes the model more effective.

Bard will also added “Google it,” where the chatbot will let users know which part of its responses are based on citations from the internet versus ones based on less certain sources. The update is meant to combat the “hallucination” problem, where large language models spit out inaccurate information.

Semafor spoke with Sissie Hsiao, vice president at Google and general manager for Google Assistant and Bard, about the new features and her thoughts on the future of AI tools.

The View From Sissie Hsiao

Q: What’s an example of what you can do with the new Bard Extensions?

A: I have two kids and at the beginning of school, it’s always emails galore from both schools. There’s back to school night. You need to buy these things. You need to pick them up at this time. You need to go meet the teacher. And so I asked Bard to summarize all the emails from my kids’ schools and tell me what I need to know. Bard retrieved all the emails, summarized them and gave me, point by point — here’s the things you need to know about the third grade classes starting and the seventh grade classes starting. In minutes, I was able to do what would have been hours reading through newsletters from all of these different schools.

Q: How does double-check work?

A: It evaluates each sentence that Bard responds with. And it basically checks if there’s corroboration for that sentence on the internet. You see green and orange sentences. Green means I found a good quality site corroborating this information and you can click on the link to go see the site. Orange sentences mean Google did not find any corroborating evidence. So this means this sentence may or may not be true and you might want to go evaluate it.

Q: You announced this at Google I/O and have been testing it for months. What did you learn from watching how people use these tools?

A: They really love double-check. They’re like ‘wow, I can now double-check what Bard is telling me.’ People also find it absolutely difficult finding things in documents and emails.

I had this tester who said, ‘I’ve had my Gmail for 15-20 years and I’ve been looking for this one email, and I just can’t find it.’ But she found it through Bard. She was able to just describe it with words.

Q: It’s ironic because Google is a search company. What does it mean for the future of search that these large language models are better at searching for things than traditional Google search?

A: Bard and search are complementary. So in the double-check feature, you search to corroborate and understand. In the Extensions case, this is searching your private corpus. It’s a way to express looking for something that is pretty unique. It’s new and I think people are finding power in that.

Q: How many people are using Bard in general? Is it a popular tool?

A: I can’t talk to specific numbers, but I can talk to its tremendous momentum and also the engagement that we have on Bard. We see people using all of our new features and finding new use cases with them.

Q: How do you think about Bard? Is Bard kind of a demonstration and these tools that you’re talking about today are the real product?

A: Bard is an experiment in generative AI and we’re constantly pushing the envelope on what generative AI can do. This latest launch with Extensions is a really powerful thing because not only is it able to do all the things that a large language model can do, you’re able to do things with other tools, which is a really powerful concept. Internally, we call it “agentive capabilities.”

If you take that to the limit of time, you can summarize those kids’ emails for school, and then put the order into Amazon for all the things you need. Or you can find your entire kids’ summer camp schedule, because it can help reason and sort through all your inbox and see all the camps that you’re interested in and actually come up with a schedule. How agentive can it be when it actually can take those higher order tasks and solve them for you through natural language with the human in control?

Q: When will we get to the point where it can actually book the summer camps? Some people call that being an agent? Is that on the horizon?

A: I definitely think that it is on the horizon. But I think people can save a lot of time, just with the actual planning and the iteration work. And then the transacting part, there’s some time savings to enter your credit card and what not. But really, that’s like a final mile and not the most accretive. I think it’s really all this, smushing all the information together and making sense of it and organizing it. That’s the work that people do a lot in email and text and all these different information corpuses and this is where we’re going to add the most value.

Q: Can you use Bard to search on other email and document services? Like, say Microsoft Office?

A: A person is not service centric. A person is multi service. You have relationships with multiple tools from multiple companies. You have your information in multiple places. My vision is that Bard can become an agent that brings all of that together for a personalized AI that helps you solve your problems.

Q: You’ve also updated PaLM2. Can you say anything about the numbers? How much bigger is it and how much better?

A: I can’t speak about model sizes. But we’re constantly building. We’re constantly experimenting with different model architectures, different data, training, mixtures. This launch is significantly upgraded in quality.

Q: Can you talk about the kind of fine-tuning and reinforcement learning with human feedback (RLHF) you’re doing to improve the model?

A: It’s fascinating that they call fine-tuning and RLHF “recipes.” You add a little of this, you subtract a little of that and it tastes better, or maybe it tastes worse. It’s actually remarkably true. And I think with our fine-tuning approach, we’re constantly looking for what kind of data will make the model hallucinate less, be more creative when it needs to be creative, be less creative when it should be more factual. Every two weeks, we’re actually constantly revving our fine tuning and our RLHF.

Q: Now that the model is bigger and better, is it more expensive to run as far as server costs go?

A: I can’t speak to the exact costs. What I can say is we are always balancing costs and latency because bigger models can also be slower. These model sizes change over time and optimizations change over time. So there’s not a fixed point perspective on our particular inference cost.

Q: A lot of people are really anticipating the launch of Gemini. PaLM 2 is getting better. Does it morph into Gemini or is Gemini a totally separate thing and what can we expect from it?

A: It’s a really exciting model that we’re going to be putting into Bard. I can’t say when. It will be a completely new model architecture, a completely new approach to the training. It’s going to be quite a large model, of course, which is really exciting. And with these things, as a product builder, you’re always thinking about what are the applications of this great technology. Again, it’s a recipe. The base model is definitely a component of that recipe and many other components on top. Features, RLHF. Stay tuned on more capabilities, but we think there will be more skills and more capabilities emerging from these newer generation models.

Correction

Google’s new integration of AI tools into its products is available to all users. An earlier version of this article misstated that it was only available to users of the paid version of Google products.