What Gemini and Google AI features we expect

Over the past year, Google has introduced a number of Gemini-branded features and other AI capabilities in its consumer-facing apps. Here’s everything that was announced and when they might be available.

Pixel

At the end of Made by Google 2023, a zoom improvement that “intelligently fills gaps between pixels and predicts fine details” was teased for the Pixel 8 Pro. By leveraging a “custom generative AI image model” on the device, Google touted this as being useful when you forget to zoom.

This is an incredible application of generative AI, opening up a whole range of possibilities for framing and editing your images. So, the kind of zoom enhancement you’d see in science fiction: it’s right on the phone in your hand.

In October, Google said it was “coming later.” After three Pixel Feature Drops, it hasn’t arrived yet. It’s unclear if the model Google is referring to is the Gemini Nano with multimodality. At this point, it could just as easily debut with the Pixel 9 Pro as that phone’s flagship camera feature.

Google Home

In the Google Home app, generative AI will be used to summarize events into a “simplified view of what happened recently.” This “quick and easy summary” will use bullet points, while you’ll also be able to “ask about your home” via conversation to find historical video clips and get automations. The “experimental features” will be available to Nest Aware subscribers in 2024.

Fitbit

Fitbit Labs will allow Fitbit Premium users to test and provide feedback on experimental AI capabilities.

One such feature is a chatbot that lets you ask questions about your Fitbit data in a natural, conversational way. This “personalized coaching” that factors in fitness goals aims to generate “actionable messages and advice,” with responses that can include personalized charts.

“For example, you can dig deeper into how many active zone minutes (AZM) you have and how that correlates to how restorative your sleep is. »
“…this model may be able to analyze variations in your sleep patterns and sleep quality, and then suggest recommendations for how you might modify the intensity of your training based on that information.”

Behind the scenes, this is being fueled by a new LLM on personal health from Fitbit and Google Research based on Gemini. As of March, it will arrive “later this year” to a “limited number of Android users enrolled in the Fitbit Labs program in the Fitbit mobile app.”

Google Photos

Ask Photos lets you ask questions about images and videos in your library. In addition to searching for images, the app can pull information and provide you with a text response. With Gemini, suggested queries include “Show me the best photo of every national park I’ve visited” and “What themes did we choose for Lena’s birthday party?” It can be used to “suggest the best photos” and create captions for them. Ask Photos is an “experimental feature” that will be rolling out soon, with Google already announcing more features in the future.

Gmail + Google Workspace

In Gmail for Android and iOS, you’ll find a Gemini button in the upper right corner that lets you display the mobile equivalent of a side panel for entering full prompts. Gmail also gets contextual Smart Replies that offer more personalized, detailed and nuanced suggestions. This will roll out to Workspace Labs in July.

At the Cloud Next 2024 conference in April, Google also previewed a voice prompt feature for Help me write in Gmail mobile. Alongside this, an “instant polish” feature will “convert raw notes into a full email with a single click.”

On desktop web, the side panel is available in Gmail, Google Drive, and Docs/Sheets/Slide. Gemini will then be available on Google Chat to summarize conversations and answer questions.

Google Maps

Last February, Google announced that Maps would use LLMs to power an “Ask about” chatbot. You can use it to find places matching your query with support for follow-up questions. It draws on information on 250 million places and user-submitted photos, videos and reviews.

Chromium

Gemini Nano is coming to Chrome for desktop, bringing browser features like “Help Me Write.” It should be available on most modern laptops and desktops.

Google Search

In addition to launching AI previews, Google previewed a number of upcoming features that will be coming to Search Labs for the first time:

You will be able to take the original AI overview and make it “simpler” (in just a few sentences) or “break it down” (longer answer).
Multi-step reasoning skills will allow you to ask a complex question in one go rather than breaking it into multiple queries.
Meal and travel planning
AI-curated search results page
Video Searches: Record a video and ask a question about it

Android

Gemini Nano with Multimodality will launch on Pixel “later this year” and will offer features like on-device/offline TalkBack descriptions and real-time scam alerts that listen to a call for telltale patterns. Google will share more details later this year.

At I/O 2024, Google also showcased how Gemini on Android will soon be an overlay panel instead of opening a full-screen UI to view results. In addition to preserving context, this will allow you to drag and drop a generated image into a conversation. For Gemini Advanced subscribers, the “Request this video” and “Request this PDF” buttons will see Gemini videos and summary documents, respectively. This will take place “over the coming months”. Additionally, dynamic suggestions will use Gemini Nano with multimodality to understand what’s on your screen:

For example, if you activate Gemini in a conversation about pickleball, suggestions might include “Find pickleball clubs near me” and “Pickleball rules for beginners.”

Another addition that will be particularly useful on mobile is the Gemini extension for Google Calendar, Tasks, and Keep. This will let you take a photo of a page with several upcoming dates that Gemini can turn into calendar events. In the coming months, a “Utility” will allow Gemini mobile to access Android’s Clock app.

We also expect the Gemini mobile to arrive on the Pixel tablet this summer.

Gemini

Live will allow you to have a two-way conversation with Gemini. To make the experience more natural, Gemini will return concise responses that you can interrupt to add new information or ask for clarification. You can choose from 10 different voices, with Google imagining Gemini Live as being useful for preparing for an interview or rehearsing a speech. It will be available in the “coming months” for Gemini Advanced members.

“Later this year,” Gemini Live will let you launch a live camera mode. Simply point to something in the real world and ask a question about it. This is powered by Project Astra.

Gems are customized versions of Gemini that let you have a “gym buddy, sous chef, coding partner, or creative writing guide.” Gemini Advanced members will be able to create custom ones, while all users will have access to pre-built gems, like Learning Coach.

Simply describe what you want your Gem to do and how you want it to react, for example “you are my running coach, give me a daily running plan and be positive, upbeat and motivating”. Gemini will take these instructions and, with one click, enhance them to create a Gem that meets your specific needs.

Gemini Advanced users will also get an “immersive planner” that goes beyond simply suggesting activities, but instead takes into account travel times and stops, as well as user interests, to create a detailed itinerary. Gemini will use flight/trip details in Gmail, Google Maps recommendations for restaurants and museums near your hotel, and searches for other activities.

FTC: We use automatic, revenue-generating affiliate links. More.

Source link