OpenAI Vs Google: Gemini Live Rolls Out To Rival ChatGPT Voice Mode

As a seasoned crypto investor with a keen eye for technological advancements, I must admit that the ongoing battle between OpenAI and Google has me intrigued. The recent launch of Gemini Live by Google is undoubtedly an exciting development, especially considering my personal preference for seamless and natural interactions with AI assistants.

At the 2024 Made by Google event, Google unveiled a voice chat feature called Gemini Live for their AI assistant, Gemini. This new addition aims to compete with OpenAI’s latest Advanced Voice Mode for ChatGPT. Accessible only to premium users, Gemini Live is designed to facilitate conversations in a more natural and engaging manner.

OpenAI Vs Google: Gemini Live to Rival ChatGPT Voice Mode

On discussion platform X, the firm unveiled its new product, Gemini Live, aiming to compete with OpenAI’s latest voice feature in ChatGPT’s Advanced Mode.

At the 2024 event, the new feature was unveiled for advanced Gemini users. This feature is designed to make interactions with AI more seamless and less structured, enabling users to pause, change topics, or carry on the discussion whenever they wish, much like in a telephone conversation.

Meet Gemini Live: a new way to have more natural conversations with Gemini.
Brainstorm ideas
Interrupt to ask questions
Pause a chat and come back to it
Now rolling out in English to Gemini Advanced subscribers on @Android phones → …
— Google DeepMind (@GoogleDeepMind) August 13, 2024

In the latest Google speech engine, a standout characteristic is its ability to generate continuous, emotionally nuanced, and lifelike dialogue across several turns. There are ten voices available, each sounding natural, and the AI has the capability to mimic the user’s voice in real-time. This hands-free feature allows for uninterrupted conversation, even when the phone is in the background or locked, enabling users to multitask without disrupting their chat.

Move to Enhance AI Interaction

As a result, the Gemini 1.5 Pro and Gemini 1.5 Flash models of the AI assistant can handle extended and intricate discussions due to their larger context window compared to other generative AI models. This feature allows Gemini Live to sustain longer conversations and manage information more effectively.

In addition to voice control, it’s been confirmed that multi-input functionality, showcased at Google I/O 2024 for the first time, will be incorporated into Gemini Live by year-end. This enhancement allows the AI to comprehend and respond to visual cues such as images and videos, making it more adaptable. Currently, this feature is only available in English on Android devices, but soon, additional languages and iOS compatibility will be added.

As the company unveils this new feature, they’re also planning to launch further features and connections with their services in the near future. In the following weeks, Gemini is set to offer expanded functionalities for Google applications like Calendar, Keep, Tasks, and YouTube Music. These improvements will empower users to execute tasks such as creating playlists, setting reminders, and organizing their schedules more effortlessly using voice commands.

In the coming days, Android users can anticipate activating Gemini not just within the app itself, but also via the power button or voice commands. This upgrade will enable seamless interaction between users and Gemini within other apps, where they can pose questions or request content such as images that blend effortlessly into their work.

OpenAI Challenges with Advanced Voice Mode

During the competition between OpenAI and Google, Google’s Advanced Voice Mode for ChatGPT encountered issues during its initial, restricted testing phase. This innovative feature, designed to enhance the chat experience by mimicking more lifelike conversations, has been met with criticism as it may unintentionally make users overly reliant on the AI due to its realistic voice interactions.

As a result, OpenAI raised a concern about a potential future development: the establishment of social connections between users and AI, which could have negative impacts on human interactions.

As a research collaborator working alongside the original creators, I’m excited to announce the launch of an updated version of SWE-bench. This new iteration is designed to provide a more dependable assessment of artificial intelligence (AI) model capabilities when addressing real-life software challenges.
— OpenAI (@OpenAI) August 13, 2024

Beyond this, the company has been working on enhancing the software development abilities of its AI systems. To address these challenges, the organization has recently made public a carefully assessed subset of the SWE-benchmark benchmark, which more accurately measures an AI model’s ability to solve real-world software issues. This action is part of ongoing efforts to ensure that advancements in AI are both safe and practical for everyday use.

2024-08-13 22:28

OpenAI Vs Google: Gemini Live to Rival ChatGPT Voice Mode

Move to Enhance AI Interaction

OpenAI Challenges with Advanced Voice Mode

Read More