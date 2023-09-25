OpenAI is once again pushing the boundaries of AI technology with the introduction of new voice and image capabilities in ChatGPT. These features are set to revolutionise how users interact with the AI model, offering a more intuitive and immersive experience.

Voice Conversations with ChatGPT

One of the standout features of this update is the ability to engage in voice conversations with ChatGPT. Users can now have real-time, back-and-forth dialogues with their AI assistant, opening up a world of possibilities. Whether you're on the go, looking for a bedtime story for your family, or settling a dinner table debate, ChatGPT's voice capabilities are ready to assist.

To get started with voice, simply navigate to the Settings menu in the mobile app, select "New Features," and opt into voice conversations. Once enabled, tap the headphone icon in the top-right corner of the home screen to choose from five different voices. These voices have been carefully crafted by professional voice actors to provide a human-like audio experience. Additionally, Whisper, OpenAI's open-source speech recognition system, transcribes spoken words into text, enhancing the overall conversation quality.

Image Interaction with ChatGPT

Another game-changing feature is the ability to share images with ChatGPT. Users can now show one or more images to ChatGPT to troubleshoot problems, explore content, or analyze complex data. Whether you're trying to figure out why your grill won't start, plan a meal based on the contents of your fridge, or decipher a data graph for work, ChatGPT can assist you.

To use this feature, tap the photo button to capture or select an image. On iOS or Android, tap the plus button first to add multiple images or use the drawing tool to guide your assistant. These image capabilities are powered by multimodal models, including GPT-3.5 and GPT-4, which apply language reasoning skills to a wide range of visual content, such as photos, screenshots, and documents containing text and images.

Gradual Deployment for Safety and Responsiveness

The deployment of voice and image capabilities is being rolled out gradually to Plus and Enterprise users over the next two weeks. Voice is available on both iOS and Android platforms, with the option to opt in through settings, while images will be accessible on all platforms.

OpenAI acknowledges the potential risks associated with these advanced capabilities. For voice, the focus is on voice chat, and the technology has been developed in collaboration with voice actors to ensure authenticity and safety. Notably, Spotify is also utilising this technology for its Voice Translation feature, expanding the reach of podcasters by translating content into various languages using their own voices.

Regarding image input, OpenAI has taken measures to limit ChatGPT's ability to analyse and make direct statements about people to respect individuals' privacy. Real-world usage and user feedback will play a crucial role in further enhancing these safeguards while ensuring the usefulness of the tool.

