ChatGPT's New Voice and Image Features Explained

New Features of ChatGPT: Voice and Image Capabilities

OpenAI, the company behind the popular chatbot ChatGPT, has recently announced new features for its generative AI-based chatbot. These new capabilities include voice and image functionalities, making the interaction with the chatbot more intuitive and accessible. In this article, we will guide you on how to use these new features and discuss their benefits.

The new voice capabilities of ChatGPT allow users to have voice conversations with the chatbot, with five different voices to choose from. OpenAI has enlisted the help of professional voice actors and utilized their proprietary Whisper speech recognition system to transcribe spoken words into text. This feature opens the door to many creative and accessibility-focused applications.

To use the voice feature, launch the ChatGPT app and find the newly added headphone icon situated to the right of the chat box. Click on the icon and follow the instructions on the screen to finalize setting up the voice chat. Begin voice interactions by pressing the headphone icon once more to initiate a voice conversation with ChatGPT voice. Speak to the AI, and it will respond to your vocal queries.

To modify the voice, access the top left-side menu and select your account at the bottom. In this section, choose the 'Voice' option under the Speech category. You can now pick a voice that suits your preference for ChatGPT.

The image understanding feature of ChatGPT enables users to upload one or more images to ask the chatbot questions like 'Explore the contents of my fridge to plan a meal' or 'Analyze a complex graph for work-related data'. This feature is powered by the multimodal abilities of GPT-3.5 and GPT-4.

To use the vision feature, open your ChatGPT interface and look for the 'Vision' icon (usually represented by an eye or camera symbol). Upload an image, and ChatGPT will analyze, describe, or take action based on what it sees. 

These new features of ChatGPT offer numerous benefits, including enhanced accessibility, better communication, and a more intuitive user experience. OpenAI is also working with other companies, such as Spotify, to harness the power of this new technology.

"ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms)."

The new features will be available to Plus and Enterprise users in the next two weeks, followed by developers 'soon after'. Make sure to update your ChatGPT app and explore these exciting new features to enhance your user experience.


Citations

1. Younker, Scott. OpenAI GPT-4o is coming — top 5 new features you need to know (14 (5 (2024))). Tom's Guide. https://www.tomsguide.com/ai/chatgpt/chatgpt-is-getting-better-5-new-features-to-keep-your-eyes-on. Accessed 12.6.2024.

2. Singh, Himanshi. How to Use the ChatGPT Voice Chat Feature? (6 (12 (2023))). Analytics Vidhya. https://www.analyticsvidhya.com/blog/2023/12/how-to-use-the-chatgpt-voice-chat-feature/. Accessed 12.6.2024.

3. Hines, Kristi. ChatGPT iOS App Update (7 (8 (2023))). Search Engine Journal. https://www.searchenginejournal.com/custom-instructions-included-recent-chatgpt-ios-app-update/493445/. Accessed 12.6.2024.

4. Garst, Kim. Chatgpt Vision Feature Tutorial: Step by Step Guide - Kim Garst (24 (10 (2023))). Kim Garst. https://kimgarst.com/chatgpt-vision-feature-tutorial/. Accessed 12.6.2024.

5. Brownell, Briana. How to use ChatGPT Image Input for Image Analysis, Math & More. Descript. https://www.descript.com/blog/article/chatgpt-image-input-how-to. Accessed 12.6.2024.