OpenAI unveils new AI that can see, hear and speak

Technology

ChatGPT creator’s latest flagship product can respond as quickly as a human

Updated On: Tue, 14 May 2024 03:23:45 PKT

(Web Desk) - OpenAI, the creator of viral chatbot ChatGPT, has unveiled a new AI model that can interact with the world via audio, vision and text in real time.

GPT-4o is the latest flagship product for the Microsoft-backed company, aiming to offer users a “more natural human-computer interaction”.

In a presentation on Monday, OpenAI said its latest AI could respond to queries in less than a third of a second – similar to human response time in conversation.

Using a smartphone’s camera and microphone, GPT-4o is capable of understanding audio and visual inputs, while using the speakers to respond in a personalised and natural voice.

OpenAI CEO Sam Altman said the new technology “feels like magic”, writing in a blog post that it was “the best computer interface” he had ever used.

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” he wrote.

“The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful.”

Unlike its other advanced AI models, OpenAI said it would offer GPT-4o for free, making it available within the next few weeks.

In an effort to prevent misuse or potential harm, OpenAI said it carried out extensive testing that covered everything from cyber security to psychology.

“We tested both pre-safety-mitigation and post-safety-mitigation versions of the model, using custom fine-tuning and prompts, to better elicit model capabilities,” the company explained in a blog post introducing the product.

“GPT-4o has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities... We will continue to mitigate new risks as they’re discovered.”

OpenAI acknowledged that its latest AI model has several limitations that it hopes to overcome with future versions.

Videos of the AI making mistakes showed GPT-4o switching between languages without being prompted, making errors with language translation, and mispronouncing someone’s name as “Nacho”.

The announcement comes just a day ahead of Google I/O, the tech giant’s biggest event of the year that is expected to have a heavy focus on artificial intelligence.

“All eyes will be on how AI becomes more integrated into connected devices, particularly smartphones, given the sheer scale of the opportunity,” Leo Gebbie, a principal analyst at CSS Insight, told The Independent ahead of the event.

“Google needs to clearly articulate the benefits of AI to avoid consumers succumbing to AI fatigue.”

OpenAI unveils new AI that can see, hear and speak

Related News

Us-Iran War Update Pro-Iran Rallies Erupt In Iraq Situation Turns Tense

Iran Another Terrible Attack Trump Shocked Israel On Chaos

Attack On Israel Iranian Public Vows Revenge for the Supreme Leaders Martyrdom

Iranian Military in Action Attack On Israel Chaos Erupt in Israel & Devastation Reported

Who Will Rule Iran After Khamenei? Iran in Crisis Short Glimpse on Life of Supreme Leader

Pak Army Gains More Ground Major Shift on the Battlefield Pak Afghan War Conflict

High Alert on Borders Pak - Afghan War Operation Ghazab lil Haq Latest Situation

Chaos Erupts in Afghanistan Pakistan Army Strikes Operation Ghazab lil Haq

Pakistan Army Launches Another Strike, Afghan Post Destroyed in Qila Saifullah Sector Breaking News

War Update Chaos Erupt in Afghanistan Pakistan Army Strikes Back

English

Urdu

Shows

videos

Video Headlines

Coronavirus

PSL 7

Newspaper

Follow Us

Links

Blogs