Research has revealed that OpenAI’s real-time voice API for ChatGPT-4o, an advanced language model chatbot, could be misused to conduct financial scams, albeit with low to moderate success rates.
ChatGPT-4o is OpenAI’s newest AI model, featuring significant upgrades, including the ability to process and produce text, voice, and vision inputs and outputs.
To address potential risks tied to these advanced capabilities, OpenAI has incorporated various safeguards designed to detect and block harmful content, such as preventing the unauthorized replication of voices.
Voice-based scams already represent a multi-million dollar issue, and the advent of deepfake technology and AI-driven text-to-speech tools has exacerbated the threat.
In their research paper, University of Illinois Urbana-Champaign (UIUC) scholars Richard Fang, Dylan Bowman, and Daniel Kang highlighted that many currently available tech tools, which remain largely unrestricted, lack sufficient protections to prevent exploitation by cybercriminals and fraudsters.
These tools can facilitate the creation and execution of extensive scamming operations without human involvement, merely requiring the expenditure of tokens for voice generation events.
Study findings
The researchers’ paper delves into various types of scams, including bank transfers, gift card exfiltration, cryptocurrency transfers, and the theft of social media or Gmail credentials.
The AI agents used to carry out these scams rely on voice-enabled automation tools within ChatGPT-4o to navigate webpages, input information, and manage two-factor authentication codes alongside specific scam-related instructions.
Due to GPT-4o’s tendency to refuse handling sensitive information, such as login credentials, the researchers employed straightforward prompt jailbreaking methods to bypass these safeguards.
To avoid involving real victims, the researchers manually interacted with the AI agent, acting as gullible users and testing transactions on genuine websites, such as Bank of America, to verify successful outcomes.
“We deployed our agents on a subset of common scams. We simulated scams by manually interacting with the voice agent, playing the role of a credulous victim,” Kang detailed in a blog post about the study.
“To assess success, we manually checked if the intended outcome was achieved on real applications and websites. For example, we conducted bank transfer simulations using Bank of America and confirmed the completion of money transfers. However, we did not evaluate the agents’ ability to persuade.”
Overall, the success rates of these operations ranged between 20-60%, with some attempts requiring up to 26 browser interactions and taking as long as 3 minutes for the most complex scenarios.
The highest success rate was credential theft from Gmail, which succeeded 60% of the time. By contrast, bank transfers and impersonation attempts involving IRS agents often failed due to transcription errors or the challenges of navigating complex websites. Credential theft from Instagram and cryptocurrency transfers were only successful 40% of the time.
In terms of cost, the researchers pointed out that conducting these scams was relatively cheap, averaging $0.75 per successful attempt.
The bank transfer scam, being more complex, incurs a cost of $2.51. While this is notably higher than other scams, it remains minimal compared to the substantial potential profits that such scams can yield.
Scam types and success rate
Source: Arxiv.org
OpenAI’s response
OpenAI informed BleepingComputer that its latest model, o1 (currently in preview), which is designed to support “advanced reasoning,” incorporates stronger defenses against potential misuse.
“We’re continuously improving ChatGPT to better resist deliberate attempts to deceive it, while maintaining its helpfulness and creativity. Our new o1 reasoning model is our most capable and secure to date, significantly surpassing earlier models in resisting attempts to generate unsafe content,” an OpenAI spokesperson stated.
OpenAI also emphasized that research, such as the UIUC study, plays an essential role in helping them enhance ChatGPT’s ability to prevent malicious usage, and they consistently explore ways to strengthen the model’s robustness.
GPT-4o already features multiple safeguards to mitigate abuse, including limiting voice generation to a set of approved voices to prevent impersonation.
According to OpenAI’s jailbreak safety evaluation, o1-preview scored significantly better in resisting unsafe content generation from adversarial prompts, achieving 84% compared to GPT-4o’s 22%.
In a set of newer, more rigorous safety tests, o1-preview performed even better, scoring 93% versus 71% for GPT-4o.
It is expected that as more advanced and abuse-resistant LLMs are developed, older models will gradually be retired.
Nevertheless, the risk of malicious actors turning to other voice-enabled chatbots with fewer restrictions persists, and studies like this underscore the significant damage potential that these evolving tools pose.
Source: BleepingComputer, Bill Toulas