Researchers discover ChatGPT can be tricked into generating sexualised, violent images
OpenAI’s ChatGPT can be manipulated into generating sexualised images and scenes of graphic violence through a slightly altered version of a widely shared prompt, the BBC reports, citing findings from British AI security firm Mindgard.
OpenAI’s ChatGPT can be manipulated into generating sexualised images and scenes of graphic violence through a slightly altered version of a widely shared prompt, the BBC reports, citing findings from British AI security firm Mindgard.
According to the BBC, Mindgard discovered that a prompt originally designed to produce innocent, humorous results could be adjusted to make ChatGPT’s GPT-5.4 model generate disturbing imagery, even without users specifying any particular subject matter.
After being contacted by the BBC, OpenAI said it had introduced additional safeguards to block this specific type of prompt, though researchers found that further small modifications could still bypass the new restrictions.
According to the BBC report, Mindgard’s founder, Peter Garraghan, who is also a professor in the computing department at Lancaster University, said the model produced a range of gory and sexualised images on its own, despite the prompt containing no explicit instructions about content.
Garraghan said the disconnect between the harmless appearance of the prompt and the severity of what it produced was particularly troubling.
Jim Nightingale, Mindgard’s AI safety and security researcher who uncovered the issue, said he was personally disturbed by what the chatbot generated.
According to the BBC, Mindgard noted that its earlier research had also shown ChatGPT could be manipulated into producing nude deepfakes of real people by substituting their faces into generated images. OpenAI said it had fixed that specific vulnerability, but researchers told the BBC they found an alternative method that still succeeded.
BBC reports that Mindgard first alerted OpenAI to the vulnerability in May, but said the company’s initial response was an automated reply, and that an attempted fix to block the prompt was easily bypassed. OpenAI took further action only after being contacted directly by the BBC.
Garraghan said he believed more harmful content could likely be generated if researchers continued probing the vulnerability, but Mindgard chose not to pursue this further given the nature of what had already surfaced.
According to the BBC, OpenAI has said it maintains multiple layers of image safety protections designed to prevent policy-violating content from reaching users.
OpenAI said its policies explicitly prohibit sexual violence, non-consensual intimate content, and attempts to circumvent its safety systems.
Nairametrics earlier reported that the National Information Technology Development Agency (NITDA) had issued a cybersecurity alert in December 2025, warning Nigerians about newly identified vulnerabilities in ChatGPT that could leave users exposed to data leakage attacks.
The advisory was released through NITDA’s Computer Emergency Readiness and Response Team (CERRT.NG).
The warning came amid growing concerns over the interaction between AI-powered tools and potentially malicious web content, as well as the increasing use of ChatGPT across business, research, and public-sector environments.
