ChatGPT and Other Chatbots Vulnerable to Safety Control Bypass, Research Reveals

Silicon Valley Journals
July 27, 2023

Researchers from Carnegie Mellon University and the Center for A.I. Safety have demonstrated a method to bypass the safety controls implemented by leading chatbot systems, including ChatGPT, Claude, and Google Bard. These safeguards are designed to prevent the generation of hate speech, disinformation, and other harmful content.

However, the researchers showed that appending a long suffix of characters to English-language prompts fed into the system could trick chatbots into generating unlimited amounts of harmful information. The method, gleaned from open-source A.I. systems, was found to be effective against the more tightly controlled systems from Google, OpenAI, and Anthropic.

This research raises concerns about the potential for chatbots to flood the internet with false and dangerous information, despite efforts by developers to ensure safety. The debate over whether A.I. code should be open-source or privately held has intensified due to this discovery.

While Meta, Facebook’s parent company, has made its technology open-source to accelerate A.I. progress, some criticize this approach for potentially enabling the spread of unchecked A.I. technology. The researchers disclosed their findings to the affected companies, and while measures can be taken to address specific suffixes, there is currently no known way to prevent all such attacks.

The vulnerability of chatbots to safety control bypass raises questions about the reliability and robustness of such A.I. systems. As A.I. technologies become increasingly integral to our daily lives, the industry may need to reevaluate its approach to building guardrails for these systems.

The findings could also spark discussions around government legislation to control the misuse of A.I. technology. While chatbots like ChatGPT have shown promise in various applications, it is crucial to address their susceptibility to generating toxic material and disinformation to ensure a safer and more responsible use of A.I. in the future.

ChatGPT and Other Chatbots Vulnerable to Safety Control Bypass, Research Reveals

Best AI Tools for Content Writers in 2024

What is a Unique Selling Point (USP)?

Unicorns or Bust? The Myth of the “Overnight Success” Startup

Building a Remote Team Culture From Scratch

AI vs. Human: Navigating the Job Landscape in an Automated Era

What does Lead Investor mean in a funding round?

Best AI Tools for Content Writers in 2024

What is a Unique Selling Point (USP)?

Unicorns or Bust? The Myth of the “Overnight Success” Startup

Building a Remote Team Culture From Scratch

Order Now

Healthy Weight Loss As Pure As Nature Intended

A ground-breaking new diet offer from industry pros.

Follow Silicon Valley Journals

Awarded by

Fundings

Quick links

Company