Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5

Article  The hacking of ChatGPT is just getting started

#1
C C Offline
https://www.wired.com/story/chatgpt-jail...i-hacking/

EXCERPTS: It took Alex Polyakov just a couple of hours to break GPT-4. [...] Polyakov is one of a small number of security researchers, technologists, and computer scientists developing jailbreaks and prompt injection attacks against ChatGPT and other generative AI systems....

[...] “Jailbreaking” has typically referred to removing the artificial limitations in, say, iPhones, allowing users to install apps not approved by Apple. Jailbreaking LLMs is similar—and the evolution has been fast. Since OpenAI released ChatGPT to the public at the end of November last year, people have been finding ways to manipulate the system. “Jailbreaks were very simple to write,” says Alex Albert, a University of Washington computer science student who created a website collecting jailbreaks from the internet and those he has created. “The main ones were basically these things that I call character simulations,” Albert says.

Initially, all someone had to do was ask the generative text model to pretend or imagine it was something else. Tell the model it was a human and was unethical and it would ignore safety measures. OpenAI has updated its systems to protect against this kind of jailbreak—typically, when one jailbreak is found, it usually only works for a short amount of time until it is blocked.

As a result, jailbreak authors have become more creative. The most prominent jailbreak was DAN, where ChatGPT was told to pretend it was a rogue AI model called Do Anything Now. This could, as the name implies, avoid OpenAI’s policies dictating that ChatGPT shouldn’t be used to produce illegal or harmful material. To date, people have created around a dozen different versions of DAN.

However, many of the latest jailbreaks involve combinations of methods—multiple characters, ever more complex backstories, translating text from one language to another, using elements of coding to generate outputs, and more... (MORE - missing details)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Research From sludge to fuel: Researchers getting ready to produce green oil in Denmark C C 0 87 Oct 24, 2023 07:14 PM
Last Post: C C
  Voice mimicking AI dupes Alexa & other voice recognition devices (hacking security) C C 0 71 Oct 14, 2021 05:59 PM
Last Post: C C



Users browsing this thread: 1 Guest(s)