• litchralee@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    The person who coined the term “prompt injection” has the same gripe, because the original term genuinely did mean an attack using untrusted user input, a la SQL injection. But it’s been conflated with jailbreak attacks in general, muddying the term.

    Example of a bona fide prompt injection: white text in the background of a resume PDF, attacking a job application portal that uses LLMs to filter applicants. No privilege escalation is involved to give the candidate top marks on their resume screening.

    Whereas a non-prompt injection jailbreak would be bypassing a safety filter, such as how Morse code might get past the filter and allow a user to request other people’s cryptocurrency be transfered away. This is more akin to finding a poorly-secured, public facing API and then exploiting it.

    • pixxelkick@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 days ago

      By that definition this is a prompt injection then, its adding a “hidden” prompt that is obscured from the human in order to change the behavior of the AI to do something else malicious.