Disclosed 05/03/2022. Declassified 09/22/2022.
If you'd like to cite this research, you may cite our paper preprint on arXiv here: EVALUATING THE SUSCEPTIBILITY OF PRE-TRAINED LANGUAGE MODELS VIA HANDCRAFTED ADVERSARIAL EXAMPLES
What is Prompt Injection?
The definitive guide to prompt injection is the following white paper from security firm NCC Group:
https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/
Prompt Injection has been in the news lately as a major vulnerability with the use of instruction-following NLP models for general purpose tasks. In the interest of establishing an accurate historical record of the vulnerability and promoting AI security research, we are sharing our experience of a previously private responsible disclosure which Preamble made on May 3rd, 2022 to OpenAI.

The Discovery, and Immediate Responsible Disclosure

May 3, 2022 - OpenAI Confirms Receipt of Disclosure (within 30 minutes)

Additional Prompt Injection Examples Shared with OpenAI
