🤖 AI researchers reveal the "black box" of large language models (LLMs) like OpenAI's ChatGPT and Google's Bard, making it tough to delete sensitive data. Here's why:

- LLMs are pre-trained on databases and fine-tuned for coherent outputs.

- Deleting specific files from the database doesn't remove related results from the model.

- Guardrails like hard-coded prompts and reinforcement learning from human feedback (RLHF) help, but don't fully delete info.

- State-of-the-art methods like Rank-One Model Editing (ROME) still leave facts extractable 29-38% of the time.

- Researchers developed new defense methods, but admit they may always be playing catch-up to attack methods. 🕵️‍♂️