🤖 AI researchers reveal the "black box" of large language m

🤖 AI researchers reveal the "black box" of large language models (LLMs) like OpenAI's ChatGPT and Google's Bard, making it tough to delete sensitive data. Here's why:- LLMs are pre-trained on databases and fine-tuned for coherent outputs.
- Deleting specific files from the database doesn't remove related results from the model.
- Guardrails like hard-coded prompts and reinforcement learning from human feedback (RLHF) help, but don't fully delete info.
- State-of-the-art methods like Rank-One Model Editing (ROME) still leave facts extractable 29-38% of the time.
- Researchers developed new defense methods, but admit they may always be playing catch-up to attack methods. 🕵️‍♂️

.css-1iqe90x{box-sizing:border-box;margin:0;min-width:0;color:#EAECEF;}🤖 AI researchers reveal the "black box" of large language models (LLMs) like OpenAI's ChatGPT and Google's Bard, making it tough to delete sensitive data. Here's why:

🤖 AI researchers reveal the "black box" of large language models (LLMs) like OpenAI's ChatGPT and Google's Bard, making it tough to delete sensitive data. Here's why: