Anthropic Report Reveals Vulnerability in Large Models

Anthropic's report shows 250 malicious documents can create backdoor vulnerabilities in large models.
Researchers partnered with UK organizations to challenge data-poisoning attack assumptions.
250 documents (0.00016% of training tokens) trigger hidden behaviors in 13B-parameter models.
Data-poisoning attacks may require less effort than previously thought.
Anthropic urges research on defenses and emphasizes AI safety measures.
Copyright © 2025 Minimalist News. All Rights Reserved.
Privacy Policy
Terms of Service
Contact Us