AI models can obtain backdoors from surprisingly few malicious documents
Refinement experiments with 100,000 clean samples versus 1,000 clean samples showed similar success rates when the number of malicious examples remained constant. For GPT-3.5 turbo,… Read More »AI models can obtain backdoors from surprisingly few malicious documents