Researchers find instances of systems double-crossing opponents, bluffing, pretending to be human and modifying behaviour in tests
They can outwit humans at board games, decode the structure of proteins and hold a passable conversation, but as AI systems have grown in sophistication so has their capacity for deception, scientists warn.
The analysis, by Massachusetts Institute of Technology (MIT) researchers, identifies wide-ranging instances of AI systems double-crossing opponents, bluffing and pretending to be human. One system even altered its behaviour during mock safety tests, raising the prospect of auditors being lured into a false sense of security.
More Stories
Is it true that … ginger shots boost immunity?
House of Lords pushes back against government’s AI plans
ChatGPT may be polite, but it’s not cooperating with you