Skip to content

If AI systems become aware, should they have rights?

    One of my most deeply held values ​​as a technical columnist is humanism. I believe in people, and I think technology should help people, rather than invalidating or replacing. I indicate that it coordinates artificial intelligence – that is, ensuring that AI systems act in accordance with human values ​​- because I think our values ​​are fundamentally good, or at least better than the values ​​that a robot could come up with.

    So when I heard that Anthropic researchers, the AI ​​company made the Claude Chatbot, began to study 'model welfare' – the idea that AI models would soon become aware and earn some moral status – the humanist thought in me: Who cares for the chatbots? Are we not worried about the fact that we mistreat ourselves, not that we mistreat it?

    It is difficult to claim that today's AI systems are aware. Of course, great language models are trained to talk as people, and some are extremely impressive. But can chatgpt experience joy or suffering? Does Gemini earn human rights? Many AI experts I know would say no, not yet, not even close.

    But I was intrigued. After all, more people begin to treat AI systems as if they are aware – fall in love with them, use them as therapists and ask for their advice. The smartest AI systems exceed people in some domains. Is there a threshold where an AI would start earning, if not rights at a human level, at least the same moral consideration that we give to animals?

    Awareness has long been a taboo subject in the world of serious AI research, where people are wary of anthropomorphization of AI systems for fear of resembling as cranks. (Everyone remembers what happened to Blake Lemoine, a former Google employee who was fired in 2022, after he claimed that the Lamda Chatbot of the company had become aware.)

    But that can begin to change. There is a small amount of academic research into AI Model Welfare, and a modest but growing number of experts in areas such as philosophy and neurosciences take the prospect of AI consciousness more seriously, as AI systems become more intelligent. Recently the Tech -Podcaster Dwarkesh Patel AI welfare compared to animal welfare and said he believed that it was important to ensure that “the digital equivalent of factory brigade” does not happen with future AI beings.

    Technology companies are also starting to talk more about it. Google has recently posted a vacancy for a “post-agi” research scientist whose areas of attention will include “machine conservation”. And last year Anthropic hired his first AI welfare researcher, Kyle Fish.

    I got Mr. last week Fish interviewed at the Anthropic office in San Francisco. He is a friendly vegan who, like a number of anthropic employees, has ties with effective altruism, an intellectual movement with roots in the technical scene of Bay Area that is aimed at AI safety, animal welfare and other ethical issues.

    Mr Fish told me that his work was anthropically focused on two basic questions: First, is it possible that Claude or other AI systems will become aware in the near future? And second, if that happens, what should it do anthropic it?

    He emphasized that this research was still early and exploratory. He thinks there is only a small chance (maybe 15 percent or so) that Claude or another current AI system is aware. But he believes that AI companies in the coming years, as AI models develop more human skills, have to take the possibility of consciousness more seriously.

    “It seems to me that if you are in the situation to bring a new class of existence that is able to communicate and relate and reason and to resolve problem solving and plan in a way that we have previously exclusively associated with conscious beings, it seems pretty sensible to ask at least whether that system could have his own types of experiences, he said.

    Mr. Fish is not the only person in anthropic thinking about AI welfare. There is an active channel on the weak message system of the company called #Model-Welfare, where employees check in on the well-being of Claude and share examples of AI systems that act in human ways.

    Jared Kaplan, Chief Science Officer of Anthropic, told me in a separate interview that he found it “fairly reasonable” to study AI welfare, considering how intelligent the models get.

    But testing AI systems for consciousness is difficult, Mr Kaplan warned because they are such good imitations. If you encourage Claude or Chatgpt to talk about his feelings, this can give you a compelling response. That doesn't actually mean the chatbot has Feelings – only that it knows how to talk about it.

    “Everyone is very aware that we can train the models to say what we want,” said Mr. Kaplan. “We can reward them because they say they have no feelings at all. We can reward them because they say really interesting philosophical speculations about their feelings.”

    So how should researchers know whether AI systems are actually aware or not?

    Mr Fish said it could be to use techniques borrowed from mechanistic interpretability, an AI subveld that studies the inner functioning of AI systems, to check whether some of the same structures and routes are associated with consciousness in human brains are also active in AI systems.

    You could also investigate an AI system, he said, by observing his behavior, looking at how it chooses to work in certain environments or to accomplish certain tasks, which things it seems to be preferable and avoided.

    Mr Fish acknowledged that there was probably no litmus test for AI consciousness. (He thinks that consciousness is probably more a spectrum than a simple yes/no switch, anyway.) But he said there were things that AI companies could do to take into account the well -being of their models, in case they ever become aware.

    A question that anthropic examines, he said, is whether future AI models should have the opportunity to stop chatting with an annoying or offensive user if they find the requests of the user to disturb.

    “If a user constantly asks to ask for harmful content despite the refusal of the model and attempts to detour, can we simply put the model an end to that interaction?” Said Mr Fish.

    Critics can reject such measures as a crazy talk – today's AI systems are not aware of most standards, so why would you speculate about what they might find annoying? Or they can object to studying awareness of an AI company in the first place, because it can create stimuli to train their systems to act more consciously than they actually are.

    Personally, I think it is great for researchers to study AI welfare or to investigate AI systems for signs of consciousness, as long as it does not distract resources from AI safety and alignment work that are aimed at keeping people safe. And I think it's probably a good idea to be nice for AI systems, if only like a hedge. (I try to say “please” and “thank you” to chatbots, although I don't think they are aware, because, as Sam Altman says of OpenAi, you never know.)

    But for now I will reserve my deepest concern for carbon -based life forms. In the coming AI Storm it is our well -being that I am most worried.