OpenAi Admits That Chatgpt -guarantees Fail During Extensive Conversations

Adam Raine learned to circumvent these guarantees by claiming that he was writing a story – a technique that says the court case that Chatgpt himself suggested. This vulnerability comes partly from the relaxed guarantees with regard to fantasy rol game and fictional scenarios that will be implemented in February. In his blog post on Tuesday, OpenAi admitted that his content -blocking systems have gaps in which “the classificator underestimates the seriousness of what it sees.”

OpenAI states that “the” currently no self-damaging cases is referring to law enforcement to respect people's privacy, given the unique private nature of chatgpt interactions. ” The company gives priority to the privacy of users, even in life -threatening situations, despite the moderation technology that detects self -damage content by a maximum of 99.8 percent accuracy, according to the court case. The reality, however, is that detection systems identify statistical patterns related to self -harm, not a human understanding of crisis situations.

OpenAi's safety plan for the future

In response to these failures, OpenAi describes constant refinements and plans for the future in her blog post. For example, the company says that the consultation with “90+ doctors in more than 30 countries” and is planning to 'supervise' parental supervision 'soon, although no timeline has yet been provided.

OpenAI also described plans for “connecting people with certified therapists” via chatgpt – which are essentially positioning his chatbot as a psychiatric platform, despite alleged failures such as Raine's matter. The company wants to 'build a network of recognized professionals who can reach people directly through chatgpt', which may strengthen the idea that an AI system should mediate the mental health crises.

Raine reportedly used GPT-4O to generate the instructions for suicide help; The model is known for difficult tendencies such as Sycophanancy, where an AI model tells users that things want, even if they are not true. OpenAI claims that its recently released model, GPT-5, “Noneal model reactions in mental health emergency situations reduces more than 25% compared to 4O.” Yet this seemingly marginal improvement has not prevented the company from getting chatgpt even deeper into mental health care as a gateway to therapists.

As ARS investigated earlier, the release of the influence of an AI chatbot in a misleading chat -spiral often requires external intervention. Starting a new chat session without conversation history and memories can reveal how reactions change without the structure of earlier trade fairs – a reality control that becomes impossible in long, isolated conversations where guarantees deteriorate.

However, “sex” of that context is very difficult to do when the user wants to continue to actively participate in potentially harmful behavior – while the use of a system that increasingly applies their attention and intimacy.

OpenAi admits that Chatgpt -guarantees fail during extensive conversations

OpenAi's safety plan for the future