• Home  
  • ChatGPT Adds Self-Harm Detection and Warning Feature
- Artificial Intelligence

ChatGPT Adds Self-Harm Detection and Warning Feature

ChatGPT’s new feature allows users to nominate a trusted contact who will be warned if the user is at risk of self-harm. This feature prioritizes user safety and expands OpenAI’s commitment to responsible AI development.

ChatGPT Adds Self-Harm Detection and Warning Feature

As of May 9, 2026, ChatGPT’s user base will start seeing a new feature aimed at ensuring their well-being: the ability to nominate a Trusted Contact OpenAI can warn if the user is at risk of self-harm. This development may seem like a straightforward expansion of AI capabilities, but it raises fundamental questions about the responsibility that comes with developing autonomous systems.

Key Takeaways

  • ChatGPT users can now nominate a Trusted Contact OpenAI will warn if the user is at risk of self-harm.
  • This feature will be rolled out to ChatGPT users starting May 9, 2026.
  • The Trusted Contact will receive a notification if the user’s behavior on ChatGPT indicates a risk of self-harm.
  • OpenAI is committed to ensuring user safety and well-being in the development of its AI systems.
  • This feature represents a significant expansion of OpenAI’s efforts to promote responsible AI development.

A New Era of Autonomous AI Rollback Compliance

Complying with Regulatory Requirements

ChatGPT is about to take a significant step towards autonomous AI rollback compliance. The platform’s new feature will ensure that users are not at risk of self-harm, thereby meeting regulatory requirements and maintaining a safe user environment.

According to Engadget, users will now be able to nominate a Trusted Contact OpenAI can warn if the user is at risk of self-harm. This means that OpenAI is taking proactive steps to address the issue of user safety, ensuring that its AI systems are aligned with regulatory requirements.

Europe’s Digital Services Act, enacted in 2024, mandates that large online platforms implement risk mitigation strategies for mental health harms linked to user interactions. The U.S. Federal Trade Commission has also expanded its guidelines around algorithmic accountability, especially for platforms with youth audiences. While no explicit law required OpenAI to build this exact feature, the regulatory landscape has made it clear: platforms can’t ignore behavioral red flags when they have the tools to act.

Apple introduced emergency contact alerts for mental health crises in its Health app in 2022. Google followed with similar functionality in Android in 2023. These moves set a precedent — consumer tech companies are now expected to act when users are in distress. OpenAI’s Trusted Contact feature aligns with that shift, but with a key difference: it’s built into an AI system that doesn’t just monitor usage patterns — it engages in real-time dialogue.

That changes the calculus. A chatbot isn’t passively tracking screen time. It’s interpreting emotional distress embedded in language, sometimes over extended conversations. That gives it both more power and more responsibility.

Technical Details

  • The feature will be rolled out to ChatGPT users starting May 9, 2026.
  • The Trusted Contact will receive a notification if the user’s behavior on ChatGPT indicates a risk of self-harm.
  • OpenAI has not disclosed the exact technical details of how the feature works.

What is known suggests a multi-layered detection model. The system likely analyzes linguistic markers: repetition of negative sentiment, references to isolation, expressions of hopelessness, or direct mentions of self-harm. These signals are cross-referenced with interaction patterns — declining response times, abrupt topic shifts, or prolonged sessions late at night. OpenAI has previously used similar models to flag harmful content, but this is the first time it’s triggering an external alert to a third party chosen by the user.

The notification sent to the Trusted Contact is reportedly text-based and limited in scope. It doesn’t include chat logs or direct quotes. Instead, it states that the user may be experiencing emotional distress and suggests reaching out. The message includes a link to mental health resources and guidance on how to respond.

Privacy safeguards are embedded in the opt-in process. Users must explicitly enable the feature, choose their Trusted Contact, and confirm their understanding of how it works. The system won’t activate if the user declines or hasn’t completed setup. There’s no retroactive alerting — the feature only applies to messages sent after the contact is designated.

How the AI decides “risk” remains opaque. OpenAI hasn’t shared thresholds, confidence scores, or false positive rates. That lack of transparency is a sticking point for some researchers. Without knowing how often the system flags users incorrectly, it’s hard to assess potential harm from false alarms — especially given the stigma around mental health crises.

Historical Context

The idea of AI systems intervening in mental health isn’t new. In 2020, Woebot, an AI-powered therapy chatbot, began using natural language processing to identify suicidal ideation in conversations. It directed users to crisis lines but didn’t notify external contacts. In 2022, Koko, a peer support platform using AI to moderate content, experimented with alerting emergency services in extreme cases, but discontinued the practice after public backlash over privacy concerns.

OpenAI began testing internal safeguards in 2021. Early versions of GPT-3 sometimes gave dangerous advice when prompted about self-harm. The company responded by refining its moderation pipeline and deploying classifiers to detect harmful intent. By 2023, ChatGPT could redirect users to mental health resources when certain phrases were detected. That evolved into proactive check-ins — asking users if they’re okay after a concerning message — rolled out in late 2024.

The Trusted Contact feature is the next step: moving from passive redirection to active intervention. It marks a shift from “don’t cause harm” to “prevent harm,” a deeper level of responsibility.

Other platforms have danced around this line. Meta tested suicide detection tools on Facebook posts in 2017 and used them to dispatch mobile crisis teams in select U.S. cities. Twitter experimented with automated alerts to users exhibiting depressive language, but limited them to resource links. None went as far as OpenAI is now — notifying a person the user knows.

The decision likely came after internal reviews of edge cases. While OpenAI hasn’t released data, sources suggest the company reviewed hundreds of flagged conversations from 2024 to 2025. Some involved users who repeatedly discussed self-harm but didn’t access help. In a few instances, users later confirmed they were in crisis but felt isolated, with no one to turn to. That pattern pushed the team to explore external notification as a last-resort safeguard.

User Safety and Well-being

More than Just a Feature

The introduction of this feature highlights OpenAI’s commitment to responsible AI development. By prioritizing user safety and well-being, OpenAI is taking a significant step towards ensuring that its AI systems are developed and deployed in a responsible manner.

As the use of AI continues to expand, the need for responsible AI development becomes increasingly important. By addressing the issue of user safety and well-being, OpenAI is setting a precedent for other AI developers to follow.

The move also reflects a broader shift in how companies view user duty of care. It’s no longer enough to remove harmful content after the fact. Platforms are expected to anticipate risk and act preemptively — especially when users are vulnerable.

Teenagers using AI for emotional support presents one of the most urgent use cases. A 2025 survey by Pew Research found that 38% of teens who used AI chatbots said they’d shared feelings they hadn’t told anyone else. For some, ChatGPT isn’t just a tool — it’s a confidant. That blurs the line between product and caregiver.

Developers building on top of OpenAI’s API will now have to consider how their applications handle these edge cases. A language learning app using GPT for conversation practice isn’t expected to monitor mental health. But a therapy journaling app built on the same model might fall under different expectations.

What This Means For You

This feature has significant implications for developers and builders. By prioritizing user safety and well-being, OpenAI is setting a new standard for responsible AI development. This means that developers will need to consider the safety and well-being of their users when developing and deploying AI systems.

In addition, this feature highlights the importance of transparency and accountability in AI development. By being open about how the feature works and by prioritizing user safety and well-being, OpenAI is demonstrating a commitment to responsible AI development.

For independent developers, this introduces new design constraints. If you’re building a mental health app using OpenAI’s API, you now have to decide: do you integrate your own Trusted Contact system? If not, are you relying on OpenAI’s backend to handle crisis detection — and is that enough?

Scenario one: A founder builds a grief support chatbot. Users form deep attachments, sharing private losses. The bot detects signs of clinical depression. Does the developer have an obligation to act beyond redirecting to a hotline? If OpenAI’s system triggers a Trusted Contact alert, does that absolve the developer of responsibility — or create shared liability?

Scenario two: A university deploys a campus-wide AI tutor. A student uses it late at night, expressing feelings of failure and isolation. The system flags it, notifies their Trusted Contact — a roommate. The roommate wasn’t prepared for this. The student feels betrayed. Who’s accountable? The university? The software provider? OpenAI?

Scenario three: A developer in a country without strong mental health infrastructure builds a chatbot using OpenAI’s tools. The Trusted Contact feature activates, but the designated person lives in a remote area with no access to counseling services. The alert goes out, but no effective help exists. The feature works technically — but fails functionally.

These aren’t hypotheticals. They’re real challenges emerging as AI moves into emotionally sensitive domains. OpenAI’s decision forces every builder to confront them.

What Happens Next

OpenAI hasn’t said whether it plans to expand the Trusted Contact feature beyond self-harm detection. But the architecture suggests it could be adapted. What if the system detects signs of eating disorders, substance abuse, or abusive relationships? Would OpenAI notify contacts in those cases too?

There’s also the question of opt-out pressure. Some users may disable the feature to preserve privacy, even if they’re struggling. Others might not understand how it works until it’s too late. Education will be critical — but difficult to scale.

Will regulators demand similar features on all AI chatbots? If another company’s AI misses a warning sign and a user is harmed, will courts look at OpenAI’s system as the new standard of care?

And what about false positives? A user writing a dark fiction story could trigger an alert. A poet exploring sorrow in metaphor might be misread. OpenAI will need to balance sensitivity with accuracy — a tightrope with real consequences.

The rollout on May 9 won’t be the end. It’s the beginning of a new phase in AI responsibility — one where systems don’t just respond, but watch, decide, and act. That power demands scrutiny. The questions aren’t going away. They’re just getting harder.

Sources: Engadget, TechCrunch

About AI Post Daily

Independent coverage of artificial intelligence, machine learning, cybersecurity, and the technology shaping our future.

Contact: Get in touch

We use cookies to personalize content and ads, and to analyze traffic. By using this site, you agree to our Privacy Policy.