Social chat moderation program
social-chat-moderationDomain: social-safetyType: mixedDescription
Platforms that host user-to-user chat or private messaging (DMs) increasingly carry duty-of-care obligations distinct from public-content moderation: the audience is smaller, the harms (grooming, sextortion, harassment) are higher-velocity, and the privacy framing of 'private message' has been narrowed by regulators when the platform is in a position to detect and act. The control has three pieces: a chat-moderation policy that defines what scanning, classifier matching, or human review applies (with explicit carve-outs for end-to-end encrypted threads), an enforcement workflow that triggers on signal (slurs, CSAM hash match, grooming pattern, threat keywords) with action options ranging from warning to message-block to account suspension to NCMEC referral, and a transparency mechanism (Statement of Reasons under DSA Art 17, takedown ledger for OSA) that makes the enforcement auditable. For platforms with under-13 or under-18 user segments, content moderation upgrades from 'recommended' to 'expected' under California AADC, UK OSA Part 3, and the EU DSA's enhanced obligations on platforms reaching minors.
Required by (3 regulations)
- DSA
Art 16-17 (notice-and-action + statement of reasons), Art 28 (minor-protection systems on platforms reaching minors).
DSA Art. 16, 17, 28
- UK OSA
Part 3 illegal-content + child-safety duties — chat services with UK users must implement proportionate moderation systems.
OSA 2023 Part 3
- KOSA
(if enacted) §3 platform duty-of-care to mitigate harms to minors including bullying, sexual exploitation, mental-health-harming content patterns in chat surfaces.
KOSA §3
Fulfilled by (4)
- community-sift-two-hat · full · medium effort · $$$Microsoft / Two Hat Community Sift — real-time chat classification with platform-level risk tagging.
- activefence · full · medium effort · $$$Trust + safety platform with chat-pattern detection, grooming-signal classifiers, and threat-intel feeds.
- hive · partial · low effort · $$Hive's text / image moderation APIs cover chat triage; pair with a workflow tool for enforcement actions.
- In-house build · high effortBuild a chat-moderation pipeline: classifier (managed or custom), enforcement workflow, NCMEC integration, SoR export. Typically takes 3-6 months of engineering effort.
ClearLaunch does not accept payment from vendors. Methodology.
Evidence formats
- chat moderation policy doc
- classifier / vendor pipeline diagram
- enforcement action ledger
- Statement of Reasons (SoR) export
- NCMEC referral procedure