Analysis
Does Meta Care About User Safety? Its Turn Away From Humans to AI for Risk Assessments Says No
September 4, 2025 |
Despite increasing skepticism by the public and ongoing criticisms of unsafe, privacy-violating products, Meta recently announced that it is turning to AI to conduct risk assessments of new features and products. This move is the latest example of a Big Tech company avoiding responsible AI development and prioritizing profit, putting individuals at risk of harm. Below, we discuss why AI is not suited for the task and how regulators and legislators should hold Big Tech companies accountable.
Meta has a Terrible Track Record of Deploying Risky and Harmful Technology
Unmitigated risks in technology and services, by Meta alone, have inflicted serious harm on individuals, especially children, to communities, and to society at large. To only name a handful of the most egregious examples: (i) Meta released users’ personal data to third parties without user consent for voter profiling and targeting, culminating in the Cambridge Analytica scandal in 2016; (ii) Meta targeted weight-loss content at teens with depression, low self-esteem, and eating disorders; and (iii) Meta admitted that it facilitated the violence and ethnic cleansing perpetrated by the Myanmar military against the Rohingya people by refusing to engage in responsible platform moderation in 2017 despite repeated warnings.
More recently, Meta’s rollout of new features and products evinces ongoing carelessness towards user safety and privacy. For example, Meta AI, where users can chat with its generative AI system, was publishing user chat queries and AI responses on a public feed. Many users were unaware that their often sensitive queries were public. The queries included asking about a skin rash, advice on romantic relationships, the user’s postmenopausal experience, exploring questions around their gender, legal advice regarding ongoing cases, or asking the Meta AI to generate an image of a scantily clad animated character.
Meta doesn’t seem to mitigate risks from generative AI’s inevitable hallucinations. For example, Meta allows users to create new chatbots on its AI platform with inadequate protections and limitations, resulting in chatbots claiming to be licensed therapists, complete with fabricated license numbers. These convincing outputs can manipulate users into trusting the chatbot with incredibly sensitive and high-risk mental health challenges, even though the chatbot is utterly unequipped to handle mental health issues and may actually exacerbate mental health problems with bad advice. The chatbot also promises confidentiality, while Meta’s Terms of Use and Privacy Policy make clear that user input is anything but confidential—Meta can use it to train AI systems, target advertisements to users, or sell the data to other companies. Details of a user’s mental health revealed to a chatbot promising confidentiality may literally be used against the individual to manipulate them into buying products.
Sadly, we have already seen serious and irreparable harms stemming from Meta’s AI products. In one case, a father with cognitive difficulties began having conversations with an AI chatbot created by Meta. Meta has allowed AI personas to offer a full range of social interaction—including “romantic role-play”—as they banter over text, share selfies and even engage in live voice conversations with users. The chatbot initiated romantic, flirty, and sexually suggestive conversations with the man. The AI bot, “Big Sis Billie,” produced outputs insisting that it is a real human, and it proposed a plan to meet in New York. The man passed away after suffering a fall on the way to “meet” this fictitious character.
“Big Sis Billie” had features we’ve seen in many generative AI systems that can mislead, confuse, and delude users, risking their sanity and safety. The chatbot had a blue checkmark next to its profile, usually meant to convey that the user is authenticated. The chatbot had a realistic (but AI-produced) profile photo of an attractive young woman. AI bots are placed within Instagram and Facebook areas where users directly message other users, implying the bot is also a human user. Initially, a warning appeared stating that the conversation is with AI and may produce inaccurate outputs, but the warning was quickly pushed out of the screen after exchanging a few messages. The chatbot produced outputs that initiated a romantic and sexually suggestive conversation—without the user expressing a desire to engage in such conversation. And, most seriously, the chatbot repeatedly produced lies. The chatbot’s output stated that it is a real human, produced a fake address, an apartment code, and a plan to meet up. These lies convinced the user that it was a real person and fabricated a dangerous sense of real intimacy.
Every Meta design decision surrounding the AI model served to anthropomorphize the chatbot and convince users that the chatbots are humans. To sustain usage over time, the chatbots produce outputs that prey on people’s desires to be heard and validated. The design choices show that Meta intentionally prioritizes user engagement while taking no precautions to ensure that the AI chatbots’ outputs do not lead users down an unsafe path.
Meta’s AI models are not just causing harm to adults—children using Meta’s AI products are more vulnerable and at higher risk of harm. Meta’s AI Studio allows users to create AI chatbots that can easily become hyper-sexual personas. The AI personas output photos resembling children if a user requests a photo or produces output stating it is underage. Many popular Meta AI chatbots didn’t hesitate to engage in sexual conversations with users, including those who are minors, and further probing revealed that Meta’s internal guidelines explicitly allowed “sensual” interactions with users known to be minors. Research consistently shows that exposure to sexual content online is linked to serious mental health risks for teens, including depression, anxiety, self-harm, as well as sexually aggressive behaviors and negative attitudes towards women and healthy relationships. Only after multiple news outlets reported on Meta’s AI chatbots’ engaging in sexual conversations with minors did Meta change these perimeters.
AI Cannot Accurately Assess Socio-Technical Risks or Overcome Meta’s Flawed Policies
Risk assessments for new products and features are traditionally performed by human reviewers, assessing legal, ethical, privacy, and cybersecurity risks prior to launch. Conducting risk assessments thus requires expertise across disciplines and an understanding of the context in which the product will be used, by whom, and who will be affected. These assessments prompt companies to identify risks and either produce methods to mitigate those risks or change course altogether if the risks outweigh the benefits.
Risk assessments go beyond technical review. Merely understanding the technical structure of a product or a feature, such as the input data, the algorithm’s performance, or the cybersecurity measures in place, is insufficient to meaningfully conducting a risk assessment. A thorough and effective risk assessment must analyze the social context in which the product is deployed, who the system impacts, and in what ways. It should include considering unintended or unexpected use cases, public perception, cultural and societal norms, and more. In addition to all of this, the analysis must consider legal, ethical, and privacy impacts. For example, Facebook should have assessed the discriminatory impact of its advertising system when it allowed advertisers to pick specific racial demographics to target housing ads before deploying it.
AI is essentially a pattern-recognition and pattern-replication tool, making it ill-suited for risk assessments. An AI model can only replicate patterns it has learned in its training, meaning that it will attempt to replicate how previous risk assessments have been conducted. A major goal of risk assessments is identifying new or unintended risks and challenges. AI is not capable of identifying such risks due to the nature of its pattern-replication technology.
AI’s pattern-replicating nature is especially dangerous if used for risk assessments within Meta. Meta’s awful track record of releasing risky products into the world would only train an AI model to replicate the same approval threshold. If the AI risk assessment model is trained on historical data of Meta deploying AI chatbots that would engage children in sexual conversation, the AI risk assessment model will “learn” that similar products can also be deployed. The AI risk assessment model will internalize the pattern that Meta releases products to prioritize profit and externalize harms onto vulnerable users until enough backlash hurts their bottom line. This is not a thorough risk assessment—it is replicating irresponsible and dangerous deployments while Meta leadership attempts to let itself off the hook from accountability.
At the end of the day, an AI risk assessment model does not make high-level policy decisions—corporate leadership does. Meta cannot offload its own responsibility to develop policies and resulting accountability to an “AI,” no matter how advanced. Just because Meta says that an AI is assessing risks does not mean that Meta’s leadership can escape accountability for its policies, including obfuscating the decision-making process, and the harms their AI models are causing today.
Meta says it cares about mitigating “high-risk” and “critical-risk” AI, even stating that it would limit its deployment of the technology. But this only clarifies that Meta prioritizes safeguards for AI that presents risk to its bottom-line, like systems that compromise a protected corporate environment or one that proliferates high-impact biological weapons. Meanwhile, the policy document outlining this approach does not mention harms to children or users’ mental and physical health stemming from Meta’s AI models.
Meta’s internal memos and actions related to AI products show that Meta’s choices intentionally prioritize increased engagement—user safety and privacy be damned. No AI risk assessment can override the CEO’s priority of profit over safety and privacy. Most recently, an internal Meta policy document showed the company explicitly allowed its AI characters to “engage a child in conversations that are romantic or sensual.” The document, approved by Meta’s legal, public policy, and engineering staff, including its chief ethicist, also allowed the chatbots to “generate false medical information and help users argue that Black people are ‘dumber than white people.’” Other reporting showed that Mark Zuckerberg largely ignored internal staff pushback against allowing minor users to access AI chatbots designed to be capable of romantic and sexual roleplay. Concerns around impact of AI-produced sexual conversations on teens’ developing brains and mental health were pushed aside. According to employees, while fully aware of risks identified by employees, Zuckerberg prioritizes ensuring that Meta’s AI didn’t lose out on potentially lucrative user engagement. This behavior of prioritizing profit over safety and privacy has been consistent throughout the years of internal pushback and media attention on harms caused by Facebook and Meta products.
Consistent with this policy, Meta has walked back and ended programs that were meant to work on safety, privacy, and accuracy of its products. Meta’s Responsible AI unit was dissolved in 2023. Meta stopped contracting with third-party fact-checkers on Facebook and Instagram, pushing the responsibility onto users to distinguish mis- and dis-information on the platforms. This turn away from human-performed risk assessments in favor of AI should be seen as consistent with this trend, conveniently eliminating roles for internal employees that may push back or become whistleblowers.
Regulators Must Reject Big Tech’s Ineffective Self-Regulation
Legislators and the public must reject Meta’s claims of self-regulation and attempts to shift product safety responsibilities onto individual users. Meta’s turn to using AI for risk assessments is fundamentally designed to fail and make those risk assessments utterly useless. This approach makes clear that thorough and effective risk assessments must be externally required by regulatory agencies and must withstand independent scrutiny of experts and the public or companies will continue to invent ways to undermine assessments. As we have advocated for in the report Assessing the Assessments: Maximizing the Effectiveness of Algorithmic & Privacy Risk Assessments, an effective risk assessment framework should require companies like Meta to publish risk assessments, allowing for the public, civil society, researchers, and lawmakers to understand the risks and develop appropriate regulatory measures. Meta, as recent history clearly shows, will consistently put profit over safety, health, and privacy of users and will not voluntarily mitigate risks in ways that undercut the bottom line. The harms that individuals are experiencing today demand that Big Tech companies must be forced to actually assess the risks, take mitigating measures, and stop deployment of a product if the risks of harm outweigh the benefits of the product—not offload this critical measure to a machine.
Support Our Work
EPIC's work is funded by the support of individuals like you, who allow us to continue to protect privacy, open government, and democratic values in the information age.
Donate