Google hired Timnit Gebru to be an outspoken critic of unethical AI. Then she was fired for it.

Gebru is one of the most high-profile Black women in her field and a powerful voice in the new field of ethical AI, which seeks to identify issues around bias, fairness, and responsibility.

Two months ago, Google promoted Timnit Gebru, co-lead of a group focused on ethical artificial intelligence, after she earned a high score on her annual employee appraisal. Gebru is one of the most high-profile Black women in her field and a powerful voice in the new field of ethical AI, which seeks to identify issues around bias, fairness, and responsibility.

In his peer review of Gebru, Jeff Dean, the head of Google Artificial Intelligence, left only one comment when asked what she could do to have a greater impact, according to documents viewed by The Washington Post: Ensure that her team helps make a promising new software tool for processing human language “consistent with our AI Principles.”

In an email thanking Dean for his review, Gebru let him know that her team was already working on a paper about the ethical risks around the same language models, which are essential to understanding the complexity of language in search queries. On Oct. 20, Dean wrote that he wanted to see a draft, adding, “definitely not my area of expertise, but would definitely learn from reading it.”

Six weeks later, Google fired Gebru while she was on vacation.

I can’t imagine anybody else who would be safer than me,” Gebru, 37, said. “I was super visible. I’m well known in the research community, but also the regulatory space. I have a lot of grass-roots support — and this is what happened.”

In an internal memo that he later posted online explaining Gebru’s departure, Dean told employees that the paper “didn’t meet our bar for publication” and “ignored too much relevant research” on recent positive improvements to the technology. Gebru’s superiors had insisted that she and the other Google co-authors either retract the paper or remove their names. Employees in Google Research, the department that houses the ethical AI team, say authors who make claims about the benefits of large language models have not received the same scrutiny during the approval process as those who highlight the shortcomings.

Her abrupt firing shows that Google is pushing back on the kind of scrutiny that it claims to welcome, according to interviews with Gebru, current Google employees, and emails and documents viewed by The Post.

It raises doubts about Silicon Valley’s ability to self-police, especially when it comes to advanced technology that is largely unregulated and being deployed in the real world despite demonstrable bias toward marginalized groups. Already, AI systems shape decision-making in law enforcement, employment opportunity and access to health care worldwide.

That made Gebru’s perspective essential in a field that is predominantly White, Asian and male. Women made up only 15 percent of the AI research staff at Facebook and 10 percent at Google, according to a 2018 report in Wired magazine. At Google, Black women make up 1.6 percent of the workforce.

Although Google publicly celebrated Gebru’s work identifying problems with AI, it disenfranchised the work internally by keeping it hierarchically distinct from other AI initiatives, not heeding the group’s advice, and not creating an incentive structure to put in practice the ethical findings, Gebru and other employees said.

Google declined to comment, but noted that in addition to the dozen or so staff members on Gebru’s team, 200 employees are focused on responsible AI.

Google has said that it did not fire Gebru, but accepted her “resignation,” citing her request to explain who at Google demanded that the paper be retracted, according to Dean’s memo. The company also blamed an email Gebru wrote to an employee resource group for women and allies at Google working in AI as inappropriate for a manager. The message warned the group that pushing for diversity was no use until Google leadership took accountability.

Rumman Chowdhury, a former global lead for responsible AI at Accenture and chief executive of Parity, a start-up that helps companies figure out how to audit algorithms, said there is a fundamental lack of respect within the industry for work on AI ethics compared with equivalent roles in other industries, such as model risk managers in quantitative hedge funds or threat analysts in cybersecurity.

“It’s being framed as the AI optimists and the people really building the stuff [versus] the rest of us negative Nellies, raining on their parade,” Chowdhury said. “You can’t help but notice, it’s like the boys will make the toys and then the girls will have to clean up.”

Google, which for decades evangelized an office culture that embraced employee dissent, has fired outspoken workers in recent years and shut down forums for exchange and questioning.

Nearly 3,000 Google employees and more than 4,000 academics, engineers and industry colleagues have signed a petition calling Gebru’s termination an act of retaliation by Google. Last week, nine Democratic lawmakers, including Sens. Elizabeth Warren (Mass.) and Cory Booker (N.J.) and Rep. Yvette D. Clarke (N.Y.), sponsor of the Algorithmic Accountability Act, a bill that would require companies to audit and correct race and gender bias in its algorithms, sent a letter to Google chief executive Sundar Pichai asking the company to affirm its commitment to research freedom and diversity.

Like any good researcher, Gebru is comfortable in the gray areas. And she has been using her ouster as an opportunity to shed light on the black box of algorithmic accountability inside Google — annotating the company’s claims with contradictory data, drawing connections to larger systemic issues, and illuminating the way internal AI ethics efforts can break down without oversight or a change in incentives to corporate practices and power structures.

Big Tech dominates AI research around advancements in machine learning, image recognition, language translation — poaching talent from top universities, sponsoring conferences and publishing influential papers. In response to concerns about the way those technologies could be abused or compound bias, the industry ramped up funding and promotion of AI ethics initiatives, beginning around 2016.

Tech giants have made similar investments in shaping policy debate around antitrust and online privacy, as a way to ward off lawmakers. Pichai invoked Google’s AI principles in an interview in 2018 with The Post, arguing for self-regulation around AI.

Google created its Ethical AI group in 2018 as an outgrowth of an employee-led push to prioritize fairness in the company’s machine learning applications. Margaret Mitchell, Gebru’s co-lead, pitched the idea of a team of researchers investigating the long-term effects of AI and translating those findings into action to mitigate harm and risk.

The same year, Pichai released a broadly worded set of principles governing Google’s AI work after thousands of employees protested the company’s contract with the Pentagon to analyze surveillance imagery from drones. But Google, which requires privacy and security tests before any product launch, has not mandated an equivalent process for vetting AI ethics, employees say.

Gebru, whose family’s ethnic origins are in Eritrea, was born and raised in Ethiopia and came to Massachusetts as 16-year-old after receiving political asylum from the war between the two African countries. She began her career as an electrical engineer at Apple and received her PhD from the Stanford Artificial Intelligence Lab, studying computer vision under renowned computer scientist Fei-Fei Li, a former Google executive and now co-director of Stanford’s Human-Centered AI Institute, which receives funding from Google.

Gebru did her postdoctoral research at Microsoft Research as part of a group focused on accountability and ethics in AI. There, she and Joy Buolamwini, then a masters student at MIT Media Lab, co-wrote a groundbreaking 2018 study that found that commercial facial recognition tools sold by companies such as IBM and Microsoft were 99 percent accurate at identifying White males, but only 35 percent effective with Black women.

In June, IBM, Microsoft and Amazon announced that they would stop selling the software to law enforcement, which Dean credited to Gebru’s work. She also co-founded Black in AI, a nonprofit organization that increased the number Black attendees at the largest annual AI conference.

Gebru said that in 2018, Google recruited her with the promise of total academic freedom. She was unconvinced, but the company was opening its first artificial intelligence lab on the African continent in Accra, the capital of Ghana, and she wanted to be involved. When she joined Google, Gebru said, she was the first Black female researcher in the company. (When she left, there were still only a handful of Black women working in research, out of hundreds.)

Gebru said she was also drawn to working with Mitchell. Both women prioritized foresight and building practical solutions to prevent AI risk, whereas the operating mind-set in tech is biased toward benefits and “rapid hindsight,” in response to harm, Mitchell said.

Gebru’s approach to ethical AI was shaped by her experiences. Hardware, for instance, came with datasheets that documented whether components were safe to use in certain situations. “When you look at this field as a whole, that doesn’t exist,” said Gebru, an electrical engineer. “It’s just super behind in terms of documentation and standards of safety.”

She also leaned on her industry experience when collaborating with other teams. Engineers live on a product cycle, consumed with putting out fires and fixing bugs. A vague requirement to “make things fair” would only cause more work and frustration, she thought. So she tried to build institutional structures and documentation tools for “when people want to do the right thing.”

Despite their expertise, the Ethical AI group fought to be taken seriously and included in Google’s other AI efforts, employees said.

Within the company, Gebru and her former colleagues said, there is little transparency or accountability regarding how decisions around AI ethics or diversity initiatives get made. Work on AI principles, for instance, falls under Kent Walker, the senior vice president of global affairs, whose vast purview includes lobbying, public policy and legal work. Walker also runs an internal ethics board of top executives, including Dean, called the Advanced Technology Review Council, which is responsible for yes or no decisions when issues escalate, Gebru said. The Ethical AI team had to fight to be consulted on Walker’s initiatives, she said.

“Here’s the guy tasked with covering Google’s a–, lobbying and also … working on AI principles,” Gebru said. “Shouldn’t you have a different entity that pushes back a little bit internally — some sort of push and pull?” What’s more, members of Walker’s council are predominantly vice presidents or higher, constricting diversity, Gebru said.

In her conversations with product teams, such as a group working on fairness in Machine Learning Infrastructure, Gebru said she kept getting questions about what tools and features they could build to protect against the ethical risks involved with large language models. Google had credited it with the biggest breakthrough in improving search results in the past five years. The models can process words in relation to the other words that come before and after them, which is useful for understanding the intent behind conversational search queries.

But despite the increasing use of these models, there was limited research investigating groups that might be negatively impacted. Gebru says she wanted to help develop those safeguards, one of the reasons she agreed to collaborate with the research paper proposed by Emily M. Bender, a linguist at the University of Washington.

Mitchell, who developed the idea of model cards, like nutrition labels for machine learning models, described the paper as “due diligence.” Her model card idea is being adopted more widely across the industry, and engineers needed to know how to fill out the section on harm.

Gebru said her biggest contribution to both her team and the paper has been identifying researchers who study directly-affected communities.

That diversity was reflected in the authors of the paper, including Mark Diaz, a Black and Latino Google researcher whose previous work looked at how platforms leave out the elderly, who talk about ageism in blog posts, but don’t share as much on sites such as Twitter. For the paper, he identified the possibility that large data sets from the Internet, particularly if they are from a single moment in time, will not reflect cultural shifts from social movements, such as the #MeToo movement or Black Lives Matter, which seek to shift power through changes in language.

The paper identified four overarching categories of harm, according to a recent draft viewed by The Post. It delved into the environmental effect of the computing power, the inscrutability of massive data sets used to train these models, the opportunity costs of the “hype” around claims that these models can understand language, as opposed to identifying patterns, and the danger that the real-sounding text generated by such models could be used to spread misinformation.

Because Google depends on large language models, Gebru and Mitchell expected that the company might push back against certain sections or attempt to water down their findings. So they looped in PR & Policy representatives in mid-September, with plenty of time before the deadline for changes at the end of January 2021.

Before making a pre-publication draft available online, Gebru first wanted to vet the paper with a variety of experts, including those who have built large language models. She asked for feedback from two top people at OpenAI, an AI research lab co-founded by Elon Musk, in addition to her manager at Google, and about 30 others. They suggested additions or revisions, Gebru said. “I really wanted to send it to people who would disagree with our view and be defensive,” Gebru said.

Given all their upfront effort, Gebru was baffled when she received a notification for a meeting with Google Research Vice President Megan Kacholia at 4.30 p.m. on the Thursday before Thanksgiving.

At the meeting, Kacholia informed Gebru and her co-authors that Google wanted the paper retracted.

On Thanksgiving, a week after the meeting, Gebru composed a six-page email to Kacholia and Dean outlining how disrespectful and oddly secretive the process had been.

Specific individuals should not be empowered to unilaterally shut down work in such a disrespectful manner,” she wrote, adding that researchers from underrepresented groups were mistreated.

Mitchell, who is White, said she shared Gebru’s concerns but did not receive the same treatment from the company. “Google is very hierarchical, and it’s been a battle to have any sort of recognition,” she said. “We tried to explain that Timnit, and to a lesser extent me, are respected voices publicly, but we could not communicate upwards.”

How can you still ask why there aren’t Black women in this industry?” Gebru said.

Gebru said she found out this week that the paper was accepted to the Conference on Fairness, Accountability and Transparency, as part of its anonymous review process. “It’s sad, the scientific community respects us a lot more than anybody inside Google,” she said.

Timnit Gebru’s Exit From Google Exposes a Crisis in AI

This year has held many things, among them bold claims of artificial intelligence breakthroughs. Industry commentators speculated that the language-generation model GPT-3 may have achieved “artificial general intelligence,” while others lauded Alphabet subsidiary DeepMind’s protein-folding algorithm—Alphafold—and its capacity to “transform biology.” While the basis of such claims is thinner than the effusive headlines, this hasn’t done much to dampen enthusiasm across the industry, whose profits and prestige are dependent on AI’s proliferation.

It was against this backdrop that Google fired Timnit Gebru, our dear friend and colleague, and a leader in the field of artificial intelligence. She is also one of the few Black women in AI research and an unflinching advocate for bringing more BIPOC, women, and non-Western people into the field. By any measure, she excelled at the job Google hired her to perform, including demonstrating racial and gender disparities in facial-analysis technologies and developing reporting guidelines for data sets and AI models. Ironically, this and her vocal advocacy for those underrepresented in AI research are also the reasons, she says, the company fired her. According to Gebru, after demanding that she and her colleagues withdraw a research paper critical of (profitable) large-scale AI systems, Google Research told her team that it had accepted her resignation, despite the fact that she hadn’t resigned. (Google declined to comment for this story.)

Google’s appalling treatment of Gebru exposes a dual crisis in AI research. The field is dominated by an elite, primarily white male workforce, and it is controlled and funded primarily by large industry players—Microsoft, Facebook, Amazon, IBM, and yes, Google. With Gebru’s firing, the civility politics that yoked the young effort to construct the necessary guardrails around AI have been torn apart, bringing questions about the racial homogeneity of the AI workforce and the inefficacy of corporate diversity programs to the center of the discourse. But this situation has also made clear that—however sincere a company like Google’s promises may seem—corporate-funded research can never be divorced from the realities of power, and the flows of revenue and capital.

This should concern us all. With the proliferation of AI into domains such as health carecriminal justice, and education, researchers and advocates are raising urgent concerns. These systems make determinations that directly shape lives, at the same time that they are embedded in organizations structured to reinforce histories of racial discrimination. AI systems also concentrate power in the hands of those designing and using them, while obscuring responsibility (and liability) behind the veneer of complex computation. The risks are profound, and the incentives are decidedly perverse.

The current crisis exposes the structural barriers limiting our ability to build effective protections around AI systems. This is especially important because the populations subject to harm and bias from AI’s predictions and determinations are primarily BIPOC people, women, religious and gender minorities, and the poor—those who’ve borne the brunt of structural discrimination. Here we have a clear racialized divide between those benefiting—the corporations and the primarily white male researchers and developers—and those most likely to be harmed.

Take facial-recognition technologies, for instance, which have been shown to “recognize” darker skinned people less frequently than those with lighter skin. This alone is alarming. But these racialized “errors” aren’t the only problems with facial recognition. Tawana Petty, director of organizing at Data for Black Lives, points out that these systems are disproportionately deployed in predominantly Black neighborhoods and cities, while cities that have had success in banning and pushing back against facial recognition’s use are predominately white.

Without independent, critical research that centers the perspectives and experiences of those who bear the harms of these technologies, our ability to understand and contest the overhyped claims made by industry is significantly hampered. Google’s treatment of Gebru makes increasingly clear where the company’s priorities seem to lie when critical work pushes back on its business incentives. This makes it almost impossible to ensure that AI systems are accountable to the people most vulnerable to their damage.

Checks on the industry are further compromised by the close ties between tech companies and ostensibly independent academic institutions. Researchers from corporations and academia publish papers together and rub elbows at the same conferences, with some researchers even holding concurrent positions at tech companies and universities. This blurs the boundary between academic and corporate research and obscures the incentives underwriting such work. It also means that the two groups look awfully similar—AI research in academia suffers from the same pernicious racial and gender homogeneity issues as its corporate counterparts. Moreover, the top computer science departments accept copious amounts of Big Tech research funding. We have only to look to Big Tobacco and Big Oil for troubling templates that expose just how much influence over the public understanding of complex scientific issues large companies can exert when knowledge creation is left in their hands.

Gebru’s firing suggests this dynamic is at work once again. Powerful companies like Google have the ability to co-opt, minimize, or silence criticisms of their own large-scale AI systems—systems that are at the core of their profit motives. Indeed, according to a recent Reuters report, Google leadership went as far as to instruct researchers to “strike a positive tone” in work that examined technologies and issues sensitive to Google’s bottom line. Gebru’s firing also highlights the danger the rest of the public faces if we allow an elite, homogenous research cohort, made up of people who are unlikely to experience the negative effects of AI, to drive and shape the research on it from within corporate environments. The handful of people who are benefiting from AI’s proliferation are shaping the academic and public understanding of these systems, while those most likely to be harmed are shut out of knowledge creation and influence. This inequity follows predictable racial, gender, and class lines.

As the dust begins to settle in the wake of Gebru’s firing, one question resounds: What do we do to contest these incentives, and to continue critical work on AI in solidarity with the people most at risk of harm? To that question, we have a few, preliminary answers.

First and foremost, tech workers need a union. Organized workers are a key lever for change and accountability, and one of the few forces that has been shown capable of pushing back against large firms. This is especially true in tech, given that many workers have sought-after expertise and are not easily replaceable, giving them significant labor power. Such organizations can act as a check on retaliation and discrimination, and can be a force pushing back against morally reprehensible uses of tech. Just look at Amazon workers’ fight against climate change or Google employees’ resistance to military uses of AI, which changed company policies and demonstrated the power of self-organized tech workers. To be effective here, such an organization must be grounded in anti-racism and cross-class solidarity, taking a broad view of who counts as a tech worker, and working to prioritize the protection and elevation of BIPOC tech workers across the board. It should also use its collective muscle to push back on tech that hurts historically marginalized people beyond Big Tech’s boundaries, and to align with external advocates and organizers to ensure this.

We also need protections and funding for critical research outside of the corporate environment that’s free of corporate influence. Not every company has a Timnit Gebru prepared to push back against reported research censorship. Researchers outside of corporate environments must be guaranteed greater access to technologies currently hidden behind claims of corporate secrecy, such as access to training data sets, and policies and procedures related to data annotation and content moderation. Such spaces for protected, critical research should also prioritize supporting BIPOC, women, and other historically excluded researchers and perspectives, recognizing that racial and gender homogeneity in the field contribute to AI’s harms. This endeavor would need significant funding, which could be achieved through a tax levied on these companies.

Finally, the AI field desperately needs regulation. Local, state, and federal governments must step in and pass legislation that protects privacy and ensures meaningful consent around data collection and the use of AI; increases protections for workers, including whistle-blower protections and measures to better protect BIPOC workers and others subject to discrimination; and ensures that those most vulnerable to the risks of AI systems can contest—and refuse—their use.

This crisis makes clear that the current AI research ecosystem—constrained as it is by corporate influence and dominated by a privileged set of researchers—is not capable of asking and answering the questions most important to those who bear the harms of AI systems. Public-minded research and knowledge creation isn’t just important for its own sake, it provides essential information for those developing robust strategies for the democratic oversight and governance of AI, and for social movements that can push back on harmful tech and those who wield it. Supporting and protecting organized tech workers, expanding the field that examines AI, and nurturing well-resourced and inclusive research environments outside the shadow of corporate influence are essential steps in providing the space to address these urgent concerns.