Ghosts in the Machine: Adivasi Women and the Hidden Labours of AI

31-07-2025

This article argues that Adivasi women's data labour is not just economically precarious, but is also epistemically exploited. While more women participate in the tech economy than ever before, they do so at the low end of the skill spectrum, reinforcing rather than reducing gendered labour divisions. They produce the building blocks of AI systems, which they neither understand nor benefit from. The AI systems they support are rooted in Western epistemologies, privileging certain kinds of knowledge, logics, and classifications, while erasing or ignoring their own. This digital labour represents a new form of coloniality: where extractive resource empires have transitioned into extractive data empires, but the geographies of exploitation remain eerily similar.

In a small village in Jharkhand, Sunita* sits cross-legged on the floor, her smartphone balanced on her lap. As the rest of her family prepares for the day, Sunita logs onto a global data annotation platform. Her task: to draw digital boxes around objects in photos—traffic lights, bicycles, unfamiliar city streets. She has never seen a Tesla or a Manhattan avenue in real life, but her careful clicks will teach an algorithm to recognise them. Sunita's world is far from the glass towers of Silicon Valley, yet her invisible labour is a crucial building block of the global AI supply chain.

The story of Adivasi women’s labour is not new. During the colonial era, British resource empires extracted minerals, timber, and tea from the same regions that now supply the world’s data. Adivasi communities were displaced, their lands seized, and their bodies conscripted as cheap, expendable labour. Women, in particular, bore the brunt: underpaid, overworked, and rendered invisible in both records and rewards.

Today, the nature of extraction has shifted from the physical to the digital. Instead of tea leaves or iron ore, the resource is data—images, texts, and sounds that human hands must painstakingly label before machines can learn from them. The geographies of exploitation remain eerily similar. The digital economy's hunger for cheap, flexible labour has reached deep into Adivasi territories, where economic opportunities are scarce and digital work is marketed as a path to empowerment.

But as Sunita and her peers will tell you, the reality is more complicated.

As artificial intelligence (AI) systems become increasingly embedded in our lives, a vast and largely invisible workforce powers their development. Data annotation has become a major industry in India, often centred in smaller cities and rural areas. According to industry reports, India's data-labelling market was already worth about $250 million (with 60% of revenue from U.S. firms) and employed roughly 70,000 people by 2022. Crucially, a large majority of these annotators are first-generation workers from rural or small-town backgrounds (NASSCOM found 80% come from rural India, and 90% from Tier-II/III cities). Many such centres employ predominantly women – for example, a lab in rural Telangana was staffed entirely by local women (most holding college degrees) who took the job as housewives seeking work. Data-annotation firms like iMerit and Niki.ai have even opened offices in Jharkhand, Chhattisgarh and Odisha to tap indigenous labour. In short, tribal and Dalit women are increasingly joining AI's supply chain as "human-in-the-loop" workers. This work, framed as flexible gig work, occupies a foundational role: the labels these women produce train AI models (for tasks from object recognition to medical diagnostics). At the same time, the workers themselves often never learn how their work connects to the final product.

Precarity and Paradox: The Economic Realities of Data Annotation

Despite the vital role they play in enabling AI systems, data annotators in India operate under precarious conditions. Most are paid low piece rates or modest fixed salaries, with limited benefits and strict performance surveillance. Errors, pauses, or slow completion can lead to outright task rejection and non-payment. Tasks are governed by Non-Disclosure Agreements, preventing annotators from speaking out, even as their labour powers billion-dollar global platforms. Many are young women juggling domestic responsibilities: one tribal annotator in Jharkhand worked from 6 AM to 2 PM before returning to childcare and evening classes. She, like many others, treated the work as isolated microtasks, with little knowledge of how their labels trained AI models.

Annotation jobs are often framed as “flexible work” or even as empowerment. Companies like iMerit offer salaried positions (?10,000–15,000/month, with the best cases reaching ?25,000), but even at the higher end, pay remains far below India’s tech-sector average. Employers frame this as a "good village salary," yet workers themselves lament stagnant wages and inflationary pressures. Surveyed annotators rarely engage with global concerns like algorithmic colonialism—they simply ask, "When will my salary go up?"

And yet, for many women, the economic impact is undeniable. Annotation work has offered some a newfound autonomy: incomes that cover rent, hospital bills, or freedom from abusive homes. One annotator described it as "rain in the desert." But such gains coexist with deep structural inequalities. Annotation firms operate under lax labour regimes: workers have no unions, legal protections, or career mobility. They remain "invisible" workers—outsourced, feminised labour from the Global South performing a crucial yet unrecognised role in the AI economy. They ascend only as a digital underclass, still saddled with domestic work and socio-cultural marginalisation.

Epistemic Exploitation and Colonial Knowledge

This labour also raises epistemic justice concerns. The images and texts labelled by Adivasi women carry the biases of their labellers – yet Western engineers set the broader framework of meaning. Scholars warn that "AI algorithms, shaped by data and perspectives that largely emanate from Western contexts, risk perpetuating a form of epistemic dominance" and call this a kind of "cognitive imperialism" – AI built on Western norms will erase or misinterpret indigenous categories, thus perpetuating colonial-era epistemic hierarchies. This epistemic marginalisation is not new—it echoes long-standing state practices of dismissing tribal knowledge systems as inferior or irrelevant.

Gender, Caste and Tribal Marginality

India's social hierarchies further complicate the picture. Adivasi women face a double bind: they are marginalised by both gender and tribal identity, and often also by caste. In tech fields (and society at large), caste bias persists in subtle and overt ways. Though data annotation centres may promise work for "educated rural women," these women often come from lower castes or tribes and lack the social capital to demand fair terms.

These women may also face prejudices at work: for instance, a Dalit female tech worker reported sexual harassment she believed was tied to her caste status. Even when not overt, caste bias shapes opportunities: many tech companies have hidden barriers and stereotypes that limit advancement for SC/ST employees. Thus, the Adivasi data-annotators are caught in multiple structures of disadvantage – they are women (in a male-dominated society), from tribal groups (long subject to exclusion and viewed as unskilled), and in low-wage gig jobs outside the protective ambit of regular employment. This intersectionality must be recognised: calls for "more women in tech" will miss the mark if they ignore that tribal and Dalit women face unique hurdles beyond what upper-caste women do.

A Way Ahead?

The AI industry and policymakers must move beyond tokenistic invocations of "AI ethics" and reckon with the deeply unequal foundations of the digital economy. Ethical frameworks that focus only on algorithmic bias or fairness in model outputs risk overlooking the exploitative labour conditions at the base of the AI supply chain. Data doesn't label itself; it is produced through hours of invisible, undervalued human work. Suppose India's digital future is to be inclusive. In that case, the rights and realities of these foundational workers—many of them Adivasi and Dalit women—must be placed at the centre of policy conversations.

Realising a truly inclusive digital future requires a fundamental rethinking of both labour and knowledge in the AI economy. Data annotation must be formally recognised as skilled labour, with fair wages, social protections, and collective bargaining rights—especially for tribal and Dalit women whose invisible work sustains the global AI supply chain. But inclusion must go beyond employment quotas; these women must have meaningful pathways into decision-making roles across the AI pipeline, allowing their epistemologies to inform and shape the systems they help build. This is not just a moral imperative but a strategic one: AI systems built on epistemic erasure and economic exploitation are inherently brittle and biased. As India expands its digital ambitions, it must confront the full chain of extraction—from data to design—by asking whose labour is used, whose knowledge counts, and who is left behind. Empowering the women who power AI is not a footnote to progress; it is its foundation.

References

1. NASSCOM Insights. (2021, February 11). Data annotation – Billion-dollar potential driving the AI revolution. NASSCOM.

2. Dixit, K. (2022, July?23). Human Touch. FiftyTwo.

3. Chandran, R., Smith, A., & Ramos, M. (2023, March 14). The AI boom is both a dream and a nightmare for workers in the Global South. Context News, Thomson Reuters Foundation.

Farrell, T., et al. (2019). Exploring Misogyny across the Manosphere in Reddit. WebSci'19. https://oro.open.ac.uk/61128/1/WebScience139.pdf

4. Ofosu-Asare, Y. Cognitive imperialism in artificial intelligence: counteracting bias with indigenous epistemologies. AI & Soc 40, 3045–3061 (2025). https://doi.org/10.1007/s00146-024-02065-0

5. Dutt, M. (2022, May 25). India’s tech industry is still divided by caste. Rest of World.