CuCo Lab @ the 'Technolinguistics' conference in Siegen

28 May 2023 · Christoph Purschke · 5 minute read · #talks

Introducing the projects “ConMan” and “AI ideologies” at the occasion of the ‘Technolinguistics’ conference

From May 24-27 2023, the CuCo Lab was a guest at the conference “Technolinguistics: Socially Situating Language in AI Systems” in Siegen, Germany, which focused on questioning current developments in AI and language technology from a critical perspective. See the conference call here. In the circle of many exciting presentations from AI research, linguistics and anthropology, the Lab’s team was able to present two of its projects for the first time.

ConMan: stories from a cooperative anthro-computational approach to the study of conspiracy theories

Alistair Plum and Catherine Tebaldi, with Christoph Purschke

Linguistic methods are always also ideologies, ways of understanding language and its relationship to the social world. Following on the conference theme of cooperation, members of Culture and Computation Lab – a linguistic anthropologist, and two computational linguists – offer an account of the theoretical and ideological questions which emerged in the lab’s project ConMan, an NLP resource aimed at documenting and ultimately enabling the detection of conspiracy theories which is informed by critical digital and linguistic anthropological research.

Moving beyond an understanding of language as naturally occurring and “context free” which animates much of NLP, our lab refers to critical sociolinguistic and anthropological understandings of language as a non-neutral medium that cannot be separated from the social world. As media anthropology (Gershon 2017) shows us that digital spaces are not neutral, or rule governed sites of information processing, platform affordances also shape social, affective, ideological spaces; indeed, this very idea of abstract, rules governed processes can create, hide, and legitimate relations of oppression and inequality (Noble 2018).
This is especially important in the world of digital political discourse in which ConMan aims to intervene the proliferation of fascist, far-right ideology online. ConMan combines an anthropological framework and machine learning to analyze conspiracy theories. This project will build on linguistic anthropological frameworks to look at the production, circulation, and consumption of conspiracy narratives, their social and ideological effects, and design an NLP tool to identify them. Initial conversations moved from understanding the “truth” or “correctness” of these narratives to looking at their uptake and consequences. This move, we hope, also brings us beyond “fact-checking” approaches which legitimate powerful, and often exclusionary, institutional discourses and cultural studies approaches which characterize conspiracy as “popular knowledge” and elide its damaging political and social effects.

The specific case of ConMan also raises broader theoretical and methodological questions: What would it mean to have a culturally grounded, socially anchored vision of NLP? What are the interactions, cooperations, and contradictions between the social and the computational? Is it possible to have a critical computational sociolinguistics? Can this collaboration create new forms of NLP, critical machine learning? Or is a critical sociolinguistics limited to critique – to understanding, and countering the ways AI creates, erases, and legitimates inequalities?

Bibliography

  • Gershon, Ilana. 2017. “Language and the Newness of Media.” Annual Review of Anthropology46 (1), 15–31. Link
  • Noble, Safiya Umoja. 2018. Algorithms of Oppression. How Search Engines Reinforce Racism. New York: New York University Press. Link

From “natural” to “culturally grounded” and “socially anchored”. Examining the notion of language in NLP

Christoph Purschke, with Alistair Plum and Catherine Tebaldi

The humanities conceptualize language as an essentially social construction, a symbolic form that is deeply rooted in the history of human culture. This double determination of language (as being socially anchored and culturally grounded) informs scholarly approaches to understanding language and its fundamental importance for human activity in the world. Contrary to this, the technical discussion of language in NLP and Machine Learning is often characterized by a “natural” given of linguistic information as “data”. NLP terminology is full of culturally loaded terms, be it the “knowledge” of language models, their “understanding” of language, or the notion of “intelligence” in technical applications (Bender & Koller 2020). In addition, fundamental concepts like the term “model” itself build on a notion of language as “naturally occurring”. Among other things, this results in the description of NLP resources as models “of language” – instead of models “for language”. Considering the current hype – both in NLP and its public uptake – around Large Language Models (e.g., GPT-4, LaMDA, LLaMA) and their applications as sources of “information” (e.g., ChatGPT, Bard, Bing) in artificial communication (Esposito 2022), we will discuss the consequences of a culturally-grounded understanding of language for NLP and Machine Learning. Starting from the idea that language is a social, embodied, and highly contextually determined practice, we examine current NLP terminology to highlight some of its consequences for working with language:

A) By ascribing intrinsically cultural values to technical derivations of actual language practice, NLP neglects the manifold socio-cultural prerequisites of automatic language processing in the sense of a naturalization of the object (e.g., the relationship between human perspective and “bias” in NLP).

B) By using metaphors coined for socio-cognitive processes to describe the performance of technical artifacts NLP blurs the crucial difference between language as human practice and its technical reproduction to simulate practice (e.g., in the context of Artificial Intelligence or “sentient” language models; tiku 2022).

Bibliography

  • Bender, Emily M., and Alexander Koller. 2020. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–98. Online: Association for Computational Linguistics. Link
  • Esposito, Elena. 2022. Artificial communication. How Algorithms Produce Social Intelligence. Cambridge: The MIT Press. Tiku, Nitasha. 2022. The Google engineer who thinks the company’s AI has come to life. In: The Washington Post, 11 June 2022. Online. Link