Feminism, Jaat aur Code-Mixing The objective of this project is to analyze the relative presence of code-mixing in the lingo that has developed on social media, specifically, Twitter, around two major sociocultural discourses in India: that on caste-concerned issues, and that on gender concerns.
The presence of code-mixing in a particular instance of speech/text in a bilingual society is contingent on several factors: for example, the proficiency of the particular speaker, the social situation s/he is in, a tendency to accommodate the conversant, etc. (Bali 1617). Suppose that we have accounted for variations in these, and observed the following two utterances:
- Vah ladki chair par khadi ho gai
- Yeh mera personal maamla hai.
Both instantiate code-mixing on a lexical level. There is a difference, however, in the fact that one might reasonably hear
- Vah kursi par khadi ho gai
whereas, as someone recently remarked to me, it is much more anomalous for a modern Hindi speaker to say
- Yeh mera vyaktigat maamla hai
Putting it differently: were we to ask 1000 speakers independently to express each of these two thoughts, we might expect to find that the percentage of people who preferred ‘personal’ to ‘vyaktigat’ would be higher than that of those who preferred ‘chair’ to ‘kursi’. Why might this be? One explanation is that the discourse of privacy and the personal has been culturally borrowed from a Western ideology. It is not, of course, an imported technology like a ‘phone’ or a ‘fridge’; Hindi might well contain the vocabulary to support it (‘niji’ maamla, ‘apna’ maamla), but using the original English code ‘personal’ might remain the more popular way of expressing a concept that was introduced from outside the cultural currency of Hindi. This is to say that the topical features, or the local semantic properties, of a discussion influence the lingo in which it occurs, owing to the sociocultural history of the discourse itself.
Hypothesis: The relative presence of Hindi in the topical category of gender issues, including but not limited to activism and protest, will be less than that in the topical category of caste-based issues, including but not limited to reservations, Dalit campaigns, and activism. The idea of feminism as we comprehend it today emerged in India post-independence. It is still regarded as an upper-class, upper-caste movement, ideologically populated, perhaps, largely by the English-educated stratum of India. On the other hand, there is reason to think that an engagement with caste politics is more deeply entrenched in Indian society and discussion, since it often forms a nerve center for nationwide political discussion, often conducted in regional languages. In both movements, speaking English itself is a political move, since English is the language of the ‘liberated’ and upward mobile in India. Approach: This project, drawing upon the annotation scheme as suggested by Bali et al., will fix the modality, i.e. the domain, to be Twitter, the discursive dimension, i.e. the nature of the speaker-audience, to be multiparty or public, and the social hierarchical dimension to be informal, as is the nature of the Twitter platform. We will then identify tweets that belong to either of the discourses under discussion in an annotated corpus, with the help of a predetermined list of associated words and their translations. Given these two buckets, we will run counting analyses on each, that evaluate, for example, the percentage of Hindi terminology (as belonging to the Hindi list), general Hindi word count, phrases with varying run-length as indicating phrasal code-mixing. Finally, we will normalize for text length, and explicate the results.
Works Cited Bali, Kalika, Monojit Choudhury, Silvana Hartmann (2018). An Integrated Representation of Linguistic and Social Functions of Code-Switching. In LREC, 2018, pp 1615-1622. Bali, K., Raafiya Begum and M. Choudhury. (2016). Functions of Code-Switching in Tweets: An Annotation Scheme and Some Initial Experiments. In LREC, 2016.