This repo contains the source code and data for the paper Neutral Bots Probe Political Bias on Social Media by Chen et al. (DOI, preprint).
Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots (we call them drifters) on Twitter to probe biases that may emerge from interactions between users, platform mechanisms, and manipulation by inauthentic actors.
botcontains the source code of the drifters. Usecd bot; python drifter_main.py <your drifter screen_name>to activate the drifters.datacontains the code and intermediate data files for the analyses.data/hashtag_political_alignmenthas the implementation of hashtag embedding.data/GenerateDataFiles.ipynbgenerates data files for our analyses.
databasecontains the script to create a PostgreSQL database for the analyses.expscontains scripts and a notebook to generate the plots and tables for the paper.exps/FinalPaperPlots.ipynbreads the output files generated bydata/GenerateDataFiles.ipynbto produce the final figures.exps/algorithm_bias_estimation.ipynbreads the output files generated bydata/GenerateDataFiles.ipynband runs statistical tests to estimate the algorithmic bias.
exps/news_seed_popularitycontains the scripts and data for the news seed popularity analysismetriccontains the scripts to pre-processes the collected data for further analyses. Please runmetric_job.pyas an example.analysis.pyandmetrics.pycompute the hashtag-based and url-based political valence score for each Tweet.time_series_scores.pycomputes the political valence changes across time for all drifters. The notebookdata/GenerateDataFiles.ipynbcalls this script.generate_networks_for_each_bot.pybuilds the ego networks for drifters and computes metrics for the echo chamber analyses. The notebookdata/GenerateDataFiles.ipynbdepends on files generated by this script.
otherscontains the code to initialize and clean up the drifters.
The software in this repository has been tested on a linux machine with Python 3 installed. Installing Python and the dependencies below might require up to one hour. Data collection for an experiment similar to the one described in the paper would require several months. Data processing would typically take a few days.
- Python 3
- Jupyter notebook is used in some cases to process and visulize the data.
- twurl, modified as shown here, is used to manage the drifters. First you need to create a Twitter app. Each drifter account must authorize the app. The keys of the app can then be used with
twurlto control the drifters. - chatterbot is used when drifters reply to tweets that mention them.
- tweepy is used in analysis code.
- psycopg2 is the database driver.
- botometer client library is used in conjunction with the Botometer Pro API to get data from the Twitter API and then calculate bot scores for friends and followers of the drifters.
- gensim provides an implementation of the word2vec algorithm for calculating the political alignment of the hashtags.
You may cite our preprint as:
@article{chen2021neutral,
title={Neutral bots probe political bias on social media},
author={Chen, Wen and Pacheco, Diogo and Yang, Kai-Cheng and Menczer, Filippo},
journal={Nature communications},
volume={12},
number={1},
pages={1--10},
year={2021},
publisher={Nature Publishing Group}
}