A shared data repository for MSBA group project datasets.
Size: 75,000 rows × 4 columns
A balanced dataset containing book-user interaction edges across three genres (romance, fiction, etc.).
- Columns: genre, book_id, book_title, user_id
- Use case: Network analysis, graph-based projects, genre classification
Size: 5,500 rows × 23 columns
A training dataset with comparative metrics between two Twitter users (A and B). Each row contains user engagement statistics and network features.
- Key columns: Choice (target variable), follower counts, engagement metrics, network features for users A and B
- Use case: Twitter influence analysis, user comparison models, machine learning classification
Size: 3,897 rows × 10 columns
A sample of tweets with user and engagement information.
- Columns: ids, screen_name, followers, retweet, inreplyto, favorite, friends, listed, location, text
- Use case: Twitter sentiment analysis, text mining, social media research
various-public-project-datasets/
├── README.md # This file
├── edges_3genres_balanced_500books.csv # Book-user network data
├── train.csv # User comparison training data
├── tweets_sample.csv # Tweet samples with metadata
└── keep_txt.txt # (Placeholder file)
These datasets are available for use in MSBA group projects. Feel free to download and use any dataset as needed for your analysis and modeling.
Last updated: February 2026