A shared data repository for MSBA group project datasets.
Size: 75,000 rows Γ 4 columns
A balanced dataset containing book-user interaction edges across three genres (romance, fiction, etc.).
- Columns: genre, book_id, book_title, user_id
- Use case: Network analysis, graph-based projects, genre classification
Size: 5,500 rows Γ 23 columns
A training dataset with comparative metrics between two Twitter users (A and B). Each row contains user engagement statistics and network features.
- Key columns: Choice (target variable), follower counts, engagement metrics, network features for users A and B
- Use case: Twitter influence analysis, user comparison models, machine learning classification
Size: 3,897 rows Γ 10 columns
A sample of tweets with user and engagement information.
- Columns: ids, screen_name, followers, retweet, inreplyto, favorite, friends, listed, location, text
- Use case: Twitter sentiment analysis, text mining, social media research
various-public-project-datasets/
βββ README.md # This file
βββ edges_3genres_balanced_500books.csv # Book-user network data
βββ train.csv # User comparison training data
βββ tweets_sample.csv # Tweet samples with metadata
βββ keep_txt.txt # (Placeholder file)
These datasets are available for use in MSBA group projects. Feel free to download and use any dataset as needed for your analysis and modeling.
Last updated: February 2026