Various Public Project Datasets

A shared data repository for MSBA group project datasets.

Datasets

edges_3genres_balanced_500books.csv

Size: 75,000 rows × 4 columns

A balanced dataset containing book-user interaction edges across three genres (romance, fiction, etc.).

Columns: genre, book_id, book_title, user_id
Use case: Network analysis, graph-based projects, genre classification

train.csv

Size: 5,500 rows × 23 columns

A training dataset with comparative metrics between two Twitter users (A and B). Each row contains user engagement statistics and network features.

Key columns: Choice (target variable), follower counts, engagement metrics, network features for users A and B
Use case: Twitter influence analysis, user comparison models, machine learning classification

tweets_sample.csv

Size: 3,897 rows × 10 columns

A sample of tweets with user and engagement information.

Columns: ids, screen_name, followers, retweet, inreplyto, favorite, friends, listed, location, text
Use case: Twitter sentiment analysis, text mining, social media research

Repository Structure

various-public-project-datasets/
├── README.md                           # This file
├── edges_3genres_balanced_500books.csv # Book-user network data
├── train.csv                           # User comparison training data
├── tweets_sample.csv                   # Tweet samples with metadata
└── keep_txt.txt                        # (Placeholder file)

Usage

These datasets are available for use in MSBA group projects. Feel free to download and use any dataset as needed for your analysis and modeling.

Last updated: February 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various Public Project Datasets

Datasets

edges_3genres_balanced_500books.csv

train.csv

tweets_sample.csv

Repository Structure

Usage

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Various Public Project Datasets

Datasets

edges_3genres_balanced_500books.csv

train.csv

tweets_sample.csv

Repository Structure

Usage