This folder contains the data processing script for the Anime Data Visualization. Each script is a Python Jupyter notebook and will output a JSON file once ran, which can be manually tweaked for formatting. Final JSON files used by the data visualization app are placed in absolute folder /app/public/data.
To run the scripts you will need:
- Python 3 (link): scripting and programming runtime and environment
- Jupyter Notebook (link): web-based development application for interactive Python scripts
- Pandas (link): data analysis library for Python
- JikanPy (link): Python wrapper library around the Jikan API
- The original dataset CSVs which can be found here, placed in relative folder
./raw(you may need to create it)
Each .ipynb can be opened in your local Jupyter Notebook instance as a standalone script.
-
Genres.ipynb: takes as inputraw/anime_cleaned.csvand producesgenre_data.jsonandgenre_top_animes_data.json, containing the data for the genres bubble diagram. -
History.ipynb: takes as inputraw/AnimeList.csvand produceshistory.json, containing the data for the histogram -
Actors.ipynb: takes as inputraw/anime_cleaned.csvand fetches data from the Jikan API. It outputsvA_datasets.jsonandvA_infos.json(the latter is to be placed in /app/src/pages/chord for bundling) which contain the data for the actors chord diagram. -
Studios.ipynb: takes as inputraw/anime_cleaned.csvand producessankey_dataset.jsonwhich contains the data for the studios sankey diagram.