Your buddy for analytics.
DataScout is an interactive web application built with Streamlit that empowers users to clean, analyze, and chat with their data. It uses AI to generate custom cleaning plans and provides rich, interactive visualizations to uncover insights effortlessly.
- 🤖 AI-Powered Cleaning: Automatically generates a data cleaning plan from an LLM tailored to your dataset.
- 📊 Interactive EDA: Explore your data with interactive charts for categorical distributions, numerical histograms, and correlation analysis.
- 💬 Q&A with Your Data: Ask questions in plain English and get AI-generated answers based on your data.
- 📥 Downloadable Assets: Download the cleaned dataset as a CSV and save plots as high-quality PNG images.
- 📁 Multi-File Support: Works seamlessly with CSV and Excel files.
Follow these instructions to set up and run a local copy of DataScout.
- Python 3.8+
- Git
-
Clone the repository:
git clone https://github.com/your-username/DataScout.git
-
Navigate to the project directory:
cd DataScout -
Install the required dependencies:
pip install -r requirements.txt
-
Set up your API Key:
- Create a folder named
.streamlitin the project's root directory. - Inside the
.streamlitfolder, create a file namedsecrets.toml. - Add your Together AI API key to this file:
TOGETHER_API_KEY = "your_api_key_here"
- Create a folder named
Launch the Streamlit application by running the following command in your terminal:
streamlit run main.py- Launch the app and wait for it to load in your browser.
- Upload your CSV or Excel file using the file uploader.
- Review the AI-generated cleaning plan and the cleaned data preview.
- Explore your data using the interactive dropdowns in the "Visual Analysis" section.
- Ask questions about your data in the "Ask AI about your data" section to get quick insights.
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.