Data Chat is a simple yet powerful web application built with Flask, Pandas, and PandasAI. It allows users to upload various types of data files (CSV, Excel, JSON, Text, PDF) and interact with them using natural language. The chatbot can answer questions, analyze data, and even generate plots directly in the web interface.
- 📁 Upload multiple data formats:
.csv,.xls,.xlsx,.json,.txt,.pdf - 💬 Chat with your data using natural language queries
- 📊 Generate and display plots based on data analysis
- 📄 Basic text extraction from PDFs
- ⏳ Loading indicators for file uploads and chat responses
- 👨💼 Built-in data analyst persona for contextual and insightful answers
Before you begin, make sure you have:
- Python 3.7+: Download
- Git: Download
- OpenAI API Key: Get API Key
-
Clone the repository:
git clone https://github.com/poojithinavolu/ cd AI_Dataanalyzer -
Create a virtual environment (recommended):
python -m venv venv
-
Activate the environment:
-
On Windows:
.\venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install dependencies:
pip install Flask pandas pandasai numpy matplotlib openpyxl python-dotenv flask-cors pdfminer.six
Create a .env file in the root directory and add your OpenAI API key:
OPENAI_API_KEY='YOUR_API_KEY'💡 You may hardcode the key in
chatbot.py, but using.envis more secure and recommended.
-
Ensure your virtual environment is activated.
-
Run the Flask server:
python chatbot.py
-
Open your browser and go to: http://127.0.0.1:5000/
- Upload a supported file (
.csv,.xls,.xlsx,.json,.txt, or.pdf) - Once uploaded, you’ll see a confirmation message in the chat box.
- Ask questions or request plots about the data in plain English.
- Get responses, insights, and visualizations from the AI-powered data analyst.
your-project-folder/
├── chatbot.py # Main Flask app
├── .env # Environment file with OpenAI API key
├── uploads/ # Uploaded files (auto-created)
├── static/
│ └── plots/ # Optional: plot images if saved as files
└── templates/
└── index.html # Frontend template
- PDF Support: Only plain text is extracted. Tabular data may require advanced PDF parsing.
- Security: Avoid hardcoding API keys in production. Use
.envor environment variables. - Scalability: Designed for small/medium datasets and single users. Consider scaling for larger or concurrent use.
Contributions are welcome! Feel free to fork the repo and submit pull requests with improvements, features, or fixes.