A modern web application that leverages web scraping and natural language processing to automatically gather and process course information. Built with the MERN stack, it provides an intelligent course planning solution.
-
Automated Data Collection
- Web scraping using Selenium and Beautiful Soup
- Real-time course information updates
- Automated data cleaning and processing
-
Intelligent Processing
- Natural Language Processing for course description analysis
- Course prerequisite mapping
- Automated course categorization
- Keyword extraction and topic modeling
-
Smart Scheduling
- Real-time schedule generation
- Conflict detection and resolution
- Prerequisite validation
- Schedule optimization
-
Modern Web Interface
- Responsive React frontend
- Interactive schedule visualization
- Real-time updates
- User-friendly interface
- Node.js - Runtime environment
- Express.js - Web framework
- MongoDB - Database
- Python - Data processing and NLP
- NLTK - Natural Language Processing
- Selenium - Web scraping
- Beautiful Soup - HTML parsing
- React - UI framework
- Material-UI - Component library
- Redux - State management
- Axios - API client
- Clone the repository
git clone https://github.com/yourusername/course-scheduler.git
cd course-scheduler- Install Python dependencies
cd scraper
pip install -r requirements.txt- Install Node.js dependencies
# Backend dependencies
cd ../server
npm install
# Frontend dependencies
cd ../client
npm install- Set up environment variables
# In server directory, create .env:
PORT=5000
MONGODB_URI=mongodb://localhost:27017/course-scheduler
NODE_ENV=development
# In scraper directory, create .env:
CHROME_DRIVER_PATH=/path/to/chromedriver- Start the development servers
# Start backend server
cd server
npm run dev
# Start frontend server in a new terminal
cd client
npm start
# Run scraper (when needed)
cd scraper
python main.pyThe application will be available at http://localhost:3000
course-scheduler/
│
├── client/ # React frontend
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── pages/ # Page components
│ │ ├── redux/ # State management
│ │ └── utils/ # Utility functions
│ │
│ └── public/ # Static files
│
├── server/ # Node.js backend
│ ├── controllers/ # Request handlers
│ ├── models/ # Database models
│ ├── routes/ # API routes
│ └── utils/ # Utility functions
│
└── scraper/ # Python scraper
├── processors/ # Data processors
├── scrapers/ # Web scrapers
└── utils/ # Utility functions
- GET
/api/courses- Get all courses - GET
/api/courses/:id- Get course by ID - POST
/api/schedules- Generate new schedule - GET
/api/schedules/:id- Get schedule by ID
Detailed API documentation is available in the API.md file.
-
Web Scraping
- Course information collection
- Schedule data gathering
- Real-time updates
-
Data Processing
- Text cleaning and normalization
- NLP processing
- Keyword extraction
- Topic modeling
-
Storage
- MongoDB database storage
- Data validation
- Indexing for quick retrieval
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Backend tests
cd server
npm test
# Frontend tests
cd client
npm test
# Scraper tests
cd scraper
python -m pytestThis project is licensed under the MIT License - see the LICENSE file for details.
- NLTK for natural language processing
- Selenium for web automation
- Beautiful Soup for HTML parsing
- MongoDB for database
- Express.js for backend framework
- React for frontend framework