Skip to content

Enhance core logic with ML, add web scraping for data, and improve JSON data handling #1

@yrdaman

Description

@yrdaman

Current problems

The current system relies heavily on if/else logic for comparisons and decisions. This creates:

  • Long, repetitive code
  • Difficulty scaling for new rules
  • Limited flexibility for dynamic data changes

Also, there’s no automated data gathering process. We’re manually adding data or relying on static files.

Proposed improvements

✅ Replace core logic with ML

  • Use Machine Learning models for smarter comparisons and decisions
  • Reduce the complexity of manual if/else blocks
  • Enable the app to adapt to new patterns without manual rule updates

Possible ML approaches:

  • Decision trees
  • Random forests
  • Logistic regression
  • Clustering or classification models

✅ Add web scraping for data gathering

  • Implement Python web scraping to fetch updated data from reliable websites
  • Parse and store scraped data into JSON files for the app to use
  • Schedule scraping periodically (e.g. with cron jobs or GitHub Actions)

Possible libraries:

  • BeautifulSoup
  • requests
  • Selenium (if JavaScript rendering required)

✅ Improve JSON file handling

  • Redesign JSON data structure for better readability and maintainability
  • Add validation logic to check JSON data consistency
  • Optimize JSON read/write performance if files are large

Benefits

  • More intelligent, flexible decision-making
  • Automatically updated data for improved accuracy
  • Cleaner, maintainable codebase
  • Easier scaling for future features

Next steps

  • Identify which parts of code can transition to ML
  • Define data requirements for ML model training
  • Select target websites for scraping
  • Draft JSON schema for the new data format
  • Plan gradual implementation to avoid breaking changes

Files possibly affected:

  • core logic files (e.g. core_logic.py, if exists)
  • data handling scripts
  • JSON data files

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions