Skip to content

Conversation

@Chaos02
Copy link

@Chaos02 Chaos02 commented Mar 4, 2022

Implemented dynamic SeasonData importer utilizing https://apexlegends.fandom.com/wiki/Season
Also providing has_duos and has_arenas bool for game modes.
TODO: maybe save memory and delete DataFrame?
TODO: maybe implement Caching for website data

Chaos02 added 3 commits March 4, 2022 22:52
`TODO: maybe save memory and delete DataFrame?`
didnt think far enough...
@synap5e
Copy link
Contributor

synap5e commented Mar 4, 2022

Thanks for this, it's a good start :)

Currently this would lead to scraping the wiki for every game processed and filesystem caching isn't a practical way to mitigate this since the processing is serverless.

I do like using an automated source for the data rather than being a manual process, but to make it work I think we would want to do a scheduled scrape to overtrack's own infrastructure. This would also remove pandas as a dependency for running overtrack, since pandas could just be used as part of the scrape.
The automation could be a github action or a lambda application, each have their pros and cons.
I think the next step would be to split this script into a downloader/parser and a loader.

Downloader/Parser:
Download https://apexlegends.fandom.com/wiki/Season and extract the data to a .json file e.g. could be something like https://api2.overtrack.gg/data/overwatch/seasons
The parser could have pandas or other parsing specific dependancies.

Loader:
_seasons.py would load the json file into the expected data structures. The loader would not need to add any dependencies since its just loading json.

The parser would then run on a schedule and upload to S3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants