-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Hey man, if I try to run this locally I get the following:
collecting data for https://www.psacard.com/pop/tcg-cards/1999/pokemon-game/57801
Error pulling data for https://www.psacard.com/pop/tcg-cards/1999/pokemon-game/57801, with error: 403 Client Error: Forbidden for url: https://www.psacard.com/Pop/GetSetItems
Traceback (most recent call last):
File "/home/USERNAME/psa-scrape/pop_report/original_to_github.py", line 112, in <module>
ppr.scrape()
File "/home/USERNAME/psa-scrape/pop_report/original_to_github.py", line 43, in scrape
cards = json_data["data"]
UnboundLocalError: local variable 'json_data' referenced before assignment
So when I try the URL (I changed it to a Pokemon one) firstly I get forbidden 403 and then theres a json_data error.
Is there anyway to resolve this? I have been trying to fix it locally but got no where.
I am running it using Python3
I am running it on Ubuntu 22.04.3 LTS
I can resolve the json_data error by doing the following:
Fix 1 of 2
Changing this:
try:
json_data = self.post_to_url(sess, form_data)
except Exception as err:
print("Error pulling data for {}, with error: {}".format(self.set_name, err))
cards = json_data["data"] # This line causes UnboundLocalError if the try block fails
To this:
try:
json_data = self.post_to_url(sess, form_data)
except Exception as err:
print("Error pulling data for {}, with error: {}".format(self.set_name, err))
return # Early exit if an error occurs
# Ensure json_data is valid before proceeding
if not json_data or "data" not in json_data:
print("No valid data found for set: {}".format(self.set_name))
return # Exit if there's no data
cards = json_data["data"] # Now safe to access since we checked for validity
Fix 2 of 2
Changing this:
json_data = self.post_to_url(sess, form_data)
cards += json_data["data"] # Assumes json_data is valid
To this:
try:
json_data = self.post_to_url(sess, form_data)
if not json_data or "data" not in json_data:
print("No valid data found for additional page: {}".format(curr_page))
break # Exit loop if there's no more data
cards += json_data["data"]
except Exception as err:
print("Error pulling additional data for set {}, page {}: {}".format(self.set_name, curr_page, err))
break # Exit loop on error
If I put those changes in, I am still left with the 403 Error:
Error pulling data for https://www.psacard.com/pop/tcg-cards/1999/pokemon-game/57801, with error: 403 Client Error: Forbidden for url: https://www.psacard.com/Pop/GetSetItems
Sorry for the long message but I wanted to give as much context as possible!
Hopefully you can help resolve this!
I think the 403 might come from Cloudflare blocking the request.