I didn't understand the specific download process, and my company does not allow direct downloads from cloud storage. However, based on the dc_slug in the TSV file, I have a general idea of how to find the original PDF URL.
For example, for a dc_slug like 456300-sept-17-23-2012-11953-13474707086771-_-pdf, I can use the split function to split at the first hyphen and then construct the URL as follows:
url = f'https://s3.amazonaws.com/s3.documentcloud.org/documents/{456300}/{sept-17-23-2012-11953-13474707086771-_-pdf}.pdf'
This way, I can directly access the PDF!