diff --git a/.ipynb_checkpoints/README-checkpoint.md b/.ipynb_checkpoints/README-checkpoint.md new file mode 100644 index 0000000..e966ac4 --- /dev/null +++ b/.ipynb_checkpoints/README-checkpoint.md @@ -0,0 +1,224 @@ +![Ironhack logo](https://i.imgur.com/1QgrNNw.png) + +# Lab | API Scavenger Game + +## Introduction + +In the lesson, you have learned how to make Python requests to APIs and parse the JSON responses to extract the information you need. In this lab, you will practice these skills by playing an API scavenger hunt game. In case you haven't played scavenger hunt when you were a kid, in a scavenger hunt players need to collect a list of items and they receive clues to help them in the mission. In this lab, you will be seeking secrets hidden inside the massive data from the API. Your data analytics skills will make you a cool API detective. + +## Getting Started + +In order to get started, we'd like you to create an access token in your [Github account](https://github.com/settings/tokens). + +1. Click `Generate new token` in the page. +1. Enter token description. +1. Select the scopes for which you allow the token to access. Check at least all the `repo` checkboxes as shown in the screenshot below. +1. Click `Generate token`. Github will create a personal access token for you. + +![Github create personal token](../../images/github-create-token.png) + +A personal access token is a secret password to allow you or your app to make remote requests to the Github API. It is the same [oAuth](https://oauth.net/) technology as the Twitter developer access token discussed in the lesson but in Github you don't need to wait for the approval and your token will be available immediately. + +:exclamation: Make sure you save the token on your computer because this is the only time you will see the token string. If for any reason you lost your token, simply come back to Github and re-authorize yourself a new token. + +:warning: **Do not share your Github personal access token with anyone else! Hackers can use your token to do bad things that make damage and result in suspension of your account.** + +After generating the token, you can test it with `curl` in the Terminal. Assuming your Git username is `johndoe` and token is `d10ev1shpm10x5qox9ckw1k9b792p9rq0ogplpn5cyo55`, you can make the curl command in the following way: + +```bash +$ curl -u johndoe:d10ev1shpm10x5qox9ckw1k9b792p9rq0ogplpn5cyo55 https://api.github.com/user` +``` + +If your token is valid, you will see a JSON response that looks like: + +``` +{ + "login": "johndoe", + "id": 1234567, + "node_id": "MDQ6VXNlcjE2NTk3OTg=", + "avatar_url": "https://avatars3.githubusercontent.com/u/1659798?v=4", + "gravatar_id": "", + "url": "https://api.github.com/users/johndoe", + "html_url": "https://github.com/johndoe", + "followers_url": "https://api.github.com/users/johndoe/followers", + ... +} +``` + +Because it is inconvenient to read long API responses in Terminal, you can export the response into a physical file with the following command: + +```bash +$ curl -u johndoe:d10ev1shpm10x5qox9ckw1k9b792p9rq0ogplpn5cyo55 https://api.github.com/user` > output.json +``` + +Then you can open `output.json` with your favorite text editor to have a deep look. + +:information_source: Access token is one of the ways to authenticate requests to Github API. Alternatively, you can also use your Github username and password. However, you'll need to manually enter your password every time when you make API requests. In contrast, access token allows you to make requests without entering password manually. For more information about Github API authentications, refer to [this](https://developer.github.com/v3/auth/) and [this](https://developer.github.com/v3/oauth_authorizations/) documentation. + +:information_source: From now on, we will not give you step-by-step instructions in the labs. You already have the foundation in Python and data analytics that allows you to research data solutions. We will, however, provide general guidance on how to complete your lab assignments. In case you find it difficult to tackle your assignments with the general guidance, please don't hesitate to ask the instructional team. We are here to help! :v: + +## Goals + +### Challenge 1: Fork Languages + +You will find out how many programming languages are used among all the forks created from the main lab repo of your bootcamp. Assuming the main lab repo is `ironhack-datalabs/madrid-oct-2018`, you will: + +1. Obtain the full list of forks created from the main lab repo via Github API. + +1. Loop the JSON response to find out the `language` attribute of each fork. Use an array to store the `language` attributes of each fork. + * *Hint: Each language should appear only once in your array.* + +1. Print the language array. It should be something like: + + ```["Python", "Jupyter Notebook", "HTML"]``` + +Again, the documentation of Github API is [here](https://developer.github.com/v3/). + +### Challenge 2: Count Commits + +Count how many commits were made in the past week. + +1. Obtain all the commits made in the past week via API, which is a JSON array that contains multiple commit objects. + +1. Count how many commit objects are contained in the array. + +### Challenge 3: Hidden Cold Joke + +Using Python, call Github API to find out the cold joke contained in the 24 secret files in the following repo: + +https://github.com/ironhack-datalabs/scavenger + +The filenames of the secret files contain `.scavengerhunt` and they are scattered in different directories of this repo. The secret files are named from `.0001.scavengerhunt` to `.0024.scavengerhunt`. They are scattered randomly throughout this repo. You need to **search for these files by calling the Github API**, not searching the local files on your computer. + +Notes: + +* Github API documentation can be found [here](https://developer.github.com/v3/). + +* You will need to study the Github API documentation to decide which API endpoint to call and what parameters to use in order to obtain the information you need. Unless you are already super familiar with Github API or super lucky, you probably will do some trials and errors. Therefore, be prepared to go back and forth in studying the API documentation, testing, and revising until you obtain what you need. + +* After receiving the JSON data object, you need to inspect its structure and decide how to parse the data. + +* When you test your requests with Github API, sometimes you may be blocked by Github with an error message that reads: + + > You have triggered an abuse detection mechanism and have been temporarily blocked from content creation. Please retry your request again later. + + Don't worry. Check the parameters in your request and wait for a minute or two before you make additional requests. + +**After you find out the secrete files:** + +1. Sort the filenames ascendingly. + +1. Read the content of each secret files into an array of strings. + +1. Concatenate the strings in the array separating each two with a whitespace. + +1. Print out the joke. + +## Deliverables + +* `challenge-1.py` or `challenge-1.ipynb` that contains your solution to Challenge 1. + +* `challenge-2.py` or `challenge-1.ipynb` that contains your solution to Challenge 2. + +* `challenge-3.py` or `challenge-1.ipynb` that contains your solution to Challenge 3. + +## Submission + +Upon completion, add your deliverables to git. Then commit git and push your code to the remote. + +## Resources + +[Github RESTFUL API Documentation](https://developer.github.com/v3/) + +[OAuth](https://oauth.net/) + +[Github oAuth Authorizations API](https://developer.github.com/v3/oauth_authorizations/) + +[Github Other Authorizations API](https://developer.github.com/v3/auth/) + +## Additional Challenge for the Nerds + +So far we have practiced a lot with the `GET` method but not `PUT`, `POST`, `PATCH`, or `DELETE`. If you wonder what are the differences, refer to the following: + +https://spring.io/understanding/REST + +Simply put, the `GET` method only allows you to obtain data from API. But the other methods allow you to modify the data stored in the database behind the API. The API must be programmed to support each of these methods though. + +The additional challenge for the nerds is for you to use the `PUT` method to create a file in your own repo. You need to grant the correct permissions to your access token in order to make `PUT` requests to your repo. + +### Note: + +You don't have to use Python in this complex challenge. Simply find out how to do that with `curl` as proof of concept. That's adequate for the purpose of practicing `PUT` for API. + +### Steps: + +1. Create a new repo (don't use your forked repo for the lab because you don't want to ruin your lab codes). Assuming your repo is called `johndoe/test-repo`. + +1. Call the following API endpoint to create a new file called `test.txt`: + + ```https://api.github.com/repos/johndoe/test-repo/contents/test.txt``` + + Notes: + + * You'll need to supply a JSON object as the parameter of the `PUT` method that contains at least `message` and `content`. + + * The `content` string must be encoded with [Base64](https://en.wikipedia.org/wiki/Base64). Here is a website for you to [encode a regular string to Base64](https://www.base64encode.org/). + + For detailed documentation, see: https://developer.github.com/v3/repos/contents/#create-a-file + +1. If successful, you should see the following example response from the API: + +``` +{ + "content": { + "name": "test.txt", + "path": "test.txt", + "sha": "0d5a690c8fad5e605a6e8766295d9d459d65de42", + "size": 20, + "url": "https://api.github.com/repos/johndoe/test-repo/contents/test.txt?ref=master", + "html_url": "https://github.com/johndoe/test-repo/blob/master/test.txt", + "git_url": "https://api.github.com/repos/johndoe/test-repo/git/blobs/0d5a690c8fad5e605a6e8766295d9d459d65de42", + "download_url": "https://raw.githubusercontent.com/johndoe/test-repo/master/test.txt", + "type": "file", + "_links": { + "self": "https://api.github.com/repos/johndoe/test-repo/contents/test.txt?ref=master", + "git": "https://api.github.com/repos/johndoe/test-repo/git/blobs/0d5a690c8fad5e605a6e8766295d9d459d65de42", + "html": "https://github.com/johndoe/test-repo/blob/master/test.txt" + } + }, + "commit": { + "sha": "16f2907406174e8068ecf976fb6abc24f004a62b", + "node_id": "MDY6Q29tbWl0MTQ3NjgxMjMyOjE2ZjI5MDc0MDYxNzRlODA2OGVjZjk3NmZiNmFiYzI0ZjAwNGE2MmI=", + "url": "https://api.github.com/repos/johndoe/test-repo/git/commits/16f2907406174e8068ecf976fb6abc24f004a62b", + "html_url": "https://github.com/johndoe/test-repo/commit/16f2907406174e8068ecf976fb6abc24f004a62b", + "author": { + "name": "John Doe", + "email": "john.doe@gmail.com", + "date": "2018-10-30T04:37:34Z" + }, + "committer": { + "name": "John Doe", + "email": "john.doe@gmail.com", + "date": "2018-10-30T04:37:34Z" + }, + "tree": { + "sha": "116ad37d3680a79ef1cf9f555abb0579e293f5b4", + "url": "https://api.github.com/repos/johndoe/test-repo/git/trees/116ad37d3680a79ef1cf9f555abb0579e293f5b4" + }, + "message": "test", + "parents": [ + { + "sha": "1ab3d7a806e0a44f39ffbb63618fb26938f968ac", + "url": "https://api.github.com/repos/johndoe/test-repo/git/commits/1ab3d7a806e0a44f39ffbb63618fb26938f968ac", + "html_url": "https://github.com/johndoe/test-repo/commit/1ab3d7a806e0a44f39ffbb63618fb26938f968ac" + } + ], + "verification": { + "verified": false, + "reason": "unsigned", + "signature": null, + "payload": null + } + } +} +``` diff --git a/.ipynb_checkpoints/challenge-1-checkpoint.py b/.ipynb_checkpoints/challenge-1-checkpoint.py new file mode 100644 index 0000000..df4b3cf --- /dev/null +++ b/.ipynb_checkpoints/challenge-1-checkpoint.py @@ -0,0 +1 @@ +# enter your code below \ No newline at end of file diff --git a/your-code/.ipynb_checkpoints/Learning Advanced APIs-checkpoint.ipynb b/your-code/.ipynb_checkpoints/Learning Advanced APIs-checkpoint.ipynb new file mode 100644 index 0000000..4472bfb --- /dev/null +++ b/your-code/.ipynb_checkpoints/Learning Advanced APIs-checkpoint.ipynb @@ -0,0 +1,189 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Advanced APIs\n", + "\n", + "\n", + "\n", + "\n", + "Lesson Goals\n", + "\n", + " Understand HTTP Communication Protocol\n", + " Learn about the Request-Response Cycle\n", + " Differentiate between GET and POST requests\n", + " Familiarize with http headers\n", + " Learn about oAuth authentication\n", + " Familiarize with a few tools to work with APIs\n", + "\n", + "# Introduction - HTTP Protocol\n", + "\n", + "HTTP v1 protocol was introduced in 1991, it's this protocol that has driven the advance we've seen in the past two decades in information sharing as it's the basis for websites and we can argue that nowadays is the basis for most of the comunications that people needs in a daily basis. When you pay with your VISA online, the transaction travels via secure HTTP comunication known as HTTPS, HTTP is also the protocol choosen in quite different applications that range from synchronize traffic lights in a city to receive a video when you visit Youtube, quite powerful. I hasn't seen many updates over this years, nowadays the most recent version is HTTP2 as defined in RFC7540 that is the document that defines how the standard works. But let's deep a bit on how HTTP works in following sections.\n", + "\n", + "\n", + "\n", + "# Request-Response Cycle\n", + "\n", + "In an overview, the HTTP protocol consists on one participant computer making questions to a source of information that is in another computer (but it can be the same one). The one that makes the question is known as the CLIENT and the one responding to those questions is named the SERVER. For each question there must be always an answer, even if the answer is \"i don't have an answer for what you are asking for\", this is what we call the request-response cycle. If both parties CLIENT and SERVER can communicate, they should be able to complete this cycle. For sure there are corner cases, for example when the CLIENT cannot sent a question to the SERVER because maybe the SERVER is offline.\n", + "\n", + "In fact, the proper way to refer when a CLIENT makes a question to a SERVER is known in HTTP jargon to say we are making a REQUEST, and for each request the server sends a RESPONSE.\n", + "HTTP Url\n", + "\n", + "The URL is an important part of the protocol, it contains information of which resource whe are asking the SERVER for. It shoudld follow this format:\n", + "\n", + "Format:\n", + "scheme:[//authority]path[?query][#fragment]\n", + "\n", + "Example:\n", + "http://www.ironhack.com/learning/1234?user=pepe\n", + "\n", + "This is readed as the following:\n", + "\n", + " scheme: in this case it's http, but it can be also https in case we are connecting securely.\n", + " authority: In this case www.ironhack.com this is the server we are connecting to send the REQUEST.\n", + " path: Also known as the RESOURCE we are aksing for, in this case /learning/1234. Normally a server holds multiple pieces of information and serves the REQUESTS primarily based on the path of the url.\n", + "\n", + "Side note: To understand the format [someting] means something is optional and can be present or not.\n", + "Request\n", + "\n", + "As we've said, the request is a question we formulate to a SERVER in order to receive a piece of information. This piece of information can be anything, is not defined in the HTTP standard what information the SERVER should answer with. It can be anything: an image, a video, a HTML page, a PDF file, etc.\n", + "\n", + "The format of the REQUEST contains two main parts, the VERB of the request and the HEADERS. The verb contains information on which type of request we are making, and the headers contain metadata for the request, for example who is making the request, at what time, if is a known user, cookies, etc.\n", + "\n", + "\n", + "\n", + "# Http Verbs\n", + "\n", + "There are quite a few VERBS, but the main ones are GET and POST:\n", + "\n", + " GET: Just expreses that we want some resource from a SERVER. For example, let's assume you are asking about the twitter timeline. As you are just refering to the timeline, twitter will send you the global timeline for the whole world.\n", + "\n", + "This is how a raw GET http REQUEST is sent over the wire:\n", + "\n", + "GET /timeline HTTP/1.1\n", + "Host: www.twitter.com\n", + "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) Chrome/71.0.3578.98 Safari/537.36\n", + "Accept: text/htm\n", + "Accept-Language: en,es\n", + "\n", + "You can note that we are asking for the /timeline resource. But what if we want just the timeline for the users i'm following. Then i have to make a POST request.\n", + "\n", + "POST verb requests are different from GET requests as they contain information for the SERVER to respond according to it.\n", + "\n", + "POST /timeline HTTP/1.1\n", + "Host: www.twitter.com\n", + "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) Chrome/71.0.3578.98 Safari/537.36\n", + "Accept: text/htm\n", + "Accept-Language: en,es\n", + "\n", + "username=pepe\n", + "\n", + "Here you can see that we are sending a parameter that is the username we want the timeline for the user pepe. The resource is the same ( /timeline) but the server will send us a different RESPONSE according to this parameter we are sending. Those parameters are named POST BODY and what parameters to send depends on who we are asking, this is not standarized.\n", + "\n", + "\n", + "# Request Headers\n", + "\n", + "Also in a request there are the headers, let's analyze some of it:\n", + "\n", + "GET /timeline HTTP/1.1\n", + "Host: www.twitter.com\n", + "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) Chrome/71.0.3578.98 Safari/537.36\n", + "Accept: text/html\n", + "Accept-Language: en,es\n", + "\n", + " User-Agent: The browser that is making the REQUEST.\n", + " Accept-Language: The language in which we've configured our browser.\n", + "\n", + "The headers can vary between requests, there are some standard ones. The WEB SERVERS following the HTTP standard agree on how to proceed depending on which headers are present in a REQUEST, but you can add personalized ones if you want.\n", + "\n", + "\n", + "# Response\n", + "\n", + "All responses follow the same structure. They have a HEADER section that contains metadata about what the behaviour of the server and a BODY that contains the desired piece of information.\n", + "Status Codes\n", + "\n", + "One important part of the RESPONSE HEADER is the status code. This code is a numerical code indicating the result of the server. There are different status codes depending if the server succedded to perform the REQUEST of didn't manage to do anything at all. This are some groups of status codes:\n", + "\n", + " 2xx Success: The most important status code here is 200 that means everything worked as expected and the resource is found on the SERVER.\n", + " 4xx Client errors: Those are codes for errors on the client side, meaning the client asked for a wrong resource or the one that overybody knows 404 that means resource can't be found on the SERVER.\n", + " 5xx Server errors: This means the SERVER itself chashed when processing your REQUEST. Maybe due to a bug on server's code or something related.\n", + "\n", + "For a complete list see Wikipedia\n", + "\n", + "\n", + "\n", + "# OAuth Authentication\n", + "\n", + "OAuth is simply a secure authorization protocol that works on top of HTTP. When you are consuming some external service via HTTP calls, servers want to take control over their service and maybe limit the numer of requests you can do, or ask you to pay for it. Or maybe they don't want you to see data from other users in the systems. For this purposes, you should authenticate yourself.\n", + "\n", + "The oauth authentication is a common used one among lots of API providers as it is a well known public standard. This learning is not meant to cover how oauth work, but just to introduce you an important aspect about those. When working with oauth, the service you are asking a REQUEST must provide you some authenticated credentials that normally have names like clientId or apiKey and apiSecret or apiToken or simply token. This will be used to authenticate instead of username+password. The reason? it is not as secure to be transfering your password up and down as credentials, also credentials can be revoked if lost or stolen.\n", + "\n", + "The credentials should be kept securelly, and must not be shared with anyone else, so, do not add it inside a git repo as there are online hackers that scan for them and make fraudulent use, possibly wasting lots of dollars in your name!\n", + "Tools to work with APIs\n", + "\n", + "\n", + "\n", + "# CURL - Making HTTP requests with your terminal\n", + "\n", + "Curl is a command-line tool that allows you to make HTTPrequests as easy as this:\n", + "\n", + "$ curl www.ironhack.com\n", + "\n", + "Calling curl like this will make a GET request to www.ironhack.com, you will see the RESPONSE BODY as text in your terminal. Try it!\n", + "\n", + "If you want to save the response to a text file, just redirect the output to a file like this:\n", + "\n", + "$ curl www.ironhack.com > index.html\n", + "\n", + "This redirection operation can be done with any command output, not just curl.\n", + "\n", + "Good news!, Curl is already installed by default on your terminal.\n", + "\n", + "\n", + "# Debug API calls using Charles Proxy\n", + "\n", + "Charles is an HTTP proxy / HTTP monitor / Reverse Proxy that enables a developer to view all of the HTTP and SSL / HTTPS traffic between their machine and the Internet. This includes requests, responses and the HTTP headers (which contain the cookies and caching information).\n", + "\n", + "[Download it here](https://www.charlesproxy.com/)\n", + "\n", + "\n", + "\n", + "# Test API calls using Postman\n", + "\n", + "Postman is a great tool when trying to dissect APIs made by others or test ones you have made yourself. It offers a sleek user interface with which to make HTML requests, without the hassle of writing a bunch of code just to test an API's functionality.\n", + "\n", + "[Download it here](https://www.getpostman.com/)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.8" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/your-code/.ipynb_checkpoints/Learning Working with APIs-checkpoint.ipynb b/your-code/.ipynb_checkpoints/Learning Working with APIs-checkpoint.ipynb new file mode 100644 index 0000000..7cc71de --- /dev/null +++ b/your-code/.ipynb_checkpoints/Learning Working with APIs-checkpoint.ipynb @@ -0,0 +1,876 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Working with APIs\n", + "\n", + "\n", + "\n", + "Lesson Goals\n", + "\n", + " Understand what is API and what it does.\n", + " Learn how to make simple calls to an API and retrieve JSON data.\n", + " Learn how to handle nested JSON API results.\n", + "\n", + "Introduction\n", + "\n", + "Thus far in the program, we have learned how to obtain data from files and from relational databases. However, sometimes the data we need is not readily available via one of these two data sources. In some cases, the data we need may be contained within an application. Application owners will often create APIs (or Application Programming Interface) so that their applications can talk to other applications. An API is a set of programmatic instructions for accessing software applications, and the data that comes from APIs typically contains some sort of structure (such as JSON). This structure makes working with API data preferable to crawling websites and scraping content off of web pages.\n", + "\n", + "In this lesson, we are going to learn how to make API calls to an application, retrieve data in JSON format, learn about API authentication, and use Python libraries to obtain data from APIs.\n", + "Simple API Example with Requests\n", + "\n", + "There are a few libraries that can be used for working with APIs in Python, but the Requests library is one of the most intuitive. It has a get method that allows you to send an HTTP request to an application and receive a response. Let's take a look at a basic API call using the requests library. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import json\n", + "import requests\n", + "\n", + "response = requests.get('https://jsonplaceholder.typicode.com/todos')\n", + "results = response.json()\n", + "results[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, we used the get method to send a request to the JSONPlaceholder API, and we received back a response in the form of JSON structured data. If we wanted to analyze this data, we could easily use Pandas to convert the results into a data frame to which we can then apply various analytical methods. " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
completedidtitleuserId
0False1delectus aut autem1
1False2quis ut nam facilis et officia qui1
2False3fugiat veniam minus1
3True4et porro tempora1
4False5laboriosam mollitia et enim quasi adipisci qui...1
\n", + "
" + ], + "text/plain": [ + " completed id title userId\n", + "0 False 1 delectus aut autem 1\n", + "1 False 2 quis ut nam facilis et officia qui 1\n", + "2 False 3 fugiat veniam minus 1\n", + "3 True 4 et porro tempora 1\n", + "4 False 5 laboriosam mollitia et enim quasi adipisci qui... 1" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import pandas as pd\n", + "\n", + "data = pd.DataFrame(results)\n", + "data.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# More Complex Requests API Example\n", + "\n", + "In the previous section, the data we received from the API was not very complex. It was all at a single level and fit neatly into a data frame. However, sometimes API responses contain data that is nested, and we must find a way to flatten the JSON data so that it fits nicely into a data frame. Let's make an API call to the Github public API, create a Pandas data frame from the results, and examine the structure of the data.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
actorcreated_atidorgpayloadpublicrepotype
0{'id': 50721655, 'login': 'Jrose3797', 'displa...2019-07-08T14:22:26Z9967820122NaN{'push_id': 3793486088, 'size': 1, 'distinct_s...True{'id': 195824522, 'name': 'Jrose3797/dsc-intro...PushEvent
1{'id': 3761375, 'login': 'cdcabrera', 'display...2019-07-08T14:22:26Z9967820118NaN{'push_id': 3793486084, 'size': 2, 'distinct_s...True{'id': 190663766, 'name': 'cdcabrera/curiosity...PushEvent
2{'id': 26219511, 'login': 'heaptracetechnology...2019-07-08T14:22:26Z9967820120NaN{'ref': 'Standard-OMG-mongodb', 'ref_type': 'b...True{'id': 195819196, 'name': 'heaptracetechnology...CreateEvent
3{'id': 9443847, 'login': 'hendrikebbers', 'dis...2019-07-08T14:22:26Z9967820117{'id': 1673867, 'login': 'AdoptOpenJDK', 'grav...{'action': 'created', 'issue': {'url': 'https:...True{'id': 176502087, 'name': 'AdoptOpenJDK/IcedTe...IssueCommentEvent
4{'id': 6710696, 'login': 'nbuonin', 'display_l...2019-07-08T14:22:26Z9967820111{'id': 52456, 'login': 'ccnmtl', 'gravatar_id'...{'push_id': 3793486074, 'size': 1, 'distinct_s...True{'id': 183269109, 'name': 'ccnmtl/ohcoe-hugo',...PushEvent
\n", + "
" + ], + "text/plain": [ + " actor created_at \\\n", + "0 {'id': 50721655, 'login': 'Jrose3797', 'displa... 2019-07-08T14:22:26Z \n", + "1 {'id': 3761375, 'login': 'cdcabrera', 'display... 2019-07-08T14:22:26Z \n", + "2 {'id': 26219511, 'login': 'heaptracetechnology... 2019-07-08T14:22:26Z \n", + "3 {'id': 9443847, 'login': 'hendrikebbers', 'dis... 2019-07-08T14:22:26Z \n", + "4 {'id': 6710696, 'login': 'nbuonin', 'display_l... 2019-07-08T14:22:26Z \n", + "\n", + " id org \\\n", + "0 9967820122 NaN \n", + "1 9967820118 NaN \n", + "2 9967820120 NaN \n", + "3 9967820117 {'id': 1673867, 'login': 'AdoptOpenJDK', 'grav... \n", + "4 9967820111 {'id': 52456, 'login': 'ccnmtl', 'gravatar_id'... \n", + "\n", + " payload public \\\n", + "0 {'push_id': 3793486088, 'size': 1, 'distinct_s... True \n", + "1 {'push_id': 3793486084, 'size': 2, 'distinct_s... True \n", + "2 {'ref': 'Standard-OMG-mongodb', 'ref_type': 'b... True \n", + "3 {'action': 'created', 'issue': {'url': 'https:... True \n", + "4 {'push_id': 3793486074, 'size': 1, 'distinct_s... True \n", + "\n", + " repo type \n", + "0 {'id': 195824522, 'name': 'Jrose3797/dsc-intro... PushEvent \n", + "1 {'id': 190663766, 'name': 'cdcabrera/curiosity... PushEvent \n", + "2 {'id': 195819196, 'name': 'heaptracetechnology... CreateEvent \n", + "3 {'id': 176502087, 'name': 'AdoptOpenJDK/IcedTe... IssueCommentEvent \n", + "4 {'id': 183269109, 'name': 'ccnmtl/ohcoe-hugo',... PushEvent " + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "response = requests.get('https://api.github.com/events')\n", + "\n", + "data = pd.DataFrame(response.json())\n", + "data.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When we look at the data frame, we can see that there are dictionaries nested in several fields. We need to extract the information that is in these fields and add them to the data frame as columns. To do this, we are going to create our own flatten function that accepts a data frame and a list of columns that contain nested dictionaries in them. Our function is going to iterate through the columns and, for each column, it is going to:\n", + "\n", + " Turn the nested dictionaries into a data frame with a column for each key\n", + " Assign column names to each column in this new data frame\n", + " Add these new columns to the original data frame\n", + " Drop the column with the nested dictionaries\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "def flatten(data, col_list):\n", + " for column in col_list:\n", + " flattened = pd.DataFrame(dict(data[column])).transpose()\n", + " columns = [str(col) for col in flattened.columns]\n", + " flattened.columns = [column + '_' + colname for colname in columns]\n", + " data = pd.concat([data, flattened], axis=1)\n", + " data = data.drop(column, axis=1)\n", + " return data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have our function, let's apply it to the columns that have nested dictionaries and get back a revised data frame." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
created_atidpublictypeactor_avatar_urlactor_display_loginactor_gravatar_idactor_idactor_loginactor_url...payload_numberpayload_pull_requestpayload_push_idpayload_pusher_typepayload_refpayload_ref_typepayload_sizerepo_idrepo_namerepo_url
02019-07-08T14:22:26Z9967820122TruePushEventhttps://avatars.githubusercontent.com/u/50721655?Jrose379750721655Jrose3797https://api.github.com/users/Jrose3797...NaNNaN3793486088NaNrefs/heads/wipNaN1195824522Jrose3797/dsc-intro-to-sets-lab-houston-ds-060319https://api.github.com/repos/Jrose3797/dsc-int...
12019-07-08T14:22:26Z9967820118TruePushEventhttps://avatars.githubusercontent.com/u/3761375?cdcabrera3761375cdcabrerahttps://api.github.com/users/cdcabrera...NaNNaN3793486084NaNrefs/heads/masterNaN2190663766cdcabrera/curiosity-frontendhttps://api.github.com/repos/cdcabrera/curiosi...
22019-07-08T14:22:26Z9967820120TrueCreateEventhttps://avatars.githubusercontent.com/u/26219511?heaptracetechnology26219511heaptracetechnologyhttps://api.github.com/users/heaptracetechnology...NaNNaNNaNuserStandard-OMG-mongodbbranchNaN195819196heaptracetechnology/mongodbhttps://api.github.com/repos/heaptracetechnolo...
32019-07-08T14:22:26Z9967820117TrueIssueCommentEventhttps://avatars.githubusercontent.com/u/9443847?hendrikebbers9443847hendrikebbershttps://api.github.com/users/hendrikebbers...NaNNaNNaNNaNNaNNaNNaN176502087AdoptOpenJDK/IcedTea-Webhttps://api.github.com/repos/AdoptOpenJDK/Iced...
42019-07-08T14:22:26Z9967820111TruePushEventhttps://avatars.githubusercontent.com/u/6710696?nbuonin6710696nbuoninhttps://api.github.com/users/nbuonin...NaNNaN3793486074NaNrefs/heads/domain-rev-progress-barsNaN1183269109ccnmtl/ohcoe-hugohttps://api.github.com/repos/ccnmtl/ohcoe-hugo
\n", + "

5 rows × 35 columns

\n", + "
" + ], + "text/plain": [ + " created_at id public type \\\n", + "0 2019-07-08T14:22:26Z 9967820122 True PushEvent \n", + "1 2019-07-08T14:22:26Z 9967820118 True PushEvent \n", + "2 2019-07-08T14:22:26Z 9967820120 True CreateEvent \n", + "3 2019-07-08T14:22:26Z 9967820117 True IssueCommentEvent \n", + "4 2019-07-08T14:22:26Z 9967820111 True PushEvent \n", + "\n", + " actor_avatar_url actor_display_login \\\n", + "0 https://avatars.githubusercontent.com/u/50721655? Jrose3797 \n", + "1 https://avatars.githubusercontent.com/u/3761375? cdcabrera \n", + "2 https://avatars.githubusercontent.com/u/26219511? heaptracetechnology \n", + "3 https://avatars.githubusercontent.com/u/9443847? hendrikebbers \n", + "4 https://avatars.githubusercontent.com/u/6710696? nbuonin \n", + "\n", + " actor_gravatar_id actor_id actor_login \\\n", + "0 50721655 Jrose3797 \n", + "1 3761375 cdcabrera \n", + "2 26219511 heaptracetechnology \n", + "3 9443847 hendrikebbers \n", + "4 6710696 nbuonin \n", + "\n", + " actor_url ... payload_number \\\n", + "0 https://api.github.com/users/Jrose3797 ... NaN \n", + "1 https://api.github.com/users/cdcabrera ... NaN \n", + "2 https://api.github.com/users/heaptracetechnology ... NaN \n", + "3 https://api.github.com/users/hendrikebbers ... NaN \n", + "4 https://api.github.com/users/nbuonin ... NaN \n", + "\n", + " payload_pull_request payload_push_id payload_pusher_type \\\n", + "0 NaN 3793486088 NaN \n", + "1 NaN 3793486084 NaN \n", + "2 NaN NaN user \n", + "3 NaN NaN NaN \n", + "4 NaN 3793486074 NaN \n", + "\n", + " payload_ref payload_ref_type payload_size \\\n", + "0 refs/heads/wip NaN 1 \n", + "1 refs/heads/master NaN 2 \n", + "2 Standard-OMG-mongodb branch NaN \n", + "3 NaN NaN NaN \n", + "4 refs/heads/domain-rev-progress-bars NaN 1 \n", + "\n", + " repo_id repo_name \\\n", + "0 195824522 Jrose3797/dsc-intro-to-sets-lab-houston-ds-060319 \n", + "1 190663766 cdcabrera/curiosity-frontend \n", + "2 195819196 heaptracetechnology/mongodb \n", + "3 176502087 AdoptOpenJDK/IcedTea-Web \n", + "4 183269109 ccnmtl/ohcoe-hugo \n", + "\n", + " repo_url \n", + "0 https://api.github.com/repos/Jrose3797/dsc-int... \n", + "1 https://api.github.com/repos/cdcabrera/curiosi... \n", + "2 https://api.github.com/repos/heaptracetechnolo... \n", + "3 https://api.github.com/repos/AdoptOpenJDK/Iced... \n", + "4 https://api.github.com/repos/ccnmtl/ohcoe-hugo \n", + "\n", + "[5 rows x 35 columns]" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_columns = ['actor', 'org', 'payload', 'repo']\n", + "\n", + "flat = flatten(data, nested_columns)\n", + "flat.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Alternatively, we can flatten nested data using the function json_normalize. This function is part of the Pandas library. The function will flatten and rename each flattened column to the name of the original column and the name of the nested column separated by a period. For example actor.avatar_url.\n", + "\n", + "Here is an example of how to use this function. Note that you have to import it separately in order to avoid using the full path when calling the function." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
actor.avatar_urlactor.display_loginactor.gravatar_idactor.idactor.loginactor.urlcreated_atidorg.avatar_urlorg.gravatar_id...payload.push_idpayload.pusher_typepayload.refpayload.ref_typepayload.sizepublicrepo.idrepo.namerepo.urltype
0https://avatars.githubusercontent.com/u/50721655?Jrose379750721655Jrose3797https://api.github.com/users/Jrose37972019-07-08T14:22:26Z9967820122NaNNaN...3.793486e+09NaNrefs/heads/wipNaN1.0True195824522Jrose3797/dsc-intro-to-sets-lab-houston-ds-060319https://api.github.com/repos/Jrose3797/dsc-int...PushEvent
1https://avatars.githubusercontent.com/u/3761375?cdcabrera3761375cdcabrerahttps://api.github.com/users/cdcabrera2019-07-08T14:22:26Z9967820118NaNNaN...3.793486e+09NaNrefs/heads/masterNaN2.0True190663766cdcabrera/curiosity-frontendhttps://api.github.com/repos/cdcabrera/curiosi...PushEvent
2https://avatars.githubusercontent.com/u/26219511?heaptracetechnology26219511heaptracetechnologyhttps://api.github.com/users/heaptracetechnology2019-07-08T14:22:26Z9967820120NaNNaN...NaNuserStandard-OMG-mongodbbranchNaNTrue195819196heaptracetechnology/mongodbhttps://api.github.com/repos/heaptracetechnolo...CreateEvent
3https://avatars.githubusercontent.com/u/9443847?hendrikebbers9443847hendrikebbershttps://api.github.com/users/hendrikebbers2019-07-08T14:22:26Z9967820117https://avatars.githubusercontent.com/u/1673867?...NaNNaNNaNNaNNaNTrue176502087AdoptOpenJDK/IcedTea-Webhttps://api.github.com/repos/AdoptOpenJDK/Iced...IssueCommentEvent
4https://avatars.githubusercontent.com/u/6710696?nbuonin6710696nbuoninhttps://api.github.com/users/nbuonin2019-07-08T14:22:26Z9967820111https://avatars.githubusercontent.com/u/52456?...3.793486e+09NaNrefs/heads/domain-rev-progress-barsNaN1.0True183269109ccnmtl/ohcoe-hugohttps://api.github.com/repos/ccnmtl/ohcoe-hugoPushEvent
\n", + "

5 rows × 506 columns

\n", + "
" + ], + "text/plain": [ + " actor.avatar_url actor.display_login \\\n", + "0 https://avatars.githubusercontent.com/u/50721655? Jrose3797 \n", + "1 https://avatars.githubusercontent.com/u/3761375? cdcabrera \n", + "2 https://avatars.githubusercontent.com/u/26219511? heaptracetechnology \n", + "3 https://avatars.githubusercontent.com/u/9443847? hendrikebbers \n", + "4 https://avatars.githubusercontent.com/u/6710696? nbuonin \n", + "\n", + " actor.gravatar_id actor.id actor.login \\\n", + "0 50721655 Jrose3797 \n", + "1 3761375 cdcabrera \n", + "2 26219511 heaptracetechnology \n", + "3 9443847 hendrikebbers \n", + "4 6710696 nbuonin \n", + "\n", + " actor.url created_at \\\n", + "0 https://api.github.com/users/Jrose3797 2019-07-08T14:22:26Z \n", + "1 https://api.github.com/users/cdcabrera 2019-07-08T14:22:26Z \n", + "2 https://api.github.com/users/heaptracetechnology 2019-07-08T14:22:26Z \n", + "3 https://api.github.com/users/hendrikebbers 2019-07-08T14:22:26Z \n", + "4 https://api.github.com/users/nbuonin 2019-07-08T14:22:26Z \n", + "\n", + " id org.avatar_url \\\n", + "0 9967820122 NaN \n", + "1 9967820118 NaN \n", + "2 9967820120 NaN \n", + "3 9967820117 https://avatars.githubusercontent.com/u/1673867? \n", + "4 9967820111 https://avatars.githubusercontent.com/u/52456? \n", + "\n", + " org.gravatar_id ... payload.push_id payload.pusher_type \\\n", + "0 NaN ... 3.793486e+09 NaN \n", + "1 NaN ... 3.793486e+09 NaN \n", + "2 NaN ... NaN user \n", + "3 ... NaN NaN \n", + "4 ... 3.793486e+09 NaN \n", + "\n", + " payload.ref payload.ref_type payload.size public \\\n", + "0 refs/heads/wip NaN 1.0 True \n", + "1 refs/heads/master NaN 2.0 True \n", + "2 Standard-OMG-mongodb branch NaN True \n", + "3 NaN NaN NaN True \n", + "4 refs/heads/domain-rev-progress-bars NaN 1.0 True \n", + "\n", + " repo.id repo.name \\\n", + "0 195824522 Jrose3797/dsc-intro-to-sets-lab-houston-ds-060319 \n", + "1 190663766 cdcabrera/curiosity-frontend \n", + "2 195819196 heaptracetechnology/mongodb \n", + "3 176502087 AdoptOpenJDK/IcedTea-Web \n", + "4 183269109 ccnmtl/ohcoe-hugo \n", + "\n", + " repo.url type \n", + "0 https://api.github.com/repos/Jrose3797/dsc-int... PushEvent \n", + "1 https://api.github.com/repos/cdcabrera/curiosi... PushEvent \n", + "2 https://api.github.com/repos/heaptracetechnolo... CreateEvent \n", + "3 https://api.github.com/repos/AdoptOpenJDK/Iced... IssueCommentEvent \n", + "4 https://api.github.com/repos/ccnmtl/ohcoe-hugo PushEvent \n", + "\n", + "[5 rows x 506 columns]" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from pandas.io.json import json_normalize\n", + "\n", + "results = response.json()\n", + "flattened_data = json_normalize(results)\n", + "\n", + "flattened_data.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Looks much cleaner, and now we have access to the information that was enclosed within those dictionaries. Sometimes multiple rounds of flattening will be required if the JSON data returned from the API you are working with has hierarchically nested data.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.8" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/your-code/.ipynb_checkpoints/lab-api-scavenger-checkpoint.ipynb b/your-code/.ipynb_checkpoints/lab-api-scavenger-checkpoint.ipynb new file mode 100644 index 0000000..734942e --- /dev/null +++ b/your-code/.ipynb_checkpoints/lab-api-scavenger-checkpoint.ipynb @@ -0,0 +1,1079 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "09425fc7-4f68-4b79-ae26-c279f2b8558c", + "metadata": {}, + "source": [ + "## Lab de scavenger de github y el cold joke##" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "915dd6a9-c550-4474-9b86-e820276e085f", + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "from bs4 import BeautifulSoup\n", + "import pandas as pd\n", + "# from pprint import pprint\n", + "from lxml import html\n", + "from lxml.html import fromstring\n", + "import urllib.request\n", + "from urllib.request import urlopen\n", + "# import random\n", + "import re\n", + "# import scrapy\n", + "from urllib.request import Request, urlopen\n", + "import numpy as np\n", + "import base64\n", + "import os" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "4d96dca7-f871-45b1-aeb4-36b3086674e4", + "metadata": {}, + "outputs": [], + "source": [ + "import configparser\n", + "import requests\n", + "import os" + ] + }, + { + "cell_type": "code", + "execution_count": 287, + "id": "53dc2238-294a-4475-a8d2-55d800b41179", + "metadata": {}, + "outputs": [], + "source": [ + "config['database'] = {}\n", + "database = config['database']\n", + "database['host'] = '127.0.0.1'\n", + "database['user'] = 'jf8aconstantino'\n", + "database['pass'] = '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'\n", + "database['keep-alive'] = 'no'\n", + "database['database'] = 'database'\n", + "\n", + "with open('config.ini', 'w') as configfile:\n", + " config.write(configfile)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 288, + "id": "03c486e3-de66-4707-99d6-baef8a88140b", + "metadata": {}, + "outputs": [], + "source": [ + "config = configparser.ConfigParser()" + ] + }, + { + "cell_type": "code", + "execution_count": 289, + "id": "f59a4564-e111-410e-920a-97eeffd014dc", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['config.ini']" + ] + }, + "execution_count": 289, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "config.read('config.ini')" + ] + }, + { + "cell_type": "code", + "execution_count": 290, + "id": "d2a2d964-e9a2-47be-955d-2172304b1495", + "metadata": {}, + "outputs": [], + "source": [ + "base_contenido_repos = 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/'" + ] + }, + { + "cell_type": "code", + "execution_count": 320, + "id": "f7ace711-bbf2-4f81-92e9-d4d3c6666ac0", + "metadata": {}, + "outputs": [], + "source": [ + "username = 'jf8aconstantino'\n", + "token = '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'" + ] + }, + { + "cell_type": "code", + "execution_count": 321, + "id": "2abd3c33-c259-4bcd-9ad8-f86ef417fc85", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta = requests.get(base_contenido_repos,\n", + " auth = (username,token))" + ] + }, + { + "cell_type": "code", + "execution_count": 353, + "id": "e2f54b80-561e-494c-9a5f-8302f82549c9", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "rate_limit_status= ' https://api.github.com/rate_limit'" + ] + }, + { + "cell_type": "code", + "execution_count": 354, + "id": "555f6cd4-4fc4-457e-8a24-b4711c193c1c", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_rate_limit = requests.get(rate_limit_status)" + ] + }, + { + "cell_type": "code", + "execution_count": 355, + "id": "0f505d00-2485-409a-a0b0-152281309e28", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 355, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_rate_limit" + ] + }, + { + "cell_type": "code", + "execution_count": 322, + "id": "c5062c5b-d2d3-4e47-a04d-864073ed0764", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "200" + ] + }, + "execution_count": 322, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.status_code" + ] + }, + { + "cell_type": "code", + "execution_count": 295, + "id": "f3b6cb3b-650d-42d3-b3fe-b7c8feb4fa93", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'Server': 'GitHub.com', 'Date': 'Thu, 26 May 2022 22:52:08 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Cache-Control': 'private, max-age=60, s-maxage=60', 'Vary': 'Accept, Authorization, Cookie, X-GitHub-OTP, Accept-Encoding, Accept, X-Requested-With', 'ETag': 'W/\"0d82bf748275dbc1239a11e9157b198d88da7383\"', 'Last-Modified': 'Wed, 19 Dec 2018 03:05:09 GMT', 'X-OAuth-Scopes': 'admin:enterprise, admin:gpg_key, admin:org, admin:org_hook, admin:public_key, admin:repo_hook, delete:packages, delete_repo, gist, notifications, repo, user, workflow, write:discussion, write:packages', 'X-Accepted-OAuth-Scopes': '', 'github-authentication-token-expiration': '2022-06-24 22:51:26 UTC', 'X-GitHub-Media-Type': 'github.v3; format=json', 'X-RateLimit-Limit': '5000', 'X-RateLimit-Remaining': '4995', 'X-RateLimit-Reset': '1653608260', 'X-RateLimit-Used': '5', 'X-RateLimit-Resource': 'core', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '0', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': \"default-src 'none'\", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': 'FF83:7343:563495:C1497F:62900498'}" + ] + }, + "execution_count": 295, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.headers" + ] + }, + { + "cell_type": "code", + "execution_count": 296, + "id": "26349ad2-2171-4ec0-b965-7fb18235f377", + "metadata": { + "collapsed": true, + "jupyter": { + "outputs_hidden": true + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'name': '.gitignore',\n", + " 'path': '.gitignore',\n", + " 'sha': 'e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'size': 10,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/.gitignore',\n", + " 'type': 'file',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore'}},\n", + " {'name': '15024',\n", + " 'path': '15024',\n", + " 'sha': '2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15024',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15024'}},\n", + " {'name': '15534',\n", + " 'path': '15534',\n", + " 'sha': '5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15534',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15534'}},\n", + " {'name': '17020',\n", + " 'path': '17020',\n", + " 'sha': '9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/17020?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/17020',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/17020?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/17020'}},\n", + " {'name': '30351',\n", + " 'path': '30351',\n", + " 'sha': 'c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/30351?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/30351',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/30351?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/30351'}},\n", + " {'name': '40303',\n", + " 'path': '40303',\n", + " 'sha': '30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/40303?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/40303',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/40303?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/40303'}},\n", + " {'name': '44639',\n", + " 'path': '44639',\n", + " 'sha': '22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/44639?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/44639',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/44639?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/44639'}},\n", + " {'name': '45525',\n", + " 'path': '45525',\n", + " 'sha': '6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/45525?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/45525',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/45525?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/45525'}},\n", + " {'name': '47222',\n", + " 'path': '47222',\n", + " 'sha': 'c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47222',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47222'}},\n", + " {'name': '47830',\n", + " 'path': '47830',\n", + " 'sha': 'f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47830?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47830',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47830?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47830'}},\n", + " {'name': '49418',\n", + " 'path': '49418',\n", + " 'sha': '46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/49418?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/49418',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/49418?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/49418'}},\n", + " {'name': '50896',\n", + " 'path': '50896',\n", + " 'sha': 'e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/50896?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/50896',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/50896?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/50896'}},\n", + " {'name': '55417',\n", + " 'path': '55417',\n", + " 'sha': '636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55417?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55417',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55417?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55417'}},\n", + " {'name': '55685',\n", + " 'path': '55685',\n", + " 'sha': 'a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55685?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55685',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55685?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55685'}},\n", + " {'name': '60224',\n", + " 'path': '60224',\n", + " 'sha': '28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/60224?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/60224',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/60224?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/60224'}},\n", + " {'name': '64880',\n", + " 'path': '64880',\n", + " 'sha': '88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/64880?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/64880',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/64880?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/64880'}},\n", + " {'name': '66032',\n", + " 'path': '66032',\n", + " 'sha': '0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/66032?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/66032',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/66032?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/66032'}},\n", + " {'name': '68848',\n", + " 'path': '68848',\n", + " 'sha': 'ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/68848?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/68848',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/68848?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/68848'}},\n", + " {'name': '70751',\n", + " 'path': '70751',\n", + " 'sha': 'a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70751?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70751',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70751?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70751'}},\n", + " {'name': '70985',\n", + " 'path': '70985',\n", + " 'sha': 'd1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70985?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70985',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/d1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70985?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/d1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70985'}},\n", + " {'name': '88596',\n", + " 'path': '88596',\n", + " 'sha': 'f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/88596?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/88596',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/88596?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/88596'}},\n", + " {'name': '89046',\n", + " 'path': '89046',\n", + " 'sha': '5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89046?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89046',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89046?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89046'}},\n", + " {'name': '89338',\n", + " 'path': '89338',\n", + " 'sha': '79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89338?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89338',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89338?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89338'}},\n", + " {'name': '91701',\n", + " 'path': '91701',\n", + " 'sha': '0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/91701?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/91701',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/91701?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/91701'}},\n", + " {'name': '97881',\n", + " 'path': '97881',\n", + " 'sha': 'c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/97881?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/97881',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/97881?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/97881'}},\n", + " {'name': '98750',\n", + " 'path': '98750',\n", + " 'sha': 'cdc23915e0a5179127458431986ba3750840a924',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/98750',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/cdc23915e0a5179127458431986ba3750840a924',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/cdc23915e0a5179127458431986ba3750840a924',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/98750'}}]" + ] + }, + "execution_count": 296, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 297, + "id": "69001492-83ce-46b7-9c4e-466718f5a01d", + "metadata": {}, + "outputs": [], + "source": [ + "paths_buenos = [file['path'] for file in respuesta.json() if file['path'] != '.gitignore']" + ] + }, + { + "cell_type": "code", + "execution_count": 328, + "id": "e96e8f05-1680-4e00-8342-8b0ccad4cde0", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'15024'" + ] + }, + "execution_count": 328, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "paths_buenos[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 329, + "id": "b6c17827-bfa6-4e6c-bd63-eb559b0c2115", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/'" + ] + }, + "execution_count": 329, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "base_contenido_repos" + ] + }, + { + "cell_type": "code", + "execution_count": 330, + "id": "3d97356d-b694-4c9c-84f7-dba6ceee9003", + "metadata": {}, + "outputs": [], + "source": [ + "for carpeta in paths_buenos:\n", + " respuesta_carpeta = requests.get(base_contenido_repos+carpeta,\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 331, + "id": "9809c3e4-3e8a-46d2-97cf-3fc5a00002e1", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_carpeta = requests.get(base_contenido_repos+paths_buenos[0],\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 332, + "id": "b529a71f-4ffd-4aaa-939f-34c16f63b4ef", + "metadata": {}, + "outputs": [], + "source": [ + "archivos_carpeta = respuesta_carpeta.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 333, + "id": "80018918-8bad-4535-9847-6df118e88557", + "metadata": {}, + "outputs": [], + "source": [ + "archivos_buenos_carpeta = [archivo['path'] for archivo in archivos_carpeta\n", + " if archivo['name'].endswith('scavengerhunt')]" + ] + }, + { + "cell_type": "code", + "execution_count": 334, + "id": "c465847c-2126-41b5-a6c9-9a7782789dbd", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['15024/.0006.scavengerhunt']" + ] + }, + "execution_count": 334, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "archivos_buenos_carpeta" + ] + }, + { + "cell_type": "code", + "execution_count": 335, + "id": "72b626d6-9b08-4025-871e-d408049fa144", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos = []" + ] + }, + { + "cell_type": "code", + "execution_count": 336, + "id": "950cd8a0-d8dc-41d8-b6e4-3c2d2a91e2a3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 336, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "lista_archivos" + ] + }, + { + "cell_type": "code", + "execution_count": 337, + "id": "d1b5bb58-acd9-4ac7-8721-90cc20e7eede", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos.extend(archivos_buenos_carpeta)" + ] + }, + { + "cell_type": "code", + "execution_count": 338, + "id": "c330b3fe-1fcd-4e9a-bb2e-2478eac6c3b7", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos += archivos_buenos_carpeta" + ] + }, + { + "cell_type": "code", + "execution_count": 339, + "id": "3bddf65f-b91e-4bd5-85b2-72bb21335ff8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Carpeta 25 de 25\r" + ] + } + ], + "source": [ + "rutas_archivos_scavenger = []\n", + "for i, carpeta in enumerate(paths_buenos):\n", + " print(f'Carpeta {i+1} de {len(paths_buenos)}', end = '\\r')\n", + " respuesta_carpeta = requests.get(base_contenido_repos+carpeta,\n", + " auth = (username, token))\n", + " archivos_carpeta = respuesta_carpeta.json()\n", + " archivos_utiles = [archivo['path'] for archivo in archivos_carpeta \n", + " if archivo['name'].endswith('scavengerhunt')]\n", + " rutas_archivos_scavenger += archivos_utiles" + ] + }, + { + "cell_type": "code", + "execution_count": 340, + "id": "9eb71f74-e496-42b4-8348-40053ae53b3d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['15024/.0006.scavengerhunt',\n", + " '15534/.0008.scavengerhunt',\n", + " '15534/.0012.scavengerhunt',\n", + " '17020/.0007.scavengerhunt',\n", + " '30351/.0021.scavengerhunt']" + ] + }, + "execution_count": 340, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_scavenger[:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 341, + "id": "6928e3ed-97eb-4c4c-8c05-ec0f9b514864", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'15024/.0006.scavengerhunt'" + ] + }, + "execution_count": 341, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_scavenger[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 342, + "id": "441388fd-779c-43b2-a23d-0f3b3840b5ec", + "metadata": {}, + "outputs": [], + "source": [ + "username = 'jf8aconstantino'\n", + "token = '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'" + ] + }, + { + "cell_type": "code", + "execution_count": 343, + "id": "38a14a1c-af0e-49b4-ad1e-773fa8b05df7", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_archivo = requests.get(base_contenido_repos+rutas_archivos_scavenger[0],\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 344, + "id": "273a1b8d-cbc3-48dc-bd42-c08dd16782e7", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "200" + ] + }, + "execution_count": 344, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.status_code" + ] + }, + { + "cell_type": "code", + "execution_count": 345, + "id": "f3c6913a-a899-4fef-a2df-6bd14d49d561", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "dict_keys(['name', 'path', 'sha', 'size', 'url', 'html_url', 'git_url', 'download_url', 'type', 'content', 'encoding', '_links'])" + ] + }, + "execution_count": 345, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json().keys()" + ] + }, + { + "cell_type": "code", + "execution_count": 346, + "id": "23ca3d9d-072a-4d94-98dd-b29e3eda92ba", + "metadata": {}, + "outputs": [], + "source": [ + "import base64" + ] + }, + { + "cell_type": "code", + "execution_count": 347, + "id": "b6bbf0ac-683e-43a6-9b5c-6f5d6e75a69b", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'name': '.0006.scavengerhunt',\n", + " 'path': '15024/.0006.scavengerhunt',\n", + " 'sha': '1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'size': 3,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024/.0006.scavengerhunt?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/master/15024/.0006.scavengerhunt',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/15024/.0006.scavengerhunt',\n", + " 'type': 'file',\n", + " 'content': 'b2YK\\n',\n", + " 'encoding': 'base64',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024/.0006.scavengerhunt?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/blob/master/15024/.0006.scavengerhunt'}}" + ] + }, + "execution_count": 347, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 348, + "id": "c893b84c-7166-44a3-b7c8-44db25cec8fb", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'b2YK\\n'" + ] + }, + "execution_count": 348, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json()['content']" + ] + }, + { + "cell_type": "code", + "execution_count": 349, + "id": "3935ebf2-9c95-4ac4-b3a8-ebbf58867b75", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'of\\n'" + ] + }, + "execution_count": 349, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "base64.b64decode(respuesta_archivo.json()['content']).decode()" + ] + }, + { + "cell_type": "code", + "execution_count": 350, + "id": "f6677da2-f526-46ba-b760-4a2301182bfe", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['98750/.0001.scavengerhunt',\n", + " '88596/.0002.scavengerhunt',\n", + " '60224/.0003.scavengerhunt',\n", + " '68848/.0004.scavengerhunt',\n", + " '44639/.0005.scavengerhunt']" + ] + }, + "execution_count": 350, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_ordenados = sorted(rutas_archivos_scavenger, key = lambda x: x.split('/')[-1])\n", + "rutas_archivos_ordenados[:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 351, + "id": "789fc2f7-d5bb-4c04-a7b8-ae11bc9987fe", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Archivo 24 de 24\r" + ] + } + ], + "source": [ + "mensaje = ''\n", + "for i, archivo in enumerate(rutas_archivos_ordenados):\n", + " print(f'Archivo {i+1} de {len(rutas_archivos_ordenados)}', end = '\\r')\n", + " respuesta_archivo = requests.get(base_contenido_repos+archivo,\n", + " auth = (username, token))\n", + " contenido = respuesta_archivo.json()['content']\n", + " contenido_legible = base64.b64decode(contenido).decode()\n", + " mensaje += contenido_legible" + ] + }, + { + "cell_type": "code", + "execution_count": 352, + "id": "94d414a6-af39-4d34-a0e6-67783ab3c0b0", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "In\n", + "data\n", + "science,\n", + "80\n", + "percent\n", + "of\n", + "time\n", + "spent\n", + "is\n", + "preparing\n", + "data,\n", + "20\n", + "percent\n", + "of\n", + "time\n", + "is\n", + "spent\n", + "complaining\n", + "about\n", + "the\n", + "need\n", + "to\n", + "prepare\n", + "data.\n", + "\n" + ] + } + ], + "source": [ + "print(mensaje)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0a45de58-f4da-4bfe-8fc1-86fe3ea144da", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "015713c1-6f0f-4c9d-ac1a-f134d28bbed7", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/your-code/Learning Advanced APIs.ipynb b/your-code/Learning Advanced APIs.ipynb index 9e7a18c..4472bfb 100644 --- a/your-code/Learning Advanced APIs.ipynb +++ b/your-code/Learning Advanced APIs.ipynb @@ -181,9 +181,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.2" + "version": "3.8.8" } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/your-code/Learning Working with APIs.ipynb b/your-code/Learning Working with APIs.ipynb index 4e369ae..7cc71de 100644 --- a/your-code/Learning Working with APIs.ipynb +++ b/your-code/Learning Working with APIs.ipynb @@ -868,9 +868,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.2" + "version": "3.8.8" } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/your-code/config.ini b/your-code/config.ini new file mode 100644 index 0000000..a766b10 --- /dev/null +++ b/your-code/config.ini @@ -0,0 +1,12 @@ +[DEFAULT] +title = Hello world +compression = yes +compression_level = 9 + +[database] +host = 127.0.0.1 +user = jf8aconstantino +pass = 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 +keep-alive = no +database = database + diff --git a/your-code/lab-api-scavenger.ipynb b/your-code/lab-api-scavenger.ipynb new file mode 100644 index 0000000..734942e --- /dev/null +++ b/your-code/lab-api-scavenger.ipynb @@ -0,0 +1,1079 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "09425fc7-4f68-4b79-ae26-c279f2b8558c", + "metadata": {}, + "source": [ + "## Lab de scavenger de github y el cold joke##" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "915dd6a9-c550-4474-9b86-e820276e085f", + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "from bs4 import BeautifulSoup\n", + "import pandas as pd\n", + "# from pprint import pprint\n", + "from lxml import html\n", + "from lxml.html import fromstring\n", + "import urllib.request\n", + "from urllib.request import urlopen\n", + "# import random\n", + "import re\n", + "# import scrapy\n", + "from urllib.request import Request, urlopen\n", + "import numpy as np\n", + "import base64\n", + "import os" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "4d96dca7-f871-45b1-aeb4-36b3086674e4", + "metadata": {}, + "outputs": [], + "source": [ + "import configparser\n", + "import requests\n", + "import os" + ] + }, + { + "cell_type": "code", + "execution_count": 287, + "id": "53dc2238-294a-4475-a8d2-55d800b41179", + "metadata": {}, + "outputs": [], + "source": [ + "config['database'] = {}\n", + "database = config['database']\n", + "database['host'] = '127.0.0.1'\n", + "database['user'] = 'jf8aconstantino'\n", + "database['pass'] = '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'\n", + "database['keep-alive'] = 'no'\n", + "database['database'] = 'database'\n", + "\n", + "with open('config.ini', 'w') as configfile:\n", + " config.write(configfile)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 288, + "id": "03c486e3-de66-4707-99d6-baef8a88140b", + "metadata": {}, + "outputs": [], + "source": [ + "config = configparser.ConfigParser()" + ] + }, + { + "cell_type": "code", + "execution_count": 289, + "id": "f59a4564-e111-410e-920a-97eeffd014dc", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['config.ini']" + ] + }, + "execution_count": 289, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "config.read('config.ini')" + ] + }, + { + "cell_type": "code", + "execution_count": 290, + "id": "d2a2d964-e9a2-47be-955d-2172304b1495", + "metadata": {}, + "outputs": [], + "source": [ + "base_contenido_repos = 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/'" + ] + }, + { + "cell_type": "code", + "execution_count": 320, + "id": "f7ace711-bbf2-4f81-92e9-d4d3c6666ac0", + "metadata": {}, + "outputs": [], + "source": [ + "username = 'jf8aconstantino'\n", + "token = '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'" + ] + }, + { + "cell_type": "code", + "execution_count": 321, + "id": "2abd3c33-c259-4bcd-9ad8-f86ef417fc85", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta = requests.get(base_contenido_repos,\n", + " auth = (username,token))" + ] + }, + { + "cell_type": "code", + "execution_count": 353, + "id": "e2f54b80-561e-494c-9a5f-8302f82549c9", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "rate_limit_status= ' https://api.github.com/rate_limit'" + ] + }, + { + "cell_type": "code", + "execution_count": 354, + "id": "555f6cd4-4fc4-457e-8a24-b4711c193c1c", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_rate_limit = requests.get(rate_limit_status)" + ] + }, + { + "cell_type": "code", + "execution_count": 355, + "id": "0f505d00-2485-409a-a0b0-152281309e28", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 355, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_rate_limit" + ] + }, + { + "cell_type": "code", + "execution_count": 322, + "id": "c5062c5b-d2d3-4e47-a04d-864073ed0764", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "200" + ] + }, + "execution_count": 322, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.status_code" + ] + }, + { + "cell_type": "code", + "execution_count": 295, + "id": "f3b6cb3b-650d-42d3-b3fe-b7c8feb4fa93", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'Server': 'GitHub.com', 'Date': 'Thu, 26 May 2022 22:52:08 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Cache-Control': 'private, max-age=60, s-maxage=60', 'Vary': 'Accept, Authorization, Cookie, X-GitHub-OTP, Accept-Encoding, Accept, X-Requested-With', 'ETag': 'W/\"0d82bf748275dbc1239a11e9157b198d88da7383\"', 'Last-Modified': 'Wed, 19 Dec 2018 03:05:09 GMT', 'X-OAuth-Scopes': 'admin:enterprise, admin:gpg_key, admin:org, admin:org_hook, admin:public_key, admin:repo_hook, delete:packages, delete_repo, gist, notifications, repo, user, workflow, write:discussion, write:packages', 'X-Accepted-OAuth-Scopes': '', 'github-authentication-token-expiration': '2022-06-24 22:51:26 UTC', 'X-GitHub-Media-Type': 'github.v3; format=json', 'X-RateLimit-Limit': '5000', 'X-RateLimit-Remaining': '4995', 'X-RateLimit-Reset': '1653608260', 'X-RateLimit-Used': '5', 'X-RateLimit-Resource': 'core', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '0', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': \"default-src 'none'\", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': 'FF83:7343:563495:C1497F:62900498'}" + ] + }, + "execution_count": 295, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.headers" + ] + }, + { + "cell_type": "code", + "execution_count": 296, + "id": "26349ad2-2171-4ec0-b965-7fb18235f377", + "metadata": { + "collapsed": true, + "jupyter": { + "outputs_hidden": true + }, + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'name': '.gitignore',\n", + " 'path': '.gitignore',\n", + " 'sha': 'e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'size': 10,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/.gitignore',\n", + " 'type': 'file',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/.gitignore?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/e43b0f988953ae3a84b00331d0ccf5f7d51cb3cf',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/blob/master/.gitignore'}},\n", + " {'name': '15024',\n", + " 'path': '15024',\n", + " 'sha': '2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15024',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/2945e51c87ad5da893c954afcf092f06343bbb7d',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15024'}},\n", + " {'name': '15534',\n", + " 'path': '15534',\n", + " 'sha': '5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15534',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15534?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5af6f2a7287e4191f39e55693fc1e9c8918d1d3a',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/15534'}},\n", + " {'name': '17020',\n", + " 'path': '17020',\n", + " 'sha': '9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/17020?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/17020',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/17020?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/9c49f920aa4d9433fa99a5824128f0e6b90ec5f2',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/17020'}},\n", + " {'name': '30351',\n", + " 'path': '30351',\n", + " 'sha': 'c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/30351?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/30351',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/30351?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c488d7f64088c852e22067d48fdc64ee3670f3ba',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/30351'}},\n", + " {'name': '40303',\n", + " 'path': '40303',\n", + " 'sha': '30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/40303?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/40303',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/40303?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/30193d9cf62b07bcbb6366513ff03596861f2d29',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/40303'}},\n", + " {'name': '44639',\n", + " 'path': '44639',\n", + " 'sha': '22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/44639?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/44639',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/44639?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/22fc3d5c2db80822c351edb2248f3491c8ebda86',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/44639'}},\n", + " {'name': '45525',\n", + " 'path': '45525',\n", + " 'sha': '6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/45525?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/45525',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/45525?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/6a4a88cd9084110c8646c3cfd84dfe96b300a4a7',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/45525'}},\n", + " {'name': '47222',\n", + " 'path': '47222',\n", + " 'sha': 'c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47222',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47222?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c7001604cdadc2fe7b82e0f6996690718cac6941',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47222'}},\n", + " {'name': '47830',\n", + " 'path': '47830',\n", + " 'sha': 'f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47830?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47830',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/47830?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f84882ad7560fd2b8c6a0867bc707ce9009ef288',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/47830'}},\n", + " {'name': '49418',\n", + " 'path': '49418',\n", + " 'sha': '46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/49418?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/49418',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/49418?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/46bc658c09589d9023246b00e848ce97d30d4989',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/49418'}},\n", + " {'name': '50896',\n", + " 'path': '50896',\n", + " 'sha': 'e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/50896?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/50896',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/50896?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/e47a7a35a19f80694587330c57d94e28d3b4c054',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/50896'}},\n", + " {'name': '55417',\n", + " 'path': '55417',\n", + " 'sha': '636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55417?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55417',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55417?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/636fa555a2ee752759144a268fd860feb2b6fd2d',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55417'}},\n", + " {'name': '55685',\n", + " 'path': '55685',\n", + " 'sha': 'a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55685?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55685',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/55685?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a00a8148a88287508a867616d7063786d3d5d4ff',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/55685'}},\n", + " {'name': '60224',\n", + " 'path': '60224',\n", + " 'sha': '28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/60224?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/60224',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/60224?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/28d70fba98bfacfaa5e5544b2eff6b61c9e8f57b',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/60224'}},\n", + " {'name': '64880',\n", + " 'path': '64880',\n", + " 'sha': '88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/64880?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/64880',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/64880?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/88b159d6f73378e6968bb35ccfd8e3ad0cc462d2',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/64880'}},\n", + " {'name': '66032',\n", + " 'path': '66032',\n", + " 'sha': '0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/66032?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/66032',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/66032?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0230fa6fa1ccf49ab976fbbfc9eb838094779785',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/66032'}},\n", + " {'name': '68848',\n", + " 'path': '68848',\n", + " 'sha': 'ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/68848?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/68848',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/68848?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/ed2f90be6835e7e74c283aedba1942b788754d32',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/68848'}},\n", + " {'name': '70751',\n", + " 'path': '70751',\n", + " 'sha': 'a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70751?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70751',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70751?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/a5d9391003b67cecf3c336398ec38cfa75a689b7',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70751'}},\n", + " {'name': '70985',\n", + " 'path': '70985',\n", + " 'sha': 'd1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70985?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70985',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/d1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/70985?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/d1a654c5811f52ec8a101652b0a04367644eab99',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/70985'}},\n", + " {'name': '88596',\n", + " 'path': '88596',\n", + " 'sha': 'f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/88596?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/88596',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/88596?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/f294d2a0e55a4bab12625a7f709b44450a5e4648',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/88596'}},\n", + " {'name': '89046',\n", + " 'path': '89046',\n", + " 'sha': '5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89046?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89046',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89046?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/5f3ef5f14cf72bbe03a24b69777ba02f19a3adb5',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89046'}},\n", + " {'name': '89338',\n", + " 'path': '89338',\n", + " 'sha': '79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89338?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89338',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/89338?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/79c94a4032a927b2af52cc6da4ce27eb2abbf55e',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/89338'}},\n", + " {'name': '91701',\n", + " 'path': '91701',\n", + " 'sha': '0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/91701?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/91701',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/91701?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/0ad19115f0b56c3cd10cb7e077140c201b527301',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/91701'}},\n", + " {'name': '97881',\n", + " 'path': '97881',\n", + " 'sha': 'c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/97881?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/97881',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/97881?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/c369c43c17ec44cc3e66dd27f8e557f9d15d40f4',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/97881'}},\n", + " {'name': '98750',\n", + " 'path': '98750',\n", + " 'sha': 'cdc23915e0a5179127458431986ba3750840a924',\n", + " 'size': 0,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/tree/master/98750',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/cdc23915e0a5179127458431986ba3750840a924',\n", + " 'download_url': None,\n", + " 'type': 'dir',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/98750?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/trees/cdc23915e0a5179127458431986ba3750840a924',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/tree/master/98750'}}]" + ] + }, + "execution_count": 296, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 297, + "id": "69001492-83ce-46b7-9c4e-466718f5a01d", + "metadata": {}, + "outputs": [], + "source": [ + "paths_buenos = [file['path'] for file in respuesta.json() if file['path'] != '.gitignore']" + ] + }, + { + "cell_type": "code", + "execution_count": 328, + "id": "e96e8f05-1680-4e00-8342-8b0ccad4cde0", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'15024'" + ] + }, + "execution_count": 328, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "paths_buenos[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 329, + "id": "b6c17827-bfa6-4e6c-bd63-eb559b0c2115", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/'" + ] + }, + "execution_count": 329, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "base_contenido_repos" + ] + }, + { + "cell_type": "code", + "execution_count": 330, + "id": "3d97356d-b694-4c9c-84f7-dba6ceee9003", + "metadata": {}, + "outputs": [], + "source": [ + "for carpeta in paths_buenos:\n", + " respuesta_carpeta = requests.get(base_contenido_repos+carpeta,\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 331, + "id": "9809c3e4-3e8a-46d2-97cf-3fc5a00002e1", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_carpeta = requests.get(base_contenido_repos+paths_buenos[0],\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 332, + "id": "b529a71f-4ffd-4aaa-939f-34c16f63b4ef", + "metadata": {}, + "outputs": [], + "source": [ + "archivos_carpeta = respuesta_carpeta.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 333, + "id": "80018918-8bad-4535-9847-6df118e88557", + "metadata": {}, + "outputs": [], + "source": [ + "archivos_buenos_carpeta = [archivo['path'] for archivo in archivos_carpeta\n", + " if archivo['name'].endswith('scavengerhunt')]" + ] + }, + { + "cell_type": "code", + "execution_count": 334, + "id": "c465847c-2126-41b5-a6c9-9a7782789dbd", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['15024/.0006.scavengerhunt']" + ] + }, + "execution_count": 334, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "archivos_buenos_carpeta" + ] + }, + { + "cell_type": "code", + "execution_count": 335, + "id": "72b626d6-9b08-4025-871e-d408049fa144", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos = []" + ] + }, + { + "cell_type": "code", + "execution_count": 336, + "id": "950cd8a0-d8dc-41d8-b6e4-3c2d2a91e2a3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 336, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "lista_archivos" + ] + }, + { + "cell_type": "code", + "execution_count": 337, + "id": "d1b5bb58-acd9-4ac7-8721-90cc20e7eede", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos.extend(archivos_buenos_carpeta)" + ] + }, + { + "cell_type": "code", + "execution_count": 338, + "id": "c330b3fe-1fcd-4e9a-bb2e-2478eac6c3b7", + "metadata": {}, + "outputs": [], + "source": [ + "lista_archivos += archivos_buenos_carpeta" + ] + }, + { + "cell_type": "code", + "execution_count": 339, + "id": "3bddf65f-b91e-4bd5-85b2-72bb21335ff8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Carpeta 25 de 25\r" + ] + } + ], + "source": [ + "rutas_archivos_scavenger = []\n", + "for i, carpeta in enumerate(paths_buenos):\n", + " print(f'Carpeta {i+1} de {len(paths_buenos)}', end = '\\r')\n", + " respuesta_carpeta = requests.get(base_contenido_repos+carpeta,\n", + " auth = (username, token))\n", + " archivos_carpeta = respuesta_carpeta.json()\n", + " archivos_utiles = [archivo['path'] for archivo in archivos_carpeta \n", + " if archivo['name'].endswith('scavengerhunt')]\n", + " rutas_archivos_scavenger += archivos_utiles" + ] + }, + { + "cell_type": "code", + "execution_count": 340, + "id": "9eb71f74-e496-42b4-8348-40053ae53b3d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['15024/.0006.scavengerhunt',\n", + " '15534/.0008.scavengerhunt',\n", + " '15534/.0012.scavengerhunt',\n", + " '17020/.0007.scavengerhunt',\n", + " '30351/.0021.scavengerhunt']" + ] + }, + "execution_count": 340, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_scavenger[:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 341, + "id": "6928e3ed-97eb-4c4c-8c05-ec0f9b514864", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'15024/.0006.scavengerhunt'" + ] + }, + "execution_count": 341, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_scavenger[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 342, + "id": "441388fd-779c-43b2-a23d-0f3b3840b5ec", + "metadata": {}, + "outputs": [], + "source": [ + "username = 'jf8aconstantino'\n", + "token = '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'" + ] + }, + { + "cell_type": "code", + "execution_count": 343, + "id": "38a14a1c-af0e-49b4-ad1e-773fa8b05df7", + "metadata": {}, + "outputs": [], + "source": [ + "respuesta_archivo = requests.get(base_contenido_repos+rutas_archivos_scavenger[0],\n", + " auth = (username, token))" + ] + }, + { + "cell_type": "code", + "execution_count": 344, + "id": "273a1b8d-cbc3-48dc-bd42-c08dd16782e7", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "200" + ] + }, + "execution_count": 344, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.status_code" + ] + }, + { + "cell_type": "code", + "execution_count": 345, + "id": "f3c6913a-a899-4fef-a2df-6bd14d49d561", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "dict_keys(['name', 'path', 'sha', 'size', 'url', 'html_url', 'git_url', 'download_url', 'type', 'content', 'encoding', '_links'])" + ] + }, + "execution_count": 345, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json().keys()" + ] + }, + { + "cell_type": "code", + "execution_count": 346, + "id": "23ca3d9d-072a-4d94-98dd-b29e3eda92ba", + "metadata": {}, + "outputs": [], + "source": [ + "import base64" + ] + }, + { + "cell_type": "code", + "execution_count": 347, + "id": "b6bbf0ac-683e-43a6-9b5c-6f5d6e75a69b", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'name': '.0006.scavengerhunt',\n", + " 'path': '15024/.0006.scavengerhunt',\n", + " 'sha': '1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'size': 3,\n", + " 'url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024/.0006.scavengerhunt?ref=master',\n", + " 'html_url': 'https://github.com/ironhack-datalabs/scavenger/blob/master/15024/.0006.scavengerhunt',\n", + " 'git_url': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'download_url': 'https://raw.githubusercontent.com/ironhack-datalabs/scavenger/master/15024/.0006.scavengerhunt',\n", + " 'type': 'file',\n", + " 'content': 'b2YK\\n',\n", + " 'encoding': 'base64',\n", + " '_links': {'self': 'https://api.github.com/repos/ironhack-datalabs/scavenger/contents/15024/.0006.scavengerhunt?ref=master',\n", + " 'git': 'https://api.github.com/repos/ironhack-datalabs/scavenger/git/blobs/1c9064284a24b3486015eafdb391b141c27ada2b',\n", + " 'html': 'https://github.com/ironhack-datalabs/scavenger/blob/master/15024/.0006.scavengerhunt'}}" + ] + }, + "execution_count": 347, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json()" + ] + }, + { + "cell_type": "code", + "execution_count": 348, + "id": "c893b84c-7166-44a3-b7c8-44db25cec8fb", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'b2YK\\n'" + ] + }, + "execution_count": 348, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "respuesta_archivo.json()['content']" + ] + }, + { + "cell_type": "code", + "execution_count": 349, + "id": "3935ebf2-9c95-4ac4-b3a8-ebbf58867b75", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'of\\n'" + ] + }, + "execution_count": 349, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "base64.b64decode(respuesta_archivo.json()['content']).decode()" + ] + }, + { + "cell_type": "code", + "execution_count": 350, + "id": "f6677da2-f526-46ba-b760-4a2301182bfe", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['98750/.0001.scavengerhunt',\n", + " '88596/.0002.scavengerhunt',\n", + " '60224/.0003.scavengerhunt',\n", + " '68848/.0004.scavengerhunt',\n", + " '44639/.0005.scavengerhunt']" + ] + }, + "execution_count": 350, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rutas_archivos_ordenados = sorted(rutas_archivos_scavenger, key = lambda x: x.split('/')[-1])\n", + "rutas_archivos_ordenados[:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 351, + "id": "789fc2f7-d5bb-4c04-a7b8-ae11bc9987fe", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Archivo 24 de 24\r" + ] + } + ], + "source": [ + "mensaje = ''\n", + "for i, archivo in enumerate(rutas_archivos_ordenados):\n", + " print(f'Archivo {i+1} de {len(rutas_archivos_ordenados)}', end = '\\r')\n", + " respuesta_archivo = requests.get(base_contenido_repos+archivo,\n", + " auth = (username, token))\n", + " contenido = respuesta_archivo.json()['content']\n", + " contenido_legible = base64.b64decode(contenido).decode()\n", + " mensaje += contenido_legible" + ] + }, + { + "cell_type": "code", + "execution_count": 352, + "id": "94d414a6-af39-4d34-a0e6-67783ab3c0b0", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "In\n", + "data\n", + "science,\n", + "80\n", + "percent\n", + "of\n", + "time\n", + "spent\n", + "is\n", + "preparing\n", + "data,\n", + "20\n", + "percent\n", + "of\n", + "time\n", + "is\n", + "spent\n", + "complaining\n", + "about\n", + "the\n", + "need\n", + "to\n", + "prepare\n", + "data.\n", + "\n" + ] + } + ], + "source": [ + "print(mensaje)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0a45de58-f4da-4bfe-8fc1-86fe3ea144da", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "015713c1-6f0f-4c9d-ac1a-f134d28bbed7", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.8" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}