This project is divided in two parts. In part 1, the database is created using pymed module to extract infromation from pubmed. In part 2 the database is loaded from a pickle file into a pandas dataframe, cleaned and analyzes. Many plots have been kept for the sake of pure data exploration.