- I used an adjacency list to implement the graph as instructed, this is good in sparse graphs. To build this adjacency list I used a hashmap to store all keys (the urls) to allow for fast lookup as there would be a lot of lookups, additionally, I did not need to encode urls to some sort of index representation.
- The value for each key holds a page that has information about the page that we need to use for the algorithm. This allows for easy access to the page and its information. Additionally, the page object has a list of outgoing edges. This functions as the list of edges for each node in our adjacency list.
What is the computational complexity of each method in your implementation in the worst case in terms of Big O notation? [5 points]
- insertEdge method: Worst Case O(n) .Because C++ map uses a tree behind the scenes when try to insert a new key value pair into our map it is O(log(n)) time. After it has been inserted we access it again in O(log(n)) time for readibility reasons and push the new to edge into the edge list which is O(n) time for a vector because of resizing ( amortized its only O(1)). 2 * O(log(n)) + O(n) = O(n)
- initializeGraph method: Worst Case O(n). Because we iterate through the entire graph and initialize each page object with a page rank of 1/n where n is the number of pages. This is O(n) time.
- powerIterate method: Worst Case O(n^2). We iterate through the entire graph and calculate the new page rank for each page. This is O(n) time. Inside the for loop, we iterate through the outgoing edges of each page (this could be n). This is O(n) in worst case. This results in a time complexity of O(n^2).
- rankPages method: Worst Case O(k*n^2). initializeGraph is O(n) and powerIterate is O(n^2). We powerIterate k times. This results in a time complexity of O(k*n^2).
- getRanksAsString method: Worst Case O(n^2). We iterate through each page in the graph and get it's rank which is 0(n), however, inside each iteration string concat is not constant and in worst case is O(n).
- parseInput method: Worst Case (n^2). We iterate n times in where n is the number of lines in the input file aka pages. This results in a time complexity of O(n). Inside the for loop, we call insertEdge which is O(n). This results in a time complexity of O(n^2). Next we rank pages which has k*n^2 worst case time complexity.
What is the computational complexity of your main method in your implementation in the worst case in terms of Big O notation? [5 points]
- the computational complexity of the main method in my implementation is O(k*N^2) where N is the number of pages and k is power iterations.
- this is because we iterate through the input file which is a factor of n, then we insert them also a factor of n which is n^2.
- Then we rank the pages and this method is k*n^2 where k is the number of power iterations. This results in a time complexity of O(k*n^2).
What did you learn from this assignment, and what would you do differently if you had to start over? [2 points]
- I learned how to implement a graph using an adjacency list. Because I have studied Neural Networks and simple backprop/forwardprop algorithms, the matrix implementation made a lot of sense to be, but I struggled to wrap my head around adjacency lists in implementation until this assignment
- If I could start over, I would have immediately drawn some matrix and adjacency list represenations of pagerank graphs to help me understand how the algorithm works in each implementation and also how each method handles the algorithm. This is what I ended up doing anyway and it actually helped.