Added the other types methods to get player stats. Added an annual st…#4
Added the other types methods to get player stats. Added an annual st…#4matthewstirling wants to merge 5 commits intomdgoldberg:masterfrom
Conversation
…ats merge method.
…olidated for ease.
|
I added the collegeid method from the other branch. I'll close that pull request. |
mdgoldberg
left a comment
There was a problem hiding this comment.
See comments. Thanks again for contributing!! Love reading your PRs. Great to look over the nfl code, which I haven't touched for a while, and to see it being expanded.
| return college | ||
|
|
||
| @sportsref.decorators.memoize | ||
| def collegeid(self): |
There was a problem hiding this comment.
isn't there already a "college" function right above this one?
There was a problem hiding this comment.
the other college function returns the college. this returns the player's college id so you could cross-reference their college page.
There was a problem hiding this comment.
I guess maybe the name isn't clear, but the other function returns the college ID too, I'm pretty sure
There was a problem hiding this comment.
I don't think I've been clear. For Brady:
p.college() yields u'michigan'
p.collegeid() yields u'tom-brady-1'
The first one is Tom Brady's college's ID for the cfb website. The one I added yields Tom Brady's ID for the cfb website. My method is useful for looking up college info for an existing NFL player.
There was a problem hiding this comment.
Ohh I see. I'm sorry I should have figured that out, I didn't look closely enough. Sounds good, want to change it to college_player_id() just to be super clear? Usually in comments and stuff when I say "{X} ID" I mean an ID of an X, where X is "boxscore", "player", etc. so I think just to be clear it should be college_player_id() or ncaaf_player_id() maybe?
There was a problem hiding this comment.
changed to ncaaf_player_id()
| :returns: Pandas DataFrame with defense/fumble stats. | ||
| """ | ||
| doc = self.get_doc() | ||
| table = (doc('#all_defense') if kind == 'R' |
There was a problem hiding this comment.
I think it's a bit cleaner (and I just made this change myself to the pre-existing functions, will be included in the next release) to change these table assignments to doc('table#defense') for example, instead of doc('#all_defense'). That's more explicit, and the new utils.get_html should allow you to access table#defense. I realize utils.parse_table works on the div#all_defense div as well, but since this could change, I think it makes the most sense to use table#defense. It'd be great if you could make this change here and on the other functions you added (basically just adding table before the # and getting rid of all_).
There was a problem hiding this comment.
Thanks for the suggestion. I changed all of the methods I wrote to incorporate this approach.
|
|
||
| @sportsref.decorators.memoize | ||
| @sportsref.decorators.kind_rpb(include_type=True) | ||
| def all_annual_stats(self, kind='R'): |
There was a problem hiding this comment.
I haven't actually tested this function, but it looks good to me.
There was a problem hiding this comment.
I'll use this function a lot, so I'll thoroughly test it on pretty much every possible player once I get rolling.
There was a problem hiding this comment.
Sounds good. On my todo list for the overall project is to write an actual test suite but I probably won't until I finish my thesis. But until then I'm fine with you just opening new pull requests or whatever if there are bugs after it's merged in.
| """ | ||
| # set link and table_name and then get the pyquery table | ||
| link = sportsref.nfl.BASE_URL + \ | ||
| '/teams/{}/{}_injuries.htm'.format(self.teamID, str(year)) |
There was a problem hiding this comment.
This should be done using self.get_year_doc. See the roster function for reference.
There was a problem hiding this comment.
get_year_doc returns the html in pyquery form for the team and year combo. For example the link:
"http://www.pro-football-reference.com/teams/rav/2012.htm"
The injury and snap count info are not on that page as far as I can tell. They are on related, but different links:
"http://www.pro-football-reference.com/teams/rav/2012_injuries.htm"
and
"http://www.pro-football-reference.com/teams/rav/2012-snap-counts.htm"
These are entirely different html pages and content on those pages. Am I missing something? Is the data on the base page but hidden and still parsable?
There was a problem hiding this comment.
You'd get those as self.get_year_doc('2012_injuries') and self.get_year_doc('2012-snap-counts'). See the way they're used in the roster function. It just wraps those first two lines in a function, but those functions become tedious when they all do the same thing.
There was a problem hiding this comment.
Didn't see that. Very nice stuff there. Made the changes.
| """ | ||
| # set link and table_name and then get the pyquery table | ||
| link = sportsref.nfl.BASE_URL + \ | ||
| '/teams/{}/{}-snap-counts.htm'.format(self.teamID, str(year)) |
There was a problem hiding this comment.
Use self.get_year_doc here too
There was a problem hiding this comment.
please see comment above.
There was a problem hiding this comment.
See above. also in the line below this change '#all_snap_counts' to 'table#snap_counts' once you merge in the new master branch
|
I made all the table method changes. Please see the comments for the other three items. |
|
I think I've caught up with all your requested changes on this pull request. |
…ats merge method.
I added methods to get defensive, return stats, kicking stats, and other types of statistic tables found on pfr pages. I also created a all_annual_stats method in player to combine all the different types of locations for annual player stats into one method call with a df return.