HW3 Review by Matt Lichti

Good work. 

I liked your code for converting the timedelta64 into integers.
data['age_days']= data['age'].astype("timedelta64[D]")
My code was kind of weird but had the same results.
student_logins['account_age']/=np.timedelta64(1, 'D')

Your code for adding the class dummy variables was really good since it could work with any number of dummy variables.
for j in data['class_id'].unique():
    label = "class_" + str(j)
    data[label] = data['class_id'] == j

My code was a little different and not as reusable:
class_dummies = pd.core.reshape.get_dummies(logins['class_id'])
logins[['class_a', 'class_c', 'class_e', 'class_g', 'class_m']]=class_dummies

Excluding one of the class ids makes a lot of sense since every entry is in exactly one class. I didn't even think about that when I ran my model and got the colinearlity warning. 

On the OLS regression I set X and y like you did except without the '.values' at the end. 
y = logins['duration'] instead of y = logins['duration'].values. That meant my results summary showed the names of the X variables which I think makes it easier to understand the results.

Our final models were very similar. We both had R-Squared of .486 meaning our models explained 48.6% of the variation in Y. Calculating MSE was good. Minimizing MSE is effectively the same as maximizing R-squared since R-squared = 1 - (MSE / variance(y))

@ghego, @craigsakuma, @kebaler


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HW3 Review by Matt Lichti #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

HW3 Review by Matt Lichti #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions