Skip to content

Pandas (Day 18) #15

@xinyushi

Description

@xinyushi

Hello Professor,

Here are my solutions to the exercise:

Exercise:
Given the dataframe df below, find the following

  • Last two rows of columns A and D
    df.iloc[[6,7], [0,3]]

  • Last three rows such which statisfy that column A > Column B

last3rows = df.iloc[5:8,]  # last three rows 
print(last3rows['A'] > last3rows['B']) # The boolean statement return true if column A > Column B. 

Under the Titanic data, I write a possible exercise.

Background: In data analysis, knowing how to properly fill in missing data is very important, sometimes we don't want to just ignore them, especially when the observational numbers are small. There are various ways to do it such as filling with the mean, K-Nearest Neighbors (KNN) methods and so on.

Question: Here, as an exercise, we are trying to replace missing values in age as its column mean.

Solution

ages = df["age"]
ageMean = ages.mean()
df["age"] = ages.fillna(ageMean)
df["age"]

Here is a typo I found:
“depending on they type of data held in the series”
“they” should be “the”

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions