Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions .ipynb_checkpoints/README (cleaning process)-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d741c99f",
"metadata": {},
"source": [
"## Technique 1\n",
"\n",
"First, we drop all columns with mostly null values, which in this case are the two last ones. Both are calles \"Unnamed\".\n",
"\n",
"## Technique 2\n",
"\n",
"Then, we sort rows by the \"Original Order\" column, and then drop this column. We do this to reduce the number of columns, since once sorted, the index will replace this original order column.\n",
"\n",
"## Technique 3\n",
"\n",
"Replace spaces and dots with underscores and remove capital letters from column names, as well as strip leading and trailing white spaces\n",
"\n",
"## Technique 4\n",
"\n",
"Drop rows with null values in columns with mostly non-null values (\"Country\", Area\", \"Location\", \"Activity\", \"Name\", \"Sex\", \"Injury\", \"Fatal\", \"Investigator or Source\", \"href formula\", \"h ref\")\n",
"\n",
"## Technique 5\n",
"\n",
"Correct or drop rows with wrong values in sex column (which can only be \"M\" or \"F\")\n",
"\n",
"## Technique 6\n",
"\n",
"DROP rows with nonnumeric value in \"age\" columns and changing column type to float\n",
"\n",
"## Technique 7\n",
"\n",
"Strip spaces from values in \"fatal_(y/n)\" column and drop values which are not \"Y\" or \"N\"\n",
"\n",
"## Technique 8\n",
"\n",
"After having fixed all fixable values in columns, drop columns with null values because we consider they are not important for our analysis (\"species\" and \"time\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading