Skip to content

Issues loading CSV into SQL Server database #2

@wangw23

Description

@wangw23

For loading a file into a Microsoft SQL Server database, the use of quotations needs to be consistent across columns and rows. Also, there is an extra tab at the end of each row that causes issues since the file is tab delimited. Lastly, Byte 0xe4 appears in the icd9.txt file, and causes issues in loading the file into a Microsoft SQL Server database.
The following line of bash code addresses each of those issues by removing quotations, the tabs at the end of each rows, and the special characters that cause problems.
cat icd9.txt | tr -d '"' | sed 's/\t$//g' | LANG=C sed 's/[\d128-\d255]//g' > icd9.csv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions