Skip to content

etl homework#35

Open
leomeyer1908 wants to merge 1 commit intoHedgeApple:masterfrom
leomeyer1908:master
Open

etl homework#35
leomeyer1908 wants to merge 1 commit intoHedgeApple:masterfrom
leomeyer1908:master

Conversation

@leomeyer1908
Copy link
Copy Markdown

My code begins by opening the file "homework.csv" by using the csv.reader function from the CSV library, which creates a reader object for the CSV file. Then, I extract both the header row and all the subsequent rows containing data from the reader object and pass them to the transform_data function. Additionally, I provide the list of output_headers, which are the desired headers from "example.csv" that each entry from the input CSV file should conform to.

Within the transform_data function, I first create a list of dictionaries. Each index in the list corresponds to a different item in "homework.csv", and each dictionary contains all the header values as keys, with the corresponding values for that key as the value for that header in the current item in the "homework.csv" file. This dictionary structure allows for O(1) retrieval of the necessary keys without the need to search for the index of the appropriate header in the next step, where we iterate through each row.

Next, I create a list of output items to store a similar list of dictionaries, but for the output items. I iterate through each row, mapping each output header key to a corresponding input key, and apply the necessary transformations as specified in the instructions. The only two transformations required from the instructions were converting UPC to EAN13 and transforming the currency. Converting UPC to EAN13 involved prepending the value with a '0' and adding dashes at the proper locations. Transforming the currency involved ensuring that empty entries were represented as "0.00", removing the dollar sign if present, removing commas, and rounding to 2 decimal places.

After completing all transformations, output_items contained a list of dictionaries corresponding to those of the input but with the output headers and the necessary transformations. I then use the output_headers to create a list of rows for each item and return that as output_data. Finally, I used a CSV writer object to first write the output header and then each row of output_data to the "formatted.csv" file as a CSV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant