This page has college readiness data that should be imported, but it uses a different structure than the rest of IDOE's spreadsheets and doesn't identify schools/corporations by their IDOE codes.
Identifying schools
Since IDOE codes aren't used, the school/corporation's name will need to be matched to an existing record. In the event that no exact match is found, An array of all names could be loaded and the top three candidates could be displayed based on which names start with the same letter and have the lowest levenshtein() values and then the correct name could be chosen by the user.
Reconciling with existing import script
It may take less work to create a script that converts a college readiness spreadsheet into a new spreadsheet formatted like all of the other IDOE sheets, i.e. with
- Each school displayed once per worksheet
- The first two columns are IDOE code and name
- The first row has column headers
- Every cell to the right of codes and names and below the headers contains statistical data (or nulls)
Rationale
My presumption at this point is that...
- It would take a very cumbersome overhaul of the
import-stats command in order to be able to feed these college readiness spreadsheets into it. It would require the acknowledgement of two different spreadsheet formats, and determining how the code knows which format a file uses and how each method in ImportStatsCommand and ImportFile would need to be adjusted based on the format sounds like a headache. Specifically, one that would balloon the complexity of these classes and hurt their maintainability.
- It would also take a tremendous amount of work to manually reformat these spreadsheets to match the expected format.
- A command that reformats these spreadsheets would be contained and likely not very large.
- The precedent of putting spreadsheets into a common format and running them through the same
import-stats command seems like a better long-term plan than adding code to import-stats that accounts for every format variation that's been encountered.
This page has college readiness data that should be imported, but it uses a different structure than the rest of IDOE's spreadsheets and doesn't identify schools/corporations by their IDOE codes.
Identifying schools
Since IDOE codes aren't used, the school/corporation's name will need to be matched to an existing record. In the event that no exact match is found, An array of all names could be loaded and the top three candidates could be displayed based on which names start with the same letter and have the lowest
levenshtein()values and then the correct name could be chosen by the user.Reconciling with existing import script
It may take less work to create a script that converts a college readiness spreadsheet into a new spreadsheet formatted like all of the other IDOE sheets, i.e. with
Rationale
My presumption at this point is that...
import-statscommand in order to be able to feed these college readiness spreadsheets into it. It would require the acknowledgement of two different spreadsheet formats, and determining how the code knows which format a file uses and how each method inImportStatsCommandandImportFilewould need to be adjusted based on the format sounds like a headache. Specifically, one that would balloon the complexity of these classes and hurt their maintainability.import-statscommand seems like a better long-term plan than adding code toimport-statsthat accounts for every format variation that's been encountered.