QC fixes for residential data transformation scripts by simularis · Pull Request #163 · sound-data/DEER-Prototypes-EnergyPlus

simularis · 2026-04-01T06:52:31Z

Pull Request (PR) Description

Fix some issues identified in data transformation and post-processing of SWHC049-08. See discussion in #118.

Make hourly data extraction more robust using column name
Fix issue where SFm New vintage label replaced with 1975/1985
Refine README instructions section on post-processing steps for residential measures

PR Author

Make sure the PR branch is up to date with main branch at the time of the PR submission
Craft a succinct title that effectively encapsulates the essence of the pull request, providing a general overview of the proposed changes.
Label the PR with at least one of the following: New Measure, Bug, or Feature. Type: bug fix.
Provide a concise description of the measure, bug, or feature. Submit one PR per measure.
N/A For a new measure, attach a workbook named DEER_EnergyPlus_Modelkit_Measure_list_working.xlsx, containing only rows used for post-processing the measure.
Add comments in the code when necessary to facilitate the review process.
Add a comment before the added code, including the author's full name, company, and specifying if it's a bug fix, new measure, or feature.
For a new feature or bug, demonstrate the impact on energy consumption for selected cases with justification using plots and descriptions.
N/A For a new measure, add a summary table showing total energy consumption per simulated case.

PR Reviewer

Conduct a thorough code review.
If the branch is behind the main, merge the branch locally to check for potential conflicts.
If a bug, locally reproduce it and compare energy consumptions before and after.
Explore creative ways to stress-test the code.
Locally check the error file and other outputs.

simularis · 2026-04-01T06:57:57Z

scripts/data transformation/DMo.py

+    #remove traling spaces on col headers
+    df.columns = df.columns.str.rstrip()
+
    #extract the last column (the total elec hrly profile)
    #if for enduse hourly, then extract the relevant end use column
-    extracted_df = pd.DataFrame(df.iloc[:,-1])
+    extracted_df = pd.DataFrame(df['Electricity:Facility [J](Hourly)'])


Here, we make hourly data extraction more robust using column name (1/3)

simularis · 2026-04-01T06:58:10Z

scripts/data transformation/MFm.py

+    #remove traling spaces on col headers
+    df.columns = df.columns.str.rstrip()
+
    #extract the last column (the total elec hrly profile)
    #if for enduse hourly, then extract the relevant end use column
-    extracted_df = pd.DataFrame(df.iloc[:,-1])
+    extracted_df = pd.DataFrame(df['Electricity:Facility [J](Hourly)'])


Here, we make hourly data extraction more robust using column name (2/3)

simularis · 2026-04-01T06:58:23Z

scripts/data transformation/SFm.py

+        #remove traling spaces on col headers
+        df.columns = df.columns.str.rstrip()
+
        #extract the last column (the total elec hrly profile)
        #if for enduse hourly, then extract the relevant end use column
-        extracted_df = pd.DataFrame(df.iloc[:,-1])
+        extracted_df = pd.DataFrame(df['Electricity:Facility [J](Hourly)'])


Here, we make hourly data extraction more robust using column name (3/3)

simularis · 2026-04-01T06:59:33Z

scripts/data transformation/SFm.py

 #input the two subdirectory of SFm, one being 1975, the other 1985. If New vintage, input path at path_new and leave other blank.
-path_1975 = 'residential measures/SWHC049-03 SEER Rated AC HP_SFm_1975'
-path_1985 = 'residential measures/SWHC049-03 SEER Rated AC HP_SFm_1985'
-path_new = ''
-
-paths = [path_1975, path_1985]
-
-if path_new != '' :
+path_1975 = 'residential measures/SWHC049-08 SEER Rated AC HP/SWHC049-08 SEER Rated AC HP_SFm_1975'
+path_1985 = 'residential measures/SWHC049-08 SEER Rated AC HP/SWHC049-08 SEER Rated AC HP_SFm_1985'
+path_new = 'residential measures/SWHC049-08 SEER Rated AC HP/SWHC049-08 SEER Rated AC HP_SFm_New'
+
+# Select whether to process New or Existing vintage models.
+# The script is not compatible with processing both New and Existing in a single batch.
+MODE_NEW_VINTAGE = False
+if MODE_NEW_VINTAGE:
    paths = [path_new]
+else:
+    paths = [path_1975, path_1985]
+


Here we introduce a new variable that triggers distinct logic for new and existing vintage later on in SFm.py

simularis · 2026-04-01T07:01:21Z

scripts/data transformation/SFm.py

 #%%
-##BldgVint label correction for NumStor weights
-sim_annual_f['BldgVint'] = sim_annual_f['BldgLoc'].map(cz_vint_dict)
-sim_hourly_final['BldgVint'] = sim_hourly_final['BldgLoc'].map(cz_vint_dict)
+if MODE_NEW_VINTAGE:
+    pass
+else:
+    ##BldgVint label correction for NumStor weights
+    # Intended to be used only for Existing vintage models.
+    # This overwrites the BldgVint attribute from the model, regardless of New or Existing.    
+    sim_annual_f['BldgVint'] = sim_annual_f['BldgLoc'].map(cz_vint_dict)
+    sim_hourly_final['BldgVint'] = sim_hourly_final['BldgLoc'].map(cz_vint_dict)


Here we prevent triggering the vintage label replacement logic intended for Existing vintage when the processing a batch of New vintage models.

simularis · 2026-04-01T07:02:23Z

scripts/Readme.md

 ### Post-processing steps for residential measures:
+If preparing a new measure or update to an existing measure, then create a measure list workbook to define permutations to be calculated in post-processing.
+1. Make a copy of the measure list workbook template, `DEER_EnergyPlus_Modelkit_Measure_list_working.xlsx`.
+2. Permute rows for all combinations of building type, vintage, HVAC type, and pairings of 'PreTechID', 'StdTechID', 'MeasTechID'. For each unique pairing of 'PreTechID', 'StdTechID', 'MeasTechID', enter one unique MeasureID name; the outputs will be labeled by MeasureID rather than by TechID. All rows for a given MeasureID should share the same Normunit option.
+3. Save your measure list workbook as part of the measure setup documentation.


Here we clarify some of the requirements for entering measure definitions into the measure list workbook.

simularis · 2026-04-01T07:03:22Z

scripts/Readme.md

+Apply the following data transformation steps for each subfolder under your measure (e.g. DMo, MFm-Ex, MFm-New, SFm-Ex, SFm-New):
 1. Open one of the provided .py scripts in the **data transformation** directory (either DMo.py, MFm.py, or SFm.py). The corresponding building type script should be used.
 2. Open up the accompanying excel spreadsheet ***DEER_EnergyPlus_Modelkit_Measure_list_working.xlsx***, identify the corresponding measure name in column A of the sheet `Measure_list`.
 3. In line 23 of the python script (line 26 for the Com script), specify the measure name identified in step 2. For example: `measure_name = 'SWSV001-05 Duct Seal_DMo'`
 4. In line 33 of the python script (line 34 and 35 in the Single Family script, line 40 in Com script but that one should be automatic, double check to make sure), specify the path to the simulation directory starting with the folder Analysis. For example: `path = 'residential measures/SWSV001-05 Duct Seal_DMo'` (For Single Family, if existing vintage, assign both the 1975 and 1985 directory to path_1975 and path_1985 respectively, and leave path_new blank; if new vintage, assign the New directory to path_new ) 
 5. run the python script. The script should produce 3 files, ***current_msr_mat.csv***, ***sim_annual.csv***, ***sim_hourly_wb.csv*** (***sfm_annual.csv*** and ***sfm_hourly_csv*** instead for Single Family). These files should appear in the same directory as the python script. For better organization, save these files somewhere else trackable. Note that these files are part of **gitignore**, but the user can produce them in their local repo and move them to a desirable location after the process is finished.


Here we clarify that these steps should be repeated for each cohort subfolder.

nfette added 4 commits March 31, 2026 15:26

Remove extra slash in file path

08ff125

Make residential hourly data step 2 more robust

e7e4afb

Fix issue where SFm New vintage label replaced with 1975/1985

93cbb6d

Refine post-processing steps for residential measures

d433f00

simularis commented Apr 1, 2026

View reviewed changes

Merge branch 'main' into dev-res-post-process-qc

60d3047

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QC fixes for residential data transformation scripts#163

QC fixes for residential data transformation scripts#163
simularis wants to merge 5 commits intosound-data:mainfrom
simularis:dev-res-post-process-qc

simularis commented Apr 1, 2026 •

edited

Loading

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

simularis Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simularis commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request (PR) Description

PR Author

PR Reviewer

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

simularis Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simularis commented Apr 1, 2026 •

edited

Loading