-
Notifications
You must be signed in to change notification settings - Fork 1
Description
STAREPandas DFs restricted to the same timestamp (i.e., contemporaneous via temporal intersection).
From DFs with all timestamps:
imerg_ts = np.sort(imerg_sdf['timestamp'].unique())
mcms_ts = np.sort(mcms_sdf['timestamp'].unique())
# Merged, unique datetimes
merged_ts = np.union1d(imerg_ts, mcms_ts)
##
# Sort by TimeStamp
imerg_sdf_by_ts = imerg_sdf.sort_values(by=["timestamp"])
mcms_sdf_by_ts = mcms_sdf.sort_values(by=["timestamp"])
for aidx, a_time in enumerate(merged_ts):
##
# MCMS subset with just this DTime.
mcms_sdf_now = mcms_sdf_by_ts[mcms_sdf_by_ts.timestamp == a_time]
mcms_sdf_now.reset_index(inplace=True, drop=True)
##
# IMERG subset with just this DTime.
imerg_sdf_now = imerg_sdf_by_ts[imerg_sdf_by_ts.timestamp == a_time]
imerg_sdf_now.reset_index(inplace=True, drop=True)This give something like this for each a_time:
imerg_sdf_now
label timestamp itivs x y cell_areas tot_area precips tot_precip sids cover trixels
0 87 2021-01-10 2275465702582262897 ... ...
1 91 2021-01-10 2275465702582262897 ... ...
mcms_sdf_now
usi uci timestamp tivs30 lon lat cslp ctype cinten tinten depth sarea sa_fill vert_poly_geo verts sids cover trixels
0 20210109150539835085 20210110000540035625 2021-01-10 2275465702582262897 ... ...
1 20210109030280028312 20210110000255028312 2021-01-10 2275465702582262897 ... ...
2 20210109150515029437 20210110000500029687 2021-01-10 2275465702582262897 ... ...
3 20210109150530500800 20210110000525001062 2021-01-10 2275465702582262897 ... ...
4 20210108030470001937 20210110000500002687 2021-01-10 2275465702582262897 ... ...
5 20210107180425030375 20210110000370031000 2021-01-10 2275465702582262897 ... ...
6 20210106150557234913 20210110000495000125 2021-01-10 2275465702582262897 ... ...
7 20210109210595025312 20210110000605025437 2021-01-10 2275465702582262897 ... ...
The problem
Property differences:
- ETC centers are spatially contiguous but possibly nested (center-B may be wholly enclosed with center-A).
- IMERG features are not always spatially contiguous (i.e., disjoint), but are never nested or overlapping.
The spatial relationship between contemporaneous ETC centers and IMERG features is thus Many-to-Many:
- A relationship between sets (dataframes) with two properties:
- Members of one set (dataframe row) can potentially link to any member (row) of the other set.
a. Each ETC center needs to be checked against each IMERG feature. - A member of one set (row) can potentially link to no, one or multiple members (rows) of the other set.
a. An ETC center may intersect with no, one or many IMERG features, and likewise, an IMERG feature may intersect with no, one or many ETC centers.
- Members of one set (dataframe row) can potentially link to any member (row) of the other set.
Example solution using placeholder data.
mcms_data = {'uci': ["uci-a", "uci-b"], 'vert_poly_geo': ["poly-a", "poly-b"], 'sids': ["sids-a", "sids-b"], 'cover': ["cover-a", "cover-b"], 'trixels': ["trixels-a", "trixels-b"]}
mcms_sdf_now = pandas.DataFrame.from_dict(mcms_data)
imerg_data = {"label": [87, 91], "sids": ["sids-87", "sids-91"], "cover": ["cover-87", "cover-91"], "trixels": ["trixels-87", "trixels-91"]}
imerg_sdf_now = pandas.DataFrame.from_dict(imerg_data)
# Merge so info about each IMERG feature is available for each ETC center (uci)
combined = mcms_sdf_now.merge(imerg_sdf_now, how='cross', suffixes=('_mcms', '_imerg')) uci vert_poly_geo sids cover trixels
0 uci-a poly-a sids-a cover-a trixels-a
1 uci-b poly-b sids-b cover-b trixels-b
label sids cover trixels
0 87 sids-87 cover-87 trixels-87
1 91 sids-91 cover-91 trixels-91
uci vert_poly_geo sids_mcms cover_mcms trixels_mcms label sids_imerg cover_imerg trixels_imerg
0 uci-a poly-a sids-a cover-a trixels-a 87 sids-87 cover-87 trixels-87
1 uci-a poly-a sids-a cover-a trixels-a 91 sids-91 cover-91 trixels-91
2 uci-b poly-b sids-b cover-b trixels-b 87 sids-87 cover-87 trixels-87
3 uci-b poly-b sids-b cover-b trixels-b 91 sids-91 cover-91 trixels-91
Now I can check the combined DF for spatial intersection between the columns "sids_mcms" and "sids_imerg" for each row, storing the intersecting SIDs (if any) in a new column "sids_st". I guess I could also make a "cover_st" and "trixels_st" column based on "sids_st" as well.
Then I can plot the ETC trixels, the full IMERG trixels or the space-time intersecting (st) IMERG trixels as required.
I know how to brute force this using loops and starepandas [stare_intersection(), to_trixels() and stare_dissolve()], but is there a simple DF set of operations to so this last part?