-
Notifications
You must be signed in to change notification settings - Fork 18
Description
On downloading data, pysep provides a station_file.txt of the inventory retrieved based on the station metadata gathering and station removal parameters, as well as a weights.dat file for use with the mtuq package. In many cases the number of stations listed in the weights.dat file is less than the number of stations in the station_file.txt. The reason for some of these missing stations is logged in the pysep.log, but that for some others is not. It seems that the stations that are unaccounted for are probably the ones for which no data was retrieved at all even though they were in the inventory retrieved. It will be helpful to have a log of such stations as well, at least for completeness.
In fact, I think it will be extremely useful to have a separate stations file listing ALL the stations that were in the retrieved metadata that did not make the cut for any reason whatsoever, some of which include -
- data could not be retrieved from the servers
- removed based on user provided curtailing parameters (e.g.
remove_clipped=True) - did not satisfy
pysepchecks (e.g. rotation metadata quality)
Note: This should not contain the stations which were removed because of geographical subsetting, or because a network was not requested. This should only include the stations that did not make the cut after the inventory was finalized along with the reason for the same.
An additional point: The utility of the parameter remove_insufficient_length isn't clear. It also seems redundant to the parameters fill_data_gaps and gap_fraction. Many stations do not make the cut due to less than a sample's difference in time length from the modal trace length of the set of streams, because of this parameter being turned on.