Skip to content

Latest commit

 

History

History
147 lines (117 loc) · 6.13 KB

File metadata and controls

147 lines (117 loc) · 6.13 KB

sdig

Helper Routines for reading ERDDAP DSG metadata.

Installation (for the time being)

pip install git+https://github.com/noaaroland/sdig.git

Details

All ERDDAP DSG data sets have a handy metadata construct located at server.gov/erddap/info/dataset_id/index.csv. If you read that metadata into a Pandas dataframe as in info_df = pd.read_csv(info_url), then you can use the methods here to extract lots of interesting things as detailed below.

NAME

sdig.erddap.info

FUNCTIONS

    get_depths(depth_name, url)
        Returns a list of depths read directly from an ERDDAP data source. Use dsg_info to get the name of the depth
        variable.

            Parameters:
                    :param: depth_name: A string which the ERDDAP/netCDF short variable name of the variable
                    that contains the DSG ID
                    :type: str
                    :param: url: data url of the tabledap data set from which to read the data values
                    :type: str

            Returns:
                    :returns: depths: a sorted list of distinct depths.
                    :rtype: list

    get_dsg_info(dsg_type, info_df)
        Returns the name of the Z variable as a string and the DSG ID of the data set as a dict
        with keys timeseries, profile, trajectory as appropriate.
        A timeseriesprofile data set has both a timeseries and a profile key.


        Parameters:
            :param: dsg_type: the dsg_type of the data set. Use get_dsg_type
            :type: str
            :param: info_df: A dataframe from the pd.read_csv(erddap.server.gov/erddap/info/DS_ID/index.csv)
            :type: pd.DataFrame
        Returns:
            :returns: depth_name: the value of the title global attribute
            :rtype: str
            :returns: dsg_id: the variable names of the id variables as values with [dsg_type]_id as keys
            :rtype: str

    get_dsg_type(info_df)
        Returns the dsg_type of the data set. One of timeseries, profile, trajectory, timeseriesprofile.

        Parameters:
            :param: info_df: A dataframe from the pd.read_csv(erddap.server.gov/erddap/info/DS_ID/index.csv)
            :type: pd.DataFrame
        Returns:
            :returns: dsg_type: the dsg type as a lower case string
            :rtype: str

    get_info_url(data_url)
        Returns the info CSV URL from the data URL of a DSG ERDDAP dataset. Changes:
         https://data.pmel.noaa.gov/pmel/erddap/tabledap/CGBN_Canada.html or
         https://data.pmel.noaa.gov/pmel/erddap/tabledap/CGBN_Canada
         to
         https://data.pmel.noaa.gov/pmel/erddap/info/CGBN_Canada/index.csv

        :param data_url:
        :type: str
        :return: The info CSV url for use with the
        :rtype: str

    get_time_marks(start, end)
        Make some time marks for the time slider, only one mark at the beginning and one mark at the end

        Parameters:
            :param: start: start time as Unix epoch seconds
            :type: float
            :param: end: end time as Unix epoch seconds
            :type: float

        Returns:
            :returns: marks: the marks as suitable for input into the slider DASH widget
            :rtype: dict

    get_times(info_df)
        Returns a tuple of start time and end times both as YYYY-MM-dd string as Unix epoch seconds read from the metadata
        not the actual data values so it should be fast

        Parameters:
                :param: info_df: A dataframe from the pd.read_csv(erddap.server.gov/erddap/info/DS_ID/index.csv)
                :type: pd.DataFrame

        Returns:
                :returns: start_date: start date formatted YYYY-MM-dd
                :rtype: str
                :returns: end_date: end date formtted YYYY-MM-dd
                :rtype: str
                :returns: start_date_second: the start date as seconds from the Unix epoch
                :rtype: float
                :returns: end_date_seconds: the end date as seconds from the Unix epoch
                :rtype: float

    get_title(info_df)
        Returns the value of the title global attribute of the data set.

        Parameters:
            :param: info_df: A dataframe from the pd.read_csv(erddap.server.gov/erddap/info/DS_ID/index.csv)
            :type: pd.DataFrame

        Returns:
            :returns: title: the value of the title global attribute
            :rtype: str

    get_variables(info_df)
        Returns a tuple list and dictionaries describing the data variables in an ERDDAP data set. Check if the key
        is in the dict before use because not all variables have long_name, units and standard_name attributes.

        Parameters:
                :param: info_df: A dataframe from the pd.read_csv(erddap.server.gov/erddap/info/DS_ID/index.csv)
                :type: pd.DataFrame

        Returns:
                :returns: variables: the short variable names as a string
                :rtype: list
                :returns: long_names: a dict of variable long names values with the short names as the keys
                :rtype: dict
                :returns: units: a dict of variable units values with the short names as the keys
                :rtype: dict
                :returns: standard_names: a dict of variable standard_name values with the short names as the keys
                :rtype: dict

    make_platform_constraint(dsg_id_var, in_platforms)
        Returns a ERDDAP formatted 'OR' constraint that will allow selection of rows which contain any of the
        ID's in the input.

            Parameters:
                    :param: dsg_id_var: A string which the ERDDAP/netCDF short variable name of the variable that contains
                    :type: str
                    :param: in_platforms: DASH returns either a str or a list of str which you can pass into this method
                    directly to get the constraint string
                    :type: list or str
            Returns:
                    :returns: return_stuff: a dictionary 'con' has the constraint, and 'platforms' is the list of platforms
                    as a list even if the input was a str variable. Use dsg_info to get the dsg_id_var name.
                    :rtype: dict