Skip to content

cdscan issues with poorly formed netcdf files #430

@durack1

Description

@durack1

This redirects the issue described in pochedls/xagg#33

cdscan is having problems with poorly formed netcdf files. These files contain valid data but have been poorly defined, for e.g. a time fixed field (no time dimension) that includes a time dimension that has no values. For the example below, cdms2 can read the areacello variable from the file, but cdscan throws an error. For comparison, a valid file ncdump is included at the bottom of this issue.

$ ncdump -ct ~/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/piControl/r1i1p1f2/Ofx/areacello/gn/v20180814/areacello_Ofx_CNRM-CM6-1_piControl_r1i1p1f2_gn.nc
netcdf areacello_Ofx_CNRM-CM6-1_piControl_r1i1p1f2_gn {
dimensions:
	axis_nbounds = 2 ;
	x = 362 ;
	y = 294 ;
	nvertex = 4 ;
	time = UNLIMITED ; // (0 currently)
variables:
	double lat(y, x) ;
		lat:standard_name = "latitude" ;
		lat:long_name = "Latitude" ;
...
	double lon(y, x) ;
		lon:standard_name = "longitude" ;
		lon:long_name = "Longitude" ;
...
	double bounds_lon(y, x, nvertex) ;
	double bounds_lat(y, x, nvertex) ;
	float areacello(y, x) ;
		areacello:standard_name = "cell_area" ;
		areacello:long_name = "Grid-Cell Area" ;
		areacello:units = "m2" ;
...
		areacello:history = "none" ;

// global attributes:
...

To Reproduce
Steps to reproduce the behavior:

  1. Install CDAT 8.2.1 nompi
  2. Attempt to run cdscan on the file listed above
(cdat821nompi) bash-4.2$ cdscan -x tmp.xml ~/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/piControl/r1i1p1f2/Ofx/areacello/gn/v20180814/areacello_Ofx_CNRM-CM6-1_piControl_r1i1p1f2_gn.nc
Finding common directory ...
Common directory: ~/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/piControl/r1i1p1f2/Ofx/areacello/gn/v20180814/
Scanning files ...
~/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/piControl/r1i1p1f2/Ofx/areacello/gn/v20180814/areacello_Ofx_CNRM-CM6-1_piControl_r1i1p1f2_gn.nc
Setting reference time units to 
Traceback (most recent call last):
  File "~/anaconda3/envs/cdat821nompi/bin/cdscan", line 1842, in <module>
    main(sys.argv)
  File "~/anaconda3/envs/cdat821nompi/bin/cdscan", line 1284, in main
    timeIsLinear = (referenceTime[0].lower().split() in
IndexError: string index out of range
  1. See cdscan error above

And here is an ncdump of a validly formed file (note no time dimension is defined)

(cdat821nompi) bash-4.2$ ncdump -ct ~/esgf_publish/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/1pctCO2/r1i1p1f1/Ofx/areacello/gn/v20191109/areacello_Ofx_ACCESS-CM2_1pctCO2_r1i1p1f1_gn.nc 
netcdf areacello_Ofx_ACCESS-CM2_1pctCO2_r1i1p1f1_gn {
dimensions:
	j = 300 ;
	i = 360 ;
	bnds = 2 ;
	vertices = 4 ;
variables:
	int j(j) ;
		j:units = "1" ;
		j:long_name = "cell index along second dimension" ;
	int i(i) ;
		i:units = "1" ;
		i:long_name = "cell index along first dimension" ;
	double latitude(j, i) ;
		latitude:standard_name = "latitude" ;
		latitude:long_name = "latitude" ;
...
		latitude:bounds = "vertices_latitude" ;
	double longitude(j, i) ;
		longitude:standard_name = "longitude" ;
		longitude:long_name = "longitude" ;
...
		longitude:bounds = "vertices_longitude" ;
	double vertices_latitude(j, i, vertices) ;
		vertices_latitude:units = "degrees_north" ;
...
	double vertices_longitude(j, i, vertices) ;
		vertices_longitude:units = "degrees_east" ;
...
	float areacello(j, i) ;
		areacello:standard_name = "cell_area" ;
		areacello:long_name = "Grid-Cell Area for Ocean Variables" ;
		areacello:comment = "Horizontal area of ocean grid cells" ;
		areacello:units = "m2" ;
...

// global attributes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions