Collocating MODIS L2 AOD and Unified Model output

9 posts / 0 new
Last post
RachelH
Collocating MODIS L2 AOD and Unified Model output

Hi,

I've just started working with CIS and it seems like it could be really helpful for me, but so far I can't figure out how to collocate my model data with the MODIS Aqua AOD data.

My model data has a gridded domain from longitude 22 to -77 and latitude -2.5 to 32. The AOD at 550nm is in one netcdf file ('pseudo_2_AOD_model.nc') with one variable called 'oddust_prog' (AOD). This variable has 192 hourly timesteps with 3 dimensions: time (192 values), 'grid_latitude' (500 values), 'grid_longitude' (1300 values).
>>> cis info pseudo_2_AOD_model.nc
Out: oddust_prog

>>> cis info oddust_prog:pseudo_2_AOD_model.nc
Out: atmosphere_optical_thickness_due_to_dust_ambient_aerosol / (1) (time: 192; grid_latitude: 500; grid_longitude: 1300)
Dimension coordinates:
time x - -
grid_latitude - x -
grid_longitude - - x
Auxiliary coordinates:
forecast_period x - -
Scalar coordinates:
forecast_reference_time: 2015-08-17 00:00:00
pseudo_level: 2
Attributes:
Conventions: CF-1.5
STASH: m01s02i422
_NCProperties: version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.16
history: 2017-07-21T10:19:09Z Subsetted using limits: pseudo_level: [2, 2]
source: Data from Met Office Unified Model
um_version: 10.3

I then have MODIS Level 2 aerosol data in separate files for the same days with approximately 30 files per day. These files are of the format:
MYD04_L2.A2015229.1005.006.2015230155543.hdf
MYD04_L2.A2015229.1010.006.2015230163110.hdf etc. with 229 being the day of the year.
These hdfs contain 39 variables such as Latitude, Longitude, AOD_550_Dark_Target_Deep_Blue_Combined, Optical_Depth_Land_And_Ocean etc. The one I am interested in is 'Optical_Depth_Land_And_Ocean'.

>>> cis info Optical_Depth_Land_And_Ocean:MYD04_L2.A2015229.1005.006.2015230155543.hdf
Out: Ungridded data: AOT at 0.55 micron for both ocean (Average) (Quality flag=1,2,3) and land (corrected) (Quality flag=3) / (None)
Shape = (204, 135)

Total number of points = 27540
Number of non-masked points = 8252
Long name = AOT at 0.55 micron for both ocean (Average) (Quality flag=1,2,3) and land (corrected) (Quality flag=3)
Standard name = None
Units = None
Missing value = -9999
Range = (0.0, 0.51500002446118742)
History =
Misc attributes:
Cell_Along_Swath_Sampling = [1, 2031, 10]
Geolocation_Pointer = Internal geolocation arrays
Cell_Across_Swath_Sampling = [1, 1354, 10]
Parameter_Type = Output
Valid_Range = [-100, 5000]
Coordinates:
latitude
Long name = Geodetic Latitude
Standard name = latitude
Units = Degrees_north
Missing value = -999.0
Range = (-9.9063931, 10.862552)
History =
Misc attributes:
Cell_Along_Swath_Sampling = [1, 2031, 10]
Geolocation_Pointer = Geolocation data not applicable
Cell_Across_Swath_Sampling = [1, 1345, 10]
Parameter_Type = MODIS Input
Valid_Range = [-90.0, 90.0]
longitude
Long name = Geodetic Longitude
Standard name = longitude
Units = Degrees_east
Missing value = -999.0
Range = (40.012112, 64.153374)
History =
Misc attributes:
Cell_Along_Swath_Sampling = [1, 2031, 10]
Geolocation_Pointer = Geolocation data not applicable
Cell_Across_Swath_Sampling = [1, 1345, 10]
Parameter_Type = MODIS Input
Valid_Range = [-180.0, 180.0]
time
Long name = TAI Time at Start of Scan replicated across the swath
Standard name = time
Units = days since 1600-01-01 00:00:00
Missing value = -999.0
Range = (2015-08-17 10:05:10, 2015-08-17 10:10:10)
History =
Misc attributes:
Cell_Along_Swath_Sampling = [1, 2031, 10]
Geolocation_Pointer = Internal geolocation arrays
Cell_Across_Swath_Sampling = [1, 1345, 10]
Parameter_Type = MODIS Input
Valid_Range = [0.0, 3155800000.0]

The model resolution is 8km and the MODIS data 10km

What I would like to achieve is a time-matched comparison for each day between model and MODIS data. So essentially I could plot side-by-side my domain with all MODIS AOD data from that day (as the latitudes are quite low, each location should only have one overpass per day) on one side and Model data from the overpass times for each location on the other. So I need to extract from the hourly model timesteps the AOD data from each location that was overpassed by MODIS closest to that hour. This means my model image for each day will be a jigsaw of AOD data from sifferent timesteps to match the overpass times. My MODIS AOD data should be a jigsaw of all the swaths taken on that day.

When I try to collocate the model data to the satellite data for one file I get the error below.

>>> cis col oddust_prog:pseudo_2_AOD_model.nc MYD04_L2.A2015229.1005.006.2015230155543.hdf -o collocate_one_file.nc
Out: Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively
2017-07-27 16:03:21,963 - ERROR - Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively - check cis.log for details

I've also tried this with a subset file of the original with just the variable I am interested in but get the same error:

>>> cis subset Optical_Depth_Land_And_Ocean:MYD04_L2.A2015229.1005.006.2015230155543.hdf x=[-180,180],y=[0,90],t=[2015-08-17] -o trying.nc
>>> cis col oddust_prog:pseudo_2_AOD_model.nc trying.nc -o collocate_one_file2.nc
Out: Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively
2017-07-27 16:06:21,273 - ERROR - Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively - check cis.log for details
(The cis.log for this error is at the bottom of this essay!)

I can think of a couple of issues that could be factoring and maybe you can comment on whether this is likely:
-The UM model data is on a rotated grid so the values for latitude and longitude are different to that of the satellite data. This is easily changed but I would have thought that the collocation would still maybe give some result but it would be in an inaccurate location?
-From the 'cis info' the model and MODIS data have different time formats, but I thought this was something CIS could deal with.
-The order of the dimensions is different in the two datasets (MODIS is latitude, longitude, time, and the model data is time, latitude, longitude).

Do you think that any of these these could be the problem(s)? Or is there another issue with my approach?

Thanks for any help,

Rachel.

__________________________
cis.log from
>>> 'cis col oddust_prog:pseudo_2_AOD_model.nc trying.nc -o collocate_one_file2.nc'

2017-07-27 16:05:38,771 - INFO - parse : 290 - Identified input file list: ['MYD04_L2.A2015229.1005.006.2015230155543.hdf']
2017-07-27 16:06:00,134 - INFO - parse : 290 - Identified input file list: ['MYD04_L2.A2015229.1005.006.2015230155543.hdf']
2017-07-27 16:06:01,702 - DEBUG - cis_main : 252 - CIS started at: 2017-07-27 16:06
2017-07-27 16:06:01,702 - DEBUG - cis_main : 253 - Running command: subset
2017-07-27 16:06:01,702 - DEBUG - cis_main : 254 - With the following arguments: Namespace(command='subset', datagroups=[{'variables': ['Optical_Depth_Land_And_Ocean'], 'filenames': ['MYD04_L2.A2015229.1005.006.2015230155543.hdf']}], force_overwrite=False, limits={'y': [0.0, 90.0], 'x': [-180.0, 180.0], 't': <cis.time_util.PartialDateTime object at 0x4773e90>}, output='trying.nc', output_var=None, quiet=False, subsetranges='x=[-180,180],y=[0,90],t=[2015-08-17]', verbose=None)
2017-07-27 16:06:01,703 - DEBUG - plugin : 81 - AProduct subclasses are: [<class 'cis.data_io.products.products.ASCII_Hyperpoints'>, <class 'cis.data_io.products.products.Aeronet'>, <class 'cis.data_io.products.CCI.Aerosol_CCI'>, <class 'cis.data_io.products.caliop.Caliop_L1'>, <class 'cis.data_io.products.caliop.Caliop_L2'>, <class 'cis.data_io.products.cloudsat.CloudSat'>, <class 'cis.data_io.products.CCI.Cloud_CCI'>, <class 'cis.data_io.products.HadGEM.HadGEM_CONVSH'>, <class 'cis.data_io.products.HadGEM.HadGEM_PP'>, <class 'cis.data_io.products.MODIS.MODIS_L2'>, <class 'cis.data_io.products.MODIS.MODIS_L3'>, <class 'cis.data_io.products.NCAR_NetCDF_RAF.NCAR_NetCDF_RAF'>, <class 'cis.data_io.products.gridded_NetCDF.NetCDF_Gridded'>, <class 'cis.data_io.products.products.cis'>]
2017-07-27 16:06:01,706 - DEBUG - AProduct : 169 - Found product class MODIS_L2 matching regex pattern .*MYD04_L2.*\.hdf
2017-07-27 16:06:01,707 - INFO - AProduct : 198 - Retrieving data using product MODIS_L2...
2017-07-27 16:06:01,707 - DEBUG - MODIS : 295 - Creating data object for variable Optical_Depth_Land_And_Ocean
2017-07-27 16:06:01,707 - INFO - MODIS : 258 - Listing coordinates: ['Latitude', 'Longitude', 'Scan_Start_Time']
2017-07-27 16:06:01,707 - DEBUG - hdf : 96 - reading file: MYD04_L2.A2015229.1005.006.2015230155543.hdf
2017-07-27 16:06:01,799 - DEBUG - MODIS : 29 - Masking all values -90.0 > v > 90.0.
2017-07-27 16:06:01,799 - DEBUG - MODIS : 58 - Applying 'science_data = (packed_data - 0.0) * 1.0' transformation to data.
2017-07-27 16:06:01,846 - DEBUG - MODIS : 29 - Masking all values -180.0 > v > 180.0.
2017-07-27 16:06:01,847 - DEBUG - MODIS : 58 - Applying 'science_data = (packed_data - 0.0) * 1.0' transformation to data.
2017-07-27 16:06:01,941 - DEBUG - MODIS : 29 - Masking all values 0.0 > v > 3155800000.0.
2017-07-27 16:06:01,943 - DEBUG - MODIS : 58 - Applying 'science_data = (packed_data - 0.0) * 1.0' transformation to data.
2017-07-27 16:06:01,952 - DEBUG - hdf : 96 - reading file: MYD04_L2.A2015229.1005.006.2015230155543.hdf
2017-07-27 16:06:02,009 - INFO - ungridded_data : 145 - Unable to parse cf-units: None. Some operations may not be available.
2017-07-27 16:06:02,040 - DEBUG - MODIS : 29 - Masking all values -100 > v > 5000.
2017-07-27 16:06:02,042 - DEBUG - MODIS : 58 - Applying 'science_data = (packed_data - 0.0) * 0.0010000000475' transformation to data.
2017-07-27 16:06:02,174 - DEBUG - subset : 100 - Created SubsetConstraint of type UngriddedSubsetConstraint
2017-07-27 16:06:02,291 - INFO - ungridded_data : 1053 - Saving data to trying.nc
2017-07-27 16:06:02,311 - INFO - write_netcdf : 106 - Creating variable: latitude(obs) f4
2017-07-27 16:06:02,343 - INFO - write_netcdf : 106 - Creating variable: longitude(obs) f4
2017-07-27 16:06:02,345 - INFO - write_netcdf : 106 - Creating variable: time(obs) f8
2017-07-27 16:06:02,356 - INFO - write_netcdf : 106 - Creating variable: Optical_Depth_Land_And_Ocean(obs) f8
2017-07-27 16:06:21,124 - INFO - parse : 290 - Identified input file list: ['trying.nc']
2017-07-27 16:06:21,126 - INFO - parse : 290 - Identified input file list: ['pseudo_2_AOD_model.nc']
2017-07-27 16:06:21,126 - DEBUG - cis_main : 252 - CIS started at: 2017-07-27 16:06
2017-07-27 16:06:21,127 - DEBUG - cis_main : 253 - Running command: collocate
2017-07-27 16:06:21,127 - DEBUG - cis_main : 254 - With the following arguments: Namespace(command='collocate', datagroups=[{'variables': ['oddust_prog'], 'filenames': ['pseudo_2_AOD_model.nc']}], force_overwrite=False, output='collocate_one_file2.nc', output_var=None, quiet=False, samplefiles=['trying.nc'], samplegroup={'filenames': ['trying.nc']}, sampleproduct=None, samplevariable=None, verbose=None)
2017-07-27 16:06:21,128 - DEBUG - plugin : 81 - AProduct subclasses are: [<class 'cis.data_io.products.products.ASCII_Hyperpoints'>, <class 'cis.data_io.products.products.Aeronet'>, <class 'cis.data_io.products.CCI.Aerosol_CCI'>, <class 'cis.data_io.products.caliop.Caliop_L1'>, <class 'cis.data_io.products.caliop.Caliop_L2'>, <class 'cis.data_io.products.cloudsat.CloudSat'>, <class 'cis.data_io.products.CCI.Cloud_CCI'>, <class 'cis.data_io.products.HadGEM.HadGEM_CONVSH'>, <class 'cis.data_io.products.HadGEM.HadGEM_PP'>, <class 'cis.data_io.products.MODIS.MODIS_L2'>, <class 'cis.data_io.products.MODIS.MODIS_L3'>, <class 'cis.data_io.products.NCAR_NetCDF_RAF.NCAR_NetCDF_RAF'>, <class 'cis.data_io.products.gridded_NetCDF.NetCDF_Gridded'>, <class 'cis.data_io.products.products.cis'>]
2017-07-27 16:06:21,128 - DEBUG - AProduct : 169 - Found product class cis matching regex pattern .*\.nc
2017-07-27 16:06:21,133 - INFO - AProduct : 220 - Retrieving coordinates using product cis
2017-07-27 16:06:21,134 - INFO - products : 23 - Listing coordinates: [('longitude', 'x'), ('latitude', 'y'), ('altitude', 'z'), ('time', 't'), ('air_pressure', 'p')]
2017-07-27 16:06:21,145 - DEBUG - netcdf : 288 - Masking all values -180.0 > v > 180.0.
2017-07-27 16:06:21,147 - DEBUG - netcdf : 288 - Masking all values -90.0 > v > 90.0.
2017-07-27 16:06:21,148 - DEBUG - netcdf : 288 - Masking all values 0.0 > v > 3155800000.0.
2017-07-27 16:06:21,149 - DEBUG - plugin : 81 - AProduct subclasses are: [<class 'cis.data_io.products.products.ASCII_Hyperpoints'>, <class 'cis.data_io.products.products.Aeronet'>, <class 'cis.data_io.products.CCI.Aerosol_CCI'>, <class 'cis.data_io.products.caliop.Caliop_L1'>, <class 'cis.data_io.products.caliop.Caliop_L2'>, <class 'cis.data_io.products.cloudsat.CloudSat'>, <class 'cis.data_io.products.CCI.Cloud_CCI'>, <class 'cis.data_io.products.HadGEM.HadGEM_CONVSH'>, <class 'cis.data_io.products.HadGEM.HadGEM_PP'>, <class 'cis.data_io.products.MODIS.MODIS_L2'>, <class 'cis.data_io.products.MODIS.MODIS_L3'>, <class 'cis.data_io.products.NCAR_NetCDF_RAF.NCAR_NetCDF_RAF'>, <class 'cis.data_io.products.gridded_NetCDF.NetCDF_Gridded'>, <class 'cis.data_io.products.products.cis'>]
2017-07-27 16:06:21,149 - DEBUG - AProduct : 169 - Found product class cis matching regex pattern .*\.nc
2017-07-27 16:06:21,151 - INFO - AProduct : 174 - Product class cis is not right because ['Source (Data from Met Office Unified Model) does not match CIS in pseudo_2_AOD_model.nc']
2017-07-27 16:06:21,152 - DEBUG - AProduct : 169 - Found product class NCAR_NetCDF_RAF matching regex pattern .*\.nc$
2017-07-27 16:06:21,153 - INFO - AProduct : 174 - Product class NCAR_NetCDF_RAF is not right because ["NCAR-RAF convention unknown, expecting 'NCAR-RAF/nimbus' was 'CF-1.5'"]
2017-07-27 16:06:21,171 - DEBUG - AProduct : 169 - Found product class NetCDF_Gridded matching regex pattern .*\.nc
2017-07-27 16:06:21,171 - INFO - AProduct : 198 - Retrieving data using product NetCDF_Gridded...
2017-07-27 16:06:21,250 - INFO - col : 25 - Collocator: <cis.collocation.col_implementations.GriddedUngriddedCollocator object at 0x2c195d0>
2017-07-27 16:06:21,251 - INFO - col : 26 - Kernel: lin
2017-07-27 16:06:21,251 - INFO - col : 28 - Collocating, this could take a while...
2017-07-27 16:06:21,267 - DEBUG - utils : 674 - App Memory MB (GriddedUngriddedCollocator Initial): 66.8828125
2017-07-27 16:06:21,267 - DEBUG - utils : 674 - App Memory MB (GriddedUngriddedCollocator Initial): 66.8828125
2017-07-27 16:06:21,269 - DEBUG - utils : 674 - App Memory MB (GriddedUngriddedCollocator after data retrieval): 67.4375
2017-07-27 16:06:21,269 - INFO - col_implementations : 180 - --> Collocating...
2017-07-27 16:06:21,269 - INFO - col_implementations : 181 - 14497 sample points
2017-07-27 16:06:21,271 - DEBUG - cis_main : 22 - Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively

2017-07-27 16:06:21,273 - DEBUG - cis_main : 23 - Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cis/cis_main.py", line 280, in main
parse_and_run_arguments()
File "/usr/lib/python2.7/site-packages/cis/cis_main.py", line 258, in parse_and_run_arguments
cmd(arguments)
File "/usr/lib/python2.7/site-packages/cis/cis_main.py", line 98, in col_cmd
missing_data_for_missing_sample=missing_data_for_missing_sample, **col_options)
File "/usr/lib/python2.7/site-packages/cis/data_io/common_data.py", line 415, in collocated_onto
var_units=var_units, **kwargs)
File "/usr/lib/python2.7/site-packages/cis/data_io/ungridded_data.py", line 1023, in sampled_from
var_units=var_units, **kwargs)
File "/usr/lib/python2.7/site-packages/cis/data_io/ungridded_data.py", line 1250, in _ungridded_sampled_from
return collocate(data, sample, col, con, kernel)
File "/usr/lib/python2.7/site-packages/cis/collocation/col.py", line 31, in collocate
new_data = collocator.collocate(sample, data, constraint, kernel)
File "/usr/lib/python2.7/site-packages/cis/collocation/col_implementations.py", line 166, in collocate
output.extend(self.collocate(points, var, constraint, kernel))
File "/usr/lib/python2.7/site-packages/cis/collocation/col_implementations.py", line 185, in collocate
self.interpolator = GriddedUngriddedInterpolator(data, points, kernel, self.missing_data_for_missing_sample)
File "/usr/lib/python2.7/site-packages/cis/collocation/gridded_interpolation.py", line 122, in __init__
"dimenions: {} and {} respectively".format(len(sample_points), len(data.shape)))
ValueError: Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively

2017-07-27 16:06:21,273 - ERROR - cis_main : 24 - Sample points do not uniquely define gridded data source points, invalid dimenions: 1 and 3 respectively - check cis.log for details

duncanwp
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Rachel,

Yes, that sounds like just the kind of things CIS should be able to help with. Thanks for the detailed log, but I think you've identified the problem yourself. The error message isn't very helpful in this case but the problem is because CIS won't match the grid_longitude and grid_latitude coordinates with the MODIS longitude and latitude coordinates respectively.

If you're not worried about the differences between the coordinate systems you can just rename your model dimensions to longitude and latitude and it should just work.

The alternative is to re-project the model data to the MODIS coordinate system and then CIS would find the nearest points using kd-trees, but you might find that quite slow as there are some optimisations in this case which CIS doesn't yet take advantage of (the main one being that the model grid is the same at each time step, unlike when collocation with other satellite data for example).

I'm hoping that in version 2.0 we will be able to take care of differences in coordinate reference systems, although this might just be as a separate 're-project' command, as it's not always clear what the best option would be. I'd be happy to hear your thoughts on the process.

Many thanks,

Duncan

RachelH
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Duncan,

Thank you for getting back to me. That seems to have worked so that's great.

One thing I haven't managed to do is exclude all the missing values from the satellite files from the model data as well. When I do 'cis plot', the collocated model data has 3 small swaths of blank data or missing values, whereas the satellite data has much larger regions of missing data. For a fairer comparison, I'd like to be able to excluded all the locations where data is missing from the satellite data from the model data. From the cis.info of the satellite files, I think the AOD missing points have a value of -9999. It's probably very simple but how do I use the 'missing_data_for_missing_sample' command to exclude all the missing data from the model data.

So far I have tried the following, all of which get the same error:
>>> cis col oddust_prog:Model_AOD.nc Satellite_Optical_Depth_Land_And_Ocean.nc:collocator=lin[missing_data_for_missing_sample] -o Model_missing_vals_test_AOD.nc
Out: 'bool' object has no attribute 'lower'
2017-08-01 10:53:56,375 - ERROR - 'bool' object has no attribute 'lower' - check cis.log for details

>>> cis col oddust_prog:Model_AOD.nc Satellite_Optical_Depth_Land_And_Ocean.nc:collocator=lin[Optical_Depth_Land_And_Ocean,missing_data_for_missing_sample] -o Model_missing_vals_test_AOD.nc
Out: 'bool' object has no attribute 'lower'
2017-08-01 10:55:47,514 - ERROR - 'bool' object has no attribute 'lower' - check cis.log for details

>>> cis col oddust_prog:Model_AOD.nc Satellite_Optical_Depth_Land_And_Ocean.nc:collocator=lin[variable=Optical_Depth_Land_And_Ocean,missing_data_for_missing_sample] -o Model_missing_vals_test_AOD.nc
Out:'bool' object has no attribute 'lower'
2017-08-01 10:55:57,756 - ERROR - 'bool' object has no attribute 'lower' - check cis.log for details

>>> cis col oddust_prog:Model_AOD.nc Satellite_Optical_Depth_Land_And_Ocean.nc:collocator=box[missing_data_for_missing_sample] -o Model_missing_vals_test_AOD.nc
Out: 'bool' object has no attribute 'lower'
2017-08-01 10:56:59,511 - ERROR - 'bool' object has no attribute 'lower' - check cis.log for details

The cis.log for one of these is:
2017-08-01 10:55:47,485 - INFO - parse : 290 - Identified input file list: ['Satellite_Optical_Depth_Land_And_Ocean.nc']
2017-08-01 10:55:47,486 - INFO - parse : 290 - Identified input file list: ['Model_AOD.nc']
2017-08-01 10:55:47,486 - DEBUG - cis_main : 252 - CIS started at: 2017-08-01 10:55
2017-08-01 10:55:47,486 - DEBUG - cis_main : 253 - Running command: collocate
2017-08-01 10:55:47,487 - DEBUG - cis_main : 254 - With the following arguments: Namespace(command='collocate', datagroups=[{'variables': ['oddust_prog'], 'filenames': ['Model_AOD.nc']}], force_overwrite=False, output='Model_missing_vals_test_AOD.nc', output_var=None, quiet=False, samplefiles=['Satellite_Optical_Depth_Land_And_Ocean.nc'], samplegroup={'collocator': ('lin', {'missing_data_for_missing_sample': True, 'Optical_Depth_Land_And_Ocean': True}), 'filenames': ['Satellite_Optical_Depth_Land_And_Ocean.nc']}, sampleproduct=None, samplevariable=None, verbose=None)
2017-08-01 10:55:47,487 - DEBUG - plugin : 81 - AProduct subclasses are: [<class 'cis.data_io.products.products.ASCII_Hyperpoints'>, <class 'cis.data_io.products.products.Aeronet'>, <class 'cis.data_io.products.CCI.Aerosol_CCI'>, <class 'cis.data_io.products.caliop.Caliop_L1'>, <class 'cis.data_io.products.caliop.Caliop_L2'>, <class 'cis.data_io.products.cloudsat.CloudSat'>, <class 'cis.data_io.products.CCI.Cloud_CCI'>, <class 'cis.data_io.products.HadGEM.HadGEM_CONVSH'>, <class 'cis.data_io.products.HadGEM.HadGEM_PP'>, <class 'cis.data_io.products.MODIS.MODIS_L2'>, <class 'cis.data_io.products.MODIS.MODIS_L3'>, <class 'cis.data_io.products.NCAR_NetCDF_RAF.NCAR_NetCDF_RAF'>, <class 'cis.data_io.products.gridded_NetCDF.NetCDF_Gridded'>, <class 'cis.data_io.products.products.cis'>]
2017-08-01 10:55:47,488 - DEBUG - AProduct : 169 - Found product class cis matching regex pattern .*\.nc
2017-08-01 10:55:47,491 - INFO - AProduct : 220 - Retrieving coordinates using product cis
2017-08-01 10:55:47,491 - INFO - products : 23 - Listing coordinates: [('longitude', 'x'), ('latitude', 'y'), ('altitude', 'z'), ('time', 't'), ('air_pressure', 'p')]
2017-08-01 10:55:47,502 - DEBUG - netcdf : 288 - Masking all values -180.0 > v > 180.0.
2017-08-01 10:55:47,506 - DEBUG - netcdf : 288 - Masking all values -90.0 > v > 90.0.
2017-08-01 10:55:47,512 - DEBUG - netcdf : 288 - Masking all values 0.0 > v > 3155800000.0.
2017-08-01 10:55:47,514 - DEBUG - cis_main : 22 - 'bool' object has no attribute 'lower'

duncanwp
Re: Collocating MODIS L2 AOD and Unified Model output

You're nearly there!

I think the following should work:

cis col oddust_prog:Model_AOD.nc Satellite_Optical_Depth_Land_And_Ocean.nc:variable=Optical_Depth_Land_And_Ocean,collocator=lin -o Model_missing_vals_test_AOD.nc

When you specify the variable that you want to use to do the collocation then CIS assumes you want to skip any masked points, so you don't need to use the missing_data_for_missing_sample flag.

RachelH
Collocating MODIS L2 AOD and Unified Model output

Hi Duncan,

Thank you, that worked!

I had to use the 'box[hsep]' collocator instead of 'lin'. I think this was because the data had already been collocated once so it wasn't gridded. I just have a couple of questions to check the way I did it was sound and to understand some things better.

What I did was:
1. Make sure the model lat and lon data was unrotated and the lat, lon and time coordinates had the same name and were in the same order as the satellite data.
2. Subset the satellite data so I had one file for each day for the same area as the region of the model. There was only one overpass for each day for everywhere in the domain so there was no need to subset by time.
3. Collocate the model data to each daily satellite file so there was a corresponding daily model file. I didn't use any specific collocation method for this step so all the missing values weren't excluded. The reason I didn't specify was that I was worried doing that would mean the model collocation was averaged over time, whereas I wanted it to collocate based on all three dimensions including time. I'm not sure I needed to do this?
4. Collocate the new daily model files to the satellite daily files again using 'box[hsep=11km]'. This got rid of the missing values. I don't completely understand the logic I should use to decide the the separation or what the hsep option actually represents. It worked for anything more than the grid spacing of the files (model 8km, satellite 10km). Can you explain what it is and how to decide what value to use?
5. Aggregated both the model and satellite files to 1 degree resolution. This is because I thought the file format might be easier to use if they were gridded rather than points and if they were on the same resolution. Again, I'm not sure this was totally necessary? Or if it would be better to do this before instead of after collocation?

Sorry for all the questions!

Best,

Rachel.

duncanwp
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Rachel,

No problem at all, I'm glad to hear it worked. The method you've described looks broadly sensible, but I've responded to each of your questions in-line below.

It's a bit of a rabbit hole once you really start thinking about the best way of comparing these kinds of data. CIS was designed to give you lots of power and flexibility and we do work through some of these issues in our workshops but it might be good to try and have a quick-start guide to collocation on the website. Hopefully this post will help too though!

Duncan

>> 1. Make sure the model lat and lon data was unrotated and the lat, lon and time coordinates had the same name and were in the same order as the satellite data.
Re-projecting the lat/lon is necessary, but the names and orders of the coordinates shouldn't matter (as long as they have valid standard_name attributes in the NetCDF).

>> 2. Subset the satellite data so I had one file for each day for the same area as the region of the model. There was only one overpass for each day for everywhere in the domain so there was no need to subset by time.
Great, makes sense.

>> 3. Collocate the model data to each daily satellite file so there was a corresponding daily model file. I didn't use any specific collocation method for this step so all the missing values weren't excluded. The reason I didn't specify was that I was worried doing that would mean the model collocation was averaged over time, whereas I wanted it to collocate based on all three dimensions including time. I'm not sure I needed to do this?
If I've understood your input files correctly you should have been able to use the 'lin' method and specified the MODIS AOD variable name so that only valid AOD retrievals are compared with the model, this would be quicker and save the next step completely. As long as the subset satellite data still has a time coordinate (which it should) then it would find the right model tilmestep for the overpass - no averaging would happen.

>> 4. Collocate the new daily model files to the satellite daily files again using 'box[hsep=11km]'. This got rid of the missing values. I don't completely understand the logic I should use to decide the the separation or what the hsep option actually represents. It worked for anything more than the grid spacing of the files (model 8km, satellite 10km). Can you explain what it is and how to decide what value to use?
What's happening here is that each of your collocated model points is being matched up with any satellite points within 11km in the horizontal (hsep is short for horizontal separation). Now, because they already match up after the step above you should only need to specify an arbitrarily small separation (say 0.1km), and the mean would just be the value of the one matching point (the corrected sample standard deviation should also be output but this will be NaN since N=1). This type of collocation also has the option of just choosing the nearest point in the horizontal (kernel=nn_horizontal) which might be more appropriate, but I would just do the slightly modified collocation in step 3 and skip this step altogether.

>> 5. Aggregated both the model and satellite files to 1 degree resolution. This is because I thought the file format might be easier to use if they were gridded rather than points and if they were on the same resolution. Again, I'm not sure this was totally necessary? Or if it would be better to do this before instead of after collocation?
This is a really good question. It's not necessary, but it does make the data a bit easier to work with, and probably more importantly allows you to make temporal averages which will remove some of the sampling error (see Nick's recent paper on this exact topic here: https://www.atmos-chem-phys.net/16/6335/2016/). I think Nick prefers to aggregate the data in space (keeping the high temporal resolution) then collocate, but I prefer to do the collocation then aggregate. It will effect the variability you see, and may effect the sampling errors, but I've not seen anything to convince me that one way is better than the other.

RachelH
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Duncan,

There's another small issue I hadn't noticed before with the collocation that hopefully you can help with.

The coordinates of the model AOD seem to be shifted downwards from where they are meant to be. I think this is because the model is regional one and thus has a different projection to a global grid. The latitude and longitude coordinates I read in were 1D arrays and thus at the edges, the coordinates aren't correct when collocated onto an orthogonal grid. If I could read in 2D arrays for latitude and longitude, it probably wouldn't be an issue. Do you know if this is possible? Or if not, do you think reprojecting the model data onto an orthogonal grid before I do the collocation would solve it?

Best wishes,

Rachel.

duncanwp
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Rachel,

I alluded to this in my initial reply (#2) but I may not have been very clear. Both options are valid and would work, but depending on how many time steps of model data you have compared to how much satellite data you have they could take quite different amount of compute time. Unfortunately there's no way of knowing beforehand which would be best...

Duncan

RachelH
Re: Collocating MODIS L2 AOD and Unified Model output

Hi Duncan,

Thanks for your reply and sorry for the delay in responding. I have got it all working now based on the above. Great that it doesn't need both collocation steps and getting rid of the second command made it a lot quicker to run.

Thanks for the link to the paper, I will have a look.

Best wishes,

Rachel.

Website designed & built by OCC