collocating 2D data with AERONET data

11 posts / 0 new
Last post
Masaru Yoshioka
collocating 2D data with AERONET data

As I said the other day my current goal is to compare simulated aerosol optical depths (AODs) with AERONET measurements. I had a lot of problems to deal with simulated AODs stored in UM outputs (.pp file) with CIS. But AODs are output for different aerosol modes and I have to add them up to get the total AODs. Although I'm not completely sure I'm guessing CIS does not have the functionality to do simple arithmetics. So I'm trying to use pre-calculated AODs and collocate them on AERONET measurements.

AOD has been calculated using nco (netCDF operator). Now I use this data;

fnmn=AOD550_TOTAL_teafwa_pm2008jul.nc
vnmn=AOD550_TOTAL

I'm using this AERONET data;

dira=/home/users/myoshioka/crescendo/Data/AERONET/AOT/LEV20/ALL_POINTS
station=Izana
fnma=920801_171209_$station.lev20

fnmc=${fnmn//.nc/_$station.nc} # output file name
echo $fnmc # AOD550_TOTAL_teafwa_pm2008jan_Izana.nc

Then I did;

cis col $vnmn:$fnmn $dira/$fnma -o $fnmc

but this doesn't go well. I get an error message like this;

ERROR - Sample points do not uniquely define gridded data source points, invalid dimenions: 2 and 3 respectively

I checked the full output using option "-vv" but I don't see any other info that is very helpful.
The input data has a structure like this;

netcdf AOD550_TOTAL_teafwa_pm2008jul {
dimensions:
t = UNLIMITED ; // (1 currently)
latitude = 145 ;
longitude = 192 ;
variables:
float AOD550_TOTAL(t, latitude, longitude) ;

longitude = 0, 1.875, 3.75, 5.625, 7.5, 9.375, 11.25, 13.125, 15, 16.875, ...

latitude = -90, -88.75, -87.5, -86.25, -85, -83.75, -82.5, -81.25, -80, ...

t = 228.5 ;

Do you see any problem? Should I remove dimension t to make strictly 2 dimensional variable? But other cis operations work with t. Please could you give me any advice?

Thanks,
Masaru

duncanwp
Re: collocating 2D data with AERONET data

Hi Masaru,

Indeed CIS can do basic mathematical operations, see the documentation here: http://cis.readthedocs.io/en/stable/evaluation.html

In any case, the nco command you used should be fine. I think the error you are getting from the collocation is because CIS by default will linearly intepolate the model grid values onto the Aeronet sample points. The trouble is that you're only passing it one model time point, and so CIS cannot interpolate in the time dimension. If you pass all the model months then CIS will be able to do the interpolation.

Cheers,

Duncan

Masaru Yoshioka
Hi Duncan,

Hi Duncan,

Thanks for this. I didn't know cis col does not only spatial but also temporal collocation at the same time. Is it possible to turn off temporal collocation and do spatial collocation only? Or to do linearly collocate in space but find the closest point in time?

Well, I checked the documentation and found the latter is probably possible. Probably I should use collocator=lin,kernel=nn_t like this?

cis col -v $vnmn:$fnmn $dira/$fnma:collocator=lin,kernel=nn_t -o $fnmc

but this failed with the same error;

ERROR - Sample points do not uniquely define gridded data source points, invalid dimenions: 2 and 3 respectively

multiple points in time are necessary even with nn_t? Now I tried this;

fnmn=AOD550_TOTAL_teafwa_pm2008???.nc

cis col $vnmn:$fnmn $dira/$fnma -o $fnmc

this opens like this;

cis col AOD550_TOTAL:AOD550_TOTAL_teafwa_pm2008???.nc /home/users/myoshioka/crescendo/Data/AERONET/AOT/LEV20/ALL_POINTS/920801_171209_Izana.lev20 -o AOD550_TOTAL_teafwa_pm2008jul_Izana.nc

this failed with error messages like these;

ERROR - Unable to concatenate cubes on load:
failed to concatenate into a single cube.
Cube metadata differs for phenomenon: DOUBLE CALL AEROSOL OPTICAL DEPTHS SUMMED FOR ALL MODES
An error occurred retrieving data using the product NetCDF_Gridded. Check that this is the correct product plugin for your chosen data. Exception was InvalidVariableError: Unable to create a single cube from arguments given: (['AOD550_TOTAL_teafwa_pm2008apr.nc', 'AOD550_TOTAL_teafwa_pm2008aug.nc', 'AOD550_TOTAL_teafwa_pm2008dec.nc', 'AOD550_TOTAL_teafwa_pm2008feb.nc', 'AOD550_TOTAL_teafwa_pm2008jan.nc', 'AOD550_TOTAL_teafwa_pm2008jul.nc', 'AOD550_TOTAL_teafwa_pm2008jun.nc', 'AOD550_TOTAL_teafwa_pm2008mar.nc', 'AOD550_TOTAL_teafwa_pm2008may.nc', 'AOD550_TOTAL_teafwa_pm2008nov.nc', 'AOD550_TOTAL_teafwa_pm2008oct.nc', 'AOD550_TOTAL_teafwa_pm2008sep.nc'], Constraint(cube_func=<function <lambda> at 0x47dc668>)).
2017-12-18 11:58:29,444 - ERROR - An error occurred retrieving data using the product NetCDF_Gridded. Check that this is the correct product plugin for your chosen data. Exception was InvalidVariableError: Unable to create a single cube from arguments given: (['AOD550_TOTAL_teafwa_pm2008apr.nc', 'AOD550_TOTAL_teafwa_pm2008aug.nc', 'AOD550_TOTAL_teafwa_pm2008dec.nc', 'AOD550_TOTAL_teafwa_pm2008feb.nc', 'AOD550_TOTAL_teafwa_pm2008jan.nc', 'AOD550_TOTAL_teafwa_pm2008jul.nc', 'AOD550_TOTAL_teafwa_pm2008jun.nc', 'AOD550_TOTAL_teafwa_pm2008mar.nc', 'AOD550_TOTAL_teafwa_pm2008may.nc', 'AOD550_TOTAL_teafwa_pm2008nov.nc', 'AOD550_TOTAL_teafwa_pm2008oct.nc', 'AOD550_TOTAL_teafwa_pm2008sep.nc'], Constraint(cube_func=<function <lambda> at 0x47dc668>)). - check cis.log for details

I'm not sure what's wrong. All data contains variable AOD550_TOTAL. Its attributes are like these;

float AOD550_TOTAL(t, latitude, longitude) ;
AOD550_TOTAL:_FillValue = -1.e+20f ;
AOD550_TOTAL:long_name = "DOUBLE CALL AEROSOL OPTICAL DEPTHS SUMMED FOR ALL MODES" ;
AOD550_TOTAL:missing_value = -1.e+20f ;
AOD550_TOTAL:name = "AEROSOL OPTICAL DEPTHS" ;
AOD550_TOTAL:source = "Unified Model Output (Vn 8.4):" ;
AOD550_TOTAL:title = "AEROSOL OPTICAL DEPTHS" ;
AOD550_TOTAL:units = " " ;
AOD550_TOTAL:valid_min = 0.f ;
AOD550_TOTAL:description1 = "fields 2500+2501+2502+2503+2504+2505" ;
AOD550_TOTAL:cell_methods = "job: mean pseudo: mean" ;

I think these are identical for all 12 files.

t has attributes like this;

float t(t) ;
t:long_name = "t" ;
t:units = "days since 2007-12-01 00:00:00" ;
t:time_origin = "01-DEC-2007:00:00:00" ;

values of it are like these;

$ for file in $fnmn; do ncdump $file| grep 't ='; done
t = 137 ;
t = 259.5 ;
t = 381.5 ;
t = 76.5 ;
t = 46.5 ;
t = 228.5 ;
t = 198 ;
t = 106.5 ;
t = 167.5 ;
t = 351 ;
t = 320.5 ;
t = 290 ;

All files include a line " t = UNLIMITED ; // (1 currently)" as well but I omitted this from above.

These are in alphabetical order just like above (in the error message). These make sense to me although they are not compatible with AERONET time stamps. Do you see any problem here? Can you think of anything I should check? Do you think I need to write my own plugin, even though these files are created by extracting variables from UM (vn8.4) outputs using xconv and doing some arithmetics with nco?

Thanks,
Masaru

Masaru Yoshioka
well I thought maybe I should

well I thought maybe I should arrange the files in chronological order. Also I noticed that the documentation says these have to be comma delimited list, instead of space. So I did this;

fnmn="AOD550_TOTAL_teafwa_pm2008jan.nc,AOD550_TOTAL_teafwa_pm2008feb.nc,AOD550_TOTAL_teafwa_pm2008mar.nc,AOD550_TOTAL_teafwa_pm2008apr.nc,AOD550_TOTAL_teafwa_pm2008may.nc,AOD550_TOTAL_teafwa_pm2008jun.nc,AOD550_TOTAL_teafwa_pm2008jul.nc,AOD550_TOTAL_teafwa_pm2008aug.nc,AOD550_TOTAL_teafwa_pm2008sep.nc,AOD550_TOTAL_teafwa_pm2008oct.nc,AOD550_TOTAL_teafwa_pm2008nov.nc,AOD550_TOTAL_teafwa_pm2008dec.nc"

Also instead of the point data whose time dimension has a size of thousands, I used the monthly mean I created;

dira=/home/users/myoshioka/crescendo/Data/AERONET/AOT/LEV20/monave
fnma=AOT_500_Izana_monthly_2008-07.nc

cis col $vnmn:$fnmn $dira/$fnma -o $fnmc

This gave me the same error as before. Then I used a concatenated data;

fnmn=AOD550_TOTAL_teafwa_pm2008_12mo.nc

cis col $vnmn:$fnmn $dira/$fnma -o $dira/$fnmc

WOW! this was finally successful!

$ ll $dira/$fnmc
-rw-r--r-- 1 myoshioka users 43951 Dec 19 11:06 /home/users/myoshioka/crescendo/Data/AERONET/AOT/LEV20/monave/AOD550_TOTAL_teafwa_pm2008jul_Izana.nc

ncdump $dira/$fnmc gives me this;

AOD550_TOTAL =
0.2121584, 0.2231532, 0.2253381, 0.3172731, 0.1427087, 0.1464469,
0.2167568, 0.2769181, 0.1928095, 0.2244199, 0.1814614, 0.1897959 ;

longitude = -16.499 ;

latitude = 28.309 ;

t = 46.5, 76.5, 106.5, 137, 167.5, 198, 228.5, 259.5, 290, 320.5, 351, 381.5 ;

so this actually did not collocate the data temporarily. Now I tried this;

fnmn=AOD550_TOTAL_teafwa_pm2008jul.nc

cis col $vnmn:$fnmn $dira/$fnma -o $dira/$fnmc

ncdump gives me this;

AOD550_TOTAL =
0.2167568 ;

longitude = -16.499 ;

latitude = 28.309 ;

t = 228.5 ;

so I went back to the expression using wildcards;

fnmn=AOD550_TOTAL_teafwa_pm2008???.nc

cis col $vnmn:$fnmn $dira/$fnma -o $dira/$fnmc

This gave me the same error as before.

So I can conclude there were two reasons why cis col failed. Firstly, I cannot use comma delimited list of file names or wildcard which creates space delimited list. This means I still don't know how to specify multiple files as input. Secondly, I cannot use the original AERONET *.lev20 file as a reference. I have to use the created monthly mean file instead. Another conclusion is that cis col does not do temporal collocation as default. It is not clear how to turn on this.

However, I was using monthly mean files only for testing and practicing purposes. What I really want to do is to collocate 3 hourly model outputs onto AERONET measurements and take an average for a month. To do this it would be desirable to be able to temporally collocate model outputs onto measurements as well. I'll create a new thread for this.

Thank you Duncan for your help anyway.

Masaru

duncanwp
Hi Masaru,

Hi Masaru,

> So I can conclude there were two reasons why cis col failed. Firstly, I cannot use comma delimited list of file names or wildcard which creates space delimited list. This means I still don't know how to specify multiple files as input.

There seems to be a problem merging the files which you've created. This is usually because each file has a (possibly global) attribute which differs, or the time dimension isn't UNLIMITED. If you check that the history and other attributes are the same CIS should be able to read all the files in one go.

If you try doing the CIS eval command directly on the pp files to sum the modes you should be able to use the resultant file for collocation with no problem.

> Secondly, I cannot use the original AERONET *.lev20 file as a reference. I have to use the created monthly mean file instead. Another conclusion is that cis col does not do temporal collocation as default. It is not clear how to turn on this.

CIS will interpolate across all the dimensions which it needs to automatically, including time - as demonstrated by the successful collocation above. The trouble you were having is that because you were collocating one file at a time there CIS didn't know about the neighbouring time points so it couldn't do the interpolation for you. By passing all the time points to the collocation CIS will collocate in space and time.

So the basic workflow should be (I don't know what the actual AOD variable names are!):

cis eval AOD_DUST,AOD_SO4,AOD_BC:.pp "AOD_DUST+AOD_SO4+AOD_BC" 1 -o AOD_TOTAL:total_aod.nc
cis col AOD_TOTAL:total_aod.nc .lev20 -o collocated_aod.nc

Hopefully that will work!

Masaru Yoshioka
Thank you for your reply. I

Thank you for your reply. I appreciate it. but oh dear, it requires identical global attributes across all files...? that sounds like quite a bit of restriction. OK then I will either concatenate files before using them or try cis eval.

But as you saw above,

t = 46.5, 76.5, 106.5, 137, 167.5, 198, 228.5, 259.5, 290, 320.5, 351, 381.5 ;

are just as in the original data and were not collocated. Are you saying this was because these data are included in one file? Or because the reference file has one value in time? I've got an impression that you are saying neither of these, but then I don't understand why the data wasn't collocated in time. I would think in theory these can be linearly interpolated to AERONET measurement and return one value instead of twelve.

Thanks.
Masaru

duncanwp
> Thank you for your reply. I

> Thank you for your reply. I appreciate it. but oh dear, it requires identical global attributes across all files...? that sounds like quite a bit of restriction. OK then I will either concatenate files before using them or try cis eval.
Yes - this is a pain! Unfortunately it stems from the way the Iris library merges Cubes (See e.g. https://github.com/SciTools/iris/pull/469). I work around it with the following plugin which 'pops' the history attribute off and you might find useful: https://github.com/duncanwp/cis_plugins/blob/master/multi-netcdf.py

> I would think in theory these can be linearly interpolated to AERONET measurement and return one value instead of twelve.
Apologies - I got a bit lost and misunderstood your earlier comment. I see now what you mean. Could you post the results of cis info for each of the variables and then the result of the collocation?

Many thanks

Masaru Yoshioka
Hi Duncan,

Hi Duncan,
Thank you for these. Of course I understand it's very easy to get lost in this long thread. cis info... OK.

Masaru Yoshioka
Reply to #7

Hi Duncan,
Thank you for these. Of course I understand it's very easy to get lost in this long thread. cis info... here they are. White spaces seem to be collapsed when posted. Do these tell you something?
Masaru

[myoshioka@jasmin-sci3 teafw]$ cis info AOD550_TOTAL_teafwa_pm2008_12mo.nc
AOD550_TOTAL
[myoshioka@jasmin-sci3 teafw]$ cis info AOD550_TOTAL:AOD550_TOTAL_teafwa_pm2008_12mo.nc
DOUBLE CALL AEROSOL OPTICAL DEPTHS SUMMED FOR ALL MODES / (unknown) (t: 12; latitude: 145; longitude: 192)
Dimension coordinates:
t x - -
latitude - x -
longitude - - x
Attributes:
NCO: "4.5.5"
description1: fields 2500+2501+2502+2503+2504+2505
history: Thu Dec 14 17:07:15 2017: ncrename -v AODs_TOTAL,AOD550_TOTAL AOD550_TOTAL_teafwa_pm2008_12mo.nc
Fri...
name: AEROSOL OPTICAL DEPTHS
nco_openmp_thread_number: 1
source: Unified Model Output (Vn 8.4):
title: AEROSOL OPTICAL DEPTHS
valid_min: 0.0
Cell methods:
mean: job
mean: pseudo

[myoshioka@jasmin-sci3 monave]$ cis info AOT_500_Izana_monthly_2008-07.nc
AOT_500_num_points
AOT_500_std_dev
AOT_500
[myoshioka@jasmin-sci3 monave]$ cis info AOT_500:AOT_500_Izana_monthly_2008-07.nc
AOT_500 / (1) (longitude: 1; latitude: 1; altitude: 1; time: 1)
Dimension coordinates:
longitude x - - -
latitude - x - -
altitude - - x -
time - - - x
Attributes:
Conventions: CF-1.5
history: 2017-12-13T17:08:50Z Aggregated using CIS version 1.5.4
variables: ['AOT_500']
...

duncanwp
OK, I think the problem is

OK, I think the problem is that the time dimension in your combined model AOD has nothing identifying it as such. If you change the variable name to 'time', or add a 'time' standard_name attribute then CIS should be able to match it with the corresponding Aeronet coordinate.

I've created a CIS issue to try and be more explicit about how we match coordinates as it's not always obvious what it's doing under the hood: https://jira.ceh.ac.uk/browse/JASCIS-376.

Masaru Yoshioka
Closing this thread

Great. It looks like you are right.

I found that time dimension and variable seem to be extracted as 't' when extracted using xconv but as 'time' when done with iris cube. The AOD data created based on those extracted with iris cube and processed with python were collocated in both space and time;

cis col aod550_total:aod550_total_teafwa_pb20080101.nc /group_workspaces/jasmin2/crescendo/Data/AERONET/AOT/LEV20/ALL_POINTS/920801_171209_Izana.lev20:collocator=lin[extrapolate=False] -o aod550_total_teafwa_pb20080101_Izana.nc

(notice variable name is in lower case because iris cube converts upper case letters into lower case. so to be consistent I changed the file name into lower case as well.)

I already have huge amount of data extracted using xconv and processed with nco and IDL, but for this purpose I will use iris cube and python.

I'm closing this thread but the cis col done here was not completely successful unfortunately. I will open another thread for that.

Very many thanks for your help.
Masaru

Website designed & built by OCC