The GriddedData
class
This notebook introduces the GriddedData class of pyaerocom. The GriddedData
class is the fundamental base class for the analysis of model data. The underlying data type is iris.cube.Cube which was extended, for instance by allowing direct imports of netCDF files when creating an instance of GriddedData
(i.e. by passing the filename and
specifying the variable name on initialisation). This notebook introduces some of the features of the GriddedData
class. Starting with some imports:
[1]:
import pyaerocom as pya
from warnings import filterwarnings
filterwarnings('ignore')
pya.change_verbosity('critical')
Initating pyaerocom configuration
Checking database access...
Checking access to: /lustre/storeA
Access to lustre database: True
Init data paths for lustre
Expired time: 0.016 s
Let’s get a test file to load
[2]:
test_files = pya.io.testfiles.get()
for name, filepath in test_files["models"].items(): print("%s\n%s\n" %(name, filepath))
aatsr_su_v4.3
/lustre/storeA/project/aerocom/aerocom-users-database/CCI-Aerosol/CCI_AEROSOL_Phase2/AATSR_SU_v4.3/renamed/aerocom.AATSR_SU_v4.3.daily.od550aer.2008.nc
ecmwf_osuite
/lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/renamed/aerocom.ECMWF_OSUITE_NRT_test.daily.od550aer.2018.nc
Let’s pick out the ECMWF OSUITE test file and load the data directly into an instance of the GriddedData
class. The GriddedData
class takes either preloaded instances of the iris.cube.Cube
class as input, or a valid netCDF file path. The latter requires specification of the variable name which is then filtered from the data stored in the netCDF file (which may contain multiple variables. The following example imports the data for the aerosol optical density at 550 nm. The string
representation of the GriddedData
class (see print at end of following code cell) was slitghtly adapted from the underlying Cube
object.
Read ECMWF OSUITE AOD data
[3]:
fpath = test_files["models"]["ecmwf_osuite"]
data = pya.GriddedData(input=fpath, var_name="od550aer", data_id="ECMWF_OSUITE")
print(data)
Overwriting unit unknown in cube od550aer with value "1"
pyaerocom.GriddedData: ECMWF_OSUITE
Grid data: Dust Aerosol Optical Depth at 550nm / (1) (time: 365; latitude: 451; longitude: 900)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Attributes:
Conventions: CF-1.0
NCO: 4.7.2
computed: False
concatenated: False
data_id: ECMWF_OSUITE
from_files: ['/lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/rename...
history: Tue Mar 20 13:08:49 2018: ncks -7 -O -o test.nc -x -v time_bnds od550aer.test.orig.nc
Tue...
history_of_appended_files: Tue Mar 20 02:09:15 2018: Appended file /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT/renamed//aerocom.ECMWF_OSUITE_NRT.daily.od550bc.2018.nc...
invalid_units: ~
is_at_stations: False
nco_openmp_thread_number: 1
outliers_removed: False
reader: None
region: None
regridded: False
ts_type: daily
var_name: od550aer
var_name_read: n/d
vert_code:
year: 2018
Cell methods:
mean: time
Remark on longitude definition
If the longitudes in the original NetCDF file are defined as:
then, pyaerocom converts automatically to:
when an instance of the GriddedData
class is created (see print statment above Rolling longitudes to -180 -> 180 definition). This is, for instance, the case for the ECMWF OSUITE data files.
Basic attributes of the GriddedData
class
In the following cells, some of the most important attributes are introduced. These are mostly reimplementations of the underlying iris.Cube
data object, which is stored in and can be accessed via the GriddedData.cube
attribute. For instance the attribute GriddedData.longitude
will access GriddedData.grid.coord("longitude")
, GriddedData.latitude
will return GriddedData.grid.coord("latitude")
and GriddedData.time
will return the time dimension array
(GriddedData.grid.coord("time")
).
[4]:
data.data_id
[4]:
'ECMWF_OSUITE'
[5]:
data.var_name
[5]:
'od550aer'
[6]:
data.units
[6]:
Unit('1')
[7]:
data.ts_type
[7]:
'daily'
Side note: the unit is obviously not specified in this dataset, which is part of the game, unfortunately, when working with external data…
[8]:
type(data.longitude)
[8]:
iris.coords.DimCoord
[9]:
data.longitude.points.min(), data.longitude.points.max()
[9]:
(-180.0, 179.60000610351562)
[10]:
type(data.latitude)
[10]:
iris.coords.DimCoord
[11]:
data.latitude.points.min(), data.latitude.points.max()
[11]:
(-90.0, 90.0)
[12]:
type(data.time)
[12]:
iris.coords.DimCoord
[13]:
data.time.points.min(), data.time.points.max()
[13]:
(0.0, 364.0)
[14]:
tstamps = data.time_stamps()
print(tstamps[0], tstamps[-1])
2018-01-01T00:00:00.000000 2018-12-31T00:00:00.000000
If you do not specify the variable type, an Exception is raised, that will get you some information about what variables are available in the file (if the file is readable using the iris.load
method).
[15]:
try:
data = pya.GriddedData(input=fpath)
except pya.exceptions.NetcdfError as e:
print("This did not work...error message: %s" %repr(e))
This did not work...error message: NetcdfError("Could not load single cube from /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/renamed/aerocom.ECMWF_OSUITE_NRT_test.daily.od550aer.2018.nc. Please specify var_name. Input file contains the following variables: ['od550dust', 'od550oa', 'od550aer', 'od550so4', 'od550bc']")
Also, if you parse an invalid variable name, you will get some hint.
[16]:
try:
data = pya.GriddedData(input=fpath, var_name="Blaaa")
except Exception as e:
print("This also did not work...error message: %s" %repr(e))
This also did not work...error message: NetcdfError('Variable Blaaa not available in file /lustre/storeA/project/aerocom/aerocom1/ECMWF_OSUITE_NRT_test/renamed/aerocom.ECMWF_OSUITE_NRT_test.daily.od550aer.2018.nc')
You can have a quick look at the data using the class-own quickplot method
[17]:
data.quickplot_map(time_idx=0, vmin=0, vmax=1, c_over="r");
Why not load some of the other variables…
[18]:
data_bc = pya.GriddedData(fpath, var_name="od550bc", data_id="ECMWF_OSUITE")
data_so4 = pya.GriddedData(fpath, var_name="od550so4", data_id="ECMWF_OSUITE")
Overwriting unit unknown in cube od550bc with value "1"
Overwriting unit unknown in cube od550so4 with value "1"
… and plot them as well
[19]:
data_bc.quickplot_map();
Apply custom crop and plot
[20]:
data_so4_cropped = data_so4.crop(lon_range=(-30, 30),
lat_range=(10, 60))
data_so4_cropped.quickplot_map(time_idx='25-2-2018', xlim=(-100, 100), ylim=(-70, 70));
Change resolution
Downscale to 2x2 resolution:
[21]:
lons = np.arange(-180, 180, 2)
lats = np.arange(-90, 90, 2)
data_lowres = data.interpolate(longitude=lons, latitude=lats)
Interpolating data of shape (365, 451, 900). This may take a while.
Successfully interpolated cube
And plot:
[22]:
fig =data_lowres.quickplot_map()
Area weighted mean
Retrieve area weighted mean from data
[23]:
area_mean = data.get_area_weighted_timeseries()
area_mean.plot_timeseries('od550aer');
Trying to infer ts_type in StationData ECMWF_OSUITE for variable od550aer
… more to come
This tutorial is not yet completed as the GriddedData
class is currently under development.