Diving deeper into the analysis and API
This notebook is meant to give a quick introduction into the main features and workflows of pyaerocom.
This includes brief introductions into the following features:
In a graphical way it introduces the main data object and processing routines for model and observation comparisons with pyaerocom, illustrated in the following flowchart:
Only that in this example “Data server” is the local computer that has the minimal testdataset as an example dataset.
[1]:
import pyaerocom as pya
import logging
logging.getLogger().setLevel(logging.ERROR) # Set level in information outputted by pyaerocom to the console.\n",
pya.__version__
[1]:
'0.24.0'
Should be at least 0.14.X
Check access to testdata
NOTE: details regarding testdata access and intialization are covered in tutorial notebook getting_started_setup.ipynb.
You should see the data we downloaded in the previous tutorial
[2]:
import os
dataloc = f'./data/testdata-minimal/'
os.listdir(dataloc)
pya.const.add_data_search_dir(dataloc + 'modeldata')
pya.const.DATA_SEARCH_DIRS
[2]:
['/lustre/storeB/project/aerocom/aerocom1/',
'/lustre/storeB/project/aerocom/aerocom2/',
'/lustre/storeB/project/aerocom/aerocom-users-database/CMIP6',
'/lustre/storeB/project/aerocom/aerocom-users-database/DOMOS',
'/lustre/storeB/project/aerocom/aerocom-users-database/C3S-Aerosol',
'/lustre/storeB/project/aerocom/aerocom-users-database/ECLIPSE',
'/lustre/storeB/project/aerocom/aerocom-users-database/SATELLITE-DATA/',
'/lustre/storeB/project/aerocom/aerocom-users-database/CCI-Aerosol/CCI_AEROSOL_Phase2/',
'/lustre/storeB/project/aerocom/aerocom-users-database/ACCMIP/',
'/lustre/storeB/project/aerocom/aerocom-users-database/ECMWF/',
'/lustre/storeB/project/aerocom/aerocom2/EMEP_COPERNICUS/',
'/lustre/storeB/project/aerocom/aerocom2/EMEP/',
'/lustre/storeB/project/aerocom/aerocom2/EMEP_GLOBAL/',
'/lustre/storeB/project/aerocom/aerocom2/EMEP_SVN_TEST/',
'/lustre/storeB/project/aerocom/aerocom2/NorESM_SVN_TEST/',
'/lustre/storeB/project/aerocom/aerocom2/INCA/',
'/lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/',
'/lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-II/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-I/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III-2019/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III-Trend/',
'/lustre/storeB/project/aerocom/aerocom-users-database/CCI-Aerosol/CCI_AEROSOL_Phase1/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II-IND3/',
'/lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II-IND2/',
'/lustre/storeB/project/fou/kl/CAMS61/',
'/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/PYAEROCOM/',
'/lustre/storeB/project/fou/kl/CAMS2_40/task4041/',
'/lustre/storeB/project/aerocom/aerocom-users-database/DOMOS/',
'./data/testdata-minimal/modeldata']
Model data: Reading of and working with gridded data
This section provides an introduction into the following pyaerocom classes and architectures:
`pyaerocom.io.ReadGridded
<https://pyaerocom.met.no/api.html#module-pyaerocom.io.readgridded>`__`pyaerocom.GriddedData
<https://pyaerocom.met.no/api.html#module-pyaerocom.griddeddata>`__
*you may click the links to see the online documentation of these classes.
Pre-remark on the ReadGridded
class
As you could see in tutorial getting_started_setup.ipynb the ReadGridded
class makes extensive use of the AeroCom file naming conventions. So if you have model data that is stored using different conventions (e.g. CMIP6), this class will not be of much help (yet) for filtering the correct files to read. In that case you may locate a model NetCDF file yourself and pass it directly into a GriddedData
object on initialisation.
The testdataset contains data from the TM5 model, which is used in the following. You can use the browse_database
function of pyaerocom to find model ID’s (which can be quite cryptic sometimes) using wildcard pattern search.
[3]:
from pyaerocom.io.utils import browse_database
browse_database('*TM5*')
Pyaerocom ReadGridded
---------------------
Data ID: TM5JRCCY2IPCCV1_SR6SA
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5JRCCY2IPCCV1_SR6SA/renamed
Available experiments: ['SR6SA']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['MMR_BCSR6SA', 'MMR_NO3SR6SA', 'MMR_POMSR6SA', 'MMR_SO4SR6SA']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-JRC-cy2-ipcc-v1_SR1
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5-JRC-cy2-ipcc-v1_SR1/renamed
Available experiments: ['SR1']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['vmro3']
Pyaerocom ReadGridded
---------------------
Data ID: TM5JRCCY2IPCCV1_SR6EU
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5JRCCY2IPCCV1_SR6EU/renamed
Available experiments: ['SR6EU']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['MMR_BCSR6EU', 'MMR_NO3SR6EU', 'MMR_POMSR6EU', 'MMR_SO4SR6EU']
Pyaerocom ReadGridded
---------------------
Data ID: TM5JRCCY2IPCCV1_SR1
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5JRCCY2IPCCV1_SR1/renamed
Available experiments: ['SR1']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['SCONCBC', 'SCONCNO3', 'SCONCPM25', 'SCONCPOM', 'SCONCSO4']
Pyaerocom ReadGridded
---------------------
Data ID: TM5JRCCY2IPCCV1_SR6NA
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5JRCCY2IPCCV1_SR6NA/renamed
Available experiments: ['SR6NA']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['MMR_BCSR6NA', 'MMR_NO3SR6NA', 'MMR_POMSR6NA', 'MMR_SO4SR6NA']
Pyaerocom ReadGridded
---------------------
Data ID: TM5JRCCY2IPCCV1_SR6EA
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/HTAP-PHASE-I/TM5JRCCY2IPCCV1_SR6EA/renamed
Available experiments: ['SR6EA']
Available years: [2001]
Available frequencies ['monthly']
Available variables: ['MMR_BCSR6EA', 'MMR_NO3SR6EA', 'MMR_POMSR6EA', 'MMR_SO4SR6EA']
Reading failed for TM5_B. Error: AttributeError("'NoneType' object has no attribute 'experiment'")
Pyaerocom ReadGridded
---------------------
Data ID: TM5-V3.A2.PRE
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II/TM5-V3.A2.PRE/renamed
Available experiments: ['']
Available years: [1850]
Available frequencies ['daily' 'monthly']
Available variables: ['abs550aer', 'abs550dryaer', 'airmass', 'asyaer', 'clt', 'drybc', 'drydms', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'drynoy', 'dryoa', 'dryso2', 'dryso4', 'dryss', 'ec550aer', 'ec550dryaer', 'emibc', 'emidms', 'emidust', 'eminh3', 'eminox', 'emioa', 'emiso2', 'emiso4', 'emiss', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'precip', 'pressure', 'ps', 'rsds', 'rsdscs', 'rsdscsdif', 'rsdscsvis', 'rsdt', 'rsus', 'rsut', 'rsutcs', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'temp', 'vmrdms', 'vmrhno3', 'vmrno', 'vmrno2', 'vmrpan', 'vmrso2', 'wet3Dbc', 'wet3Ddu', 'wet3Dhno3', 'wet3Dnh4', 'wet3Dnoy', 'wet3Doa', 'wet3Dso2', 'wet3Dso4', 'wet3Dss', 'wetbc', 'wetdms', 'wetdust', 'wethno3', 'wetnh4', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'ang4487aer', 'od550gt1aer', 'fmf550aer', 'concNhno3', 'wetdu']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-V3.A2.HCA-0
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II/TM5-V3.A2.HCA-0/renamed
Available experiments: ['']
Available years: [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009]
Available frequencies ['daily' 'monthly']
Available variables: ['abs550aer', 'abs550dryaer', 'airmass', 'asyaer', 'drydms', 'drydust', 'dryso2', 'dryso4', 'dryss', 'ec550aer', 'ec550dryaer', 'emibc', 'emidms', 'emidust', 'emioa', 'emiso2', 'emiso4', 'emiss', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'pmid3d', 'precip', 'pressure', 'ps', 'scatc550dryaer', 'sconcbc', 'sconcdust', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'temp', 'wetbc', 'wetdms', 'wetdust', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'ang4487aer', 'od550gt1aer', 'fmf550aer', 'pmid']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-V3.A2.CTRL
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II/TM5-V3.A2.CTRL/renamed
Available experiments: ['']
Available years: [2006]
Available frequencies ['daily' 'monthly' 'hourly']
Available variables: ['abs550aer', 'abs550dry1Daer', 'abs550dryaer', 'airmass', 'ang4487aer', 'asyaer', 'conccn1Dmode01', 'conccn1Dmode02', 'conccn1Dmode03', 'conccn1Dmode04', 'conccn1Dmode05', 'conccn1Dmode06', 'conccn1Dmode07', 'conccnmode01', 'conccnmode02', 'conccnmode03', 'conccnmode04', 'conccnmode05', 'conccnmode06', 'conccnmode07', 'drybc', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'drynoy', 'dryoa', 'dryso2', 'dryso4', 'dryss', 'ec550aer', 'ec550dry1Daer', 'ec550dryaer', 'emibc', 'emidms', 'emidust', 'eminh3', 'eminox', 'emioa', 'emiso2', 'emiso4', 'emiss', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'mmr1Daerh2o', 'mmr1Dtr01', 'mmr1Dtr02', 'mmr1Dtr03', 'mmr1Dtr04', 'mmr1Dtr05', 'mmr1Dtr06', 'mmr1Dtr07', 'mmr1Dtr08', 'mmr1Dtr09', 'mmr1Dtr10', 'mmr1Dtr11', 'mmr1Dtr12', 'mmr1Dtr13', 'mmr1Dtr14', 'mmr1Dtr15', 'mmr1Dtr16', 'mmr1Dtr17', 'mmr1Dtr18', 'mmr1Dtr19', 'mmraerh2o', 'mmrbc', 'mmrdu', 'mmrno3', 'mmroa', 'mmrso4', 'mmrss', 'mmrtr01', 'mmrtr02', 'mmrtr03', 'mmrtr04', 'mmrtr05', 'mmrtr06', 'mmrtr07', 'mmrtr08', 'mmrtr09', 'mmrtr10', 'mmrtr11', 'mmrtr12', 'mmrtr13', 'mmrtr14', 'mmrtr15', 'mmrtr16', 'mmrtr17', 'mmrtr18', 'mmrtr19', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'pmid3d', 'precip', 'pressure', 'ps', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'temp', 'vmrdms', 'vmrhno3', 'vmrno', 'vmrno2', 'vmrpan', 'vmrso2', 'wet3Dbc', 'wet3Ddu', 'wet3Dhno3', 'wet3Dnh4', 'wet3Dnoy', 'wet3Doa', 'wet3Dso2', 'wet3Dso4', 'wet3Dss', 'wetbc', 'wetdust', 'wethno3', 'wetnh4', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'od550gt1aer', 'fmf550aer', 'concNhno3', 'pmid', 'wetdu']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-V3.A2.HCA-IPCC
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-II/TM5-V3.A2.HCA-IPCC/renamed
Available experiments: ['']
Available years: [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009]
Available frequencies ['daily' 'monthly' 'hourly']
Available variables: ['abs550aer', 'abs550dry1Daer', 'abs550dryaer', 'airmass', 'asyaer', 'clt', 'conccn1Dmode01', 'conccn1Dmode02', 'conccn1Dmode03', 'conccn1Dmode04', 'conccn1Dmode05', 'conccn1Dmode06', 'conccn1Dmode07', 'conccnmode01', 'conccnmode02', 'conccnmode03', 'conccnmode04', 'conccnmode05', 'conccnmode06', 'conccnmode07', 'drybc', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'drynoy', 'dryoa', 'dryso2', 'dryso4', 'dryss', 'ec550aer', 'ec550dry1Daer', 'ec550dryaer', 'emibc', 'emidms', 'emidust', 'eminh3', 'eminox', 'emioa', 'emiso2', 'emiso4', 'emiss', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'mmr1Daerh2o', 'mmr1Dtr01', 'mmr1Dtr02', 'mmr1Dtr03', 'mmr1Dtr04', 'mmr1Dtr05', 'mmr1Dtr06', 'mmr1Dtr07', 'mmr1Dtr08', 'mmr1Dtr09', 'mmr1Dtr10', 'mmr1Dtr11', 'mmr1Dtr12', 'mmr1Dtr13', 'mmr1Dtr14', 'mmr1Dtr15', 'mmr1Dtr16', 'mmr1Dtr17', 'mmr1Dtr18', 'mmr1Dtr19', 'mmraerh2o', 'mmrbc', 'mmrdu', 'mmrno3', 'mmroa', 'mmrso4', 'mmrss', 'mmrtr01', 'mmrtr02', 'mmrtr03', 'mmrtr04', 'mmrtr05', 'mmrtr06', 'mmrtr07', 'mmrtr08', 'mmrtr09', 'mmrtr10', 'mmrtr11', 'mmrtr12', 'mmrtr13', 'mmrtr14', 'mmrtr15', 'mmrtr16', 'mmrtr17', 'mmrtr18', 'mmrtr19', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'pmid3d', 'precip', 'pressure', 'ps', 'rsds', 'rsdscs', 'rsdscsdif', 'rsdscsvis', 'rsdt', 'rsus', 'rsut', 'rsutcs', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'temp', 'vmrdms', 'vmrhno3', 'vmrno', 'vmrno2', 'vmrpan', 'vmrso2', 'wet3Dbc', 'wet3Ddu', 'wet3Dhno3', 'wet3Dnh4', 'wet3Dnoy', 'wet3Doa', 'wet3Dso2', 'wet3Dso4', 'wet3Dss', 'wetbc', 'wetdust', 'wethno3', 'wetnh4', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'ang4487aer', 'od550gt1aer', 'fmf550aer', 'concNhno3', 'pmid', 'wetdu']
Pyaerocom ReadGridded
---------------------
Data ID: TM5_AP3-CTRL2016
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III/TM5_AP3-CTRL2016/renamed
Available experiments: ['AP3']
Available years: [2006, 2008, 2010]
Available frequencies ['monthly' '3hourly']
Available variables: ['abs350aer', 'abs440aer', 'abs440dryaer', 'abs550aer', 'abs550dryaer', 'abs550drylt1aer', 'abs870aer', 'abs870dryaer', 'airmass', 'asyaer', 'asydryaer', 'deltaz3d', 'depbc', 'depdms', 'depdust', 'dephno3', 'depmsa', 'depn', 'depnh3', 'depnh4', 'depnhx', 'depno2', 'depno3', 'depnoy', 'depo3', 'depoa', 'deps', 'depso2', 'depso4', 'depss', 'dh', 'drybc', 'drydms', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'dryno3', 'drynoy', 'dryo3', 'dryoa', 'dryso2', 'dryso4', 'dryss', 'ec440dryaer', 'ec550aer', 'ec550dryaer', 'ec550drylt1aer', 'ec870dryaer', 'emibc', 'emico', 'emidms', 'emidust', 'emiisop', 'emin', 'eminh3', 'eminox', 'emioa', 'emis', 'emiso2', 'emiso4', 'emiss', 'emiterp', 'humidity3d', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'od350aer', 'od440aer', 'od550aer', 'od550aer3d', 'od550aerh2o', 'od550bc', 'od550dryaer', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550lt1ss', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'pr', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcnh4', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'ta', 'temp', 'vmrch4', 'vmrco', 'vmrno', 'vmrno2', 'vmro3', 'vmroh', 'wetbc', 'wetdms', 'wetdust', 'wethno3', 'wetmsa', 'wetnh3', 'wetnh4', 'wetno3', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'ang4487aer', 'angabs4487aer', 'od550gt1aer', 'vmrox', 'fmf550aer', 'concNnh4', 'deltaz', 'humidity']
Pyaerocom ReadGridded
---------------------
Data ID: TM5_AP3-INSITU
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III/TM5_AP3-INSITU/renamed
Available experiments: ['AP3']
Available years: [2010]
Available frequencies ['monthly' 'daily' 'hourly']
Available variables: ['abs350aer', 'abs440aer', 'abs440dryaer', 'abs550aer', 'abs550dryaer', 'abs550drylt1aer', 'abs870aer', 'abs870dryaer', 'airmass', 'asyaer', 'asydryaer', 'depbc', 'depdms', 'depdust', 'dephno3', 'depmsa', 'depn', 'depnh3', 'depnh4', 'depnhx', 'depno2', 'depno3', 'depnoy', 'depo3', 'depoa', 'deps', 'depso2', 'depso4', 'depss', 'dh', 'drybc', 'drydms', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'dryno3', 'drynoy', 'dryo3', 'dryoa', 'dryso2', 'dryso4', 'dryss', 'ec440dryaer', 'ec550aer', 'ec550dryaer', 'ec550drylt1aer', 'ec870dryaer', 'emibc', 'emico', 'emidms', 'emidust', 'emiisop', 'emin', 'eminh3', 'eminox', 'emioa', 'emis', 'emiso2', 'emiso4', 'emiss', 'emiterp', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'mmrbc', 'mmrdust', 'mmrmsa', 'mmrnh4', 'mmrno3', 'mmroa', 'mmrso4', 'mmrss', 'od350aer', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550lt1ss', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'od870aer', 'plev', 'pr', 'precip', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'ta', 'temp', 'vmrch4', 'vmrco', 'vmrno', 'vmrno2', 'vmro3', 'vmroh', 'wetbc', 'wetdms', 'wetdust', 'wethno3', 'wetmsa', 'wetnh3', 'wetnh4', 'wetno3', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetss', 'ang4487aer', 'angabs4487aer', 'od550gt1aer', 'vmrox', 'fmf550aer']
Pyaerocom ReadGridded
---------------------
Data ID: TM5_AP3-CTRL2015
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III/TM5_AP3-CTRL2015/renamed
Available experiments: ['AP3']
Available years: [2010]
Available frequencies ['monthly']
Available variables: ['depbc', 'depdust', 'depno3', 'depoa', 'depso4', 'depss', 'drybc', 'drydust', 'dryno3', 'dryoa', 'dryso4', 'dryss', 'emibc', 'emidms', 'emidust', 'eminox', 'emioa', 'emiso2', 'emiso4', 'emiss', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadss', 'od550aer', 'od550bc', 'od550dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'sconcbc', 'sconcdust', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcss', 'wetbc', 'wetdust', 'wetno3', 'wetoa', 'wetso4', 'wetss']
Pyaerocom ReadGridded
---------------------
Data ID: TM5_AP3-INSITU-TIER3
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III/TM5_AP3-INSITU-TIER3/renamed
Available experiments: ['AP3']
Available years: [2010]
Available frequencies ['hourly']
Available variables: ['abs440dryaer', 'abs550aer', 'abs550dryaer', 'abs550drylt1aer', 'abs550rh40aer', 'abs550rh55aer', 'abs550rh65aer', 'abs550rh75aer', 'abs550rh85aer', 'abs870dryaer', 'airmass', 'asydryaer', 'dh', 'ec440dryaer', 'ec550aer', 'ec550aerh2o', 'ec550bc', 'ec550dryaer', 'ec550drylt1aer', 'ec550dust', 'ec550no3', 'ec550oa', 'ec550rh40aer', 'ec550rh55aer', 'ec550rh65aer', 'ec550rh75aer', 'ec550rh85aer', 'ec550so4', 'ec550ss', 'ec870dryaer', 'hus', 'mmrbc', 'mmrdust', 'mmrmsa', 'mmrnh4', 'mmrno3', 'mmroa', 'mmrso4', 'mmrss', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550no3', 'od550oa', 'od550so4', 'od550ss', 'plev', 'pr', 'ta']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-met2010_AP3-CTRL2019
Data directory: /lustre/storeB/project/aerocom/aerocom-users-database/AEROCOM-PHASE-III-2019/TM5-met2010_AP3-CTRL2019/renamed
Available experiments: ['AP3']
Available years: [1850, 2010]
Available frequencies ['monthly' 'daily']
Available variables: ['abs350aer', 'abs440aer', 'abs440dryaer', 'abs550aer', 'abs550dryaer', 'abs550drylt1aer', 'abs870aer', 'abs870dryaer', 'airmass', 'asyaer', 'asydryaer', 'depbc', 'depdms', 'depdust', 'dephno3', 'depmsa', 'depn', 'depnh3', 'depnh4', 'depnhx', 'depno2', 'depno3', 'depnoy', 'depo3', 'depoa', 'deps', 'depso2', 'depso4', 'depsoa', 'depss', 'dh', 'drybc', 'drydms', 'drydust', 'dryhno3', 'drynh3', 'dryno2', 'dryno3', 'drynoy', 'dryo3', 'dryoa', 'dryso2', 'dryso4', 'drysoa', 'dryss', 'ec440dryaer', 'ec550aer', 'ec550dryaer', 'ec550drylt1aer', 'ec870dryaer', 'emibc', 'emico', 'emidms', 'emidust', 'emiisop', 'emin', 'eminh3', 'eminox', 'emis', 'emiso2', 'emiso4', 'emiss', 'emiterp', 'emivoc', 'hus', 'loadbc', 'loaddust', 'loadno3', 'loadoa', 'loadso4', 'loadsoa', 'loadss', 'od350aer', 'od440aer', 'od550aer', 'od550aerh2o', 'od550bc', 'od550dust', 'od550lt1aer', 'od550lt1dust', 'od550lt1ss', 'od550no3', 'od550oa', 'od550so4', 'od550soa', 'od550ss', 'od870aer', 'pr', 'sconcbc', 'sconcdust', 'sconcmsa', 'sconcnh4', 'sconcno3', 'sconcoa', 'sconcso4', 'sconcsoa', 'sconcss', 'ta', 'temp', 'vmrch4', 'vmrco', 'vmrno', 'vmrno2', 'vmro3', 'vmroh', 'wetbc', 'wetdms', 'wetdust', 'wethno3', 'wetmsa', 'wetnh3', 'wetnh4', 'wetno3', 'wetnoy', 'wetoa', 'wetso2', 'wetso4', 'wetsoa', 'wetss', 'ang4487aer', 'angabs4487aer', 'od550gt1aer', 'vmrox', 'fmf550aer', 'concNnh4']
Pyaerocom ReadGridded
---------------------
Data ID: TM5-met2010_CTRL-TEST
Data directory: data/testdata-minimal/modeldata/TM5-met2010_CTRL-TEST/renamed
Available experiments: ['AP3']
Available years: [2010, 9999]
Available frequencies ['daily' 'monthly']
Available variables: ['abs550aer', 'od550aer']
[3]:
['TM5JRCCY2IPCCV1_SR6SA',
'TM5-JRC-cy2-ipcc-v1_SR1',
'TM5JRCCY2IPCCV1_SR6EU',
'TM5JRCCY2IPCCV1_SR1',
'TM5JRCCY2IPCCV1_SR6NA',
'TM5JRCCY2IPCCV1_SR6EA',
'TM5_B',
'TM5-V3.A2.PRE',
'TM5-V3.A2.HCA-0',
'TM5-V3.A2.CTRL',
'TM5-V3.A2.HCA-IPCC',
'TM5_AP3-CTRL2016',
'TM5_AP3-INSITU',
'TM5_AP3-CTRL2015',
'TM5_AP3-INSITU-TIER3',
'TM5-met2010_AP3-CTRL2019',
'TM5-met2010_CTRL-TEST']
[4]:
model_id = 'TM5-met2010_CTRL-TEST'
reader = pya.io.ReadGridded(model_id)
You can have a look at the individual files and corresponding metadata using the file_info
attribute:
[5]:
reader.file_info
[5]:
var_name | year | ts_type | vert_code | data_id | name | meteo | experiment | perturbation | is_at_stations | 3D | filename | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | abs550aer | 2010 | daily | Column | TM5-met2010_CTRL-TEST | TM5 | met2010 | AP3 | CTRL2019 | False | False | aerocom3_TM5-met2010_AP3-CTRL2019_abs550aer_Co... |
3 | abs550aer | 2010 | monthly | Column | TM5-met2010_CTRL-TEST | TM5 | met2010 | AP3 | CTRL2019 | False | False | aerocom3_TM5-met2010_AP3-CTRL2019_abs550aer_Co... |
2 | abs550aer | 9999 | daily | Column | TM5-met2010_CTRL-TEST | TM5 | met2010 | AP3 | CTRL2019 | False | False | aerocom3_TM5-met2010_AP3-CTRL2019_abs550aer_Co... |
0 | od550aer | 2010 | daily | Column | TM5-met2010_CTRL-TEST | TM5 | met2010 | AP3 | CTRL2019 | False | False | aerocom3_TM5-met2010_AP3-CTRL2019_od550aer_Col... |
1 | od550aer | 2010 | monthly | Column | TM5-met2010_CTRL-TEST | TM5 | AP3 | CTRL2016 | False | False | aerocom3_TM5_AP3-CTRL2016_od550aer_Column_2010... |
You can also filter this attribute based on what you are interested in. E.g.:
[6]:
files = reader.filter_files(var_name='od550aer')
files
[6]:
var_name | year | ts_type | vert_code | data_id | name | meteo | experiment | perturbation | is_at_stations | 3D | filename | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | od550aer | 2010 | daily | Column | TM5-met2010_CTRL-TEST | TM5 | met2010 | AP3 | CTRL2019 | False | False | aerocom3_TM5-met2010_AP3-CTRL2019_od550aer_Col... |
1 | od550aer | 2010 | monthly | Column | TM5-met2010_CTRL-TEST | TM5 | AP3 | CTRL2016 | False | False | aerocom3_TM5_AP3-CTRL2016_od550aer_Column_2010... |
[7]:
od550aer = reader.read_var('od550aer')
[8]:
od550aer.quickplot_map();
Ups, this looks rather incomplete. The reason is that pyaerocom picked the available daily dataset, which is cropped in the minimal testdataset for storage purpose. Let’s try monthly.
[9]:
od550aer_tm5 = reader.read_var('od550aer', ts_type='monthly')
od550aer_tm5.quickplot_map();
Looking better. You may wonder why only January is displayed here. This is because quickplot_map
picks the first available timestamp in the dataset, you may specify that explicitly.
Under the hood pyaerocom.GriddedData is based on the iris.Cube object class (iris library) and features very similar functionality (and more).
The loaded Cube
instance can be accessed via:
[10]:
od550aer_tm5.cube
[10]:
Atmosphere Optical Thickness Due To Ambient Aerosol (1) | time | latitude | longitude |
---|---|---|---|
Shape | 12 | 90 | 120 |
Dimension coordinates | |||
time | x | - | - |
latitude | - | x | - |
longitude | - | - | x |
Cell methods | |||
0 | longitude: latitude: point | ||
1 | time: mean | ||
Attributes | |||
Conventions | 'CF-1.6' | ||
computed | False | ||
concatenated | False | ||
contact | 'Twan van Noije (noije@knmi.nl)' | ||
data_id | 'TM5-met2010_CTRL-TEST' | ||
experiment | 'AP3' | ||
experiment_id | 'AP3-CTRL2016' | ||
from_files | ['data/testdata-minimal/modeldata/TM5-met2010_CTRL-TEST/renamed/aerocom3_TM5_AP3-CTRL2016_od550aer_Column_2010_monthly.nc'] | ||
institute_id | 'KNMI' | ||
institution | 'Royal Netherlands Meteorological Institute, De Bilt, The Netherlands' | ||
meteo | '' | ||
model_id | 'TM5' | ||
outliers_removed | False | ||
perturbation | 'CTRL2016' | ||
proj_info | None | ||
project_id | 'AeroCom Phase 3' | ||
reader | None | ||
references | 'Van Noije, T.P.C., et al. (Geosci. Model Dev., 7, 2435-2475, 2014); Van ...' | ||
region | None | ||
regridded | False | ||
source | 'TM5-mp: CTM ERA-Interim 3x2 34L' | ||
title | 'TM5 model output prepared for AeroCom Phase 3' | ||
ts_type | 'monthly' | ||
var_name_read | 'undefined' | ||
vert_code | 'Column' |
If you have not heard of xarray, you should check it out. If you have heard of it (or maybe even used it already) you may convert a GriddedData
object to an xarray.DataArray
via:
[11]:
xarr = od550aer_tm5.to_xarray()
xarr
[11]:
<xarray.DataArray 'od550aer' (time: 12, lat: 90, lon: 120)> Size: 518kB dask.array<filled, shape=(12, 90, 120), dtype=float32, chunksize=(12, 90, 60), chunktype=numpy.ndarray> Coordinates: * time (time) object 96B 2010-01-15 12:00:00 ... 2010-12-15 12:00:00 * lat (lat) float64 720B -89.0 -87.0 -85.0 -83.0 ... 83.0 85.0 87.0 89.0 * lon (lon) float64 960B -178.5 -175.5 -172.5 ... 172.5 175.5 178.5 Attributes: (12/25) standard_name: atmosphere_optical_thickness_due_to_ambient_aerosol long_name: Ambient Aerosol Optical Thickness at 550 nm institution: Royal Netherlands Meteorological Institute, De Bilt, T... institute_id: KNMI source: TM5-mp: CTM ERA-Interim 3x2 34L model_id: TM5 ... ... computed: False concatenated: False meteo: experiment: AP3 perturbation: CTRL2016 cell_methods: longitude: latitude: point time: mean
Simply print the object.
[12]:
print(od550aer)
pyaerocom.GriddedData: (od550aer, TM5-met2010_CTRL-TEST)
atmosphere_optical_thickness_due_to_ambient_aerosol / (1) (time: 365; latitude: 11; longitude: 11)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Cell methods:
0 longitude: latitude: point
1 time: mean
Attributes:
Conventions 'CF-1.6'
NCO '4.7.2'
computed False
concatenated False
contact 'Twan van Noije (noije@knmi.nl)'
data_id 'TM5-met2010_CTRL-TEST'
experiment 'AP3'
experiment_id 'AP3-CTRL2019'
from_files ['data/testdata-minimal/modeldata/TM5-met2010_CTRL-TEST/renamed/aerocom3_TM5-met2010_AP3-CTRL2019_od550aer_Column_2010_daily.nc']
history 'Wed Jul 8 10:31:53 2020: ncks -d lat,20,30 -d lon,20,30 raw/aerocom3_TM5-met2010_AP3-CTRL2019_od550aer_Column_2010_daily.nc ...'
institute_id 'KNMI'
institution 'Royal Netherlands Meteorological Institute, De Bilt, The Netherlands'
meteo 'met2010'
model_id 'TM5'
outliers_removed False
perturbation 'CTRL2019'
proj_info None
project_id 'AeroCom Phase 3'
reader None
references 'Van Noije, T.P.C., et al. (Geosci. Model Dev., 7, 2435-2475, 2014); Bergman, ...'
region None
regridded False
source 'TM5-mp, r1058: CTM ERA-Interim 3x2 34L'
timedim-corrected True
title 'TM5 model output prepared for AeroCom Phase 3'
ts_type 'daily'
var_name_read 'undefined'
vert_code 'Column'
Dimension coordinates can be simply accessed either using []
or .
operator, e.g.
[13]:
od550aer['latitude']
[13]:
<DimCoord: latitude / (degrees) [-49., -47., ..., -31., -29.]+bounds shape(11,)>
[14]:
od550aer.longitude
[14]:
<DimCoord: longitude / (degrees) [61.5, 64.5, ..., 88.5, 91.5]+bounds shape(11,)>
They are instances of iris.coords.DimCoords
, as defined in the underlying Cube
instance used in the GriddedData
object.
Time stamps
Time stamps are represented as numerical values with respect to a reference date and frequency, according to the CF conventions. They can be accessed via the time
attribute of the data class (if the data contains a time dimension).
[15]:
od550aer_tm5.time
[15]:
<DimCoord: time / (days since 1850-01-01 00:00) [2010-01-15 12:00:00, ...] shape(12,)>
You may also want the time-stamps in the form of actual datetime-like objects. These can be computed using the time_stamps()
method:
[16]:
od550aer.time_stamps()[0:3]
[16]:
array(['2010-01-01T00:00:00.000000', '2010-01-02T00:00:00.000000',
'2010-01-03T00:00:00.000000'], dtype='datetime64[us]')
As introduced above, maps of individual time stamps can be plotted using the quickplot_map method. Above we used the default call, which chooses the first available timestamp. You may also specify which date you are interested in:
[17]:
od550aer_tm5.quickplot_map('2010-10');
If you want more control on the input parameters of the map plotting function (e.g. color-binning, lower, upper limit, colorbar, etc.), you may use the underlying plot method (that is also used in GriddedData.quickplot_map
, which is available at `pya.plot.mapping.plot_griddeddata_on_map
<https://pyaerocom.met.no/api.html#pyaerocom.plot.mapping.plot_griddeddata_on_map>`__, e.g.:
[18]:
pya.plot.mapping.plot_griddeddata_on_map(od550aer_tm5[1], xlim=(-60, 60), ylim=(-30, 30), vmin=0, vmax=0.4, log_scale=False);
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/pyaerocom/mathutils.py:232: RuntimeWarning: divide by zero encountered in log10
return np.floor(np.log10(abs(np.asarray(num)))).astype(int)
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/pyaerocom/mathutils.py:232: RuntimeWarning: invalid value encountered in cast
return np.floor(np.log10(abs(np.asarray(num)))).astype(int)
Regional filtering can be performed using the Filter class.
Rectangular regions
An overview of rectangular AeroCom default regions can be accessed via:
[19]:
print(pya.const.OLD_AEROCOM_REGIONS)
['ALL', 'ASIA', 'AUSTRALIA', 'CHINA', 'EUROPE', 'INDIA', 'NAFRICA', 'SAFRICA', 'SAMERICA', 'NAMERICA']
Let’s choose north Africa as an example. Create instance of Filter class:
[20]:
f = pya.Filter('NAFRICA')
You can print its region
attribute to see the edges:
[21]:
print(f.region)
pyaeorocom Region
Name: NAFRICA
Longitude range: [-17, 50]
Latitude range: [0, 40]
Longitude range (plots): [-17, 50]
Latitude range (plots): [0, 40]
Now apply to the model data object:
[22]:
od550aer_nafrica = f(od550aer_tm5)
Compare shapes:
[23]:
od550aer_nafrica
[23]:
pyaerocom.GriddedData: (od550aer, TM5-met2010_CTRL-TEST)
<iris 'Cube' of atmosphere_optical_thickness_due_to_ambient_aerosol / (1) (time: 12; latitude: 22; longitude: 23)>
[24]:
od550aer_tm5
[24]:
pyaerocom.GriddedData: (od550aer, TM5-met2010_CTRL-TEST)
<iris 'Cube' of atmosphere_optical_thickness_due_to_ambient_aerosol / (1) (time: 12; latitude: 90; longitude: 120)>
As you can see, the filtered object is reduced in the longitude and latitude dimension. Let’s have a look:
[25]:
od550aer_nafrica.quickplot_map('March 2010');
Binary region masks
Available HTAP binary filter masks can be accessed via:
[26]:
print(pya.const.HTAP_REGIONS)
['PAN', 'EAS', 'NAF', 'MDE', 'LAND', 'SAS', 'SPO', 'OCN', 'SEA', 'RBU', 'EEUROPE', 'NAM', 'WEUROPE', 'SAF', 'USA', 'SAM', 'EUR', 'NPO', 'MCA']
And they are handled in the same way as the rectangular regions:
[27]:
pya.Filter('OCN')(od550aer_tm5).quickplot_map();
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/pyaerocom/mathutils.py:232: RuntimeWarning: invalid value encountered in cast
return np.floor(np.log10(abs(np.asarray(num)))).astype(int)
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/pyaerocom/griddeddata.py:2430: RuntimeWarning: overflow encountered in scalar absolute
vstr = f"{mean:.{abs(exponent(mean)) + 1}f}"
As you can see the provided HTAP region masks are only valid within 60\(^\circ\)S to 60\(^\circ\)N.
Filtering of time
Filtering of time is not included in the Filter class (which only allows for regional filtering) but can be easily performed from the GriddedData
object directly. If you know the indices of the time stamps you want to crop, you can simply use numpy indexing syntax (remember that we have a 3D array containing time, latitude and lonfgitude).
Let’s say we are interested in the (northern hemispheric) summer months of June to September.
Since the time dimension corresponds the first index in the 3D data (time, lat, lon), and since we know, that we have monthly 2010 data (see above), we may use:
[28]:
od550aer_summer = od550aer_tm5[5:8]
od550aer_summer.time_stamps()
[28]:
array(['2010-06-15T00:00:00.000000', '2010-07-15T12:00:00.000000',
'2010-08-15T12:00:00.000000'], dtype='datetime64[us]')
However, this methodology might not always be handy (imagine you have a 10 year dataset of 3hourly
sampled data and want to extract three months in the 6th year …). In that case, you can perform the cropping using the actual timestamps:
[29]:
od550aer_tm5.crop(time_range=('6-2010', '9-2010')).time_stamps()
[29]:
array(['2010-06-15T00:00:00.000000', '2010-07-15T12:00:00.000000',
'2010-08-15T12:00:00.000000'], dtype='datetime64[us]')
Data selection over multiple dimensions
Inspired by the xarray.DataArray.sel method, a similar method was implemented in GriddedData
:
[30]:
od550aer_tm5.sel(time='April 2010', longitude=(90, 179), latitude=(-50, 20)).quickplot_map();
NOTE: Before release of version 0.10.0, there was a bug that led to a crash if a time range (i.e. time=(start, stop)
) was passed into the sel
method.
You may regrid GriddedData
using the regrid
method (for regional regridding) or the resample_time
method (for temporal resmpling). Like already done above, the calls may be combined, e.g.:
[31]:
lowres = od550aer_tm5.regrid(lat_res_deg=10, lon_res_deg=20).resample_time('yearly')
lowres
[31]:
pyaerocom.GriddedData: (od550aer, TM5-met2010_CTRL-TEST)
<iris 'Cube' of atmosphere_optical_thickness_due_to_ambient_aerosol / (1) (time: 1; latitude: 18; longitude: 18)>
As you can see, the time dimension only has one entry, as expected, as the data only contains 2010 timestamps and we computed a yearly average, lat and lon dimensions are also reduced, accordingly.
[32]:
lowres.quickplot_map();
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/iris/coords.py:1979: IrisGuessBoundsWarning: Coordinate 'longitude' is not bounded, guessing contiguous bounds.
warnings.warn(
/home/thlun8736/Documents/work/pyaerocom-tutorials/venv/lib/python3.10/site-packages/iris/coords.py:1979: IrisGuessBoundsWarning: Coordinate 'latitude' is not bounded, guessing contiguous bounds.
warnings.warn(
Regional averaging
The actual cell sizes of latitude and longitude coordinates vary, dependent on where you are, that is, they are largest close to the equator, and smallest near the poles. When computing a regional average, this needs to be considered (i.e. values need to be weighted by their actual cell size). This is area weighted regridding is implemented in the iris
library and is used by default in GriddedData
, for instance, when calling:
[33]:
od550aer_tm5.mean()
[33]:
0.11864813532841474
You may specify if you do not want to use area weighting:
[34]:
od550aer_tm5.mean(areaweighted=False)
[34]:
0.0982569
Makes quite a difference, doesn’t it?
Time-series at individual coordinates can be extracted from a GriddedData
object via:
[35]:
ts_data = od550aer_tm5.to_time_series(latitude=60, longitude=11)
ts_data
[35]:
[StationData: {'dtime': [], 'var_info': BrowseDict: {'od550aer': {'units': Unit('1')}}, 'station_coords': {'latitude': None, 'longitude': None, 'altitude': None}, 'data_err': BrowseDict: {}, 'overlap': BrowseDict: {}, 'numobs': BrowseDict: {}, 'data_flagged': BrowseDict: {}, 'filename': None, 'station_id': None, 'station_name': None, 'instrument_name': None, 'PI': None, 'country': None, 'country_code': None, 'ts_type': 'monthly', 'latitude': 61.0, 'longitude': 10.5, 'altitude': nan, 'data_id': 'TM5-met2010_CTRL-TEST', 'dataset_name': None, 'data_product': None, 'data_version': None, 'data_level': None, 'framework': None, 'instr_vert_loc': None, 'revision_date': None, 'website': None, 'ts_type_src': None, 'stat_merge_pref_attr': None, 'od550aer': 2010-01-15 12:00:00 0.049607
2010-02-14 00:00:00 0.061162
2010-03-15 12:00:00 0.069986
2010-04-15 00:00:00 0.097556
2010-05-15 12:00:00 0.103770
2010-06-15 00:00:00 0.107482
2010-07-15 12:00:00 0.146354
2010-08-15 12:00:00 0.145518
2010-09-15 00:00:00 0.078066
2010-10-15 12:00:00 0.077722
2010-11-15 00:00:00 0.037447
2010-12-15 12:00:00 0.039024
dtype: float32}]
As you can see from the output, the return value of this method is a list, that contains one pyaerocom.StationData
object. The reason why this method returns a list is because it is usually called with many input coordinates (e.g. all site locations of an observation network), and thus, returns a list of StationData
objects, one for each input coordinate.
The StationData
object is basically a dictionary-like object with some extra functionality.
[36]:
station = ts_data[0]
The actual time-series is a pandas.Series
object and can be accessed through the variable name (remember, GriddedData
instances are single variable).
[37]:
ts = station['od550aer']
ts
[37]:
2010-01-15 12:00:00 0.049607
2010-02-14 00:00:00 0.061162
2010-03-15 12:00:00 0.069986
2010-04-15 00:00:00 0.097556
2010-05-15 12:00:00 0.103770
2010-06-15 00:00:00 0.107482
2010-07-15 12:00:00 0.146354
2010-08-15 12:00:00 0.145518
2010-09-15 00:00:00 0.078066
2010-10-15 12:00:00 0.077722
2010-11-15 00:00:00 0.037447
2010-12-15 12:00:00 0.039024
dtype: float32
[38]:
ax = ts.plot()
ax.set_title('TM5 AOD Oslo')
ax.set_ylabel('AOD [550 nm]');
Let’s have a closer look at the observations. After all, the main purpose of the AeroCom initiative is to compare models with observations. As we shall see below, the just introduced StationData
object will play a key role when bringing gridded model data (GriddedData
) together with ungridded observational data, such as measurements of a certain variable at a given site location.
In the following section the reading of ungridded data is illustrated based on the example of AERONET version 3 (level 2) data.
Observational data: Reading of and working with ungridded data
This section provides brief introductions into the following pyaerocom classes and architectures:
`pya.io.ReadUngridded
<https://pyaerocom.met.no/api.html#pyaerocom.io.readungridded.ReadUngridded>`__`pya.UngriddedData
<https://pyaerocom.met.no/api.html#pyaerocom.ungriddeddata.UngriddedData>`__`pya.StationData
<https://pyaerocom.met.no/api.html#pyaerocom.stationdata.StationData>`__
Primer on observational data
Other than model data, which can be provided as a gridded object over a certain domain (e.g. latitude, longitude, time) and in that, can be considered fully sampled, observational data is usually sparsely sampled in space and time.
That is, consider a network of observations of a certain variable (e.g. od550aer, or AOD), with many different site locations around the globe. Each of these sites is measuring the variable at that exact location, and the whole network of sites makes a point cloud of site locations in the latitude, longitude domain. In addition, since these are real world measurements, the temporal sampling itself between the different sites is not synchronised, that is, each site is measuring independently of any other site.
For instance, the AERONET network is a global network of sun photometer measurements, that can measure the AOD at several wavelengths based on measurements of the solar irradiance. Thus, at the least, these measurements require 2 things:
Daylight
A clear sky
Thus, it is needless to say, that a site in Antarctica cannot measure at the same time as a site in Ny-Ålesund (actually, that is also not strictly true, as AERONET now also provides AOD measurements based on the lunar irradiance, but I hope you got the point anyways).
This should illustrate, that it is more difficult to define a harmonised and yet, flexible data format for such observational databases. In pyaerocom, the UngriddedData
object is designed for such point cloud data and typically holds the data belonging to a whole observation network, that is:
The ``UngriddedData`` object can be considered a point-cloud-like dataobject that holds individual time-series from many locations around the globe and the associated metadata for each site and measurement
Moreover, since observational data typically comes from many different observation networks, the formats in which these data are stored typically vary from network to network, which makes it harder to read the data, compared to model data which typically comes as NetCDF file and these days, most often follow some metadata conventions such as the CF conventions.
Data from the AERONET network (that is introduced in the following), for instance, is provided in the form of column seperated text files per measurement station, where columns correspond to different variables and data rows to individual time stamps.
As a result, custom reading routines for individual observation networks need to be implemented, and pyaerocom provides reading support for many commonly used observational databases such as AERONET, or the EBAS or EARLINET data.
The basic workflow for reading of ungridded data, such as Aeronet data, is very similar to the reading of gridded data (comprising a reading class that handles a query and returns a data class, here UngriddedData. However, under the hood, the implementation is a little more complicated, as there are reading classes for each supported network, as illustrated in the following flowchart:
The actual classes handling the reading of data (for a given dataset) are indicated in blue. The orange ReadUngridded
class is a factory class, that knows about the blue reading classes via a unique ID (similar to the gridded reading). Thus, as indicated, as a user, you do not need to know which exact reading class you need, you just need the ID and ReadUngridded
will know which (blue) reader to use. To summarise, what you need for reading an ungridded dataset is:
A path where the actual datafiles are located
An unique ID, that links that path with a name
A reader that can read the class
The first 2 points are available via:
[39]:
pya.const.OBSLOCS_UNGRIDDED
[39]:
{'AeronetSunV3Lev1.5.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev1.5.daily/renamed',
'AeronetSunV3Lev1.5.AP': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev1.5.AP/renamed',
'AeronetSunV3Lev2.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev2.0.daily/renamed',
'AeronetSunV3Lev2.AP': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev2.0.AP/renamed',
'AeronetSDAV3Lev1.5.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.SDA.V3L1.5.daily/renamed',
'AeronetSDAV3Lev1.5.AP': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/',
'AeronetSDAV3Lev2.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.SDA.V3L2.0.daily/renamed',
'AeronetSDAV3Lev2.AP': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/',
'AeronetInvV3Lev1.5.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.Inv.V3L1.5.daily/renamed',
'AeronetInvV3Lev2.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.Inv.V3L2.0.daily/renamed',
'EBASMC': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EBASMultiColumn/data',
'EEAAQeRep': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EEA_AQeRep/renamed',
'EARLINET': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EarlinetV3/download',
'GAWTADsubsetAasEtAl': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/PYAEROCOM/GAWTADSulphurSubset/data',
'DMS_AMS_CVO': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/PYAEROCOM/DMS_AMS_CVO/data',
'GHOST.EEA.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EEA_AQ_eReporting/daily',
'GHOST.EEA.hourly': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EEA_AQ_eReporting/hourly',
'GHOST.EEA.monthly': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EEA_AQ_eReporting/monthly',
'GHOST.EBAS.daily': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EBAS/daily',
'GHOST.EBAS.hourly': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EBAS/hourly',
'GHOST.EBAS.monthly': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GHOST/data/EBAS/monthly',
'EEAAQeRep.NRT': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EEA_AQeRep.NRT/renamed/',
'EEAAQeRep.v2': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EEA_AQeRep.v2/renamed/',
'AirNow': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/MACC_INSITU_AirNow',
'MarcoPolo': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/CHINA_MP_NRT',
'CNEMC': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/MEP/aggregated/',
'ICOS': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/ICOS/aggregated/',
'ICPFORESTS': '/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/icp-forests/dep/',
'TROPOMI_XEMEP_R01x01': '/lustre/storeB/project/fou/kl/sesam/work/CSO-gridded/xEMEP__r01x01/data/'}
And the reader classes that are supposed to be used for each of these IDs is provided in the ReadUngridded
class header:
[40]:
pya.io.ReadUngridded.SUPPORTED_READERS
[40]:
[pyaerocom.io.read_aeronet_invv3.ReadAeronetInvV3,
pyaerocom.io.read_aeronet_sdav3.ReadAeronetSdaV3,
pyaerocom.io.read_aeronet_sunv3.ReadAeronetSunV3,
pyaerocom.io.read_earlinet.ReadEarlinet,
pyaerocom.io.read_ebas.ReadEbas,
pyaerocom.io.read_aasetal.ReadAasEtal,
pyaerocom.io.read_airnow.ReadAirNow,
pyaerocom.io.read_eea_aqerep.ReadEEAAQEREP,
pyaerocom.io.read_eea_aqerep_v2.ReadEEAAQEREP_V2,
pyaerocom.io.cams2_83.read_obs.ReadCAMS2_83,
pyaerocom.io.gaw.reader.ReadGAW,
pyaerocom.io.ghost.reader.ReadGhost,
pyaerocom.io.cnemc.reader.ReadCNEMC,
pyaerocom.io.icos.reader.ReadICOS,
pyaerocom.io.icpforests.reader.ReadICPForest,
pyaerocom.io.pyaro.read_pyaro.ReadPyaro]
The link between ID (keys of const.OBSLOCS_UNGRIDDED
) and reader is available in the actual readers themselves, e.g.:
[41]:
pya.io.read_aeronet_sunv3.ReadAeronetSunV3.SUPPORTED_DATASETS
[41]:
['AeronetSunV3Lev1.5.daily',
'AeronetSunV3Lev1.5.AP',
'AeronetSunV3Lev2.daily',
'AeronetSunV3Lev2.AP']
But these are details that you usually do not need to worry about. If you want to register a new observation dataset, you need the 3 points specified above and can add it via:
[42]:
aeronet_sun_datadir = f'./data/testdata-minimal/obsdata/AeronetSunV3Lev2.daily/renamed'
pya.const.add_ungridded_obs(obs_id='Bla',
data_dir=aeronet_sun_datadir,
reader=pya.io.read_aeronet_sunv3.ReadAeronetSunV3)
Now, we basically have 2 names for the same dataset:
[43]:
pya.io.read_aeronet_sunv3.ReadAeronetSunV3.SUPPORTED_DATASETS
[43]:
['AeronetSunV3Lev1.5.daily',
'AeronetSunV3Lev1.5.AP',
'AeronetSunV3Lev2.daily',
'AeronetSunV3Lev2.AP',
'Bla']
That is, the data under the above directory is now accessible via 2 IDs: Bla
and AeronetSunV3L2Subset.daily
.
Before continuing with the reading of observational data, some things need to be said related to the caching of UngriddedData
objects.
Caching of UngriddedData
Reading of ungridded data is often rather time-consuming. Therefore, pyaerocom uses a caching strategy that stores loaded instances of the UngriddedData
class as pickle files in a cache directory (illustrated in the flowchart shown above). The loaction of the cache directory can be accessed via:
[44]:
pya.const.CACHEDIR
[44]:
'/home/thlun8736/MyPyaerocom/_cache/thlun8736'
You may change this directory if required.
[45]:
f'Caching is active? {pya.const.CACHING}'
[45]:
'Caching is active? True'
Deactivate / Activate caching
[46]:
pya.const.CACHING = False
[47]:
pya.const.CACHING = True
Note: if caching is active, make sure you have enough disk quota or change location where the cache files are stored.
Read Aeronet Sun v3 level 2 data
As illustrated in the flowchart above, ungridded observation data can be imported using the ReadUngridded
class. Like for the model data, observation datasets can be searched as follows:
[48]:
browse_database('Aeronet*');
Dataset name: AeronetSunV3Lev1.5.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev1.5.daily/renamed
Supported variables: ['od340aer', 'od440aer', 'od500aer', 'od870aer', 'ang4487aer', 'ang44&87aer', 'od550aer', 'od550lt1ang', 'proxyod550aerh2o', 'proxyod550bc', 'proxyod550dust', 'proxyod550nh4', 'proxyod550oa', 'proxyod550so4', 'proxyod550ss', 'proxyod550no3', 'proxyzaerosol', 'proxyzdust']
Last revision: 20240913
Dataset name: AeronetSunV3Lev1.5.AP
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev1.5.AP/renamed
Supported variables: ['od340aer', 'od440aer', 'od500aer', 'od870aer', 'ang4487aer', 'ang44&87aer', 'od550aer', 'od550lt1ang', 'proxyod550aerh2o', 'proxyod550bc', 'proxyod550dust', 'proxyod550nh4', 'proxyod550oa', 'proxyod550so4', 'proxyod550ss', 'proxyod550no3', 'proxyzaerosol', 'proxyzdust']
Last revision: 20200201
Dataset name: AeronetSunV3Lev2.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev2.0.daily/renamed
Supported variables: ['od340aer', 'od440aer', 'od500aer', 'od870aer', 'ang4487aer', 'ang44&87aer', 'od550aer', 'od550lt1ang', 'proxyod550aerh2o', 'proxyod550bc', 'proxyod550dust', 'proxyod550nh4', 'proxyod550oa', 'proxyod550so4', 'proxyod550ss', 'proxyod550no3', 'proxyzaerosol', 'proxyzdust']
Last revision: 20240913
Dataset name: AeronetSunV3Lev2.AP
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev2.0.AP/renamed
Supported variables: ['od340aer', 'od440aer', 'od500aer', 'od870aer', 'ang4487aer', 'ang44&87aer', 'od550aer', 'od550lt1ang', 'proxyod550aerh2o', 'proxyod550bc', 'proxyod550dust', 'proxyod550nh4', 'proxyod550oa', 'proxyod550so4', 'proxyod550ss', 'proxyod550no3', 'proxyzaerosol', 'proxyzdust']
Last revision: 20211120
Dataset name: AeronetSDAV3Lev1.5.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.SDA.V3L1.5.daily/renamed
Supported variables: ['od500gt1aer', 'od500lt1aer', 'od500aer', 'ang4487aer', 'od500dust', 'od550aer', 'od550gt1aer', 'od550dust', 'od550lt1aer']
Last revision: 20240913
Reading failed for AeronetSDAV3Lev1.5.AP. Error: NetworkNotSupported('Could not fetch reader class: Input network AeronetSDAV3Lev1.5.AP is not supported by ReadUngridded')
Dataset name: AeronetSDAV3Lev2.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.SDA.V3L2.0.daily/renamed
Supported variables: ['od500gt1aer', 'od500lt1aer', 'od500aer', 'ang4487aer', 'od500dust', 'od550aer', 'od550gt1aer', 'od550dust', 'od550lt1aer']
Last revision: 20240913
Reading failed for AeronetSDAV3Lev2.AP. Error: NetworkNotSupported('Could not fetch reader class: Input network AeronetSDAV3Lev2.AP is not supported by ReadUngridded')
Dataset name: AeronetInvV3Lev1.5.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.Inv.V3L1.5.daily/renamed
Supported variables: ['abs440aer', 'angabs4487aer', 'od440aer', 'ang4487aer', 'ssa675aer', 'ssa670aer', 'abs550aer', 'od550aer']
Last revision: 20240907
Dataset name: AeronetInvV3Lev2.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/Aeronet.Inv.V3L2.0.daily/renamed
Supported variables: ['abs440aer', 'angabs4487aer', 'od440aer', 'ang4487aer', 'ssa675aer', 'ssa670aer', 'abs550aer', 'od550aer']
Last revision: 20240907
Reading failed for AERONET. Error: AttributeError("'NoneType' object has no attribute 'experiment'")
Reading failed for AERONET_TESTBED-SK. Error: AttributeError("'NoneType' object has no attribute 'experiment'")
The search routine found 3 matches for the 3 different AERONET data products: Sun, SDA, and Inv (inversion). You may read more about the different products at the AERONET website.
Let’s continue with the “Sun” product (AERONET Direct Sun algorithm). As you can see from the output above, this dataset contains daily averages, which is convenient to use for model evaluation.
[49]:
obs_id = 'AeronetSunV3Lev2.daily'
[50]:
obs_reader = pya.io.ReadUngridded(obs_id)
print(obs_reader)
Dataset name: AeronetSunV3Lev2.daily
Data directory: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/AeronetSunV3Lev2.0.daily/renamed
Supported variables: ['od340aer', 'od440aer', 'od500aer', 'od870aer', 'ang4487aer', 'ang44&87aer', 'od550aer', 'od550lt1ang', 'proxyod550aerh2o', 'proxyod550bc', 'proxyod550dust', 'proxyod550nh4', 'proxyod550oa', 'proxyod550so4', 'proxyod550ss', 'proxyod550no3', 'proxyzaerosol', 'proxyzdust']
Last revision: 20240913
Let’s read the data (you can read a single or multiple variables at the same time). For now, we only read the AOD at 550 nm:
[ ]:
od550aer_aeronet = obs_reader.read(vars_to_retrieve='od550aer')
od550aer_aeronet
As you can see, the data object is of type UngriddedData
. Other than GriddedData
, UngriddedData
can hold an arbitrary number of variables, and even networks. The number of metadata units indicates the number of data files that have been read.
Plot all station coordinates
To get an overview, you can plot all site coordinates contained in the dataset. You can also plot multiple times into the same map with different input criteria. For instance, below we first plot all site locations available in the data (in red), and then, on top of it, in green, we plot sites that contain data in 2010.
[ ]:
ax = od550aer_aeronet.plot_station_coordinates(markersize=80)
od550aer_aeronet.plot_station_coordinates(color='lime', var_name='od550aer', start=2010, stop=2011, markersize=20, ax=ax)
Access of individual stations
For intercomparison with model data, we are interested in time-series from individual sites. You can check out all existing site-location names via:
[ ]:
od550aer_aeronet.unique_station_names
To access individual site location data as StationData
you can simply do:
[ ]:
station_data = od550aer_aeronet['La_Paz'] # this is fully equivalent with aeronet_data.to_station_data('Leipzig')
station_data
As you can see, the returned object is of type StationData
which has been introduced above (remember, we extracted a time series from the TM5 model for the location of Oslo).
As mentioned above, it can be used like a dictionary, and the variable time-series can be accessed via:
[ ]:
station_data['od550aer'].plot()
You may also plot directly from the StationData
object (and do some more other hopefully self-explanatory things):
[ ]:
ax = station_data.plot_timeseries('od550aer', marker='x', ls='none')
station_data.resample_time(var_name='od550aer', ts_type='monthly').plot_timeseries('od550aer', marker=' ', ls='-', lw=3, ax=ax)
Back to UngriddedData
: You may also retrieve the StationData
with specifying more constraints using to_station_data
(e.g. in monthly resolution and only for the year 2010). And you can overlay different curves, by passing the axes instance returned by the plotting method:
[ ]:
ax=od550aer_aeronet.to_station_data('La_Paz',
start=2010, stop=2011,
freq='daily').plot_timeseries('od550aer')
ax=od550aer_aeronet.to_station_data('La_Paz',
start=2010,
freq='monthly').plot_timeseries('od550aer', ax=ax)
ax.legend()
ax.set_title('La Paz AODs 2010');
You can also plot the time-series directly from UngriddedData
[ ]:
od550aer_aeronet.plot_station_timeseries('La_Paz', 'od550aer', ts_type='monthly',
start=2018).set_title('Monthly AOD in La Paz, 2018');
Computing trends (BETA API, will likely see some revisions)
Trends can be computed using the same methodolgy as Mortier et al., 2020, which is also used in the Aerosol trends interface. You may also read about the method in the methods section therein.
[ ]:
te = pya.trends_engine.TrendsEngine
timeseries_monthly = station_data.resample_time('od550aer', ts_type='monthly')['od550aer']
result = te.compute_trend(data=timeseries_monthly, start_year=2008, stop_year=2019, ts_type='monthly', min_num_yrs=7)
result
Colocation of model and obsdata
Now that we have a gridded model dataset and an ungridded observation dataset loaded we can continue with colocation of both datasets. Colocation essentially describes the process of matching observations and model in space and time, which makes it possible to compare both and ultimately, to assess how well the model is performing.
As the observations are usually sparse, they define the set of locations and times to be extracted from the model (for comparison). Essentially, what needs to be done is simple:
Decide on a time interval in which you want to colocate the observations with the model data.
Decide on an output frequency.
Find all site location coordinates from the observations in the time period and extract the model values from the model dataset at these locations.
Match the time interval and frequency.
pyaerocom has some methods that can do this for you and these methods return an instance of the `ColocatedData
<https://pyaerocom.met.no/api.html#pyaerocom.colocateddata.ColocatedData>`__ object.
Low-level colocation routine(s)
Let’s colocate the TM5 model data with the AERONET AOD subset for the year 2010 and in monthly resolution. Since we already have both data objects loaded, we can go straight to the low-level colocation routine:
[ ]:
from pyaerocom.colocation.colocation_utils import colocate_gridded_ungridded
coldata = colocate_gridded_ungridded(od550aer_tm5,
od550aer_aeronet,
ts_type='monthly',
start=2010,
filter_name='ALL-noMOUNTAINS')
The filter-name ALL-noMOUNTAINS
denotes that all available AERONET sites are supposed to be used but high altitude sites (located above 1000m a.s.l). A more detailed introduction into available regions and region filters is provided in the getting_started_setup.ipynb tutorial.
You may create a scatter plot from these colocated monthly means, which includes relevant statistical parameters that help to assess model performance:
[ ]:
coldata.plot_scatter(loglog=True);
Does not look too bad, you can see that this result is from 8 sites and 62 data points (monthly averages). The normalised-mean-bias (NMB) is -15.5%, which means that the model slightly underestimates AOD at these locations.
A more illustrative view of the model biases can be retrieved by plotting a bias map:
[ ]:
pya.plot.mapping.plot_nmb_map_colocateddata(coldata);
The fact that you can barely see most of the sites is a good sign, since 0% bias is mapped to white color which is the same as the background color here. The largest bias is found in Amsterdam Island, in the southern Indian Ocean, which could be an indication that the model is simulating too little sea-salt aerosol in this very remote and clean region.
Under the hood …
… the ColocatedData
object is an xarray.DataArray
:
[ ]:
coldata.data
As you can see, model and obs (stored in data_source
dimension) now share the same coordinates (dimension station_name
) and time stamps (dimension time
). The data_source
dimension always contains the observation data at the first index and the model data at the second:
[ ]:
obsdata = coldata.data[0]
obsdata
[ ]:
modeldata = coldata.data[1]
modeldata
High-level colocation routine
If it wasn’t for the purpose of this notebook, normally, we don’t want to go through the hassle of reading the data individually before colocating. Thus, pyaerocom has a high-level interface that can do colocation straight with the observation and model IDs (under the hood, of course, it uses the same routines that have been used here). By default, this high-level interface also stores all produced ColocatedData
objects as NetCDF files, for later analysis. To use this class, we first need to
define a ColocationSetup
object. This object will store all the necessary information about how the colocation should be done:
[ ]:
colocatorsetup = pya.ColocationSetup(
model_id=model_id, obs_id=obs_id, obs_vars='od550aer',
ts_type='monthly',
model_ts_type_read='monthly',
filter_name='OCN', # let's try to better isolate the ocean stations
reanalyse_existing=True,
save_coldata=True,
start="2010-01-01",
stop="2011-01-01"
)
colocatorsetup
We can now create the colocator
[ ]:
colocator = pya.Colocator(colocatorsetup)
colocator
Quite a few options, a lot of them are for the even higher-level automatic web-processing tools that feed the Aerocom Evaluation websites, so let’s not get lost in these details here.
The colocation can be run as follows:
[ ]:
colocator.run()
As you can see in the last line of the output, the colocated data object was stored as NetCDF file. The default direcory for these files can be accessed (and modified) in the const
class:
[ ]:
pya.const.COLOCATEDDATADIR
[ ]:
import os
os.listdir(pya.const.COLOCATEDDATADIR)
And you can see that there is a subdirecory which contains all colocated data objects that have been created for the TM5 model. The loaded colocated data object can also be accessed via:
[ ]:
coldata2 = colocator.data['od550aer']['od550aer']
[ ]:
coldata2.plot_scatter();
We can see that we have multiple stations in the ocean.
As a last step for this tutorial, let’s make sure that the stations are indeed in the ocean:
[ ]:
pya.plot.mapping.plot_nmb_map_colocateddata(coldata2);
Looks like it! Ciao!