API of Aeroval tools
Documentation of the pyaerocom AeroVal API, for high level web processing tools.
Tools for AeroVal experiment setup
High level analysis setup for AeroVal experiment
- class pyaerocom.aeroval.setup_classes.CAMS2_83Setup(*, use_cams2_83: bool = False)[source]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.EvalRunOptions(*, clear_existing_json: bool = True, only_json: bool = False, only_colocation: bool = False, only_model_maps: bool = False, obs_only: bool = False)[source]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.EvalSetup(*, io_aux_file: Annotated[Path | str, '.py file containing additional read methods for modeldata'] = '', var_web_info_file: Annotated[Path | str, 'config file containing additional variables'] = '', var_scale_colmap_file: Annotated[Path | str, 'config file containing scales/ranges for variables'] = '', **extra_data: Any)[source]
Composite class representing a whole analysis setup
This represents the level at which json I/O happens for configuration setup files.
- get_model_entry(model_name) dict [source]
Get model entry configuration
Since the configuration files for experiments are in json format, they do not allow the storage of executable custom methods for model data reading. Instead, these can be specified in a python module that may be specified via
add_methods_file
and that contains a dictionary FUNS that maps the method names with the callable methods.As a result, this means that, by default, custom read methods for individual models in
model_config
do not contain the callable methods but only the names. This method will take care of handling this and will return a dictionary where potential custom method strings have been converted to the corresponding callable methods.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.ExperimentInfo(*, exp_id: str, exp_name: str = '', exp_descr: str = '', public: bool = False, exp_pi: str = 'docs', pyaerocom_version: str = '0.30.dev0', creation_date: str = '2025-04-24T09:41:45.083181Z')[source]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.ModelMapsSetup(*, maps_freq: Literal['hourly', 'daily', 'monthly', 'yearly', 'coarsest'] = 'coarsest', plot_types: dict[str, str | set[str]] | set[str] = {'contour'}, boundaries: BoundingBox = BoundingBox(west=-180.0, east=180.0, south=-90.0, north=90.0), right_menu: tuple[str, ...] | None = None, overlay_save_format: Literal['webp', 'png'] = 'webp')[source]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.OutputPaths(*, avdb_resource: Path | str | None = None, json_basedir: Path | str = '/home/docs/MyPyaerocom/aeroval/data', coldata_basedir: Path | str = '/home/docs/MyPyaerocom/aeroval/coldata', proj_id: str, exp_id: str)[source]
Setup class for output paths of json files and co-located data
This interface generates all paths required for an experiment.
- avdb_resource
An aerovaldb resource identifier as expected by aerovaldb.open()[1]. If not provided, pyaerocom will fall back to using json_basedir, for backwards compatibility.
[1] https://aerovaldb.readthedocs.io/en/latest/api.html#aerovaldb.open
- Type:
str, Path, None
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.ProjectInfo(*, proj_id: str)[source]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.StatisticsSetup(*, MIN_NUM: Annotated[int, Gt(gt=0)] = 1, weighted_stats: bool = True, annual_stats_constrained: bool = False, add_trends: bool = False, avg_over_trends: bool = False, obs_min_yrs: Annotated[int, Ge(ge=0)] = 0, stats_min_yrs: Annotated[int, Gt(gt=0)] = 0, sequential_yrs: bool = False, stats_tseries_base_freq: str | None = None, forecast_evaluation: bool = False, forecast_days: Annotated[int, Gt(gt=0)] = 4, use_fairmode: bool = False, use_diurnal: bool = False, obs_only_stats: bool = False, model_only_stats: bool = False, drop_stats: tuple[str, ...] = (), stats_decimals: int | None = None, round_floats_precision: int | None = None, **extra_data: Any)[source]
Setup options for statistical calculations
- weighted_stats
if True, statistics are calculated using area weights, this is only relevant for gridded / gridded evaluations.
- Type:
- annual_stats_constrained
if True, then only sites are considered that satisfy a potentially specified annual resampling constraint (see
pyaerocom.colocation.ColocationSetup.min_num_obs
). E.g.lets say you want to calculate statistics (bias, correlation, etc.) for monthly model / obs data for a given site and year. Lets further say, that there are only 8 valid months of data, and 4 months are missing, so statistics will be calculated for that year based on 8 vs. 8 values. Now if
pyaerocom.colocation.ColocationSetup.min_num_obs
is specified in way that requires e.g. at least 9 valid months to represent the whole year, then this station will not be considered in case annual_stats_constrained is True, else it will. Defaults to False.- Type:
- stats_tseries_base_freq
The statistics Time Series display in AeroVal (under Overall Evaluation) is computed in intervals of a certain frequency, which is specified via
TimeSetup.main_freq
(defaults to monthly). That is, monthly colocated data is used as a basis to compute the statistics for each month (e.g. if you have 10 sites, then statistics will be computed based on 10 monthly values for each month of the timeseries, 1 value for each site). stats_tseries_base_freq may be specified in case a higher resolution is supposed to be used as a basis to compute the timeseries in the resolution specified byTimeSetup.main_freq
(e.g. if daily is specified here, then for the above example 310 values would be used - 31 for each site - to compute the statistics for a given month (in this case, a month with 31 days, obviously).- Type:
str, optional
- drop_stats
tuple of strings with names of statistics (as determined by keys in aeroval.glob_defaults.py’s statistics_defaults) to not compute. For example, setting drop_stats = (“mb”, “mab”), results in json files in hm/ts with entries which do not contain the mean bias and mean absolute bias, but the other statistics are preserved.
- Type:
tuple, optional
- stats_decimals
If provided, overwrites the decimals key in glod_defaults for the statistics, which has a default of 3. Setting this higher of lower changes the number of decimals shown on the Aeroval webpage.
- Type:
int, optional
- round_floats_precision
Sets the precision argument for the function pyaerocom.aaeroval.json_utils:set_float_serialization_precision
- Type:
int, optional
- Parameters:
kwargs – any of the supported attributes, e.g. StatisticsSetup(annual_stats_constrained=True)
- model_config: ClassVar[ConfigDict] = {'extra': 'allow', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.TimeSetup(*, DEFAULT_FREQS: ~typing.Literal['monthly', 'yearly'] = 'monthly', SEASONS: list[str] = ['all', 'DJF', 'MAM', 'JJA', 'SON'], main_freq: str = 'monthly', freqs: list[str] = ['monthly', 'yearly'], periods: list[str] = <factory>, add_seasons: bool = True, use_meteorological_seasons: bool = False)[source]
Time setup options
- add_seasons
if True, seasons will be [‘all’, ‘DJF’, ‘MAM’, ‘JJA’, ‘SON’], if False, just [‘all’].
- Type:
bool, default True
- use_meteorological_seasons
if True, then statistics are based on the meteorological definition of seasons. This is relevant for periods that are a single year. So if
add_seasons
is True, for a given year [‘DJF’] will refer to data from Dec of the previous year (if available) and Jan/Feb of the same year, while ifuse_meteorological_seasons
is False, it will be based on data from Jan/Feb and December of the same year. Similarly, and weather or notadd_seasons
is True, ifuse_meteorological_seasons
is True, [‘all’] (whole year) will refer to data from Dec of the previous year to Nov of the same year, while if False, it will refer to data from Jan to Dec of the same year.- Type:
bool, default False
- get_seasons()[source]
Get list of seasons to be analysed
Returns
SEASONS
ifadd_seasons
it True, else [ ‘all’] (only whole year).- Returns:
list of season strings for analysis
- Return type:
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.setup_classes.WebDisplaySetup(*, map_zoom: str = 'World', regions_how: ~typing.Literal['default', 'aerocom', 'htap', 'country'] = 'default', add_model_maps: bool = False, modelorder_from_config: bool = True, obsorder_from_config: bool = True, var_order_menu: tuple[str, ...] = (), obs_order_menu: tuple[str, ...] = (), stats_order_menu: tuple[str, ...] = (), model_order_menu: tuple[str, ...] = (), hide_charts: tuple[str, ...] = (), hide_pages: tuple[str, ...] = (), ts_annotations: dict[str, str] = <factory>, pages: tuple[str, ...] = ('maps', 'evaluation', 'intercomp', 'overall', 'infos'))[source]
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Specification of observation datasets
- class pyaerocom.aeroval.obsentry.BulkOptions(*, vars: tuple[str, str], model_exists: bool, mode: Literal['product', 'fraction'], units: str)[source]
-
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pyaerocom.aeroval.obsentry.ObsEntry(*, obs_vars: str | tuple[str, ...], obs_id: str | tuple[str, ...], obs_name: str | None = None, obs_ts_type_read: str | dict | None = None, obs_vert_type: Literal['Column', 'Profile', 'Surface', 'ModelLevel'] = 'Surface', obs_aux_requires: dict[str, dict] = {}, instr_vert_loc: str | None = None, is_superobs: bool = False, only_superobs: bool = False, is_bulk: bool = False, bulk_options: dict[str, BulkOptions] = {}, colocation_layer_limts: tuple[LayerLimits, ...] | None = None, profile_layer_limits: tuple[LayerLimits, ...] | None = None, web_interface_name: str | None = None, diurnal_only: bool = False, obs_type: str | None = None, read_opts_ungridded: dict = {}, only_json: bool = False, coldata_dir: str | Path | None = None, obs_use_climatology: ClimatologyConfig | bool = False, **extra_data: Any)[source]
Observation configuration for evaluation (BaseModel)
Note
Only
obs_id
and obs_vars are mandatory, the rest are optional.- obs_id
ID of observation network in AeroCom database (e.g. ‘AeronetSunV3Lev2.daily’) Note that this can also be a custom supplied obs_id if and only if bs_aux_requires is provided
- Type:
- obs_vars
tuple of pyaerocom variable names that are supposed to be analysed (e.g. (‘od550aer’, ‘ang4487aer’))
- obs_ts_type_read
may be specified to explicitly define the reading frequency of the observation data (so far, this does only apply to gridded obsdata such as satellites). For ungridded reading, the frequency may be specified via
obs_id
, where applicable (e.g. AeronetSunV3Lev2.daily). Can be specified variable specific in form of dictionary.
- obs_vert_type
Aerocom vertical code encoded in the model filenames (only AeroCom 3 and later).
- Type:
str, optional
- obs_aux_requires
information about required datasets / variables for auxiliary variables.
- Type:
dict, optional
- instr_vert_loc
vertical location code of observation instrument. This is used in the aeroval interface for separating different categories of measurements such as “ground”, “space” or “airborne”.
- Type:
str, optional
- is_superobs
if True, this observation is a combination of several others which all have to have their own obs config entry.
- Type:
- only_superobs
this indicates whether this configuration is only to be used as part of a superobs network, and not individually.
- Type:
- is_bulkfraction
If true numerator and denominator are colocated separately, before the fraction is calculated. For this to work, the numerator and denominator need to be given in bulk_options
- Type:
- read_opts_ungridded
dictionary that specifies reading constraints for ungridded reading (c.g.
pyaerocom.io.ReadUngridded
).- Type:
dict
, optional
- only_json
Only to be set if the obs entry already has colocated data files which were preprocessed outside of pyaerocom. Setting to True will skip the colcoation and just create the JSON output.
- Type:
- coldata_dir
Only to be set if the obs entry already has colocated data files which were preprocessed outside of pyaerocom. This is the directory in which the colocated data files are located.
- Type:
- obs_use_climatology
Configuration for climatology. If True is given, a default configuration is made. With False, climatology is turned off
- Type:
ClimatologyConfig | bool, optional
- classmethod check_obs_vert_type(ovt)[source]
Check if obs_vert_type string is valid alias :param ovt: obs_vert_type string :type ovt: str
- Returns:
valid obs_vert_type
- Return type:
- Raises:
ValueError – if ovt is invalid
- has_var(var_name)[source]
Check if input variable is defined in entry
- Returns:
True if entry has variable available, else False
- Return type:
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow', 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Specification of model datasets
- class pyaerocom.aeroval.modelentry.ModelEntry(*, model_id: str, model_ts_type_read: str | dict | None = '', model_name: str | None = None, model_use_vars: dict = {}, model_add_vars: dict[str, tuple[str, ...]] = {}, model_read_aux: dict = {}, model_rename_vars: dict = {}, flex_ts_type: bool = True, model_data_dir: str | None = None, gridded_reader_id: dict[str, str] = {'model': 'ReadGridded', 'obs': 'ReadGridded'}, model_kwargs: dict = {})[source]
Model configuration for evaluation (BaseModel)
Note —-model_read_aux Only
model_id
is mandatory, the rest is optional.- model_ts_type_read
may be specified to explicitly define the reading frequency of the model data. Not to be confused with
ts_type
, which specifies the frequency used for colocation. Can be specified variable specific by providing a dictionary.
- model_use_vars
dictionary that specifies mapping of model variables. Keys are observation variables, values are strings specifying the corresponding model variable to be used (e.g. model_use_vars=dict(od550aer=’od550csaer’))
- Type:
- model_add_vars
dictionary that specifies additional model variables. Keys are observation variables, values are lists of strings specifying the corresponding model variables to be used (e.g. model_use_vars=dict(od550aer=[‘od550csaer’, ‘od550so4’]))
- Type:
- model_rename_vars
key / value pairs specifying new variable names for model variables in the output json files (is applied after co-location).
- Type:
- model_read_aux
may be used to specify additional computation methods of variables from models. Keys are obs variables, values are dictionaries with keys vars_required (list of required variables for computation of var and fun (method that takes list of read data objects and computes and returns var)
- Type:
- property aux_funs_required
Boolean specifying whether this entry requires auxiliary variables
- get_vars_to_process(obs_vars: tuple) tuple [source]
Get lists of obs / mod variables to be processed
- Parameters:
obs_vars (tuple) – tuple of observation variables
- Returns:
list – list of observation variables (potentially extended from input list)
list – corresponding model variables which are mapped based on content of
model_add_vars
andmodel_use_vars
.
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Containers for model and observation setup
Collection classes to specify a number of model entries and a number of observation entries for a given AeroVal experiment.
- class pyaerocom.aeroval.collections.BaseCollection[source]
- abstract add_entry(key, value) None [source]
Abstract method to add an entry to the collection.
- Parameters:
key (Hashable) – The key of the entry.
value (object) – The value of the entry.
- abstract get_entry(key) object [source]
Abstract method to get an entry from the collection.
- Parameters:
key (Hashable) – The key of the entry to retrieve.
- Returns:
The entry associated with the provided key.
- Return type:
- keylist(name_or_pattern: str | None = None) list[str] [source]
Find model / obs names that match input search pattern(s)
- class pyaerocom.aeroval.collections.ModelCollection[source]
Object that represents a collection of model entries
“Keys” are model names, values are instances of
ModelEntry
. Values can also be assigned as dict and will automatically be converted into instances ofModelEntry
.Note
Entries must not necessarily be only models but may also be observations. Entries provided in this collection refer to the x-axis in the AeroVal heatmap display and must fulfill the protocol defined by
ModelEntry
.- add_entry(key: str, entry: dict | ModelEntry)[source]
Abstract method to add an entry to the collection.
- Parameters:
key (Hashable) – The key of the entry.
value (object) – The value of the entry.
- get_entry(key: str) ModelEntry [source]
Get model entry configuration :param model_name: name of model :type model_name: str
- Returns:
Dictionary that specifies the model setup ready for the analysis
- Return type:
- class pyaerocom.aeroval.collections.ObsCollection[source]
Object that represents a collection of obs entries
“Keys” are obs names, values are instances of
ObsEntry
. Values can also be assigned as dict and will automatically be converted into instances ofObsEntry
.Note
Entries must not necessarily be only observations but may also be models. Entries provided in this collection refer to the y-axis in the AeroVal heatmap display and must fulfill the protocol defined by
ObsEntry
.- add_entry(key: str, entry: dict | ObsEntry)[source]
Abstract method to add an entry to the collection.
- Parameters:
key (Hashable) – The key of the entry.
value (object) – The value of the entry.
- property all_vert_types
List of unique vertical types specified in this collection
- get_all_vars() list[str] [source]
Get unique list of all obs variables from all entries
- Returns:
list of variables specified in obs collection
- Return type:
- get_entry(key: str) ObsEntry [source]
Getter for obs entries
- Raises:
KeyError – if input name is not in this collection
- get_web_interface_name(key: str) str [source]
Get webinterface name for entry
Note
Normally this is the key of the obsentry in
obs_config
, however, it might be specified explicitly via key web_interface_name in the corresponding value.
Processing tools
Experiment processing engine
- class pyaerocom.aeroval.experiment_processor.ExperimentProcessor(cfg: EvalSetup)[source]
Processing engine for AeroVal experiment
By default, this class processes one configuration file, represented by
EvalSetup
. As such, an instance ofEvalSetup
represents an AeroVal experiment, comprising a list of models, a list of observations (and variables).For each possible (or defined) model / obs / variable combination, the processing engine will perform spatial and temporal co-location and will store on co-located NetCDF file (e.g. if there are 2 models, 2 observation networks and 2 variables there will be 4 co-located NetCDF files). The co-location is done using
pyaerocom.Colocator
.- run(model_name=None, obs_name=None, var_list=None, update_interface=True)[source]
Create colocated data and json files for model / obs combination
- Parameters:
model_name (str or list, optional) – Name or pattern specifying model that is supposed to be analysed. Can also be a list of names or patterns to specify multiple models. If None (default), then all models are run that are part of this experiment.
obs_name (
str
, orlist
, optional) – Likemodel_name
, but for specification(s) of observations that are supposed to be used. If None (default) all observations are used.var_list (list, optional) – list variables supposed to be analysed. If None, then all variables available are used. Defaults to None. Can also be str type. Must match at least some of the variables provided by a observation network.
update_interface (bool) – if true, relevant json files that determine what is displayed online are updated after the run, including the the menu.json file and also, the model info table (minfo.json) file is created and saved in
exp_dir
.
- Returns:
list containing all colocated data objects that have been converted to json files.
- Return type:
Model maps processing
Processing of super-observation entries
Super observations refer to merged observation datasets to increase the number of stations.
Low-level base classes for processing engines
- class pyaerocom.aeroval._processing_base.DataImporter(cfg: EvalSetup)[source]
Class that supports reading of model and obs data based on an eval config.
Depending on a
EvalSetup
, reading of model and obs data may have certain constraints (e.g. freq, years, alias variable names, etc.), which are / can be specified flexibly for each model and obs entry in an analysis setup (EvalSetup
). Proper handling of these reading constraints and data import settings are handled in thepyaerocom.colocation.Colocator
engine, therefore the reading in this class is done via theColocator
engine.- read_gridded_obsdata(obs_name, var_name)[source]
Import gridded observation data, usually satellite data
- read_model_data(model_name, var_name)[source]
Import model data
- Parameters:
- Returns:
data – loaded model data.
- Return type:
- class pyaerocom.aeroval._processing_base.HasColocator(cfg: EvalSetup)[source]
Config class that also has the ability to co-locate
- class pyaerocom.aeroval._processing_base.HasConfig(cfg: EvalSetup)[source]
Base class that ensures that evaluation configuration is available
- exp_output
Manages output for an AeroVal experiment (e.g. path locations).
- Type:
- class pyaerocom.aeroval._processing_base.ProcessingEngine(cfg: EvalSetup)[source]
Abstract base for classes supposed to do one or more processing tasks
Requirement for a processing class is to inherit attrs from
HasConfig
and, in addition to that, to have implemented a method :fun:`run` which is running the corresponding processing task and storing all the associated output files, that are read by the frontend.One example of an implementation is the
pyaerocom.aeroval.modelmaps_engine.ModelMapsEngine
.
Helpers for processing of auxiliary variables
- class pyaerocom.aeroval.aux_io_helpers.ReadAuxHandler(aux_file: str)[source]
Helper class for import of auxiliary function objects
- aux_file
path to python module containing function definitions (note: function definitions in module need to be stored in a dictionary called FUNS in the file, where keys are names of the functions and values are callable objects.)
- Type:
- Parameters:
aux_file (str) – input file containing auxiliary functions (details see Attributes section).
- import_all()[source]
Import all callable functions in module with their names
Currently, these are expected to be stored in a dictionary called FUNS which should be defined in the python module.
- Returns:
function definitions.
- Return type:
- import_module()[source]
Import
aux_file
as python moduleUses
importlib.import_module()
for import. :returns: imported module. :rtype: module
Conversion of co-located data to json output
- class pyaerocom.aeroval.coldatatojson_engine.ColdataToJsonEngine(cfg: EvalSetup)[source]
- process_coldata(coldata: ColocatedData)[source]
Creates all json files for one ColocatedData object
- Parameters:
coldata (ColocatedData) – colocated data to be processed.
- Raises:
NotImplementedError – DESCRIPTION.
ValueError – DESCRIPTION.
ConfigError – DESCRIPTION.
- Return type:
None.
Output management
- class pyaerocom.aeroval.experiment_output.ExperimentOutput(cfg: EvalSetup)[source]
JSON output for experiment
- add_forecast_entry(entry: dict, region: str, network: str, obsvar: str, layer: str, modelname: str, modvar: str)[source]
Adds a forecast entry to forecast
- Parameters:
entry – The entry to be added.
network – Observation network
obsvar – Observation variable
layer – Vertical layer
modelname – Model name
modvar – Model variable
- add_heatmap_entry(entry, frequency: str, network: str, obsvar: str, layer: str, modelname: str, modvar: str)[source]
Adds a heatmap entry to glob_stats
- Parameters:
entry – The entry to be added.
region – The region (eg. ALL)
obsvar – Observation variable.
layer – Vertical Layer (eg. SURFACE)
modelname – Model name
modelvar – Model variable.
- add_heatmap_timeseries_entry(entry: dict, region: str, network: str, obsvar: str, layer: str, modelname: str, modvar: str)[source]
Adds a heatmap entry to hm/ts
- Parameters:
entry – The entry to be added.
network – Observation network
obsvar – Observation variable
layer – Vertical layer
modelname – Model name
modvar – Model variable
- add_profile_entry(data: ColocatedData, profile_viz: dict, periods: list[str], seasons: list[str], location, network, obsvar)[source]
Adds an entry for the colocated data to profiles.json.
- clean_json_files() list[str] [source]
Checks all existing json files and removes outdated data
This may be relevant when updating a model name or similar.
Returns: list[str] :
The list of file paths that where modified / removed.
- delete_experiment_data(also_coldata=True) None [source]
Delete all data associated with a certain experiment
Note
This simply deletes the experiment directory with all the json files and, if also_coldata is True, also the associated co-located data objects.
- Parameters:
also_coldata (bool) – if True and if output directory for colocated data is default and specific for input experiment ID, then also all associated colocated NetCDF files are deleted. Defaults to True.
Order of models in menu
Note
Returns empty list if no specific order is to be used in which case the models will be alphabetically ordered
Order of observation entries in menu
- reorder_experiments(exp_order=None) None [source]
Reorder experiment order in evaluation interface
Puts experiment list into order as specified by exp_order, all remaining experiments are sorted alphabetically.
- Parameters:
exp_order (list, optional) – desired experiment order, if None, then alphabetical order is used.
- property results_available: bool
True if results are available for this experiment, else False
- Type:
- update_interface() None [source]
Update web interface
Steps:
Check if results are available, and if so:
Add entry for this experiment in experiments.json
Create/update ranges.json file in experiment directory
Update menu.json against available output and evaluation setup
Synchronise content of heatmap json files with menu
Create/update file statistics.json in experiment directory
Copy json version of EvalSetup into experiment directory
- Return type:
None
Update menu
The menu.json file is created based on the available json map files in the map directory of an experiment.
Global settings
Global defaults
- class pyaerocom.aeroval.glob_defaults.ScaleAndColmap[source]
simple dictionary container with only two keys, scale and colmap
- class pyaerocom.aeroval.glob_defaults.VarWebScaleAndColormap(config_file: str = '', **kwargs)[source]
- class pyaerocom.aeroval.glob_defaults.VariableInfo(menu_name, vertical_type, category)[source]
- category: CategoryType
Alias for field number 2
Alias for field number 0
- vertical_type: VerticalType
Alias for field number 1
- class pyaerocom.aeroval.glob_defaults.VerticalType(value)[source]
A 2D variable is defined under Column on the website, 3D is defined under Surface
- pyaerocom.aeroval.glob_defaults.statistics_defaults = {'R': {'colmap': 'RdYlGn', 'decimals': 2, 'forecast': True, 'longname': 'Correlation Coefficient', 'name': 'R', 'scale': [0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1], 'unit': '1'}, 'R_spearman': {'colmap': 'RdYlGn', 'decimals': 2, 'longname': 'R Spearman Correlation', 'name': 'R Spearman', 'scale': [0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1], 'time_series': True, 'unit': '1'}, 'data_mean': {'colmap': 'coolwarm', 'decimals': 2, 'longname': 'Model Mean', 'name': 'Mean-Mod', 'scale': None, 'time_series': True, 'unit': '1'}, 'fge': {'colmap': 'reverseColmap(RdYlGn)', 'decimals': 2, 'forecast': True, 'longname': 'Fractional Gross Error', 'name': 'FGE', 'scale': [0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2], 'unit': '1'}, 'mab': {'colmap': 'bwr', 'decimals': 1, 'longname': 'Mean Absolute Bias', 'name': 'MAB', 'scale': [0, 0.025, 0.05, 0.075, 0.1, 0.125, 0.15], 'unit': 'var'}, 'mb': {'colmap': 'bwr', 'decimals': 1, 'longname': 'Mean Bias', 'name': 'MB', 'scale': [-0.15, -0.1, -0.05, 0, 0.05, 0.1, 0.15], 'unit': 'var'}, 'mnmb': {'colmap': 'bwr', 'decimals': 1, 'forecast': True, 'longname': 'Modified Normalized Mean Bias', 'name': 'MNMB', 'scale': [-100, -75, -50, -25, 0, 25, 50, 75, 100], 'unit': '%'}, 'nmb': {'colmap': 'bwr', 'decimals': 1, 'forecast': True, 'longname': 'Normalized Mean Bias', 'name': 'NMB', 'scale': [-100, -75, -50, -25, 0, 25, 50, 75, 100], 'unit': '%'}, 'nrms': {'colmap': 'Reds', 'decimals': 1, 'longname': 'Normalized Root Mean Square Error', 'name': 'NRMSE', 'scale': [0, 25, 50, 75, 100, 125, 150, 175, 200], 'time_series': True, 'unit': '%'}, 'num_coords_with_data': {'colmap': None, 'decimals': 0, 'longname': 'Number of Stations with data', 'name': 'Nb. Stations', 'overall_only': True, 'scale': None, 'unit': '1'}, 'num_valid': {'colmap': None, 'decimals': 0, 'longname': 'Number of Valid Observations', 'name': 'Nb. Obs', 'overall_only': True, 'scale': None, 'unit': '1'}, 'refdata_mean': {'colmap': 'coolwarm', 'decimals': 2, 'longname': 'Observation Mean', 'name': 'Mean-Obs', 'scale': None, 'time_series': True, 'unit': '1'}, 'rms': {'colmap': 'coolwarm', 'decimals': 2, 'forecast': True, 'longname': 'Root Mean Square Error', 'name': 'RMSE', 'scale': None, 'unit': '1'}}
Default information for statistical parameters
- pyaerocom.aeroval.glob_defaults.statistics_trend = {'mod_trend': {'category': 'Regional Time Series', 'colmap': 'bwr', 'decimals': 1, 'forecast': False, 'longname': 'Modelled Trends', 'name': 'Mod-Trends', 'scale': [-5.0, -4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0], 'unit': '%/yr'}, 'obs/mod_trend': {'category': 'Regional Time Series', 'colmap': 'bwr', 'decimals': 1, 'forecast': False, 'longname': 'Trends', 'name': 'Obs/Mod-Trends', 'scale': [-5.0, -4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0], 'unit': '%/yr'}, 'obs_trend': {'category': 'Regional Time Series', 'colmap': 'bwr', 'decimals': 1, 'forecast': False, 'longname': 'Observed Trends', 'name': 'Obs-Trends', 'scale': [-5.0, -4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0], 'unit': '%/yr'}}
Default information about trend display
Frontend variable naming conventions
- class pyaerocom.aeroval.varinfo_web.VarinfoWeb(var_name: str, cmap: str | None = None, cmap_bins: list | None = None, vmin: float | None = None, vmax: float | None = None)[source]
Additional variable information relevant for AeroVal web output
- Parameters:
var_name (str) – Name of variable (AeroCom name, not web display name)
cmap (str, optional) – name of colormap for web display. If None, the colormap associated with the input variable is used (via
pyaerocom.variable.Variable.get_cmap()
). Defaults to None.cmap_bins (list, optional) – Value bins for web display. If None, then they are inferred from input vmin and vmax, or, if the latter are also None, from attrs
pyaerocom.variable.Variable.minimum
andpyaerocom.variable.Variable.maximum
. If the latter are not defined an AttributeError will be thrown on initialisation.vmin (float, optional) – lower end of range
vmax (float, optional) – upper end of range
- autofill_missing(vmin: float | None = None, vmax: float | None = None) None [source]
Autofill missing attributes related to cmap bins and cmap
High-level utility functions
- pyaerocom.aeroval.utils.compute_model_average_and_diversity(cfg, var_name, model_names=None, ts_type=None, lat_res_deg=2, lon_res_deg=3, data_id=None, avg_how=None, extract_surface=True, ignore_models=None, comment=None, model_use_vars=None)[source]
Compute median or mean model based on input models
Note
BETA version that will likely undergo revisions.
Time selection currently not properly handled
- Parameters:
cfg (AerocomEvaluation) – analysis instance
var_name (str) – name of variable
model_names (list, optional) – list of model names. If None, all entries in input engine are used.
ts_type (str, optional) – output freq. Defaults to monthly.
lat_res_deg (int, optional) – output latitude resolution, defaults to 2 degrees.
lon_res_deg (int, optional) – output longitude resolution, defaults to 3 degrees.
data_id (str, optional) – output data_id of ensemble model.
avg_how (str, optional) – how to compute averages (choose from mean or median), defaults to “median”.
extract_surface (bool) – if True (and if data contains model levels), surface level is extracted
ignore_models (list, optional) – list of models to be ignored
comment (str, optional) – comment string added to metadata of output data objects.
model_use_vars (dict, optional) – model variables to be used.
- Returns:
GriddedData – ensemble model for input variable computed averaged using median or mean (input avg_how). Default is median.
GriddedData – corresponding diversity field, if avg_how is “mean”, then computed using definition from Textor et al., 2006 (ACP) DOI: 10.5194/acp-6-1777-2006. If avg_how is “median” then interquartile range is used (Q3-Q1)/Q2
GriddedData or None – Q1 field (only output if avg_how is median)
GriddedData or None – Q3 field (only output if avg_how is median)
GriddedData or None – standard deviation field (only output if avg_how is mean)
High-level functions for emep reporting
Global config for emep reporting pyaeroval runs
- pyaerocom.aeroval.config.emep.reporting_base.get_CFG(reportyear, year, model_dir) dict [source]
Get a configuration usable for emep reporting
- Parameters:
reportyear – year of reporting
year – year of data
model_dir – directory containing Base_hour.nc,Base_day.nc,Base_month.nc and Base_fullrun.nc or for trends directory containing years like 2005,2010,2015 again containing above files
- The current working directory of the experiment should have the following files/directories by default:
data output directory
coldata output directory
user_var_scale_colmap.ini optional user-defined colormaps for pyaerocom variables
omit_stations.yaml optional user-defined yaml file of stations to omit
The default values can be changed in your program. If you want to permanently change the defaults, please agree upon these changes with the emep-modellers and contact the pyaerocom-developers.
Example runs with this config look like:
import os import pyaerocom as pya from pyaerocom import const from pyaerocom.aeroval import EvalSetup, ExperimentProcessor from pyaerocom.aeroval.config.emep.reporting_base import get_CFG # Setup for models used in analysis CFG = get_CFG(reportyear=2024, year=2021, model_dir="/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.3_metyear2021_emis2022") CFG.update(dict( # proj_id="status-2024", exp_id="test-2021met_2022emis", exp_name="Test runs for 2024 EMEP reporting", exp_descr=( "Test run from Agnes for 2024_REPORTING/EMEP01_rv5.3_metyear2021_emis2022, i.e. 2021met and 2022emis" ), exp_pi="S. Tsyro, A. Nyiri, H. Klein", )) # remove EEA # for obs in list(CFG["obs_cfg"].keys()): # if obs.startswith("EEA"): # del CFG["obs_cfg"][obs] # print(f"removed {obs}") # remove "concCocpm10", not in model-output for obs in CFG["obs_cfg"]: if "concCocpm10" in CFG["obs_cfg"][obs]["obs_vars"]: CFG["obs_cfg"][obs]["obs_vars"].remove("concCocpm10") # remove "no, pm10, pm25" from EBAS-hourly CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concNno") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm10") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm25") # CFG["raise_exceptions"] = False # CFG["add_model_maps"] = False # CFG["only_model_maps"] = True stp = EvalSetup(**CFG) cdir = "./cache/" os.makedirs(cdir, exist_ok=True) const.CACHEDIR = cdir ana = ExperimentProcessor(stp) ana.update_interface() res = ana.run()
Another example for multiple model-evaluation:
import os import pyaerocom as pya from pyaerocom import const from pyaerocom.aeroval import EvalSetup, ExperimentProcessor from pyaerocom.aeroval.config.emep.reporting_base import get_CFG # Setup for models used in analysis CFG = get_CFG( reportyear=2024, year=2022, model_dir=f"/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.3_year2022_Status_Rep2024", ) dir_versions = { "FFmod": "/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.3_year2022_Status_Rep2024_FFmod/", "MARS5.3": "/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.3_year2022_Status_Rep2024_MARS/", "MARS5.0": "/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.0_year2022_Status_Rep2023_emis2022/", "NoCations": "/lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.3_year2022_Status_Rep2024_noCation/", } # Comparison of several models MODEL = CFG["model_cfg"]["EMEP"] PLTTYPES = CFG["plot_types"]["EMEP"] for mid, fpath in dir_versions.items(): CFG["model_cfg"][mid] = MODEL.copy() CFG["plot_types"][mid] = PLTTYPES.copy() CFG["model_cfg"][mid]["model_data_dir"] = fpath CFG["model_cfg"][mid]["model_id"] = mid del CFG["model_cfg"]["EMEP"] del CFG["plot_types"]["EMEP"] # change some config settings, usually not needed CFG.update( dict( proj_id="emepX", exp_id=f"2024-XXX_2022_ebas2", # exp_name="Evaluation of EMEP runs for 2023 EMEP reporting", exp_descr=( f"Evaluation of EMEP runs for 2024 EMEP reporting, MARS vs ISOROPIA. /lustre/storeB/project/fou/kl/emep/ModelRuns/2024_REPORTING/EMEP01_rv5.?_year2022_Status_Rep2024_*/, is compared against observations from EBAS." ), # periods=["2021"], # exp_pi="S. Tsyro, H. Klein", # add_model_maps=False, ) ) # remove "concCocpm10", not in model-output for obs in CFG["obs_cfg"]: if "concCocpm10" in CFG["obs_cfg"][obs]["obs_vars"]: CFG["obs_cfg"][obs]["obs_vars"].remove("concCocpm10") # remove "no, pm10, pm25" from EBAS-hourly CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concNno") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm10") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm25") # remove EEA for obs in list(CFG["obs_cfg"].keys()): if obs.startswith("EEA"): del CFG["obs_cfg"][obs] print(f"removed {obs}") # try to run anything, but don't fail on error # CFG["raise_exceptions"] = False stp = EvalSetup(**CFG) cdir = "./cache" os.makedirs(cdir, exist_ok=True) const.CACHEDIR = cdir ana = ExperimentProcessor(stp) ana.update_interface() # run everything res = ana.run()
and the example for trends:
import os import pyaerocom as pya from pyaerocom import const from pyaerocom.aeroval import EvalSetup, ExperimentProcessor from pyaerocom.aeroval.config.emep.reporting_base import get_CFG # Setup for models used in analysis CFG = get_CFG(reportyear=2023, year=2021, model_dir=f"/lustre/storeB/project/fou/kl/emep/ModelRuns/2023_REPORTING/TRENDS/pyaerocom_trends/") CFG.update(dict( proj_id="emep", exp_id=f"2023-trends", # exp_name="Evaluation of EMEP runs for 2023 EMEP reporting", exp_descr=( f"Evaluation of EMEP runs for 2023 EMEP reporting trend runs. 7 year obs-data availability per period. /lustre/storeB/project/fou/kl/emep/ModelRuns/2023_REPORTING/TRENDS/pyaerocom_trends is compared against observations fro m EBAS." ), periods=["1990-2021", "1990-1999", "2000-2009", "2010-2019", "2012-2021"], #range(1990,2022)], # exp_pi="S. Tsyro, H. Klein", add_model_maps=False, #only_model_maps=True, # trend parameters freqs=["yearly", "monthly"], # "weekly"],"daily"], # can't be hourly for trends, daily is too slow weekly hardly ever needed main_freq="monthly", add_trends=True, avg_over_trends=True, obs_min_yrs=7, # kun stasjoner med minst 14yr stats_min_yrs=7, # kun stasjoner med minst 14yr sequential_yrs=False, )) # remove "no, pm10, pm25" from EBAS-hourly CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concNno") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm10") CFG["obs_cfg"]["EBAS-h-diurnal"]["obs_vars"].remove("concpm25") # remove EEA for obs in list(CFG["obs_cfg"].keys()): if obs.startswith("EEA"): del CFG["obs_cfg"][obs] # remove all hourly obs, f.e. for trends for obs in list(CFG["obs_cfg"].keys()): if "ts_type" in CFG["obs_cfg"][obs] and CFG["obs_cfg"][obs]["ts_type"] == "hourly": del CFG["obs_cfg"][obs] print(f"removed hourly {obs}") # remove all daily obs, f.e. for trends for obs in list(CFG["obs_cfg"].keys()): if "ts_type" in CFG["obs_cfg"][obs] and CFG["obs_cfg"][obs]["ts_type"] == "daily": del CFG["obs_cfg"][obs] print(f"removed daily {obs}") # remove "concCocpm10", not in model-output for obs in CFG["obs_cfg"]: if "concCocpm10" in CFG["obs_cfg"][obs]["obs_vars"]: CFG["obs_cfg"][obs]["obs_vars"].remove("concCocpm10") # try to run anything, but don't fail on error # CFG["raise_exceptions"] = False stp = EvalSetup(**CFG) cdir = "./cache" os.makedirs(cdir, exist_ok=True) const.CACHEDIR = cdir ana = ExperimentProcessor(stp) ana.update_interface() # run everything res = ana.run()
- Returns:
a dict of a model configuration usable for EvalSetup
- pyaerocom.aeroval.config.emep.emep4no_base_config.get_CFG(year: int, model_dir: str, *, file_pattern: str = '^RERUN2022_{freq}_.+\\.nc$')[source]
Basically the EMEP base config with minor changes to work with EMEP4NO input. Please also refer to the EMEP config documentation as that is where the bulk of the configuration takes place.
- Parameters:
year – Year of data
model_dir – Directory where EMEP4NO files are located.
file_pattern –
Optional regular expression against which the base name of files will be matched. This can be used to override the default Base_{freq}.nc file matching in the EMEP reader.
Note that for convenience the string literal ‘{freq}’ can be included as part of the pattern and will be expanded to (hour|day|month|fullrun). This is recommended, as the presence of these strings are used to derive ts_type, which is currently necessary for reading.
- Returns:
A dict of model configuration which can be passed to EvalSetup.
Example
The following snippet shows how this config can be used.
>>> from pyaerocom.aeroval.config.emep.emep4no_base_config import get_CFG >>> import pathlib >>> >>> if __name__ == "__main__": ... import matplotlib.pyplot as plt ... import pyaerocom as pya ... from pyaerocom import const ... from pyaerocom.aeroval import EvalSetup, ExperimentProcessor ... ... # Customize cache dir to avoid disk quota issues. ... # cdir = pathlib.Path("./cache") ... # cdir.mkdir(exist_ok=True) ... # const.CACHEDIR = str(cdir) ... ... cfg = get_CFG(2022, "/lustre/storeB/project/fou/kl/emep/ModelRuns/EMEP4NO/EMEP4NO_rerun_2022/") ... ... # Change any experiment details. ... cfg.update( ... { ... #"proj_id": "<project name>", ... #"exp_id": "<experiment name>", ... #"json_basedir": "/lustre/storeB/users/thlun8736/python/aeroval/data", ... #"coldata_basedir": "/lustre/storeB/users/thlun8736/python/aeroval/coldata", ... } ... ) ... ... # Run the experiment. ... stp = EvalSetup(**cfg) ... ana = ExperimentProcessor(stp) ... res = ana.run()
Helper modules
General helper functions
Helpers for coldat2json conversion
Helpers for conversion of ColocatedData to JSON files for web interface.
- pyaerocom.aeroval.coldatatojson_helpers.process_profile_data_for_regions(data: ColocatedData, region_id: str, use_country: bool, periods: list[str], seasons: list[str], use_meteorological_seasons: bool) dict [source]
This method populates the json files in data/profiles which are use for visualization. Analogous to _process_map_and_scat for profile data. Each json file corresponds to a region or station, obs network, and variable. Inside the json, it is broken up by model. Each model has a key for “z” (the vertical dimension), “obs”, and “mod” Each “obs” and “mod” is broken up by period.
- Parameters:
data (ColocatedData) – ColocatedData object for this layer
region_id (str) – Spatial subset to compute the mean profiles over
station_name (str) – Station to compute mean profiles over for period
use_country (boolean) – Passed to filter_region().
periods (str) – Year part of the temporal range to average over
seasons (str) – Sesonal part of the temporal range to average over
- Returns:
Dictionary to write to json
- Return type:
output (dict)
- pyaerocom.aeroval.coldatatojson_helpers.process_profile_data_for_stations(data: ColocatedData, station_name: str, use_country: bool, periods: list[str], seasons: list[str], use_meteorological_seasons: bool) dict [source]
This method populates the json files in data/profiles which are use for visualization. Analogous to _process_map_and_scat for profile data. Each json file corresponds to a region, obs network, and variable. Inside the json, it is broken up by model. Each model has a key for “z” (the vertical dimension), “obs”, and “mod” Each “obs” and “mod” is broken up by period.
- Parameters:
data (ColocatedData) – ColocatedData object for this layer
region_id (str) – Spatial subset to compute the mean profiles over
station_name (str) – Station to compute mean profiles over for period
use_country (boolean) – Passed to filter_region().
periods (str) – Year part of the temporal range to average over
seasons (str) – Sesonal part of the temporal range to average over
- Returns:
Dictionary to write to json
- Return type:
output (dict)
Model maps helper functions
- pyaerocom.aeroval.modelmaps_helpers.calc_contour_json(data: GriddedData, cmap: str, cmap_bins: list[float])[source]
Convert gridded data into contours for json output
- Parameters:
data (GriddedData) – input data
cmap (str) – colormap of output
cmap_bins (list) – list containing the bins to which the values are mapped.
- Returns:
dictionary containing contour data
- Return type: