pyfelyx

Generic access class to a felyx MDB

copyright:Copyright 2017 Ifremer / Cersat.
license:Released under GPL v3 license, see license.
class pyfelyx.mdb.MDB(identifier=None, cfgfile=None, config=None)[source]

An abstraction class to read the MDB records. It can be initialized in three different exclusive ways.

A MDB configuration must define at least the following keys:

  • mdb_output_root: the path to the root directory where the match-up files are stored (with YYYY/DDD subfolders)
  • full_identifier: dictionary of the datasets provided in the match-ups the key is a free field (used in the other configuration parameters) and the value the corresponding prefix used in the match-up files (= full dataset identifier as configured in felyx)
Parameters:
  • identifier (str, optional) – configuration identifier. If provided, a configuration file named <identifier>.cfg must exist in your $HOME/.s3analysis/mdb directory.
  • cfgfile (str, optional) – explicit path to a configuration file
  • config (dict, optional) – a configuration dictionary to decode the MDB
__dict__ = dict_proxy({'__module__': 'pyfelyx.mdb', 'get_filters': <function get_filters>, 'extract_value': <function extract_value>, '__str__': <function __str__>, 'reduce': <classmethod object>, 'get_closest_pixel_indices': <function get_closest_pixel_indices>, 'get_closest_valid_pixel_indices': <function get_closest_valid_pixel_indices>, 'load_config': <function load_config>, '__weakref__': <attribute '__weakref__' of 'MDB' objects>, 'get_matchup_files': <function get_matchup_files>, '__init__': <function __init__>, '__dict__': <attribute '__dict__' of 'MDB' objects>, 'read_satellite_data': <function read_satellite_data>, 'read_insitu_field': <function read_insitu_field>, 'get_config_root': <classmethod object>, '__doc__': '\n An abstraction class to read the MDB records. It can be initialized in\n three different exclusive ways.\n \n A MDB configuration must define at least the following keys:\n\n * ``mdb_output_root``: the path to the root directory where the match-up files\n are stored (with YYYY/DDD subfolders)\n * ``full_identifier``: dictionary of the datasets provided in the match-ups\n the key is a free field (used in the other configuration parameters)\n and the value the corresponding prefix used in the match-up files\n (= full dataset identifier as configured in felyx)\n \n Args:\n identifier (str, optional): configuration identifier. If provided, a\n configuration file named <identifier>.cfg must exist in your\n $HOME/.s3analysis/mdb directory.\n cfgfile (str, optional): explicit path to a configuration file\n config (dict, optional): a configuration dictionary to decode the MDB\n ', 'read_insitu_data': <function read_insitu_data>})
__init__(identifier=None, cfgfile=None, config=None)[source]
__module__ = 'pyfelyx.mdb'
__str__(*args, **kwargs)[source]
__weakref__

list of weak references to the object (if defined)

extract_value(fieldname, data, row, cell)[source]

extract the pixel pointed to by row and cell indices.

Override here for specific cases

get_closest_pixel_indices(sat_lat, sat_lon, ins_lat, ins_lon)[source]

Return the indices and distances of closest pixel in each satellite boxes to in situ data.

The match-ups are provided as boxes of neighbour pixels centered on the dynamic site location (usually a in situ measurement). If the pixel latitudes and longitudes are masked, this function will return the location of the closest pixel with valid lat/lon for each match-up.

Parameters:
  • sat_lat (numpy.ma.array) – array of latitudes of all match-up box pixels
  • sat_lon (numpy.ma.array) – array of longitudes of all match-up box pixels
  • ins_lat (float) – latitude of the dynamic site location
  • ins_lon (float) – longitude of the dynamic site location
Returns:

tuple of the indices of the closest pixel in each match-up and its respective distance. The indices is a tuple itself, with row and cell indices as numpy arrays.

Return type:

tuple

get_closest_valid_pixel_indices(mdbf, dataset, lat=None, lon=None, filters=None, prefixes=None)[source]

Return the indice of the closest valid pixel to the in situ measurement location, in each match-up satellite box.

The pixel validity is defined by the minimum expected quality level.

Parameters:
  • dataset (str) – the dataset used as reference for the lat/lon
  • filters (dict) – the list of fields and threshold range used to define the validity, per dataset (ex: {‘WST’: {‘quality_level’: [2,5]}})
classmethod get_config_root()[source]
get_filters(config)[source]
get_matchup_files(dates, source)[source]

Return the MDB files for a source and list of dates.

Assumes the MDB files are stored locally, following a year/day in year folder organization.

Parameters:
  • dates (list of datetime) – list of dates for which mdb files will be collected.
  • source (str) – the source of dynamic sites (usually a type of in situ data) for which to collect the MDB files.
Returns:

list of full path to collected mdb files

Return type:

list

load_config(configfile)[source]
read_insitu_data(source, day, fields, min_quality=None, closest_to_surface=True, end=None)[source]
Parameters:
  • source (str) – the site collections (cmems_drifter, cmems_argo, etc…)
  • day (datetime) – the date (or start date if end_date is also set) for which to retrieve the matchup in situ values
  • matchup_root (str) – the path to the root repository where the matchup files are stored.
  • fields (list) – a list of fields to read (remove the level suffix: for instance, use cmems_water_temperature instead of cmems_water_temperature_0 and the function will detect and reconstruct the two dimensional fields for you).
Returns:

a dict where keys are field names, and values the retrieved data

read_insitu_field(matchup_file, source, field, min_quality=None, closest_to_surface=True)[source]

read the colocated in situ data in match-up files. Select only the closest measurement in time to satellite pixel. The rest of the buoy or platform history is ignored.

Parameters:
  • matchup_file (Dataset) – a handler on a NetCDF match-up file
  • source (str) – the site collection (cmems_drifter, cmems_argo, etc…)
  • field (str) – the name of the field to be read (remove the level suffix: for instance, use cmems_water_temperature instead of cmems_water_temperature_0 and the function will detect and reconstruct the two dimensional fields for you).
  • min_quality (int) – minimum quality level for a measurement to be considered as valid (and returned in the result)
  • closest_to_surface (boolean) – takes only the closest measurement to surface.
read_satellite_data(source, start, end=None, felyx_fields=None, dataset_fields=None, full_box=False, filters=None, max_distance=None, subbox=None, reference=None)[source]
Parameters:
  • source (str) – the site collections (cmems_drifter, cmems_argo, etc…)
  • start (datetime) – the date (or start date if end_date is also set) for which to retrieve the matchup in situ values
  • end (datetime) – the end date of the request match-up series. If None (default), only the start day is returned
  • felyx_fields (list) – a list of felyx internal fields
  • dataset_fields (dict) – a list of fields to read per dataset
  • full_box (boolean) – if False (default) return only the closest valid value to in situ measurement instead of the full box of neighbours. This closest valid value is evaluated wrt to the min_quality level requested and max_distance. Pixels not meeting these criteria are discarded.
  • filters (dict) – validity range for a selection of fields, per dataset. This defines which pixels can be considered as valid when searching for the closest pixel to the in situ measurement in a match-up. Only applicable if box is set to False.
  • max_distance – distance beyond which match-ups are not selected. Only applicable if full_box is set to False
  • subbox (int) – size of the box to extract (when box is set). This box size must be lower than the match-up box size. The box centered on the in situ pixel is extracted.
  • reference (str) – the reference product in the match-ups. This is product which lat/lon are used for calculation of the pixel distances to in situ/
Returns:

a dict where keys are field names, and values the retrieved data

classmethod reduce(dynamic_data, matched_data, keep)[source]

Reduce the data arrays with respect to a selection filter

Parameters:
  • dynamic_data (dict) – the dictionary of data fields from the dynamic site source (in situ data)
  • matched_data (dict) – the dictionnary of the fields of the matched datasets
  • keep (arr) – the selection of match-up to keep. One dimensional array where True indicates the match-ups to keep, False the match-ups to remove.