pyfelyx
¶
Generic access class to a felyx MDB
copyright: | Copyright 2017 Ifremer / Cersat. |
---|---|
license: | Released under GPL v3 license, see license. |
-
class
pyfelyx.mdb.
MDB
(identifier=None, cfgfile=None, config=None)[source]¶ An abstraction class to read the MDB records. It can be initialized in three different exclusive ways.
A MDB configuration must define at least the following keys:
mdb_output_root
: the path to the root directory where the match-up files are stored (with YYYY/DDD subfolders)full_identifier
: dictionary of the datasets provided in the match-ups the key is a free field (used in the other configuration parameters) and the value the corresponding prefix used in the match-up files (= full dataset identifier as configured in felyx)
Parameters: - identifier (str, optional) – configuration identifier. If provided, a configuration file named <identifier>.cfg must exist in your $HOME/.s3analysis/mdb directory.
- cfgfile (str, optional) – explicit path to a configuration file
- config (dict, optional) – a configuration dictionary to decode the MDB
-
__dict__
= dict_proxy({'__module__': 'pyfelyx.mdb', 'get_filters': <function get_filters>, 'extract_value': <function extract_value>, '__str__': <function __str__>, 'reduce': <classmethod object>, 'get_closest_pixel_indices': <function get_closest_pixel_indices>, 'get_closest_valid_pixel_indices': <function get_closest_valid_pixel_indices>, 'load_config': <function load_config>, '__weakref__': <attribute '__weakref__' of 'MDB' objects>, 'get_matchup_files': <function get_matchup_files>, '__init__': <function __init__>, '__dict__': <attribute '__dict__' of 'MDB' objects>, 'read_satellite_data': <function read_satellite_data>, 'read_insitu_field': <function read_insitu_field>, 'get_config_root': <classmethod object>, '__doc__': '\n An abstraction class to read the MDB records. It can be initialized in\n three different exclusive ways.\n \n A MDB configuration must define at least the following keys:\n\n * ``mdb_output_root``: the path to the root directory where the match-up files\n are stored (with YYYY/DDD subfolders)\n * ``full_identifier``: dictionary of the datasets provided in the match-ups\n the key is a free field (used in the other configuration parameters)\n and the value the corresponding prefix used in the match-up files\n (= full dataset identifier as configured in felyx)\n \n Args:\n identifier (str, optional): configuration identifier. If provided, a\n configuration file named <identifier>.cfg must exist in your\n $HOME/.s3analysis/mdb directory.\n cfgfile (str, optional): explicit path to a configuration file\n config (dict, optional): a configuration dictionary to decode the MDB\n ', 'read_insitu_data': <function read_insitu_data>})¶
-
__module__
= 'pyfelyx.mdb'¶
-
__weakref__
¶ list of weak references to the object (if defined)
-
extract_value
(fieldname, data, row, cell)[source]¶ extract the pixel pointed to by row and cell indices.
Override here for specific cases
-
get_closest_pixel_indices
(sat_lat, sat_lon, ins_lat, ins_lon)[source]¶ Return the indices and distances of closest pixel in each satellite boxes to in situ data.
The match-ups are provided as boxes of neighbour pixels centered on the dynamic site location (usually a in situ measurement). If the pixel latitudes and longitudes are masked, this function will return the location of the closest pixel with valid lat/lon for each match-up.
Parameters: - sat_lat (numpy.ma.array) – array of latitudes of all match-up box pixels
- sat_lon (numpy.ma.array) – array of longitudes of all match-up box pixels
- ins_lat (float) – latitude of the dynamic site location
- ins_lon (float) – longitude of the dynamic site location
Returns: tuple of the indices of the closest pixel in each match-up and its respective distance. The indices is a tuple itself, with row and cell indices as numpy arrays.
Return type: tuple
-
get_closest_valid_pixel_indices
(mdbf, dataset, lat=None, lon=None, filters=None, prefixes=None)[source]¶ Return the indice of the closest valid pixel to the in situ measurement location, in each match-up satellite box.
The pixel validity is defined by the minimum expected quality level.
Parameters: - dataset (str) – the dataset used as reference for the lat/lon
- filters (dict) – the list of fields and threshold range used to define the validity, per dataset (ex: {‘WST’: {‘quality_level’: [2,5]}})
-
get_matchup_files
(dates, source)[source]¶ Return the MDB files for a source and list of dates.
Assumes the MDB files are stored locally, following a year/day in year folder organization.
Parameters: - dates (list of datetime) – list of dates for which mdb files will be collected.
- source (str) – the source of dynamic sites (usually a type of in situ data) for which to collect the MDB files.
Returns: list of full path to collected mdb files
Return type: list
-
read_insitu_data
(source, day, fields, min_quality=None, closest_to_surface=True, end=None)[source]¶ Parameters: - source (str) – the site collections (cmems_drifter, cmems_argo, etc…)
- day (datetime) – the date (or start date if end_date is also set) for which to retrieve the matchup in situ values
- matchup_root (str) – the path to the root repository where the matchup files are stored.
- fields (list) – a list of fields to read (remove the level suffix: for instance, use cmems_water_temperature instead of cmems_water_temperature_0 and the function will detect and reconstruct the two dimensional fields for you).
Returns: a dict where keys are field names, and values the retrieved data
-
read_insitu_field
(matchup_file, source, field, min_quality=None, closest_to_surface=True)[source]¶ read the colocated in situ data in match-up files. Select only the closest measurement in time to satellite pixel. The rest of the buoy or platform history is ignored.
Parameters: - matchup_file (Dataset) – a handler on a NetCDF match-up file
- source (str) – the site collection (cmems_drifter, cmems_argo, etc…)
- field (str) – the name of the field to be read (remove the level suffix: for instance, use cmems_water_temperature instead of cmems_water_temperature_0 and the function will detect and reconstruct the two dimensional fields for you).
- min_quality (int) – minimum quality level for a measurement to be considered as valid (and returned in the result)
- closest_to_surface (boolean) – takes only the closest measurement to surface.
-
read_satellite_data
(source, start, end=None, felyx_fields=None, dataset_fields=None, full_box=False, filters=None, max_distance=None, subbox=None, reference=None)[source]¶ Parameters: - source (str) – the site collections (cmems_drifter, cmems_argo, etc…)
- start (datetime) – the date (or start date if end_date is also set) for which to retrieve the matchup in situ values
- end (datetime) – the end date of the request match-up series. If None (default), only the start day is returned
- felyx_fields (list) – a list of felyx internal fields
- dataset_fields (dict) – a list of fields to read per dataset
- full_box (boolean) – if False (default) return only the closest valid value to in situ measurement instead of the full box of neighbours. This closest valid value is evaluated wrt to the min_quality level requested and max_distance. Pixels not meeting these criteria are discarded.
- filters (dict) – validity range for a selection of fields, per dataset. This defines which pixels can be considered as valid when searching for the closest pixel to the in situ measurement in a match-up. Only applicable if box is set to False.
- max_distance – distance beyond which match-ups are not selected. Only
applicable if
full_box
is set to False - subbox (int) – size of the box to extract (when
box
is set). This box size must be lower than the match-up box size. The box centered on the in situ pixel is extracted. - reference (str) – the reference product in the match-ups. This is product which lat/lon are used for calculation of the pixel distances to in situ/
Returns: a dict where keys are field names, and values the retrieved data
-
classmethod
reduce
(dynamic_data, matched_data, keep)[source]¶ Reduce the data arrays with respect to a selection filter
Parameters: - dynamic_data (dict) – the dictionary of data fields from the dynamic site source (in situ data)
- matched_data (dict) – the dictionnary of the fields of the matched datasets
- keep (arr) – the selection of match-up to keep. One dimensional array where True indicates the match-ups to keep, False the match-ups to remove.