resistics.gather module¶
Module for gathering data that will be combined to calculate transfer functions
There are two scenarios considered here. The first is the simplest, which is quick processing outside the project environment. In this case data gathering is not complicated. This workflow does not involve a data selector, meaining only a single step is required.
QuickGather to put together the out_data, in_data and cross_data
When inside the project environment, regardless of whether it is single site or multi site processing, the workflow follows:
Selector to select shared windows across all sites for a sampling frequency
Gather to gather the combined evaluation frequency data
Warning
There may be some confusion in the code with many references to spectra data and evaluation frequency data. Evaluation frequency data, referred to below as eval_data is actually an instance of Spectra data. However, it is named differently to highlight the fact that it is not the complete spectra data, but is actually spectra data at a reduced set of frequencies corresponding to the evaluation frequncies.
Within a project, there are separate folders for users who want to save both the full spectra data with all the frequencies as well as the evaluation frequency spectra data with the smaller subset of frequencies. Only the evaluation frequency data is required to calculate the transfer function, but the full spectral data might be useful for visualisation and analysis reasons.
- resistics.gather.get_site_evals_metadata(config_name: str, proj: resistics.project.Project, site_name: str, fs: float) Dict[str, resistics.spectra.SpectraMetadata] [source]¶
Get spectra metadata for a given site and sampling frequency
- Parameters
config_name (str) – The configuration name to get the right data
proj (Project) – The project instance to get the measurements
site_name (str) – The name of the site for which to gather the SpectraMetadata
fs (float) – The original recording sampling frequency
- Returns
Dictionary of measurement name to SpectraMetadata
- Return type
Dict[str, SpectraMetadata]
- resistics.gather.get_site_level_wins(meas_metadata: Dict[str, resistics.spectra.SpectraMetadata], level: int) pandas.core.series.Series [source]¶
Get site windows for a decimation level given a sampling frequency
- Parameters
meas_metadata (Dict[str, SpectraMetadata]) – The measurement spectra metadata for a site
level (int) – The decimation level
- Returns
A series with an index of global windows for the site and values the measurements which have that global window. This is for a single decimation level
- Return type
pd.Series
See also
get_site_wins
Get windows for all decimation levels
Examples
An example getting the site windows for decimation level 0 when there are three measurements in the site.
>>> from resistics.testing import spectra_metadata_multilevel >>> from resistics.gather import get_site_level_wins >>> meas_metadata = {} >>> meas_metadata["meas1"] = spectra_metadata_multilevel(n_wins=[3, 2, 2], index_offset=[3, 2, 1]) >>> meas_metadata["meas2"] = spectra_metadata_multilevel(n_wins=[4, 3, 2], index_offset=[28, 25, 22]) >>> meas_metadata["meas3"] = spectra_metadata_multilevel(n_wins=[2, 2, 1], index_offset=[108, 104, 102]) >>> get_site_level_wins(meas_metadata, 0) 3 meas1 4 meas1 5 meas1 28 meas2 29 meas2 30 meas2 31 meas2 108 meas3 109 meas3 dtype: object >>> get_site_level_wins(meas_metadata, 1) 2 meas1 3 meas1 25 meas2 26 meas2 27 meas2 104 meas3 105 meas3 dtype: object >>> get_site_level_wins(meas_metadata, 2) 1 meas1 2 meas1 22 meas2 23 meas2 102 meas3 dtype: object
- resistics.gather.get_site_wins(config_name: str, proj: resistics.project.Project, site_name: str, fs: float) Dict[int, pandas.core.series.Series] [source]¶
Get site windows for all levels given a sampling frequency
- Parameters
config_name (str) – The configuration name to get the right data
proj (Project) – The project instance to get the measurements
site_name (str) – The site name
fs (float) – The recording sampling frequency
- Returns
Dictionary of integer to levels, with one entry for each decimation level
- Return type
Dict[int, pd.Series]
- Raises
ValueError – If no matching spectra metadata is found
- class resistics.gather.Selection(sites: List[resistics.project.Site], dec_params: resistics.decimate.DecimationParameters, tables: Dict[int, pandas.core.frame.DataFrame])[source]¶
Bases:
resistics.common.ResisticsData
Selections are output by the Selector. They hold information about the data that should be gathered for the regression.
- get_n_evals() int [source]¶
Get the total number of evaluation frequnecies
- Returns
The total number of evaluation frequencies that can be calculated
- Return type
int
- get_n_wins(level: int, eval_idx: int) int [source]¶
Get the number of windows for an evaluation frequency
- Parameters
level (int) – The decimation level
eval_idx (int) – The evaluation frequency index in the decimation level
- Returns
The number of windows
- Return type
int
- Raises
ValueError – If the level is greater than the maximum level available
- get_measurements(site: resistics.project.Site) List[str] [source]¶
Get the measurement names to read from a Site
- Parameters
site (Site) – The site for which to get the measurements
- Returns
The measurements to read from
- Return type
List[str]
- get_eval_freqs() List[float] [source]¶
Get the evaluation frequencies
- Returns
The evaluation frequencies as a flat list of floats
- Return type
List[float]
- get_eval_wins(level: int, eval_idx: int) pandas.core.frame.DataFrame [source]¶
Limit the level windows to the evaluation frequency
- Parameters
level (int) – The decimation level
eval_idx (int) – The evalution frequency index in the decimation level
- Returns
pandas DataFrame of the windows and the measurements from each site the window can be read from
- Return type
pd.DataFrame
- pydantic model resistics.gather.Selector[source]¶
Bases:
resistics.common.ResisticsProcess
The Selector takes Sites and tries to find shared windows across them. A project instance is required for the Selector to be able to find shared windows.
The Selector should be used for remote reference and intersite processing and single site processing when masks are involved.
Show JSON schema
{ "title": "Selector", "description": "The Selector takes Sites and tries to find shared windows across them. A\nproject instance is required for the Selector to be able to find shared\nwindows.\n\nThe Selector should be used for remote reference and intersite processing\nand single site processing when masks are involved.", "type": "object", "properties": { "name": { "title": "Name", "type": "string" } } }
- run(config_name: str, proj: resistics.project.Project, site_names: List[str], dec_params: resistics.decimate.DecimationParameters, masks: Optional[Dict[str, str]] = None) resistics.gather.Selection [source]¶
Run the selector
If a site repeats, the selector only considers it once. This might be the case when performing intersite or other cross site style processing.
- Parameters
config_name (str) – The configuration name
proj (Project) – The project instance
site_names (List[str]) – The names of the sites to get data from
dec_params (DecimationParameters) – The decimation parameters with number of levels etc.
masks (Optional[Dict[str, str]], optional) – Any masks to add, by default None
- Returns
The Selection information defining the measurements and windows to read for each site
- Return type
- field name: Optional[str] [Required]¶
- Validated by
validate_name
- pydantic model resistics.gather.SiteCombinedMetadata[source]¶
Bases:
resistics.common.WriteableMetadata
Metadata for combined data
Combined metadata stores metadata for measurements that are combined from a single site.
Show JSON schema
{ "title": "SiteCombinedMetadata", "description": "Metadata for combined data\n\nCombined metadata stores metadata for measurements that are combined from\na single site.", "type": "object", "properties": { "file_info": { "$ref": "#/definitions/ResisticsFile" }, "site_name": { "title": "Site Name", "type": "string" }, "fs": { "title": "Fs", "type": "number" }, "system": { "title": "System", "default": "", "type": "string" }, "serial": { "title": "Serial", "default": "", "type": "string" }, "wgs84_latitude": { "title": "Wgs84 Latitude", "default": -999.0, "type": "number" }, "wgs84_longitude": { "title": "Wgs84 Longitude", "default": -999.0, "type": "number" }, "easting": { "title": "Easting", "default": -999.0, "type": "number" }, "northing": { "title": "Northing", "default": -999.0, "type": "number" }, "elevation": { "title": "Elevation", "default": -999.0, "type": "number" }, "measurements": { "title": "Measurements", "type": "array", "items": { "type": "string" } }, "chans": { "title": "Chans", "type": "array", "items": { "type": "string" } }, "n_evals": { "title": "N Evals", "type": "integer" }, "eval_freqs": { "title": "Eval Freqs", "type": "array", "items": { "type": "number" } }, "histories": { "title": "Histories", "type": "object", "additionalProperties": { "$ref": "#/definitions/History" } } }, "required": [ "site_name", "fs", "chans", "n_evals", "eval_freqs", "histories" ], "definitions": { "ResisticsFile": { "title": "ResisticsFile", "description": "Required information for writing out a resistics file", "type": "object", "properties": { "created_on_local": { "title": "Created On Local", "type": "string", "format": "date-time" }, "created_on_utc": { "title": "Created On Utc", "type": "string", "format": "date-time" }, "version": { "title": "Version", "type": "string" } } }, "Record": { "title": "Record", "description": "Class to hold a record\n\nA record holds information about a process that was run. It is intended to\ntrack processes applied to data, allowing a process history to be saved\nalong with any datasets.\n\nExamples\n--------\nA simple example of creating a process record\n\n>>> from resistics.common import Record\n>>> messages = [\"message 1\", \"message 2\"]\n>>> record = Record(\n... creator={\"name\": \"example\", \"parameter1\": 15},\n... messages=messages,\n... record_type=\"example\"\n... )\n>>> record.summary()\n{\n 'time_local': '...',\n 'time_utc': '...',\n 'creator': {'name': 'example', 'parameter1': 15},\n 'messages': ['message 1', 'message 2'],\n 'record_type': 'example'\n}", "type": "object", "properties": { "time_local": { "title": "Time Local", "type": "string", "format": "date-time" }, "time_utc": { "title": "Time Utc", "type": "string", "format": "date-time" }, "creator": { "title": "Creator", "type": "object" }, "messages": { "title": "Messages", "type": "array", "items": { "type": "string" } }, "record_type": { "title": "Record Type", "type": "string" } }, "required": [ "creator", "messages", "record_type" ] }, "History": { "title": "History", "description": "Class for storing processing history\n\nParameters\n----------\nrecords : List[Record], optional\n List of records, by default []\n\nExamples\n--------\n>>> from resistics.testing import record_example1, record_example2\n>>> from resistics.common import History\n>>> record1 = record_example1()\n>>> record2 = record_example2()\n>>> history = History(records=[record1, record2])\n>>> history.summary()\n{\n 'records': [\n {\n 'time_local': '...',\n 'time_utc': '...',\n 'creator': {\n 'name': 'example1',\n 'a': 5,\n 'b': -7.0\n },\n 'messages': ['Message 1', 'Message 2'],\n 'record_type': 'process'\n },\n {\n 'time_local': '...',\n 'time_utc': '...',\n 'creator': {\n 'name': 'example2',\n 'a': 'parzen',\n 'b': -21\n },\n 'messages': ['Message 5', 'Message 6'],\n 'record_type': 'process'\n }\n ]\n}", "type": "object", "properties": { "records": { "title": "Records", "default": [], "type": "array", "items": { "$ref": "#/definitions/Record" } } } } } }
- field site_name: str [Required]¶
The name of the site
- field fs: float [Required]¶
Recording sampling frequency
- field system: str = ''¶
The system used for recording
- field serial: str = ''¶
Serial number of the system
- field wgs84_latitude: float = -999.0¶
Latitude in WGS84
- field wgs84_longitude: float = -999.0¶
Longitude in WGS84
- field easting: float = -999.0¶
The easting of the site in local cartersian coordinates
- field northing: float = -999.0¶
The northing of the site in local cartersian coordinates
- field elevation: float = -999.0¶
The elevation of the site
- field measurements: Optional[List[str]] = None¶
List of measurement names that were included in the combined data
- field chans: List[str] [Required]¶
List of channels, these are common amongst all the measurements
- field n_evals: int [Required]¶
The number of evaluation frequencies
- field eval_freqs: List[float] [Required]¶
The evaluation frequencies
- field histories: Dict[str, resistics.common.History] [Required]¶
Dictionary mapping measurement name to measurement processing history
- class resistics.gather.SiteCombinedData(metadata: resistics.gather.SiteCombinedMetadata, data: Dict[int, numpy.ndarray])[source]¶
Bases:
resistics.common.ResisticsData
Combined data is data that is combined from a single site for the purposes of regression.
All of the data that is combined should have the same sampling frequency, same evaluation frequencies and some shared channels.
Data is stored in the data attribute of the class. This is a dictionary mapping evaluation frequency index to data for the evaluation frequency from all windows in the site. The shape of data for a single evaluation frequency is:
n_wins x n_chans
The data is complex valued.
- class resistics.gather.GatheredData(out_data: resistics.gather.SiteCombinedData, in_data: resistics.gather.SiteCombinedData, cross_data: resistics.gather.SiteCombinedData)[source]¶
Bases:
resistics.common.ResisticsData
Class to hold data to be used in by Regression preparers
Gathered data has an out_data, in_data and cross_data. The important thing here is that the data is all aligned with regards to windows
- pydantic model resistics.gather.ProjectGather[source]¶
Bases:
resistics.common.ResisticsProcess
Gather aligned data from a single or multiple sites in the project
Aligned data means that the same index of data across multiple sites points to data covering the same global window (i.e. the same time window). This is essential for calculating intersite or remote reference transfer functions.
Show JSON schema
{ "title": "ProjectGather", "description": "Gather aligned data from a single or multiple sites in the project\n\nAligned data means that the same index of data across multiple sites points\nto data covering the same global window (i.e. the same time window). This\nis essential for calculating intersite or remote reference transfer\nfunctions.", "type": "object", "properties": { "name": { "title": "Name", "type": "string" } } }
- field name: Optional[str] [Required]¶
- Validated by
validate_name
- run(config_name: str, proj: resistics.project.Project, selection: resistics.gather.Selection, tf: resistics.transfunc.TransferFunction, out_name: str, in_name: Optional[str] = None, cross_name: Optional[str] = None) resistics.gather.GatheredData [source]¶
Gather data for input into the regression preparer
- Parameters
config_name (str) – The config name for getting the correct evals data
proj (Project) – The project instance
selection (Selection) – The selection
tf (TransferFunction) – The transfer function
out_name (str) – The name of the output site
in_name (Optional[str], optional) – The name of the input site, by default None
cross_name (Optional[str], optional) – The name of the cross site, by default None
- Returns
The data gathered for the regression preparer
- Return type
- pydantic model resistics.gather.QuickGather[source]¶
Bases:
resistics.common.ResisticsProcess
Processor to gather data outside of a resistics environment
This is intended for use when quickly calculating out a transfer function for a single measurement and only a single spectra data instance is accepted as input.
Remote reference or intersite processing is not possible using QuickGather
See also
ProjectGather
For more advanced gathering of data in a project
Show JSON schema
{ "title": "QuickGather", "description": "Processor to gather data outside of a resistics environment\n\nThis is intended for use when quickly calculating out a transfer function\nfor a single measurement and only a single spectra data instance is accepted\nas input.\n\nRemote reference or intersite processing is not possible using QuickGather\n\nSee Also\n--------\nProjectGather : For more advanced gathering of data in a project", "type": "object", "properties": { "name": { "title": "Name", "type": "string" } } }
- field name: Optional[str] [Required]¶
- Validated by
validate_name
- run(dir_path: pathlib.Path, dec_params: resistics.decimate.DecimationParameters, tf: resistics.transfunc.TransferFunction, eval_data: resistics.spectra.SpectraData) resistics.gather.GatheredData [source]¶
Generate the GatheredData object for input into regression preparation
The input is a single spectra data instance and is used to populate the in_data, out_data and cross_data.
- Parameters
dir_path (Path) – The directory path to the measurement
dec_params (DecimationParameters) – The decimation parameters
tf (TransferFunction) – The transfer function
eval_data (SpectraData) – The spectra data at the evaluation frequencies
- Returns
GatheredData for regression preparer
- Return type