resistics.gather module

Module for gathering data that will be combined to calculate transfer functions

There are two scenarios considered here. The first is the simplest, which is quick processing outside the project environment. In this case data gathering is not complicated. This workflow does not involve a data selector, meaining only a single step is required.

  • QuickGather to put together the out_data, in_data and cross_data

When inside the project environment, regardless of whether it is single site or multi site processing, the workflow follows:

  • Selector to select shared windows across all sites for a sampling frequency

  • Gather to gather the combined evaluation frequency data

Warning

There may be some confusion in the code with many references to spectra data and evaluation frequency data. Evaluation frequency data, referred to below as eval_data is actually an instance of Spectra data. However, it is named differently to highlight the fact that it is not the complete spectra data, but is actually spectra data at a reduced set of frequencies corresponding to the evaluation frequncies.

Within a project, there are separate folders for users who want to save both the full spectra data with all the frequencies as well as the evaluation frequency spectra data with the smaller subset of frequencies. Only the evaluation frequency data is required to calculate the transfer function, but the full spectral data might be useful for visualisation and analysis reasons.

resistics.gather.get_site_evals_metadata(config_name: str, proj: resistics.project.Project, site_name: str, fs: float) Dict[str, resistics.spectra.SpectraMetadata][source]

Get spectra metadata for a given site and sampling frequency

Parameters
  • config_name (str) – The configuration name to get the right data

  • proj (Project) – The project instance to get the measurements

  • site_name (str) – The name of the site for which to gather the SpectraMetadata

  • fs (float) – The original recording sampling frequency

Returns

Dictionary of measurement name to SpectraMetadata

Return type

Dict[str, SpectraMetadata]

resistics.gather.get_site_level_wins(meas_metadata: Dict[str, resistics.spectra.SpectraMetadata], level: int) pandas.core.series.Series[source]

Get site windows for a decimation level given a sampling frequency

Parameters
  • meas_metadata (Dict[str, SpectraMetadata]) – The measurement spectra metadata for a site

  • level (int) – The decimation level

Returns

A series with an index of global windows for the site and values the measurements which have that global window. This is for a single decimation level

Return type

pd.Series

See also

get_site_wins

Get windows for all decimation levels

Examples

An example getting the site windows for decimation level 0 when there are three measurements in the site.

>>> from resistics.testing import spectra_metadata_multilevel
>>> from resistics.gather import get_site_level_wins
>>> meas_metadata = {}
>>> meas_metadata["meas1"] = spectra_metadata_multilevel(n_wins=[3, 2, 2], index_offset=[3, 2, 1])
>>> meas_metadata["meas2"] = spectra_metadata_multilevel(n_wins=[4, 3, 2], index_offset=[28, 25, 22])
>>> meas_metadata["meas3"] = spectra_metadata_multilevel(n_wins=[2, 2, 1], index_offset=[108, 104, 102])
>>> get_site_level_wins(meas_metadata, 0)
3      meas1
4      meas1
5      meas1
28     meas2
29     meas2
30     meas2
31     meas2
108    meas3
109    meas3
dtype: object
>>> get_site_level_wins(meas_metadata, 1)
2      meas1
3      meas1
25     meas2
26     meas2
27     meas2
104    meas3
105    meas3
dtype: object
>>> get_site_level_wins(meas_metadata, 2)
1      meas1
2      meas1
22     meas2
23     meas2
102    meas3
dtype: object
resistics.gather.get_site_wins(config_name: str, proj: resistics.project.Project, site_name: str, fs: float) Dict[int, pandas.core.series.Series][source]

Get site windows for all levels given a sampling frequency

Parameters
  • config_name (str) – The configuration name to get the right data

  • proj (Project) – The project instance to get the measurements

  • site_name (str) – The site name

  • fs (float) – The recording sampling frequency

Returns

Dictionary of integer to levels, with one entry for each decimation level

Return type

Dict[int, pd.Series]

Raises

ValueError – If no matching spectra metadata is found

class resistics.gather.Selection(sites: List[resistics.project.Site], dec_params: resistics.decimate.DecimationParameters, tables: Dict[int, pandas.core.frame.DataFrame])[source]

Bases: resistics.common.ResisticsData

Selections are output by the Selector. They hold information about the data that should be gathered for the regression.

get_n_evals() int[source]

Get the total number of evaluation frequnecies

Returns

The total number of evaluation frequencies that can be calculated

Return type

int

get_n_wins(level: int, eval_idx: int) int[source]

Get the number of windows for an evaluation frequency

Parameters
  • level (int) – The decimation level

  • eval_idx (int) – The evaluation frequency index in the decimation level

Returns

The number of windows

Return type

int

Raises

ValueError – If the level is greater than the maximum level available

get_measurements(site: resistics.project.Site) List[str][source]

Get the measurement names to read from a Site

Parameters

site (Site) – The site for which to get the measurements

Returns

The measurements to read from

Return type

List[str]

get_eval_freqs() List[float][source]

Get the evaluation frequencies

Returns

The evaluation frequencies as a flat list of floats

Return type

List[float]

get_eval_wins(level: int, eval_idx: int) pandas.core.frame.DataFrame[source]

Limit the level windows to the evaluation frequency

Parameters
  • level (int) – The decimation level

  • eval_idx (int) – The evalution frequency index in the decimation level

Returns

pandas DataFrame of the windows and the measurements from each site the window can be read from

Return type

pd.DataFrame

pydantic model resistics.gather.Selector[source]

Bases: resistics.common.ResisticsProcess

The Selector takes Sites and tries to find shared windows across them. A project instance is required for the Selector to be able to find shared windows.

The Selector should be used for remote reference and intersite processing and single site processing when masks are involved.

Show JSON schema
{
   "title": "Selector",
   "description": "The Selector takes Sites and tries to find shared windows across them. A\nproject instance is required for the Selector to be able to find shared\nwindows.\n\nThe Selector should be used for remote reference and intersite processing\nand single site processing when masks are involved.",
   "type": "object",
   "properties": {
      "name": {
         "title": "Name",
         "type": "string"
      }
   }
}

run(config_name: str, proj: resistics.project.Project, site_names: List[str], dec_params: resistics.decimate.DecimationParameters, masks: Optional[Dict[str, str]] = None) resistics.gather.Selection[source]

Run the selector

If a site repeats, the selector only considers it once. This might be the case when performing intersite or other cross site style processing.

Parameters
  • config_name (str) – The configuration name

  • proj (Project) – The project instance

  • site_names (List[str]) – The names of the sites to get data from

  • dec_params (DecimationParameters) – The decimation parameters with number of levels etc.

  • masks (Optional[Dict[str, str]], optional) – Any masks to add, by default None

Returns

The Selection information defining the measurements and windows to read for each site

Return type

Selection

field name: Optional[str] [Required]
Validated by
  • validate_name

pydantic model resistics.gather.SiteCombinedMetadata[source]

Bases: resistics.common.WriteableMetadata

Metadata for combined data

Combined metadata stores metadata for measurements that are combined from a single site.

Show JSON schema
{
   "title": "SiteCombinedMetadata",
   "description": "Metadata for combined data\n\nCombined metadata stores metadata for measurements that are combined from\na single site.",
   "type": "object",
   "properties": {
      "file_info": {
         "$ref": "#/definitions/ResisticsFile"
      },
      "site_name": {
         "title": "Site Name",
         "type": "string"
      },
      "fs": {
         "title": "Fs",
         "type": "number"
      },
      "system": {
         "title": "System",
         "default": "",
         "type": "string"
      },
      "serial": {
         "title": "Serial",
         "default": "",
         "type": "string"
      },
      "wgs84_latitude": {
         "title": "Wgs84 Latitude",
         "default": -999.0,
         "type": "number"
      },
      "wgs84_longitude": {
         "title": "Wgs84 Longitude",
         "default": -999.0,
         "type": "number"
      },
      "easting": {
         "title": "Easting",
         "default": -999.0,
         "type": "number"
      },
      "northing": {
         "title": "Northing",
         "default": -999.0,
         "type": "number"
      },
      "elevation": {
         "title": "Elevation",
         "default": -999.0,
         "type": "number"
      },
      "measurements": {
         "title": "Measurements",
         "type": "array",
         "items": {
            "type": "string"
         }
      },
      "chans": {
         "title": "Chans",
         "type": "array",
         "items": {
            "type": "string"
         }
      },
      "n_evals": {
         "title": "N Evals",
         "type": "integer"
      },
      "eval_freqs": {
         "title": "Eval Freqs",
         "type": "array",
         "items": {
            "type": "number"
         }
      },
      "histories": {
         "title": "Histories",
         "type": "object",
         "additionalProperties": {
            "$ref": "#/definitions/History"
         }
      }
   },
   "required": [
      "site_name",
      "fs",
      "chans",
      "n_evals",
      "eval_freqs",
      "histories"
   ],
   "definitions": {
      "ResisticsFile": {
         "title": "ResisticsFile",
         "description": "Required information for writing out a resistics file",
         "type": "object",
         "properties": {
            "created_on_local": {
               "title": "Created On Local",
               "type": "string",
               "format": "date-time"
            },
            "created_on_utc": {
               "title": "Created On Utc",
               "type": "string",
               "format": "date-time"
            },
            "version": {
               "title": "Version",
               "type": "string"
            }
         }
      },
      "Record": {
         "title": "Record",
         "description": "Class to hold a record\n\nA record holds information about a process that was run. It is intended to\ntrack processes applied to data, allowing a process history to be saved\nalong with any datasets.\n\nExamples\n--------\nA simple example of creating a process record\n\n>>> from resistics.common import Record\n>>> messages = [\"message 1\", \"message 2\"]\n>>> record = Record(\n...     creator={\"name\": \"example\", \"parameter1\": 15},\n...     messages=messages,\n...     record_type=\"example\"\n... )\n>>> record.summary()\n{\n    'time_local': '...',\n    'time_utc': '...',\n    'creator': {'name': 'example', 'parameter1': 15},\n    'messages': ['message 1', 'message 2'],\n    'record_type': 'example'\n}",
         "type": "object",
         "properties": {
            "time_local": {
               "title": "Time Local",
               "type": "string",
               "format": "date-time"
            },
            "time_utc": {
               "title": "Time Utc",
               "type": "string",
               "format": "date-time"
            },
            "creator": {
               "title": "Creator",
               "type": "object"
            },
            "messages": {
               "title": "Messages",
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "record_type": {
               "title": "Record Type",
               "type": "string"
            }
         },
         "required": [
            "creator",
            "messages",
            "record_type"
         ]
      },
      "History": {
         "title": "History",
         "description": "Class for storing processing history\n\nParameters\n----------\nrecords : List[Record], optional\n    List of records, by default []\n\nExamples\n--------\n>>> from resistics.testing import record_example1, record_example2\n>>> from resistics.common import History\n>>> record1 = record_example1()\n>>> record2 = record_example2()\n>>> history = History(records=[record1, record2])\n>>> history.summary()\n{\n    'records': [\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example1',\n                'a': 5,\n                'b': -7.0\n            },\n            'messages': ['Message 1', 'Message 2'],\n            'record_type': 'process'\n        },\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example2',\n                'a': 'parzen',\n                'b': -21\n            },\n            'messages': ['Message 5', 'Message 6'],\n            'record_type': 'process'\n        }\n    ]\n}",
         "type": "object",
         "properties": {
            "records": {
               "title": "Records",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/Record"
               }
            }
         }
      }
   }
}

field site_name: str [Required]

The name of the site

field fs: float [Required]

Recording sampling frequency

field system: str = ''

The system used for recording

field serial: str = ''

Serial number of the system

field wgs84_latitude: float = -999.0

Latitude in WGS84

field wgs84_longitude: float = -999.0

Longitude in WGS84

field easting: float = -999.0

The easting of the site in local cartersian coordinates

field northing: float = -999.0

The northing of the site in local cartersian coordinates

field elevation: float = -999.0

The elevation of the site

field measurements: Optional[List[str]] = None

List of measurement names that were included in the combined data

field chans: List[str] [Required]

List of channels, these are common amongst all the measurements

field n_evals: int [Required]

The number of evaluation frequencies

field eval_freqs: List[float] [Required]

The evaluation frequencies

field histories: Dict[str, resistics.common.History] [Required]

Dictionary mapping measurement name to measurement processing history

class resistics.gather.SiteCombinedData(metadata: resistics.gather.SiteCombinedMetadata, data: Dict[int, numpy.ndarray])[source]

Bases: resistics.common.ResisticsData

Combined data is data that is combined from a single site for the purposes of regression.

All of the data that is combined should have the same sampling frequency, same evaluation frequencies and some shared channels.

Data is stored in the data attribute of the class. This is a dictionary mapping evaluation frequency index to data for the evaluation frequency from all windows in the site. The shape of data for a single evaluation frequency is:

n_wins x n_chans

The data is complex valued.

class resistics.gather.GatheredData(out_data: resistics.gather.SiteCombinedData, in_data: resistics.gather.SiteCombinedData, cross_data: resistics.gather.SiteCombinedData)[source]

Bases: resistics.common.ResisticsData

Class to hold data to be used in by Regression preparers

Gathered data has an out_data, in_data and cross_data. The important thing here is that the data is all aligned with regards to windows

pydantic model resistics.gather.ProjectGather[source]

Bases: resistics.common.ResisticsProcess

Gather aligned data from a single or multiple sites in the project

Aligned data means that the same index of data across multiple sites points to data covering the same global window (i.e. the same time window). This is essential for calculating intersite or remote reference transfer functions.

Show JSON schema
{
   "title": "ProjectGather",
   "description": "Gather aligned data from a single or multiple sites in the project\n\nAligned data means that the same index of data across multiple sites points\nto data covering the same global window (i.e. the same time window). This\nis essential for calculating intersite or remote reference transfer\nfunctions.",
   "type": "object",
   "properties": {
      "name": {
         "title": "Name",
         "type": "string"
      }
   }
}

field name: Optional[str] [Required]
Validated by
  • validate_name

run(config_name: str, proj: resistics.project.Project, selection: resistics.gather.Selection, tf: resistics.transfunc.TransferFunction, out_name: str, in_name: Optional[str] = None, cross_name: Optional[str] = None) resistics.gather.GatheredData[source]

Gather data for input into the regression preparer

Parameters
  • config_name (str) – The config name for getting the correct evals data

  • proj (Project) – The project instance

  • selection (Selection) – The selection

  • tf (TransferFunction) – The transfer function

  • out_name (str) – The name of the output site

  • in_name (Optional[str], optional) – The name of the input site, by default None

  • cross_name (Optional[str], optional) – The name of the cross site, by default None

Returns

The data gathered for the regression preparer

Return type

GatheredData

pydantic model resistics.gather.QuickGather[source]

Bases: resistics.common.ResisticsProcess

Processor to gather data outside of a resistics environment

This is intended for use when quickly calculating out a transfer function for a single measurement and only a single spectra data instance is accepted as input.

Remote reference or intersite processing is not possible using QuickGather

See also

ProjectGather

For more advanced gathering of data in a project

Show JSON schema
{
   "title": "QuickGather",
   "description": "Processor to gather data outside of a resistics environment\n\nThis is intended for use when quickly calculating out a transfer function\nfor a single measurement and only a single spectra data instance is accepted\nas input.\n\nRemote reference or intersite processing is not possible using QuickGather\n\nSee Also\n--------\nProjectGather : For more advanced gathering of data in a project",
   "type": "object",
   "properties": {
      "name": {
         "title": "Name",
         "type": "string"
      }
   }
}

field name: Optional[str] [Required]
Validated by
  • validate_name

run(dir_path: pathlib.Path, dec_params: resistics.decimate.DecimationParameters, tf: resistics.transfunc.TransferFunction, eval_data: resistics.spectra.SpectraData) resistics.gather.GatheredData[source]

Generate the GatheredData object for input into regression preparation

The input is a single spectra data instance and is used to populate the in_data, out_data and cross_data.

Parameters
  • dir_path (Path) – The directory path to the measurement

  • dec_params (DecimationParameters) – The decimation parameters

  • tf (TransferFunction) – The transfer function

  • eval_data (SpectraData) – The spectra data at the evaluation frequencies

Returns

GatheredData for regression preparer

Return type

GatheredData