.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorial-read/eg_02_read_time_bz2.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorial-read_eg_02_read_time_bz2.py: Time data bz2 ^^^^^^^^^^^^^ This example will show how to read time data from a compressed ASCII file using the default ASCII data reader. In this case, the data has been compressed using bz2. To read such a compressed ASCII data file, a metadata file is required. The example shows how an appropriate metadata file can be created and the information required to create such a metadata file. The dataset in this example has been provided for use by the SAMTEX consortium. For more information, please refer to [Jones2009]_. Additional details about the dataset can be found at https://www.mtnet.info/data/kap03/kap03.html. The dataset is KAP148. A couple of notes: - The data has a sample every 5 seconds, meaning a 0.2 Hz sampling frequency. - Values of 1E32 have been replaced by NaN .. GENERATED FROM PYTHON SOURCE LINES 22-28 .. code-block:: default from pathlib import Path import bz2 import plotly import pandas as pd from resistics.time import ChanMetadata, TimeMetadata, TimeReaderAscii, InterpolateNans .. GENERATED FROM PYTHON SOURCE LINES 29-31 Define the data path. This is dependent on where the data is stored. Here, the data path is being read from an environment variable. .. GENERATED FROM PYTHON SOURCE LINES 31-34 .. code-block:: default time_data_path = Path("..", "..", "data", "time", "bz2") ascii_data_path = time_data_path / "kap148as.ts.bz2" .. GENERATED FROM PYTHON SOURCE LINES 35-37 The folder contains a single ascii data file. Let's have a look at the contents of the file. .. GENERATED FROM PYTHON SOURCE LINES 37-43 .. code-block:: default with bz2.open(ascii_data_path, "rt") as f: for line_number, line in enumerate(f): print(line.strip("\n")) if line_number >= 130: break .. rst-class:: sphx-glr-script-out Out: .. code-block:: none # time series file from tssplice # date: Mon Nov 7 05:27:46 2016 # # Files spliced together: # kap148a1 2003-10-25 11:30:00-2003-11-02 10:52:04 # kap148b1 2003-11-02 11:30:00-2003-11-12 11:15:34 # kap148c1 2003-11-12 11:45:00-2003-11-21 13:43:14 # kap148d1 2003-11-21 14:30:00-2003-11-29 10:14:10 # # Following comment block from first file... # # time series file from mp2ts # date: Mon Nov 7 05:27:32 2016 # # input file: kap148\kap148a1.1mp # # Machine endian: Little # UNIX set : F # # site description: suikerbosrand # # Latitude :025:55:40 S # Longitude :026:27:04 E # # LiMS acquisition code : 10.2 # LiMS box number : 26 # Magnetometer number : 26 # # Ex line length (m): 100.00 # Ey line length (m): 98.00 # # Azimuths relative to: MAGNETIC NORTH # Ex azimuth; 0 # Ey azimuth; 90 # Hx azimuth; 0 # Hy azimuth; 90 # # FIRST 20 POINTS DROPPED FROM .1mp FILE TO # ACCOUNT FOR FILTER SETTLING # #F Filter block begin #F #F Filters applied to LiMS/LRMT data are: #F 1: Analogue anti-alias six-pole Bessel low-pass #F filters on each channel with -3 dB point at nominally 5 Hz. #F -calibrated values given below #F #F 2: Digital anti-alias multi-stage Chebyshev FIR filters #F with final stage at 2xsampling rate #F #F 1: Analogue single-pole Butterworth high-pass filters on the #F telluric channels only with -3 dB point at nominally 30,000 s #F -calibrated values given below #F #F Chan Calib Low-pass High-pass (s) #F 1 1.00 0.00 0.00 #F 2 1.00 0.00 0.00 #F 3 1.00 0.00 0.00 #F 4 1.00 0.00 0.00 #F 5 1.00 0.00 0.00 #F #F In the tsrestack code, these filter responses are #F removed using bessel7.f and high17.f #F #F Filter block end >INFO_START: >STATION :kap148 >INSTRUMENT: 26 >WINDOW :kap148as >LATITUDE : -25.9277802 >LONGITUDE : 26.4511108 >ELEVATION : 1518. >UTM_ORIGIN: 27. >UTM_NORTH : -2867639 >UTM_EAST : 445033 >COORD_SYS :MAGNETIC NORTH >DECLIN : 0. >FORM :ASCII >FORMAT :FREE >SEQ_REC : 1 >NCHAN : 5 >CHAN_1 :HX >SENSOR_1 : 26 >AZIM_1 : 0. >UNITS_1 :nT >GAIN_1 : 1. >BASELINE_1: 12410.8799 >CHAN_2 :HY >SENSOR_2 : 26 >AZIM_2 : 90. >UNITS_2 :nT >GAIN_2 : 1. >BASELINE_2: -245.759995 >CHAN_3 :HZ >SENSOR_3 : 26 >AZIM_3 : 0. >UNITS_3 :nT >GAIN_3 : 1. >BASELINE_3: -25784.3203 >CHAN_4 :EX >SENSOR_4 : 26 >AZIM_4 : 0. >UNITS_4 :mV/km >GAIN_4 : 1. >CHAN_5 :EY >SENSOR_5 : 26 >AZIM_5 : 90. >UNITS_5 :mV/km >GAIN_5 : 1. >STARTTIME :2003-10-25 11:30:00 >ENDTIME :2003-11-29 10:14:10 >T_UNITS :s >DELTA_T : 5. >MIS_DATA : 1.00000003E+32 >INFO_END : 3.00425005 2.62300014 -0.381249994 2.5315001 2.44311237 3.01950002 2.60774994 -0.457500011 2.62300014 2.34974504 3.01950002 2.62300014 -0.488000005 2.51625013 2.33418369 3.03474998 2.65350008 -0.442250013 2.45525002 2.48979592 3.06524992 2.66875005 -0.427000016 2.50099993 2.58316326 3.04999995 2.68400002 -0.488000005 2.54675007 2.55204082 3.09575009 2.69924998 -0.564249992 2.63825011 2.38086748 3.21775007 2.71449995 -0.610000014 2.60774994 2.28750014 .. GENERATED FROM PYTHON SOURCE LINES 44-46 Note that the metadata requires the number of samples. Pandas can be useful for this purpose. .. GENERATED FROM PYTHON SOURCE LINES 46-50 .. code-block:: default df = pd.read_csv(ascii_data_path, header=None, skiprows=123, delim_whitespace=True) n_samples = len(df.index) print(df) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none 0 1 2 3 4 0 3.004250 2.623000 -0.381250 2.531500 2.443112 1 3.019500 2.607750 -0.457500 2.623000 2.349745 2 3.019500 2.623000 -0.488000 2.516250 2.334184 3 3.034750 2.653500 -0.442250 2.455250 2.489796 4 3.065250 2.668750 -0.427000 2.501000 2.583163 ... ... ... ... ... ... 603885 67.700172 -132.892487 33.821594 -33.046749 178.160461 603886 68.035675 -132.724747 33.836845 -32.955250 398.631897 603887 67.791672 -133.105988 34.294342 -32.665501 509.894653 603888 66.952919 -134.768250 34.919594 -32.345249 508.774261 603889 66.571670 -135.332489 35.102593 -32.452000 503.001038 [603890 rows x 5 columns] .. GENERATED FROM PYTHON SOURCE LINES 51-52 Define other key pieces of recording information .. GENERATED FROM PYTHON SOURCE LINES 52-57 .. code-block:: default fs = 0.2 chans = ["Hx", "Hy", "Hz", "Ex", "Ey"] first_time = pd.Timestamp("2003-10-25 11:30:00") last_time = first_time + (n_samples - 1) * pd.Timedelta(1 / fs, "s") .. GENERATED FROM PYTHON SOURCE LINES 58-61 The next step is to create a TimeMetadata object. The TimeMetdata has information about the recording and channels. Let's construct the TimeMetadata and save it as a JSON along with the time series data file. .. GENERATED FROM PYTHON SOURCE LINES 61-79 .. code-block:: default chans_metadata = {} for chan in chans: chan_type = "electric" if chan in ["Ex", "Ey"] else "magnetic" chans_metadata[chan] = ChanMetadata( name=chan, chan_type=chan_type, data_files=[ascii_data_path.name] ) time_metadata = TimeMetadata( fs=fs, chans=chans, n_samples=n_samples, first_time=first_time, last_time=last_time, chans_metadata=chans_metadata, ) time_metadata.summary() time_metadata.write(time_data_path / "metadata.json") .. rst-class:: sphx-glr-script-out Out: .. code-block:: none { 'file_info': None, 'fs': 0.2, 'chans': ['Hx', 'Hy', 'Hz', 'Ex', 'Ey'], 'n_chans': 5, 'n_samples': 603890, 'first_time': '2003-10-25 11:30:00.000000_000000_000000_000000', 'last_time': '2003-11-29 10:14:05.000000_000000_000000_000000', 'system': '', 'serial': '', 'wgs84_latitude': -999.0, 'wgs84_longitude': -999.0, 'easting': -999.0, 'northing': -999.0, 'elevation': -999.0, 'chans_metadata': { 'Hx': { 'name': 'Hx', 'data_files': ['kap148as.ts.bz2'], 'chan_type': 'magnetic', 'chan_source': None, 'sensor': '', 'serial': '', 'gain1': 1, 'gain2': 1, 'scaling': 1, 'chopper': False, 'dipole_dist': 1, 'sensor_calibration_file': '', 'instrument_calibration_file': '' }, 'Hy': { 'name': 'Hy', 'data_files': ['kap148as.ts.bz2'], 'chan_type': 'magnetic', 'chan_source': None, 'sensor': '', 'serial': '', 'gain1': 1, 'gain2': 1, 'scaling': 1, 'chopper': False, 'dipole_dist': 1, 'sensor_calibration_file': '', 'instrument_calibration_file': '' }, 'Hz': { 'name': 'Hz', 'data_files': ['kap148as.ts.bz2'], 'chan_type': 'magnetic', 'chan_source': None, 'sensor': '', 'serial': '', 'gain1': 1, 'gain2': 1, 'scaling': 1, 'chopper': False, 'dipole_dist': 1, 'sensor_calibration_file': '', 'instrument_calibration_file': '' }, 'Ex': { 'name': 'Ex', 'data_files': ['kap148as.ts.bz2'], 'chan_type': 'electric', 'chan_source': None, 'sensor': '', 'serial': '', 'gain1': 1, 'gain2': 1, 'scaling': 1, 'chopper': False, 'dipole_dist': 1, 'sensor_calibration_file': '', 'instrument_calibration_file': '' }, 'Ey': { 'name': 'Ey', 'data_files': ['kap148as.ts.bz2'], 'chan_type': 'electric', 'chan_source': None, 'sensor': '', 'serial': '', 'gain1': 1, 'gain2': 1, 'scaling': 1, 'chopper': False, 'dipole_dist': 1, 'sensor_calibration_file': '', 'instrument_calibration_file': '' } }, 'history': {'records': []} } .. GENERATED FROM PYTHON SOURCE LINES 80-82 Now the data is ready to be read in by resistics. Read it in and print the first and last sample values. .. GENERATED FROM PYTHON SOURCE LINES 82-87 .. code-block:: default reader = TimeReaderAscii(extension=".bz2", n_header=123) time_data = reader.run(time_data_path) print(time_data.data[:, 0]) print(time_data.data[:, -1]) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none [ 3.00425 2.6230001 -0.38125 2.5315 2.4431124] [ 66.57167 -135.33249 35.102592 -32.452 503.00104 ] .. GENERATED FROM PYTHON SOURCE LINES 88-90 There are some invalid values in the data that have been replaced by NaN values. Interpolate the NaN values. .. GENERATED FROM PYTHON SOURCE LINES 90-92 .. code-block:: default time_data = InterpolateNans().run(time_data) .. GENERATED FROM PYTHON SOURCE LINES 93-95 Finally plot the data. By default, the data is downsampled using the LTTB algorithm to avoid slow and large plots. .. GENERATED FROM PYTHON SOURCE LINES 95-98 .. code-block:: default fig = time_data.plot(max_pts=1_000) fig.update_layout(height=700) plotly.io.show(fig) .. raw:: html :file: images/sphx_glr_eg_02_read_time_bz2_001.html .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 8.911 seconds) .. _sphx_glr_download_tutorial-read_eg_02_read_time_bz2.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: eg_02_read_time_bz2.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: eg_02_read_time_bz2.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_