reduction module

The reduction module is used to convert the original Febus format into a reducted format with the following options and properties:

  • transpose the section from the original [time x space] ordering into a [space x time] ordering with faster access for usual processing

  • decimated section along time or space dimension

  • selected range for the time or space dimension

  • eliminate redundant data

  • optimization of chunk size(s) in HDF5 container

  • simplified metadata storage

reduction functions

These two functions perform the conversion:

reduction.reduction_transpose(filein, fileout, trange=None, drange=None, tdecim=1, ddecim=1, hpcorner=None, kchunk=10, trace_chunk=None, verbose=0, filterOMP=True, no_aliasing=True, compression=False, position=None)

Description

Read a DAS section from an hdf5 Febus-Optics format, extract a subset or read all of it, perform a transposition of the section, an optional highpass filter in the space domain

!!! Time decimation is performed with lowpass filtering if requested !!!

After reduction, data is written on disk using float32 IEEE representation

Input

filein, fileout:

hdf5 (.h5) file to read/write

trange:

(tuple, list) read time range see core.parse_trange() for detail. default = None, everything is read

drange:

(tuple, list) distance range in meter (not km) (start,end),see core.parse_drange() for detail

ddecim:

(int) distance decimation factor (default=1)

tdecim:

(int) time decimation factor (default=1)

position:

(float ndarray[3,:]) 3 x ntraces array with x,y,z position (default=None)

hpcorner:

(float) Corner frequency for High pass spatial filtering (ex. 600m, default=None)

kchunk:

(int) output_time_chunk_size set to input_time_chunk_size * kchunk (default=10)

trace_chunk:

(int) write trace_chunk traces per space block/chunk (default = ntrace/100)

verbose:

(int) be verbose (default=0, minimum message level)

filterOMP:

(bool) use the multithread sosfilter binary package instead of scipy.signal.sosfiltfilt (default True)

no_aliasing:

(bool) if tdecim >1, apply a Low pass filter at 0.9 F_nyquist (default True)

compression:

(bool) compress data in H5 files (default False)

Usage example

>>> import a1das
>>> a1das.reduction.transpose('fichier_in.h5','fichier_out.h5')

Reading transposed reducted output file

With a1das API:

>>> import a1das
>>> f=a1das.open(filename,format='reducted') #format is optional
>>> print(f) # or shortcut f<return>, print file header
>>> a=f.read() #read all data (or use options) into an a1Section object
>>> print(a) # print section header parameters
>>> gl=a['gauge_length'] #get header parameter
>>> dist=a['dist'] # get distance vector
>>> time=a['time'] # get time vector
>>> section=a.data # get 2D float data array
>>> f.close()

With h5py:

>>> import h5py
>>> f=h5py.File(filename)
>>> header=f['/header'].attrs   # header dictionnary
>>> print(header.keys()   )    # list of metadata
>>> gl=header['gauge_length']  # pick one of metadata
>>> dist=f['/distance']         # distance vector
>>> time=f['/time']             # time vector
>>> section = f['/section'][:,:] # 2D das section
>>> f.close()
reduction.reduction_notranspose(filein, fileout, trange=None, drange=None, tdecim=1, ddecim=1, hpcorner=None, kchunk=10, trace_chunk=None, verbose=0, filterOMP=True, compression=False)

Description

Read a DAS section from an hdf5 Febus-Optics format, extract a subset or read all of it, perform an optional highpass filter in the space domain and remove data redundancy

!!! Time decimation is performed without any lowpass filtering !!!

After reduction, the records are stored in a single 2D array (time x space) where space = fast axis

Input

filein, fileout:

hdf5 (.h5) file to read/write

trange:

(tuple, list) read time range see core.parse_trange() for detail. Default = None, everything is read

drange:

(tuple, list) distance range in meter (not km), see core.parse_drange() for details, (default = None, everything is read)

ddecim:

(int) distance decimation factor (default=1)

tdecim:

(int) time decimation factor (default=1)

hpcorner:

(float) Corner frequency for High pass spatial filtering (ex. 600m, default=None)

kchunk:

(int) the ouput HDF5 chunk size is set to input_time_block_size * kchunk (default=10)

trace_chunk:

(int) output space chunk set to trace_chunk traces (default = ntrace/100)

verbose:

(int) be verbose (default=0, minimum message level)

filterOMP:

(bool) use the multithread sosfilter binary package instead of scipy.signal.sosfiltfilt (default True)

compression:

(bool) use compression in H5 files (default False)

Usage example

>>> import a1das
>>> a1das.reduction.no_transpose('fichier_in.h5','fichier_out.h5')

Reading non transposed reducted output file

With python using A1File.read(filename,format=’reducted’) or as following example:

>>> import h5py
>>> f=h5py.File(filename)
>>> header=f['header'].attrs   # header dictionnary
>>> print(header.keys()   )    # list of header keys
>>> dist=f['/distance']         # distance vector
>>> time=f['/time']             # time vector
>>> data=f['/section'][:,:]           # 2D strain[-rate] ndarray [ntime x nspace](float64)

reducted format

The reducted format uses a HDF5 container to store data and metadata (or header). It is simplified and optimized as much as possible. It can be used either using a1das IO functions or directly using H5 IO functions or calls.

HDF5 format detail

The format uses one group and three datasets:

  • /header group

    • contains metadata stored as attributes. Using h5py, they are read as a dictionary

  • /distance dataset

    • 1D distance vector of size nspace

  • /time dataset

    • 1D time vector of size ntime

  • /section dataset

    • 2D section array of size [nspace x ntime] or [nspace x ntime] depending wether data has been transposed or not

Header content

  • file_type = ‘reducted_format’

  • version = reducted format version number

plus all the metadata included in the A1Section header. The A1Section header is a class that mimics a python dictionary and that can be extended dynamically. Mandatory header values can be viewed by calling A1Section.header() method:

A1Section Mandatory header fields
---------------------------------
(see in _a1headers.py "_required_header_keys" dict)

key             	|	 meaning
---------------------------------------------------
gauge_length    	|	 gauge length (meter)
sampling_res    	|	 original spatial resolution (cm), usually between 20 and 60 cm
prf             	|	 laser Pulse Rate Frequency (Hz)
data_type       	|	 data type = raw, strain, strain-rate, ...
axis1           	|	 dimension of data first axis: "time" or "space"
axis2           	|	 dimension of data second axis: "time" or "space"
dt              	|	 time step (sec) for time axis
ntime           	|	 number of time samples
otime           	|	 origin time. Absolute time of a sample is : time + otime
dx              	|	 spatial step (m) for distance axis
nspace          	|	 number of space samples
ospace          	|	 origin position along fiber. The absolute distance is given by dist + ospace
dist            	|	 relative distance vector w/r to origin position
time            	|	 relative time vector w/r to origin time


A1Section optionnal header fields
---------------------------------
(see in _a1headers.py "_other_header_keys" dict)

derivation_time 	|	 time interval (msec) used for time derivation
time_derivation_order 	|	 finite difference order for time derivation
space_derivation_order 	|	 finite difference order for space derivation