Modules

This page list the internal methods of all modules, for developing purposes.

PROFFASTpylot consists of four main modules:

  • Pylot contains the high level functionality,

  • Filemover creates a consistent file structure from the PROFFAST output,

  • Prepare derives all processing options from the input and

  • Pressure reads and interpolates the pressure data.

Pylot

class pylot.Pylot(input_file, logginglevel='info', external_logger=None, loggername=None)

Bases: FileMover

Start all PROFFAST processes.

_add_timezones_to(df)

Add UTC and local timezone at measurement location.

_call_external_program(command_list, **kwargs)

Call a external program. Return output and error.

_get_executable(program)

Return PROFFAST executable of the given program part.

Parameters:

program (str) – can be “prep”, “pcxs” and “inv”

Returns:

executable (str) – depending on the current operating system.

_get_merged_df()

Read all invparm.dat files as Dataframe and combine them.

_select_rename_cols(df)

Return df with selected and renamed columns.

_write_logfile(program_name, output)

Write the output of preprocess, pcxs and inv to a logfile.

clean_files()

After execution clean up the files not needed anymore

combine_results()

Combine the generated result files and save as csv.

run(n_processes=1)

Execute all processes of profast.

Run preporcessing, pcxs, invers. The generated data is moved and merged in a result folder.

run_inv(n_processes=1)

Run inverse.

Loops over localdates, generates the input files and runs invers.

Parameters:

n_processes (int) – If n_processes == 1, run_inv_at is called directly. Otherwise it is called via run_parallel.

run_pcxs(n_processes=1)

Run pcxs.

Loops over local dates and executes the following steps:
  • check if the abscos bin file exists already.

  • Interpolate the mapfile. If no mapfile is found for a local date, the date is removed from self.local_dates.

  • generate the input file.

  • run pcxs(in parallel).

Parameters:

n_processes (int) – If n_processes == 1, run_pcxs_at is called directly. Otherwise it is called via run_parallel.

run_preprocess(n_processes=1)

Main method to run preprocess.

run_prf_with_inputfile(prf_inputfile, executable, popen_kwargs={})

Run PROFFAST with the given inputfile

Filemover

class filemover.FileMover(input_file, logginglevel='info', external_logger=None, loggername=None)

Bases: Preparation

Copy, Move and remove temporary proffast Files.

_create_analysis_cal_folders()

Create the analysis and cal folder.

Created folders:
  • analysis/<Site>_<Instrument>

  • analysis/<Site>_<Instrument>/<YYMMDD>/cal

    for the spectra of all measurement days.

_create_result_dir()

Create the result directories and a backup if previous results exist.

Within this folder the following subfolders are created:
  • input_files,

  • logfiles

  • raw_output_proffast

Backup behavior:

If backup_results is True and the result folder does exist: the existing folder is renamed adding backupX where X increases if an other backup does already exists. After renaming, a new folder is created.

_create_result_subdirs()

Create the subfolders in the result folder.

The folders ‘input_files’, ‘logfiles’ and ‘raw_output_proffast’ are only created if not existent.

_move_generallogfile_to_logdir()

Move the general logfile to the logdir.

This needs to be done at the end, since the folder is created by the program itself.

_move_logfile()

Move the logfile to the log-folder.

For this it the file handler is closed, the file is moved and the handler is re-opend.

_move_prf_config_file()

Copy the PROFFASTpylot input file to the result folder.

check_abscosbin_summed_size()

Get size of all abscos.bin files. Give warning if too large.

delete_abscos_files()

Delete the abscos.bin files created by pcxs.

delete_input_files()

Delete the input files for preprocess, pcxs and inv

handle_pT_VMR_files()

Copy or move the pT and VMR files created by pcxs.

If delete-abscosbin_files is True, the pT and VMR are MOVED to the result folder. If delete-abscosbin_files is False, the pT and VMR are COPIED to the result folder.

They contain the prior information and are therefore an important part or the result. Hence, they are wanted to show up in the result folder in any case.

init_folders()

Create all relevant folders on startup if nonexistant.

Check if relevant proffast folders are existant. Folders to be created: - pT, cal directories - result folder (backup of previous results) - logfiles

move_input_files()

Move the input files for prep., pcxs and inv to result folder

move_results()

Move the gererated files to the result folder.

The invparms_?.dat, job_?.spc and version_?.dat files are searched and moved to the result folder. If files are not found, a warning is printed.

The colsens.dat are produced by PXCS and
  • moved if delete_abscosbin_files is True

  • copied if delete_abscosbin_files is False.

This is to ensure that every run has them in the result folder, independent if pcxs was executed or skipped in this run.

Prepare

class prepare.Preparation(input_file, logginglevel='info', external_logger=None, loggername=None)

Bases: object

Import input parameters, and create input files.

_check_mapfile_coordinates(header)

Check if the coordinates of the mapfile are consistent. Print a warning if not.

Parameters:

header (list of lines) – originating from a GGG20 mapfile

_create_datelist(path)

Create datelist of given path. Skip elements that are not folders of the format “YYMMDD”.

_get_end_date_pos(end_date, dates)

Return position of the end date in dates.

_get_localtime_offset()

Return offset between measurement time and local time.

utc_offset + localtime_offset = total offset beteen Localtime and UTC. and thus localtime_offset = total_localtime_utc_offset - utc_offset

_get_start_date_pos(start_date, dates)

Return position of the start date in dates.

_replace_backslash(line)

Replace backslash with slash if run on linux.

_set_wet_vmr()

Set self.mapfile_wet_vmr if not given in input file to set the %WET_VMR% parameter. - GGG2014 map files: dry air (False) - GGG2020 map files: wet air (True) value can be given separately in input file.

create_logger(logginglevel='info', loggername=None, external_logger=None)

Create and return a logger.

defaults = {'backup_results': True, 'coord_file': None, 'coords': {'alt': None, 'lat': None, 'lon': None}, 'delete_abscosbin_files': False, 'delete_input_files': False, 'delete_spc_files': True, 'ignore_interpolation_error': None, 'igram_pattern': '*.*', 'ils_parameters': None, 'instrument_parameters': 'em27', 'mapfile_wetair_vmr': None, 'min_interferogram_size': 3.7, 'note': None, 'start_with_spectra': False, 'utc_offset': 0.0}
generate_invers_input(local_date)

Fills the invers input file.

Calls get_inv_parameters with the local date.

Parameters:

local_date (dt.datetime) – the date in local time to be processed

Returns:

prf_input_files, skipped_spectra

  • prf_input_files (list):

    A list of paths to the input files.

  • skipped_spectra (list):

    List containing all spectra skipped at this day, due to missing pressure values. This list is provided by get_spectra_pT_input called in get_inv_parameters.

generate_pcxs_input(local_date)

Fills the pcxs input file.

Calls get_pcxs_parameters with local_date.

Parameters:

local_date (dt.datetime) – the date in local time to be processed

Returns:

prf_input_file (str) – Path to the input file.

generate_preprocess_input(meas_date)

Fills the preprocess tempate file.

Calls self.get_prep_parameters(meas_date) :param meas_date: the date in measurement time to be processed :type meas_date: dt.datetime

Returns:

prf_input_file (str, or None) – If no igrams are found, return None, else the path to the input file

get_coords()

Return dict of coords.

If coords were not given or contain None for at least one coordinate, the coord_file will be read. If the coord_file was also not given, operation will be terminated.

get_coords_from_file(date)

Return the coordinates from the coord file.

get_igrams(meas_date)

Search for interferograms on disk and return a list of files.

get_ils_from_file(date)

Read the ILS parameters form the given file.

Parameters:

date (dt.datetime) – If multiple ILS Parameters are given in the list, get the newest ILS parameters, that are already valid at date.

Returns:

ils_parameters (tuple) – MEChan1, PEChan1, MEChan2, PEChan2

get_inv_parameters(local_date)

Return parameters to fill the invers20.inp template.

If spectra from two measurement days (i.e. different folders) belong to one local date, spectra_pT_input is a list with two elements, else, the it has one element.

For each element in the spectra_pT_input list a dict containing the parameters for the input files is generated.

The list of spectra belongig to one local date is sorted by the measurement date of the spectra which is determined by the function get_times_of(spectrum).

Parameters:

local_date (dt.datetime) – date in local time

Returns:

parameters, skipped_spectra

  • parameters (list): Contains one or two dict objects,

    depending if all spectra of the local date are stored in the same YYMMDD folder.

  • skipped_spectra (list): List containing all spectra skipped

    at this day, due to missing pressure values. This list is provided by get_sepctra_pT_input.

get_local_noon_utc(local_date)

Return local noon in utc.

Local noon is referring to the 12:00 in the local time. Daylight saving time is not considered for this transformation.

Parameters:

local_date (dt.datetime) – date in local time

Returns:

local_noon_utc – 12:00 in local time converted to UTC

get_localdate_spectra()

Return dict linking all spectra to local dates.

Returns:

localdate_spectra (dict) – containing the a list of full pathes to the spectra for each local date in the format {local_date: [“path/YYMMDD_HHMMSSSN.BIN”, …]}

get_mapfiles(local_noon_utc)

Return mapfiles of date and following date of the local noon in UTC.

get_meas_dates(start_date=None, end_date=None)

Return a list of dates in measurement time for the site + instrument.

Truncate the list if start_date and end_date are given.

Parameters:
  • start_date (dt.date) – optional start date

  • end_date (dt.date) – optional end date

get_pcxs_parameters(local_date)

Return parameters to fill the pcxs20.inp template.

Parameters:

local_date (dt.datetime) – date in local time

Returns:

parameters (dict) – dict containing the parameters to fill the pcxs template.

get_prep_parameters(meas_date)

Return Parameters to be replaced in the preprocess template.

Parameters:

meas_date (dt.datetime) – date in measurement time

Returns:

parameters (dict) – dict with parameters to fill the preporcess template.

get_prf_input_path(template_type, date=None)

Return path to the corresponding prf_input_file.

get_spectra(meas_date)

Return list of spectra for a given date (in measurement time).

Parameters:

date (dt.datetime) – in measurement time

Returns:

spectra (list) – with full path to all spectra of measurement date [“path_to/YYMMDD_HHMMSSSN.BIN”, …]

get_spectra_pT_input(local_date)

Return invers formatted pT infos for given local date.

If two measurement dates belong to one local date, spectra_pT_input contains two lists. The list is split based on the filename of the spectra, since the filename refers to measurement time.

The string has the format YYMMDD_HHMMSSSN.BIN, pressure, T_PBL

This function replaces the pt_intraday.inp file from older PROFFAST versions. Note that T_PBL is currently set to 0.0.

Parameters:

local_date (dt.datetime) – Date in local time

Returns:

spectra_pT_input (list), skipped_spectra (list)

  • spectra_pT_input: List containing a list of strings with

    spectra and pT infos.

  • skipped_spectra (list): List containing all spectra skipped

    at this day, due to missing pressure values.

get_template_path(template_type)

Return path to the corresponding template file.

get_times_of(spectrum)

Read measurement time from filename, calculate local and utc time. Check if UTC time is consistent in the spectra header.

Parameters:

spectrum (str) – full path to a spectrum

Returns:

times(dict)

A dict with the following keys:
  • meas_time (dt.datetime): time parsed from the filename

  • local_time (dt.datetime): calculated local time

  • utc_time (dt.datetime): read from spectra header

instrument_templates = {'em27': 'em27.yml', 'invenio': 'invenio.yml', 'ircube': 'ircube.yml', 'tccon_default_hr': 'tccon_default_hr.yml', 'tccon_default_lr': 'tccon_default_lr.yml', 'tccon_ka_hr': 'tccon_ka_hr.yml', 'tccon_ka_lr': 'tccon_ka_lr.yml', 'vertex': 'vertex.yml'}
interpolate_map_files(local_date)

Interpolate GGG2020 map files.

Generate a map file at 12:00 local time. This method is only called for mapfiles of type GGG2020. The mapfile is created in <result_folder>/interpolated_mapfiles

with the following filename:

“<site_abbrev><local_noon_utc>_Z.map”

The folder interpolated_mapfiles is created in this function.

Parameters:

local_date (dt.datetime) – datetime in local time

mandatory_options = ['instrument_number', 'site_name', 'site_abbrev', 'map_path', 'pressure_path', 'pressure_type_file', 'interferogram_path', 'analysis_path', 'result_path']
prepare_map_file(local_date)

Generate map file if GGG2020 map file are used.

Parameters:

local_date (dt.datetime) – date in local time

Returns:

success (bool) – True if map files were found and created False if no files were found.

replace_params_in_template(parameters, template_type, prf_input_file)

Generate a site specific input file by using a template.

Parameters:
  • parameters (dict) – Containing keys which match the variable names in the template file. They are replaced by the entries.

  • template_type (str) – Can be “prep”, “pt”, “inv” or “pcxc”

  • prf_input_file (str) – The filename of the input file

template_types = {'inv': 'invers24', 'pcxs': 'pcxs24', 'prep': 'preprocess6'}
class prepare.PylotOnly(name='')

Bases: Filter

A filter which filters out all logs not originating from the pylot.

This is used when an external logger is provided, to prevent external logging messages to show up in the PROFFASTpylot custom log file.

filenames = ['prepare.py', 'filemover.py', 'pylot.py', 'pressure.py']
filter(record)

Determine if the specified record is to be logged.

Returns True if the record should be logged, or False otherwise. If deemed appropriate, the record may be modified in-place.

Pressure

class pressure.PressureHandler(pressure_type_file, pressure_path, dates, logger, measurement_time=0)

Bases: object

Read, interpolate and return pressure data from various formats.

_append_dtype_to_csv_kwargs()

Return extended csv_kwargs to make sure the date and time column are interpreted as string.

_apply_pressure_offset_and_factor()

Multiply the pressure column with the pressure factor.

_check_mandatory()

Check mandatory options for completeness.

The options must satisfy the following: - A pressure key is given, - the time key XOR datetime key is given and - the filename is not empty.

Raises:

RuntimeError – in case of a missing option.

_get_filename(date)

Return merged filename of pressure_type.

_parse_datetime_col(df, date=None)

Parse the dataframe for a suitable datetime.

Add the column ‘parsed_datecol’ to the dataframe Depending on the options given, the datetime column is constructed f rom the combination of the separate time and date columns.

Parameters:

df (pandas.DataFrame) – pressure dataframe containing time information in arbitrary format.

Returns:

df (pandas.DataFrame) – with an additional datetime column.

_parse_pressure(date=None)

Parse the internal raw p_df and eliminate bad values

_read_subdaily_files()

Reads the subdaily AND daily files into the internal p_df

_read_unregular_files()

read unregular files. Save the result in self.p_df DataFrame

_read_yearly_files()

read yearly files and return a dict containing the pressure for each day in dates

_set_defaults(option)

Set defaults in dataframe, filename and data parameters dict. Check for mandatory options.

Parameters:

option (str) – “dataframe_parameters”, “filename_parameters” or “data_parameters”

Returns:

modified dict

default_options = {'data_parameters': {}, 'max_interpolation_time': 2, 'pressure_factor': 1.0, 'pressure_offset': 0.0, 'utc_offset': 0.0}
get_pressure_at(pressure_time)

Return the interpolated pressure at a given time.

If the value is rejected or an interpolation error occured p=0 is returned.The corresponding spectra will not be processed. This is determined in prepare.get_spectra_pT_input().

If the pressure measurements for a whole day is missing, the whole day is deleted from the processing list in pylot.run_inv().

Parameters:

(datetime (pressure_time) – datetime): time in timezone of the pressure file

mandatory_options = ['dataframe_parameters', 'filename_parameters', 'data_parameters', 'frequency']
parsed_dtcol = 'parsed_datetime'
prepare_pressure_df()

Read the pressure of a day, from files with a various frequencies.

The dataframe self.p_df is created as a object of the pressure_handler instance Containing the pressure and a datetime column.

The pressure column is multiplied by the pressure_factor given in the pressure input file.