Development¶

Online Source Code Documentation¶

TECA’s C++ sources are documented via Doxygen at the TECA Doxygen site.

Class Indices¶

Tip

The following tables contain a listing of some commonly used TECA classes. The TECA Doxygen site is a more complete reference.

Algorithms¶

TECA’s suite of algorithms that can be inserted in functional pipelines. (For more details, click on the class name)

Table 2 TECA Classes¶
Class	Description
teca_2d_component_area	An algorithm that computes the areas of labeled regions.
teca_apply_binary_mask	Applies a mask to a given list of variables.
teca_apply_tempest_remap	Moves data from one mesh to anotehr using remapping weights generated by TempestRemap.
teca_bayesian_ar_detect	The TECA BARD atmospheric river detector.
teca_bayesian_ar_detect_parameters	An algorithm that constructs and serves up the parameter table needed to run the Bayesian AR detector.
teca_binary_segmentation	An algorithm that computes a binary segmentation.
teca_cartesian_mesh_coordinate_transform
teca_cartesian_mesh_regrid	Transfers data between spatially overlapping meshes of potentially different resolutions.
teca_cartesian_mesh_source	An algorithm that generates a teca_cartesian_mesh of the requested spatial and temporal dimensions with optional user defined fields.
teca_cartesian_mesh_subset	applies a subset given in world coordinates to the upstream request
teca_component_area_filter	An algorithm that applies a mask based on connected component area.
teca_component_statistics	compute statistics about connected components
teca_connected_components	an algorithm that computes connected component labeling
teca_dataset_diff	compute the element wise difference between to datasets
teca_derived_quantity	a programmable algorithm specialized for simple array based computations
teca_descriptive_statistics	compute descriptive statistics over a set of arrays.
teca_elevation_mask	Generates a mask indicating where mesh points with a vertical pressure coordinate lie above the surface of the Earth. The mask is set to 1 where data is above the Earth’s surface and 0 otherwise.
teca_evaluate_expression	An algorithm that evaluates an expression stores the result in a new variable.
teca_face_to_cell_centering	An algorithm that transforms from face to cell centering.
teca_indexed_dataset_cache	Caches N datasets such that repeated requests for the same dataset are served from the cache.
teca_integrated_vapor_transport	An algorithm that computes integrated vapor transport (IVT)
teca_integrated_water_vapor	An algorithm that computes integrated water vapor (IWV)
teca_l2_norm	An algorithm that computes L2 norm.
teca_laplacian	An algorithm that computes the Laplacian from a vector field.
teca_latitude_damper	Inverted Gaussian damper for scalar fields.
teca_mask	an algorithm that masks a range of values
teca_normalize_coordinates	An algorithm to ensure that Cartesian mesh coordinates follow conventions.
teca_python_algorithm
teca_pytorch_algorithm
teca_rename_variables	An algorithm that renames variables.
teca_simple_moving_average	an algorithm that averages data in time
teca_table_calendar	An algorithm that transforms NetCDF CF-2 time variable into an absolute date.
teca_table_reduce	A reduction on tabular data over time steps.
teca_table_region_mask	An algorithm that identifies rows in the table that are inside the list of regions provided.
teca_table_remove_rows	An algorithm that removes rows from a table where a given expression evaluates to true.
teca_table_sort	an algorithm that sorts a table in ascending order
teca_table_to_stream	An algorithm that serializes a table to a C++ stream object.
teca_tc_candidates	GFDL tropical storms detection algorithm.
teca_tc_classify	an algorithm that classifies storms using Saphire-Simpson scale
teca_tc_trajectory	GFDL tropical storms trajectory tracking algorithm.
teca_tc_wind_radii	computes wind radius at the specified coordinates
teca_threaded_python_algorithm
teca_unpack_data	an algorithm that unpacks NetCDF packed values
teca_valid_value_mask	an algorithm that computes a mask identifying valid values
teca_vertical_coordinate_transform	An algorithm that transforms the vertical cooridinates of a mesh.
teca_vertical_reduction	The base class for vertical reducitons.
teca_vorticity	An algorithm that computes vorticity from a vector field.

I/O¶

TECA’s I/O components to read datasets efficiently. (For more details, click on the class name)

Table 3 TECA Classes¶
Class	Description
teca_array_collection_reader	A reader for collections of arrays stored in NetCDF format.
teca_cartesian_mesh_reader	A reader for data stored in binary cartesian_mesh format.
teca_cartesian_mesh_writer	An algorithm that writes Cartesian meshes in VTK format.
teca_cf_block_time_step_mapper	Maps time steps to files in fixed sized blocks.
teca_cf_interval_time_step_mapper	NetCDF CF2 files time step mapper.
teca_cf_layout_manager	Puts data on disk using NetCDF CF2 conventions.
teca_cf_reader	A reader for Cartesian mesh based data stored in NetCDF CF format.
teca_cf_time_axis_data	A dataset used to read NetCDF CF2 time and metadata in parallel.
teca_cf_time_axis_data_reduce	Gathers the time axis and metadata from a parallel read of a set of NetCDF CF2 files.
teca_cf_time_axis_reader	An algorithm to read time axis and its attributes in parallel.
teca_cf_time_step_mapper	Defines the interface for mapping time steps to files.
teca_cf_writer	A writer for Cartesian meshes in NetCDF CF2 format.
teca_multi_cf_reader	A reader for data stored in NetCDF CF format in multiple files.
teca_shape_file_mask	Generates a valid value mask defined by regions in the given ESRI shape file.
teca_table_reader	a reader for data stored in binary table format
teca_table_writer	An algorithm that writes tabular data in a binary or CSV (comma separated value) format that is easily ingested by most spreadsheet apps. Each page of a database is written to a file.
teca_wrf_reader	A reader for data stored in WRF ARW format.

Core¶

TECA’s core components. (For more details, click on the class name)

Table 4 TECA Classes¶
Class	Description
object
teca_algorithm	The interface to TECA pipeline architecture.
teca_algorithm_executive	Base class and default implementation for executives.
teca_bad_cast	An exception that maybe thrown when a conversion between two data types fails.
teca_binary_stream	Serialize objects into a binary stream.
teca_dataset	Interface for TECA datasets.
teca_dataset_capture	An algorithm that takes a reference to dataset produced by the upstream algorithm it is connected to.
teca_dataset_source	An algorithm that serves up user provided data and metadata.
teca_index_executive	An executive that generates requests using a upstream or user defined index.
teca_index_reduce	Base class for MPI + threads map reduce reduction over an index.
teca_memory_profiler	MemoryProfiler - A sampling memory use profiler.
teca_metadata	A generic container for meta data in the form of name=value pairs.
teca_mpi_manager	A RAII class to ease MPI initalization and finalization.
teca_parallel_id	A helper class for debug and error messages.
teca_profiler	A class containing methods managing memory and time profiling.
teca_programmable_algorithm	An algorithm implemented with user provided callbacks.
teca_programmable_reduce	Callbacks implement a user defined reduction over time steps.
teca_thread_pool	A class to manage a fixed size pool of threads that dispatch I/O work.
teca_threaded_algorithm	This is the base class defining a threaded algorithm.
teca_threaded_programmable_algorithm	An threaded algorithm implemented with user provided callbacks.
teca_threadsafe_queue	A thread safe queue.
teca_time_event	A helper class that times it’s life.
teca_uuid	A universally uniquer identifier.
teca_variant_array	A type agnostic container for array based data.
teca_variant_array_impl	The concrete implementation of our type agnostic container for contiguous arrays.

Data¶

TECA’s data structures. (For more details, click on the class name)

Table 5 TECA Classes¶
Class	Description
teca_arakawa_c_grid	A representation of mesh based data on an Arkawa C Grid.
teca_array_collection	A collection of named arrays.
teca_cartesian_mesh	An object representing data on a stretched Cartesian mesh.
teca_curvilinear_mesh	Data on a physically uniform curvilinear mesh.
teca_database	A collection of named tables.
teca_mesh	A base class for geometric data.
teca_priority_queue	An indirect priority queue that supports random access modification of priority.
teca_table	A collection of columnar data with row based accessors and communication and I/O support.
teca_table_collection	A collection of named tables.
teca_uniform_cartesian_mesh	Data on a uniform cartesian mesh.

Testing¶

TECA comes with an extensive regression test suite which can be used to validate your build. The tests can be executed from the build directory with the ctest command.

ctest --output-on-failure

Note that PYTHONPATH, LD_LIBRARY_PATH and DYLD_LIBRARY_PATH will need to be set to include the build’s lib directory and PATH will need to be set to include “.”.

Timing and Profiling¶

TECA contains built in profiling mechanism which captures the run time of each stage of a pipeline’s execution and a sampling memory profiler.

The profiler records the times of user defined events and sample memory at a user specified interval. The resulting data is written in parallel to a CSV file in rank order. Times are stored in one file and memory use samples in another. Each memory use sample includes the time it was taken, so that memory use can be mapped back to corresponding events.

Warning

In some cases TECA’s built in profiling can negatively impact run time performance as the number of threads is increased. For that reason one should not use it in performance studies. However, it is well suited to debugging and diagnosing scaling issues and understanding control flow.

Compilation¶

The profiler is not built by default and must be compiled in by adding -DTECA_ENABLE_PROFILER=ON to the CMake command line. Be sure to build in release mode with -DCMAKE_BUILD_TYPE=Release and also add -DNDEBUG to the CMAKE_CXX_FLAGS_RELEASE. Once compiled the built in profilier may be enabled at run time via environment variables described below or directly using its API.

Runtime controls¶

The profiler is activated by the following environment variables. Environmental variables are parsed in teca_profiler::initialize. This should be automatic in most cases as it’s called from teca_mpi_manager which is used by parallel TECA applications and tests.

Variable	Description
PROFILER_ENABLE	a binary mask that enables logging. 0x01 – event profiling enabled. 0x02 – memory profiling enabled.
PROFILER_LOG_FILE	path to write timer log to
MEMPROF_LOG_FILE	path to write memory profiler log to
MEMPROF_INTERVAL	float number of seconds between memory recordings

Visualization¶

The command line application teca_profile_explorer can be used to analyze the log files. The application requires a timer profile file and a list of MPI ranks to analyze be passed on the command line. Optionally a memory profile file can be passed as well. For instance, the following command was used to generate figure Fig. 16.

./bin/teca_profile_explorer -e bin/test/test_bayesian_ar_detect \
   -m bin/test/test_bayesian_ar_detect_mem -r 0

When run the teca_profile_explorer creast an interactive window displaying a Gantt chart for each MPI rank. The chart is organized with a row for each thread. Threads with more events are displayed higher up. For each thread, and every logged event, a colored rectangle is rendered. There can be 10’s - 100’s of unique events per thread thus it is impractical to display a legend. However, clicking on an event rectangle in the plot will result in all the data associated with the event being printed in the terminal. If a memory profile is passed on the command line the memory profile is normalized to the height of the plot and shown on top of the event profile. The maximum memory use is added to the title of the plot. Example output is shown in Fig. 16.

_images/tpc_rank_profile_data_0.png — Fig. 16 Visualization of TECA’s run time profiler for the test_bayesian_ar_detect regression test, run with 1 MPI rank and 10 threads.¶

Creating PyPi Packages¶

The typical sequence for pushing and testing to PyPi is as follows. Be sure to add an rc number to the version in setup.py when testing since these are unique and cannot be reused.

python3 setup.py build_ext
python3 setup.py install
python3 setup.py sdist
python3 -m twine upload --repository-url https://test.pypi.org/legacy/ dist/*
pip3 install --index-url https://test.pypi.org/simple/ teca