Scientific Computing

Aspell don't backup

Aspell creates backup files with a .bak extension by default. To turn off the backup files configure Aspell to not create them. Often there is not a not a user configuration file “aspell.conf” present. Even if there is a config file present, it can be overridden by environment variable ASPELL_CONF:

export ASPELL_CONF="dont-backup;"

or do similarly through Control Panel in Windows.

Confirm the setting has taken effect by:

aspell dump config | more

and look for the line backup false.


Related: Aspell dictionary location

Clear temporary scratch files on HPC

Unix-like HPC systems often have shared temporary scratch directories mapped by environment variable $TMPDIR to a directory like “/scratch” or “/tmp”. $TMPDIR may be used for temporary files during build or computation. $TMPDIR is often shared among all users with no expectation of preservation or backup. If user files are left in $TMPDIR, the HPC system may email a periodic alert to the user.

If the user determines that $TMPDIR files aren’t needed after the HPC batch job completes, one can clear $TMPDIR files with a command near the end of the batch job script. Carefully consider whether this is appropriate for the specific use case, as the scratch files will be permanently deleted.

rm -r -i $TMPDIR 2>/dev/null

Verify that deletes only the user’s files, as each user’s files have write permissions only for their own files. Once this is established, to use this command in batch scripts replace the “-i” with “-f” to make it non-interactive.

CMake TARGET_RUNTIME_DLL_DIRS for CTest

Building Windows shared libraries in general creates DLLs whose directory must be on environment variable PATH when the executable target is run. Windows error code -1073741515 corresponding to hex error code 0xc0000135 emits when the necessary DLLs are not in the program’s working directory or on Path environment variable. This will make CTest tests fail with error code 135.

The CMake generator expression TARGET_RUNTIME_DLL_DIRS along with test property ENVIRONMENT_MODIFICATION can be used to set the Path environment variable for the test, gathering all the directories of the DLLs CMake knows the target needs.

set_property(TEST adder PROPERTY ENVIRONMENT_MODIFICATION PATH=path_list_append:$<TARGET_RUNTIME_DLL_DIRS:main>)

in this minimal example CMakeLists.txt uses the properties above to work correctly.

Limit code language standard

C++17 and C++20 standard code is used throughout projects of all sizes, perhaps with limited-feature fallback to older language standards. Some standards certifications require a specific language standard. High reliability and safety-critical projects may require specific language standards. Examples include FACE and MISRA C++.

To enforce a specific language standard be limited, consider in a header used throughout the project as follows. This example limits the language standard to C++14 or earlier by halt the build if a higher standard is detected:

#if __cplusplus >= 201703L
#error "C++14 or earlier required"
#endif

For C code say no higher than C99, consider in a header used throughout the project, which will halt the build if a higher standard is detected:

#if __STDC_VERSION__ >= 201112L
#error "C99 or earlier required"
#endif

Related: MVSC __cplusplus macro flag

CMake ignore Anaconda libraries and compilers

Anaconda Python conda activate puts Conda directories first on environment variable PATH. This leads CMake to prefer finding Anaconda binaries (find_library, find_program, …) and Anaconda GCC compilers (if installed) over later directories on system PATH. Anaconda libraries and compilers are generally incompatible with the system or desired compiler. For certain libraries like HDF5, Anaconda is particularly problematic at interfering with CMake.

Detect Anaconda environment by existence of environment variable CONDA_PREFIX.

Fix by putting in CMakeLists.txt like the following.

NOTE: CMAKE_IGNORE_PREFIX_PATH does not take effect if set within Find*.cmake.

cmake_minimum_required(VERSION ...)

# ignore Anaconda compilers, which are typically not compatible with the system
if(DEFINED ENV{CONDA_PREFIX})
  set(CMAKE_IGNORE_PATH $ENV{CONDA_PREFIX}/bin)
endif()

project(example LANGUAGES C)

# Optional next two lines if needing Python in CMake project
unset(CMAKE_IGNORE_PATH)
find_package(Python ...)
# end optional lines

# exclude Anaconda directories from search
if(DEFINED ENV{CONDA_PREFIX})
  list(APPEND CMAKE_IGNORE_PREFIX_PATH $ENV{CONDA_PREFIX})
  list(APPEND CMAKE_IGNORE_PATH $ENV{CONDA_PREFIX}/bin)
  # need CMAKE_IGNORE_PATH to ensure system env var PATH
  # doesn't interfere despite CMAKE_IGNORE_PREFIX_PATH
endif()

To totally omit environment variable PATH from CMake find_* use CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH:

set(CMAKE_FIND_USE_SYSTEM_ENVIRONMENT_PATH false)

However, this can be too aggressive i.e. it might miss other programs on PATH actually wanted.

Windows host cross-build for Linux target

Visual Studio supports cross-builds on Windows host for Linux targets. This requires either a remote Linux machine connection, or using WSL on the local computer.

A more robust solution without additional setup on developer computers is CI/CD such as GitHub Actions or many other online and offline choices such as Jenkins. When the developer Git pushes, the CD job provides binaries across operating systems.

An example of GitHub Actions CD is the Ninja project. They provide old (CentOS 7) Linux binaries, macOS and Windows. This could easily be extended to ARM etc.

Black format exclude multiple directories

To exclude multiple directories from Black Python code formatter, use the following format in pyproject.toml. The multi-line regex format seems to be required–any other way didn’t take effect.

Edit / add / remove as many directories as desired, using the following multi-line format (indentation is not important). Note the escaping needed for “.” since this is a regex.

This is particularly useful when using Black in a project with Git submodules to not disturb the Git submodule Python code with Black from the top-level project. Likewise for other tools such as flake8 and mypy set exclude in their settings for Git submodules.

[tool.black]
force-exclude = '''
/(
\.git
| \.mypy_cache
| \.venv
| _build
| build
| dist
)/
'''

CMake ExternalProject/FetchContent Git vs. URL archive

CMake ExternalProject and FetchContent can download from Git or URL archive. Archive download is usually much faster, especially for projects with a large number of Git commits. Checksum of the archive can optionally be verified with URL_HASH option.

Example

Git submodule

At first glance since Git config can set fetchParallel Git clone submodule in parallel might be something the ExternalProject GIT_CONFIG could do, but we have not tried this.

CMake changelog for older versions

Recent CMake changelog

CMake 3.26 adds CMake environment variable CTEST_NO_TESTS_ACTION that is good for CI to avoid missing unwanted no test detection. Helpful for debugging is the elimination of CMakeOutput.log and CMakeError.log, replaced by CMakeConfigureLog.yaml. message(CONFIGURE_LOG) allows easy logging to CMakeConfigureLog.yaml.

CMake 3.25 adds workflow presets, making configure-build-test a single command.

CMake 3.24 adds cmake -Bbuild --fresh option to clean the build directory first. The LINK_GROUP generator expression is excellent for resolving complex link dependencies. The CMAKE_COMPILE_WARNING_AS_ERROR boolean option sets most compilers to error if a compile warning occurs, which is generally a good setting for CI systems. CMAKE_COLOR_DIAGNOSTICS environment variable is useful to colorize build system and compiler output.

CMake 3.23 further enhances –debug-find to allow debugging all find_* command, or specific find_package or find_* variables. Header files can be more cleanly installed/affiliated with targets, particularly relevant for packaging and installing with target_sources file sets.

CMake 3.22 adds several CMake Environment Variables that are generally useful. CMAKE_BUILD_TYPE default for single configuration build systems. CMAKE_CONFIGURATION_TYPES defaults available configurations for multi-config build systems like Ninja Multi-Config. CMAKE_INSTALL_MODE makes symlinks with copy fallback a good choice for installing programs from CMake. For CTest, the new ENVIRONMENT_MODIFICATION test property makes modifying environment variables for test(s) much easier.

CMake 3.21 adds more preset features, including making “generator” optional–the default CMake behavior will be used to determine generator. cmake –install-prefix can be used instead of cmake -DCMAKE_INSTALL_PREFIX=. PROJECT_IS_TOP_LEVEL and <PROJECT-NAME>_IS_TOP_LEVEL identify if a project is at the top of the project hierarchy. ctest --output-junit gives test output in standard tooling format.

CMake 3.20 adds support for Intel LLVM compiler and NVIDIA HPC compiler. ExternalProject_Add() learned CONFIGURE_HANDLED_BY_BUILD which avoids CMake commanding a reconfigure on each build. try_compile(... WORKING_DIRECTORY) parameter was added. CMake presets in CMakePresets.json now covers configure, build and test, allowing many parameters to be declared with inheritance in JSON. CMake presets are a key feature for CI, as well as user configurations. ctest --test-dir build option avoids the need to manually cd build. cmake_path allows path manipulation and introspection without actually touching the filesystem.

CMake 3.19 added support for ISPC language. string(JSON GET|SET) parsing is useful to avoid hard-coding parameters. find_package() now accepts version ranges. Emits warning for cmake_minimum_required(VERSION) < 2.8.12. CMakePresets.json enables configure parameter declarations in JSON.

CMake 3.18 adds CMake profiler Adds REQUIRED parameter to find_*(). Adds file(ARCHIVE_CREATE) and file(ARCHIVE_EXTRACT), which is much more convenient than execute_process(COMMAND ${CMAKE_COMMAND} -E tar ${archive} WORKING_DIRECTORY ${out_dir}) syntax

CMake 3.17 adds Ninja Multi-Config generator. cmake –debug-find shows what find_*() is doing. Eliminates Windows “sh.exe is on PATH” error. Recognizes that Ninja works with Fortran.

CMake 3.16 adds precompiled headers, unity builds, many advanced project features.

CMake 3.15 adds CMAKE_GENERATOR environment variable that works like global -G option. Enhances Python interpreter finding. Adds cmake --install command instead of “cmake –build build –target install”. Added Zstd compression.

CMake 3.14 adds check_fortran_source_runs(). FetchContent was enhanced with simpler syntax. The transitive link resolution was considerably enhanced in CMake 3.14. Projects just work in CMake ≥ 3.14 that fail at link-time with CMake < 3.14.


We don’t recommend use of the older CMake versions below as they take significantly more effort to support.

CMake 3.13 adds ctest --progress and better Matlab compiler support. Lots of new linking options are added, fixes to Fortran submodule bugs. The very convenient cmake -B build incantation, target_sources() with absolute path are also added. It’s significantly more difficult to use CMake older than 3.13 with medium to large projects.

CMake 3.12 adds transitive library specification (out of same directory) and full Fortran Submodule support. get_property(_test_names DIRECTORY . TESTS) retrieves test names in current directory.

CMake 3.11 allows specify targets initially w/o sources. FetchContent is added, allowing fast hierarchies of CMake and non-CMake projects.

Matlab / GNU Octave file checksum

Computing file hash checksum allows some verification of file integrity, assuming the file is not maliciously altered to have the same checksum. MD5 and SHA-256 are among the methods available in Matlab-stdlib (works on Matlab or GNU Octave) file_checksum that computes the checksum of a file. The file is read in chunks to avoid overflowing the Java VM memory.