Scientific Computing

pre-commit exclude files

When using Git pre-commit hooks with the wide range of files typical of scientific computing, it may occur that the hooks modify files that should be excluded, such as FITS files that have a text header and a binary data part. Exclude such files by using case-insensitive regex patterns in the .pre-commit-config.yaml file by top level exclude key, for example to case-insensitively exclude files with the .fit or .fits extension, use:

exclude: (?i)\.fits?$

Because it’s considered an anti-pattern to have a global pre-commit config, such exclusions and all other pre-commit configuration on a per-repo basis.

Fortran compiler flags for legacy code

To compile legacy Fortran code, certain compiler flags can be used to enable non-standard Fortran syntax that was common before the Fortran 95 standard became widely adopted. It’s hard to pin an exact year for when developers transitioned to more standard Fortran code, but the mid-2000s is a reasonable estimate for when Fortran codebases started to modernize in significant numbers. Gfortran took over from g77 as the default Fortran compiler circa 2005 and was the first widely used free Fortran compiler capable of the Fortran 95 standard. Here are legacy-enabling flags for currently maintained Fortran compilers.

In addition to the flags below, it may be necessary to provide default real and / or integer precision flags to compile old code that relies on the default precision being different than the modern default of 4 bytes for real and 4 bytes for integer.

GNU GFortran

  • -std=legacy enables pre-Fortran 77 arbitrary length arrays, where A(1) was declared instead of A(*) in Fortran 77 or A(:) in modern Fortran.
  • -fdec- options enable various non-standard extensions in the DEC Fortran style.
  • see also Gfortran runtime options for issues reading binary files.

LLVM Flang

Flang compiler flags have a “-std” option that might be checked to see if the project sets it too restrictively.

Intel oneAPI

  • -nostand does not change compiler behavior–it disables standard-based warnings.
  • -f66 apples Fortran 66 semantic rules, including Do loops that always run at least once.

IBM OpenXL

Cray Fortran

The ftn compiler can disable some checks with flags like -d C

Git config files

Git uses per-user ~/.gitconfig, per-repository .git/config configuration files, and environment variables to customize Git behavior. The system gitconfig file is not covered here. A common pattern is a developer makes most of their settings in the per-user Git config file, occasionally overriding them in the per-repository config file.

We have several articles on Git configuration, leading to a per-user Git config file with sections like:

pull HTTPS for speed, push SSH for security

[url "ssh://github.com/"]
	pushInsteadOf = https://github.com/
[url "ssh://gitlab.com/"]
	pushInsteadOf = https://gitlab.com/
[url "ssh://gist.github.com/"]
	pushInsteadOf = https://gist.github.com/scivision/
[url "ssh://gitlab.kitware.com/"]
    pushInsteadOf = https://gitlab.kitware.com/

Use Meld for diff and merge

[diff]
	tool = meld
[merge]
	tool = meld

Maintain linear Git history with fast-forward only by default with pull.ff only

[pull]
    ff = only

Check upon “git push” if all Git submodules have first been “git push” before the consuming repository, to avoid broken builds for others with push.recurseSubmodules check

[push]
	recurseSubmodules = check

For any Git operation (except “git clone”) that changs references to Git submodules commit hashes, automatically update the Git submodules to match the new commit hashes with option submodules.recurse true

[submodule]
	recurse = true

Quote CMake JSON arguments

The CMake string(JSON) subcommands should have the “json-string” input variable quoted to avoid CMake interpreting any semicolon in the JSON string as a list separator. This avoids CMake string(JSON ...) failures when the JSON string contains semicolons.

string sub-command JSON failed parsing json string:

  {

    "key1" : 42,
    "key2" : "I like to write

  * Line 3, Column 12

    Syntax error: value, object or array expected.

  .

Example

Suppose a JSON string contains one or more values with semicolons in any value. In that case, the JSON string should be quoted to avoid the CMake string(JSON ...) failure.

cmake_minimum_required(VERSION 3.19)

string(JSON jstr SET "{}" "key1" 42)

if(CMAKE_VERSION VERSION_GREATER_EQUAL 4.3)
  string(JSON val STRING_ENCODE "I like to write; my blog is about tech.")
else()
  set(val [=["I like to write; my blog is about tech."]=])
endif()
string(JSON jstr SET "${jstr}" "key2" "${val}")

# Fails with a syntax error
# string(JSON a GET ${jstr} "key1")

# works as expeccted
string(JSON v1 GET "${jstr}" "key1")
string(JSON v2 GET "${jstr}" "key2")

message(STATUS "key1: ${v1}. key2: ${v2}")

Related: CMake JSON array iteration

CMake version recommendations and install

Downloading the latest release of CMake is usually easy. Admin / sudo is not required.

  • Linux: snap install cmake
  • macOS: brew install cmake
  • Windows: winget install Kitware.CMake
  • PyPI CMake package: python -m pip install cmake

For platforms where CMake binaries aren’t easily available, build CMake using scripts/build_cmake.cmake.

To see the merge requests for a certain release, use a URL like: https://gitlab.kitware.com/cmake/cmake/-/merge_requests?milestone_title=4.3.0&scope=all&state=all

Another CMake changelog summary we like.

CMake 4.3 additions include:

  • Common Package Specification (CPS) officially supported. This provides a solution for the long-standing program of build system-agnostic JSON package configuration files, which can be generated by CMake’s install(EXPORT) and consumable by other build systems.
  • cmake --version=json-v1 tells details of the libraries vendored into CMake, which is useful for debugging and reporting issues with CMake’s bundled dependencies. This is complementary to cmake -E capabilities which reports specific capabilities of the CMake.
  • CMake command line and file(ARCHIVE_CREATE) commands can specify compression level and algorithm in more detail
  • numerous generator string expressions were added

CMake 4.2 additions include:

CMake 4.1 additions include:

  • project() added COMPAT_VERSION that propagates to subdirectories and can be queried for the top-level COMPAT_VERSION.

CMake 4.0 additions include:

CMake 3.31 additions include:

  • CMake warns if cmake_minimum_required() is < 3.10.
  • TLS ≥ 1.2 is required by default for internet operations e.g. file(DOWNLOAD), ExternalProject, FetchContent, and similar.
  • file(ARCHIVE_CREATE) gained a long-needed WORKING_DIRECTORY parameter that is essentially necessary to avoid machine-specific paths being embedded in the archive.
  • CMAKE_LINK_LIBRARIES_STRATEGY allows specifying a strategy for ordering target direct link dependencies.

CMake 3.30 additions include:

  • C++26 support.
  • CMAKE_TLS_VERIFY environment variable was added to set TLS verification (true, false).
  • defaults CMAKE_TLS_VERIFY to on, where previously it was off.
  • CTest can use the --test-dir argument with --preset, which avoids needing to be locked into a specific build directory.

CMake 3.29 additions include:

  • cmake_language(EXIT code) to exit CMake script mode with a specific return code. This is useful when using CMake as a platform-agnostic scripting language instead of shell script.
  • Environment variable CMAKE_INSTALL_PREFIX is used to set the default install prefix across projects–it can be overridden as typical by cmake -DCMAKE_INSTALL_PREFIX= option.
  • Target property TEST_LAUNCHER allows specifying a test launcher. For MPI program this allows deduplicating or making more programmatic test runner scripts.
  • Linker information variables including CMAKE__COMPILER_LINKER_ID have been added to allow programmatic logic like setting target_link_options() based on the particular linker.
  • ctest --parallel without a number or 0 will use unbounded test run parallelism.

CMake 3.28 additions include:

  • changes PATH behavior for Windows find_{library,path,file}() to no longer search PATH. This may break some projects that rely on PATH for finding libraries. MSYS2-distributed CMake is patched to include PATH like earlier CMake, which can be confusing for CI etc. not using MSYS CMake with that patch. Windows CI/user may need to specify CMAKE_PREFIX_PATH like

    cmake -DCMAKE_PREFIX_PATH=$Env:SYSTEMDRIVE/msys64/ucrt64/lib -B build
  • Support for C++20 modules is considerably improved and most users will want at least CMake 3.28 to make C++ modules usable.

  • Generator expressions $<IF> $<AND> $<OR> now short circuit.

  • Test properties now have a DIRECTORY parameter, useful for setting test parameters from the project’s top level CMakeLists.txt.

  • CMake 3.28.4 fixed a long-standing bug in Ninja Fortran targets that use include statements.

CMake 3.27 additions include:

ℹ️ Note

Fortran + Ninja was broken for OBJECT libraries in CMake 3.27.0..3.27.8 and fixed in 3.27.9.


Older CMake changelog

Get Matlab version from Python reading VersionInfo.xml

The Matlab version can be obtained “brute force” by running Matlab, but this can take tens of seconds on a network drive system such as HPC.

matlab -batch "disp(version)"

A much faster way to get the Matlab version is to read the VersionInfo.xml file that is included in the Matlab installation directory. This file contains the version information in XML format, as is available at least back to Matlab R2016a.

This Python script quickly parses the VersionInfo.xml file to extract the Matlab version information without needing to run Matlab itself.

macOS DYLD_LIBRARY_PATH security blocking

Environment variables that start with DYLD_ are restricted by macOS, and are not passed to child processes by default for security reasons. A workaround for this is to use a dummy environment variable that does not start with DYLD_, and then use a wrapper script to read that variable and set the DYLD_LIBRARY_PATH environment variable accordingly. This is necessary because DYLD_LIBRARY_PATH is read at program startup, and setting it within a program will not have any effect on the dynamic linker used by that program. The wrapper script can be used to set the DYLD_LIBRARY_PATH environment variable before executing the desired program, allowing it to find the necessary libraries without being blocked by macOS security restrictions.

Matlab exit return code for CI

Continuous integration (CI) systems generally rely on an integer return code to detect success (== 0) or failure (!= 0). The error() and assert() functions of Matlab / GNU Octave return non-zero status that works well with CI systems. The Matlab unittest framework is the primary method for testing Matlab code.

Matlab -batch option to run from the command line completely replaces -r and is more robust.

matlab -batch "assertSuccess(runtests)"

For pre-buildtool releases of matlab (< R2022b) we use a helper script that invokes runtests with appropriate parameters for the Matlab release used. To test across Matlab releases without a specific CI Matlab version fanout, we can use Python Pytest as in this wrapper script that tests Matlab from R2017a to R2025b and beyond.

Reboot computer from terminal command

There are several ways to reboot a computer from the terminal. For Linux or macOS the “reboot” command (or shutdown -r) is commonly used. For Windows, the PowerShell command Restart-Computer is a standard way to reboot the system from Terminal.

Within Windows Subsystem for Linux (WSL), the “reboot” command is only for the particular WSL instance and actually results in a shutdown of the WSL instance, not the entire Windows system. One can verify this by before and after the standard 8 second WSL shutdown time, running in Windows Terminal:

wsl.exe --list --running