Scientific Computing

CMake Zstd compression

Zstd is an open file compression standard. Zstd has become widely used and is incorporated in the Linux kernel and GCC. We use Zstd for data archiving particularly for large files where size and speed are a concern. CMake supports Zstd compression throughout, including file(ARCHIVE_CREATE) and file(ARCHIVE_EXTRACT). Zstd is vendored into CMake, so there is no need to worry about system shared libraries for Zstd.

file(ARCHIVE_CREATE ... WORKING_DIRECTORY ...) is necessary to avoid system-specific relative path issues.

set(archive "my.zst")
set(in_dir "data/")

file(ARCHIVE_CREATE
  OUTPUT ${archive}
  PATHS ${in_dir}
  COMPRESSION Zstd
  COMPRESSION_LEVEL 3
  WORKING_DIRECTORY ${in_dir}
  )
COMPRESSION_LEVEL
arbitrary, bigger value is more compressed.
FORMAT
not used for Zstd.

CMake version recommendations and install

CMake ≥ 3.24 is recommended for general users for more robust and easy syntax. For project developers, we recommend CMake ≥ 3.29 for C++20 modules and IDE integration.

Downloading the latest release of CMake is usually easy. Admin / sudo is not required.

  • Linux: snap install cmake
  • macOS: brew install cmake
  • Windows: winget install Kitware.CMake
  • PyPI CMake package: python -m pip install cmake

For platforms where CMake binaries aren’t easily available, build CMake using scripts/build_cmake.cmake.

To see the merge requests for a certain release, use a URL like: https://gitlab.kitware.com/cmake/cmake/-/merge_requests?milestone_title=3.30.0&scope=all&state=all

CMake 3.31 warns if cmake_minimum_required() is < 3.10. TLS ≥ 1.2 is now required by default for internet operations e.g. file(DOWNLOAD), ExternalProject, FetchContent, and similar.


CMake 3.30 adds C++26 support.


CMake 3.29 adds cmake_language(EXIT code) to exit CMake script mode with a specific return code. This is useful when using CMake as a platform-agnostic scripting language instead of shell script.

Environment variable CMAKE_INSTALL_PREFIX is used to set the default install prefix across projects–it can be overridden as typical by cmake -DCMAKE_INSTALL_PREFIX= option.

Target property TEST_LAUNCHER allows specifying a test launcher. For MPI program this allows deduplicating or making more programmatic test runner scripts.

Linker information variables including CMAKE__COMPILER_LINKER_ID have been added to allow programmatic logic like setting target_link_options() based on the particular linker.

ctest --parallel without a number or 0 will use unbounded test run parallelism.


CMake 3.28 changes PATH behavior for Windows find_{library,path,file}() to no longer search PATH. This may break some projects that rely on PATH for finding libraries. The MSYS2-distributed CMake is patched to include PATH like earlier CMake, which can be confusing for CI etc. not using MSYS CMake with that patch. Windows CI/user may need to specify CMAKE_PREFIX_PATH like

cmake -DCMAKE_PREFIX_PATH=$Env:SYSTEMDRIVE/msys64/ucrt64/lib -B build

Support for C++20 modules is considerably improved and most users will want at least CMake 3.28 to make C++ modules usable. Generator expressions $<IF> $<AND> $<OR> now short circuit. Test properties now have a DIRECTORY parameter, useful for setting test parameters from the project’s top level CMakeLists.txt. CMake 3.28.4 fixed a long-standing bug in Ninja Fortran targets that use include statements.


CMake 3.27 emits warning for cmake_minimum_required(VERSION) < 3.5. CTest test properties TIMEOUT_SIGNAL_NAME and TIMEOUT_SIGNAL_GRACE_PERIOD specify a POSIX signal to send to a timed out test process. Interactive CMake debugger added by cmake --debugger is used with an IDE such as Visual Studio. CMake script command cmake_file_api() allows querying CMake File API from within CMake. NOTE: Fortran + Ninja was broken for OBJECT libraries in CMake 3.27.0..3.27.8 and fixed in 3.27.9.


Older CMake changelog

CMAKE_TLS_VERIFY global

TLS verification default is ON since CMake 3.31. Users can override this default for all projects with environment variable CMAKE_TLS_VERIFY. or per-project with CMake variable CMAKE_TLS_VERIFY. The default TLS version may be set by CMAKE_TLS_VERSION. If the system TLS certificate location needs to be specified, this can be done by CMAKE_TLS_CAINFO.

Meson build system uses TLS verification by default, warning if verification fails. TLS verification is part of CMake’s internal nightly testing.

The example uses badssl.com, that purposefully has a variety of certificate problem URLs.


Reference: Issues that would have been caught with this default

CMake dependency graph

CMake --graphviz and graphviz configure preset can generate GraphViz dependency graphs for CMake-supported project code languages including C, C++, and Fortran. Fortran executables and modules are shown in the directed dependency graph. Fortran submodule are not shown in the graph.

The “dot” GraphViz program converts the .dot files to PNG, SVG, etc. dot program is available by installing the “graphviz” program via the package manager.

Generating the dependency graph requires CMake configure and generate. Thus, the compiler and generator needed by the CMake project must be working. The project does not need to be compiled before generating the dependency graph. However, the user should select the same CMake configure options as they would for compiling the project.

Example: h5fortran HDF5 object-oriented Fortran dependency graph is below. SVG vector graphics can be zoomed arbitrarily large in a web browser. The “gfx/” directory is to avoid making files in the source directory.

cmake -B build --graphviz=gfx/block.dot

cd gfx

dot -Tpng -o block.png block.dot

dot -Tsvg -o block.svg block.dot

h5fortran dependency graph

The “dependers” files show only the nodes depending on a node.

Scripts and Viewing output

CMakeUtils graph.py converts a directory of CMake dot diagrams to SVG or PNG and collects them in an HTML document for easy viewing:

python cmakeutils/graph.py ~/myprog/gfx

To open a web browser from the Terminal:

python -m webbrowser -t file:///$HOME/myprog/gfx/index.html

Note that “file:///” has three slashes and the file path must be absolute.


Related: Dependency graphs are also easily created in Python and Matlab.

Find executable path in Python

The full path to executables on the system Path (and cwd on Windows) are discovered by Python shutil.which. On Windows, environment variable PATHEXT is used to search filename suffixes if not specified at the input to shutil.which().

Shell aliases are not found by shutil.which() since the shell is not invoked. Instead append the directory of the desired executable to environment variable PATH, or specify it in shutil.which(..., path="/path/to/exe").

import shutil

# None if executable not found
exe = shutil.which('ls')

Since shutil.which() returns None for non-found executable it is convenient for pytest.mark.skipif

For programs not on PATH where the executable path is known:

shutil.which('myexe', path="/path/to/myexe")

Install Gfortran or Flang compiler on macOS Homebrew

Homebrew can install Fortran compilers including GCC and LLVM Flang.

Gfortran

Gfortran comes with GCC Homebrew package:

brew install gcc

As a complete C / C++ / Fortran compiler package, Gfortran doesn’t require additional flags or environment variables.

To use GCC compilers, source a script like:

p=$(brew --prefix gcc)/bin
v=14

export CC=$p/gcc-$v CXX=$p/g++-$v FC=$p/gfortran-$v

# to avoid GCC include errors -- MacOSX15.sdk incompatable at the moment with Homebrew GCC
export SDKROOT=/Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/

where v=14 is the major version number of the GCC compiler installed. The SDKROOT line may be necessary when Homebrew GCC package hasn’t yet enabled the latest SDK–adjust to suite the system.

LLVM Flang

LLVM Flang is a separate package from the LLVM C/C++ compilers:

brew install flang

To use Flang compiler (works with Clang, AppleClang, GCC C-C++ compilers), source a script like:

export FC=$(brew --prefix flang)/bin/flang-new

To use LLVM Clang with Flang, source a script like:

p=$(brew --prefix llvm)/bin
export CC=$p/clang CXX=$p/clang++

export FC=$(brew --prefix flang)/bin/flang-new

Troubleshooting

When a new compiler version or macOS version or Xcode SDK is released, it may be necessary to adjust the environment variables or flags temporarily until Homebrew updates the package.

Some examples to try if needed:

  • path to libSystem.tbd and libc++.tbd

    export LIBRARY_PATH=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib
  • path to the C++ standard library

    export LDFLAGS=-lc++

CMake find_program script

CMake find_program does not generally consider NAMES parameter to have file suffixes unless manually specified. For Windows, .com and .exe file suffixes are considered, with search order:

  1. .com
  2. .exe
  3. no suffix

If on Windows and an executable “hello.exe” or “hello.com” exists, then CMake will find it. CMake would NOT find “hello.exe” on non-Windows platforms, where no file suffix is expected.

The full path to executables on the system Path (and cwd on Windows) are found by find_program(). Shell aliases are not found since the shell is not invoked. Instead specify find_program(... HINTS /path/to/exe).

NOTE: CMAKE_EXECUTABLE_SUFFIX ONLY affects find_program() in CMake role PROJECT

find_program(v NAMES hello)

Shell scripts of any file suffix on any operating system are found iff:

  1. (non-Windows) script file executable flag is set (find_program general requirement)
  2. script file suffix is specified as part of find_program(… NAMES) parameter

A complete standalone example:

Detect CMake generator from CMakeCache.txt

The CMAKE_GENERATOR cache variable records the CMake generator used to configure the CMake project in the CMakeCache.txt CMake cache file. Surprisingly, the cmake -LA option does not emit the CMAKE_GENERATOR value.

Thus, parsing CMakeCache.txt will give the previously used CMake generator. This is relevant in automated processes such as CI/CD systems that may build for numerous configurations and generators. This parsing can be trivially done in scripts in many coding languages. Here we give an example in CMake script “detect_gen.cmake”:

cmake_minimum_required(VERSION 3.21)

file(REAL_PATH ${bindir} bindir EXPAND_TILDE)

file(READ "${bindir}/CMakeCache.txt" _cache)

if(_cache MATCHES "CMAKE_GENERATOR:INTERNAL=([^ \n]+)")
  set(CMAKE_GENERATOR ${CMAKE_MATCH_1})
  message(STATUS "Detected CMake generator: ${CMAKE_GENERATOR}")
else()
  message(FATAL_ERROR "Failed to detect CMake generator")
endif()

This script is used like:

cmake -Dbindir=/path/to/build -P detect_gen.cmake

CMake rebuild cache

CMake directory property CMAKE_CONFIGURE_DEPENDS can be used to specify additional dependencies for the configuration step. For example, if a JSON file is used to specify source files, CMake wouldn’t detect if a source file was added, removed, or modified without CMAKE_CONFIGURE_DEPENDS.

Sometimes, the situation is too complicated to specify all dependencies manually. If a change is made that requires CMake to rebuild the cache, two equivalent ways to do this without modifying previously set options are:

cmake -Bbuild
# preserves prior options

or

cmake --build build -t rebuild_cache

Iterate Matlab versions with CMake

These techniques work with any versioned program or library. Here we use Matlab as an example. CMake find_package with a version range would be used to simply select from a known-working version range.

Many Matlab codes require a modern version of Matlab. It’s possible to select from an arbitrary min…max range of Matlab versions with CMake FindMatlab as follows. This technique works with other versioned programs and libraries as well.

foreach(v IN ITEMS 23.2 24.1 24.2)
  find_package(Matlab ${v} EXACT COMPONENTS MAIN_PROGRAM)
  if(Matlab_FOUND)
    add_test(NAME matlab-${v}
      COMMAND ${Matlab_MAIN_PROGRAM} -batch "buildtool"
      WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
      )
  endif()
endforeach()