Scientific Computing

Static environment variables in GitHub Actions

GitHub Actions environment variables have distinct scopes:

  • Workflow
  • Job
  • Step

It’s trivial to set static environment variables in each of these scopes. Dynamically setting environment variables is also possible.

Workflow

Set static workflow environment variables in GitHub Actions by using env: at the top level of a “.github/workflows/ci.yml” file like:

name: ci

env:
  CTEST_PARALLEL_LEVEL: 0
  CMAKE_BUILD_PARALLEL_LEVEL: 4
  CTEST_NO_TESTS_ACTION: error
  CMAKE_GENERATOR: Ninja
  CC: gcc

Job

Static job environment variables are set like:

jobs:

  base:
    runs-on: macos-latest

    strategy:
      matrix:
        cc: [gcc-13, clang]

    env:
      CMAKE_GENERATOR: Ninja
      CC: ${{ matrix.cc }}

Step

Set static step environment variables like:


    - run: cmake -B build
      env:
        CMAKE_GENERATOR: Ninja

Install MSYS2 on Windows

MinGW brings GNU compiler tools to Windows since the late 1990s. MSYS2 provides numerous developer tools including MinGW on Windows using pacman package manager.

Install MSYS2 by:

winget install MSYS2.MSYS2

Start the MSYS2 UCRT console in the Windows Start menu. Update MSYS2 to get the latest packages in the MSYS2 terminal. Run this command up to twice, until it says “nothing to do”.

pacman -Syu

To use GCC and other MSYS2 / MinGW64 programs from PowerShell without disrupting other compiler use, create file “gcc.ps1” containing:

$r="$Env:SystemDrive/msys64"
# The MSYS2 root directory is where MSYS2 is installed. It's CPU architecture dependent.
# for x86_64 CPU:
$r="$r/ucrt64"
# for ARM64 CPU:
#$r="$r/clangarm64"

$b="$r/bin"

$Env:CC="$b/gcc"
$Env:FC="$b/gfortran"
$Env:CXX="$b/g++"

# important to put UCRT first to avoid "collect2.exe: error: ld returned 116 exit status" and DLL Hell
$Env:Path = "$b;$Env:Path"

$Env:CMAKE_PREFIX_PATH="$r"

When it’s desired to use MSYS from a PowerShell prompt run “gcc.ps1”.

From MSYS2 command prompt, search for packages like:

pacman -Ss gcc

MSYS2 packages of interest for scientific computing include: gcc, gdb, gcc-fortran, clang, boost, lapack, scalapack, hdf5, make, pkgconf, aspell. Install packages with environment prefix “mingw-w64-ucrt-x86_64-” for x86_64 Windows applications for example Gfortran “mingw-w64-ucrt-x86_64-gcc-fortran”. On ARM64 Windows, use “mingw-w64-clang-aarch64-” environment prefix for example Clang “mingw-w64-clang-aarch64-clang”.

You may need to reorder directories in the Windows Path variable. For example GNU Octave may need to be moved lower in the Path list or removed from Path.

If you find that MSYS2 is using more 500 MB, try clearing the package cache of old package versions

pacman -Sc

The MSYS2 latest package update feed shows recently updated packages. The MSYS2 Install reference is also useful. PowerShell per-session variable set is useful to set CC, FC, CXX to single intended compiler to build systems.

Alternatives

As compared to Cygwin, MSYS2 works from the Windows Command Prompt or PowerShell. MSYS2 provides native Window binaries. Cygwin does not have a command-line package installer.

Windows Subsystem for Linux is strongly supported, but does not give Windows binaries unless cross-compiling.

Clang MSYS2 environment

Clang, LLVM Flang Fortran compiler, GCC, Boost and many more packages are easily available on Windows via MSYS2. Clang is also available via direct download.

it’s often useful to have separate development environments for each compiler. The Powershell script “clang.ps1” creates a Clang LLVM environment. We don’t permanently put Clang on the user or system PATH to avoid DLL conflicts. Running “clang.ps1” in Powershell enables Clang until that Powershell window is closed.

For MSYS2 Clang and LLVM Flang Fortran compiler, create “clang.ps1” like:

$r="$Env:SystemDrive/msys64"
# The MSYS2 root directory is where MSYS2 is installed. It's CPU architecture dependent.
# for x86_64 CPU:
$r="$r/ucrt64"
# for ARM64 CPU:
#$r="$r/clangarm64"

$b="$r/bin"

$Env:CC="$b/clang"
$Env:CXX="$b/clang++"
$Env:FC="$b/flang"

# important to put UCRT first to avoid "collect2.exe: error: ld returned 116 exit status" and DLL Hell
$Env:Path = "$b;$Env:Path"

$Env:CMAKE_PREFIX_PATH="$r"

For standalone (non-MSYS2) Clang make “clang.ps1” like:

$Env:CC="clang"
$Env:CXX="clang++"
$Env:Path = "$Env:ProgramFiles/LLVM/bin;$Env:Path"

If you need to use the MSVC CL-like clang driver clang-cl, create “clang-cl.ps1” and run it when desired.

$Env:CC="clang-cl"
$Env:CXX="clang-cl"
$Env:Path = "$Env:ProgramFiles/LLVM/bin;$Env:Path"

Detect if program was compiled with optimizations

Users and developers might accidentally build a program or library without optimizations when they are desired. This could make the runtime 10 to 1000 times or more slower than it would be with optimizations. This could be devastating in computational cost on HPC and cause needless schedule delays. Programmatically detecting or using a heuristic to determine if a program was built with optimizations can help prevent this. Such methods are language-specific.

  • CMake, NDEBUG is set if CMAKE_BUILD_TYPE is Release or RelWithDebInfo.
  • Meson: NDEBUG is set if buildtype is release or debugoptimized with
project(..., default_options: ['b_ndebug=if-release'])

C / C++

There is currently no universal language standard method in C / C++ to determine if optimization was used on build. The presence of macro NDEBUG is used by the standard library to disable assertions. One could use if NDEBUG is defined as an indication if optimizations were used.

bool fs_is_optimized(){
// This is a heuristic, trusting the build system or user to set NDEBUG if optimized.
#if defined(NDEBUG)
  return true;
#else
  return false;
#endif
}

Fortran

If the Fortran code is compiled with preprocessing, a method using NDEBUG as above could be used. Fortran iso_fortran_env provides functions compiler_version and compiler_options. These could be used in a fine-grained, per compiler way to determine if optimizations were used.

Python

Distributed Python environments would virtually always be optimized. One can use heuristic checks to help indicate if the Python executable was built in debug mode. I am not yet aware of a universal method to determine if the CPython executable was built with optimizations.

import sysconfig

debug = bool(sysconfig.get_config_var('Py_DEBUG'))

HDF5 command line tools

HDF5 command line tools h5dump and h5ls are handy to quickly explore HDF5 files from the command line. Backup link to old documentation. They are particularly useful when accessing a remote computer such as HPC where the HDF5 files may be very large and would take a while to transfer to a local computer.


h5ls provides a high-level look at objects in an HDF5 file. Typically we start examining HDF5 files by printing the dataset hierarchy:

h5ls --recursive my.h5

Determine the filters used (e.g. was the data compressed):

h5ls --verbose my.h5

h5dump can print the entire contents of an HDF5 file to the screen. This can be overwhelming, so we typically print only the headers to start:

h5dump --header my.h5

Individual variables can be printed like:

h5dump --dataset=myvar my.h5

Determine the filters used (e.g. was the data compressed):

h5dump --properties --header --dataset=myvar my.h5

Related: HDF5 data GUI

CB Radio 11m data telemetry

Mid-range radio control (1 km to 10+ km) and other data telemetry has long been legal in the 27 MHz 11m band across the world. In the USA, FCC Rules Part 95 subpart C addresses 27 MHz data transmissions. 27 MHz is still actively used for data telemetry, with manufacturers claiming up to 15 miles range with a 10 Watt 27.255 MHz data FSK transceiver. Another long-time 27 MHz data telemetry application is 27 MHz paging.

Wireless mice and keyboards in the early 2000’s decade widely used the 27 MHz band. Unfortunately those devices operating on 27.195 MHz “19A” would significantly interfere with the popular CB channel 19 27.185 MHz, and could be heard even just driving by a house with a CB radio in the vehicle. Likewise for the other 27 MHz channels that bleed across several CB radio channels if in a neighboring house or passing within say 50 meters of a CB radio. This is due to the liberal emissions mask of FCC Part 95.779(a) allowing significant bleedover of unwanted modulation products into adjacent channels. Thankfully, these 27 MHz mice and keyboards have limited users these days. 27 MHz mice and keyboards are still sold as low-end inexpensive devices, so they might still be heard in some locales.

The data telemetry or mouse/keyboard transmissions typically use FSK modulation. On an AM receiver, FSK might sound like a quiet transmission with little modulation. Using an FM receiver, FSK typically sounds like a loud buzz or tone.

Detect 10m, 11m, 12m band openings

Detecting band openings in the 10m, 11m, and 12m radio bands can be done by listening to popular frequencies in these bands. The 10m and 12m bands are licensed amateur radio bands capable of global communications when ionospheric conditions are favorable. The 11m band is license-free and typically has more users such that an amateur radio operator may listen to 11m to determine if 12m and/or 10m are also experiencing enhanced skywave propagation.

A good 10m frequency to listen to is in the vicinity of 28.074 MHz and 28.078 MHz, which are the FT8 and JS8call suppressed carrier frequency as tuned in upper-sideband (USB) mode. This can be tuned by a converted CB radio in AM mode on 28.075 MHz. An AM mode radio tuned to 28.075 MHz will hear a seemingly random series of tones with a 15 second interval. The tones heard in an AM receiver come from multiple FT8 or JS8Call signals heterodyning.

For 12m, listen to FT8 / JS8Call USB 24.915 MHz or USB 24.922 MHz. If only having a converted AM CB radio, tune 24.915 MHz or 24.925 MHz.

11m DX frequencies to monitor include:

  • AM 27.025 MHz (CB channel 6, high powered calling frequency)
  • AM 27.185 MHz (CB channel 19, road calling channel)
  • AM 27.065 MHz (CB channel 9, Spanish language calling frequency in Central and South America)
  • FM 26.805 MHz (FM 11m DX calling frequency)
  • USB 27.245 MHz (CB channel 25, JS8Call frequency)

C++ size_type property vs size_t

The C++ Standard Library uses size_type as a property of containers like std::vector, std::string, etc. This is generally recommended over using size_t directly.

Example C++ code snippets using size_type property:

std::vector<int> vec;

std::vector<int>::size_type L = vec.size();

//----------------------------------------------
std::string path = "/usr/bin:/usr/local/bin";
constexpr char pathsep = ':';

std::string::size_type start = 0;
std::string::size_type end = path.find_first_of(pathsep, start);

Related: ssize_t for Visual Studio

Install Intel oneAPI C++ and Fortran compiler

Intel oneAPI is a cross-platform toolset that covers several programming languages including C, C++, Fortran and Python. Intel oneAPI includes the C++ “icpx” compiler, Fortran “ifx” compiler, Intel MKL, and Intel MPI. oneAPI is free-to-use and no login is required to download and install. oneAPI can be installed via package manager or manually.

Package manager install

On Windows, Intel oneAPI can be installed via WinGet:

winget install Intel.OneAPI.BaseToolkit
winget install Intel.OneAPI.HPCToolkit

On Linux, Intel oneAPI is typically available via distro package managers.

Manual install

If the package manager install is not available, the manual install is accomplished by the “online installer” download. The “online” installer can be copied over SSH to an HPC user directory for example and installed from the Terminal.

Install the oneAPI Base Toolkit with options:

  • Math Kernel Library (oneMKL)
  • (optional) GDB debugger

Install oneAPI HPC toolkit with options:

  • Intel MPI library
  • Intel C++ compiler
  • Intel Fortran compiler

Intel oneAPI version support spans the last couple releases.

Usage

There are distinct usage patterns to access Intel oneAPI compilers on Windows vs. Linux. Set environment variables CC, CXX, FC via script. oneapi-vars sets environment variable CMAKE_PREFIX_PATH so don’t just blindly overwrite that environment variable.

Windows

Windows requires Visual Studio Community to be installed first–VS IDE integration is optional. If VS IDE integration is installed, cmake -G can be used to generate Visual Studio project files with CMake 3.29 or newer. Otherwise, at least CMake 3.25.0 is adequate for oneAPI.

On Windows a Start menu shortcut for a oneAPI command prompt is installed. Powershell can also use “oneapi-vars.bat” to set the environment variables as per the oneapi.ps1 in the Gist above.

If CMake Visual Studio generator is desired, ensure:

Troubleshooting

If problems with finding packages with oneAPI on Windows and CMake occur, ensure that MSYS2 paths aren’t mixed in with the oneAPI environment. See the project CMakeConfigureLog.yaml and look for unwanted paths in the include commands.

Linux

On Linux, oneAPI requires GNU GCC toolchain. Some HPC systems have a too-old GCC version default for Intel oneAPI. This can cause problems with C++ STL linking. If needed, set environment variable CXXFLAGS for Intel GCC toolchain in custom “oneapi.sh” like:

export CXXFLAGS=--gcc-toolchain=/opt/rh/gcc-toolset-12/root/usr/

which can be determined like:

scl enable gcc-toolset-12 "which g++"

If using a CMake toolchain file, instead of CXXFLAGS environment variable, one can set

set(CMAKE_CXX_COMPILER_EXTERNAL_TOOLCHAIN "/opt/rh/gcc-toolset-12/root/usr/")