Scientific Computing

CMake build parallel

CMake environment variable CMAKE_BUILD_PARALLEL_LEVEL can be manually set to control default number of build parallel threads. Parallel builds are virtually always desired to save build and rebuild time. As a starting point, perhaps set CMAKE_BUILD_PARALLEL_LEVEL environment variable to be equal to the number of physical or logical CPU cores by setting it in the user profile:

#!/bin/bash

if [[ x"${CMAKE_BUILD_PARALLEL_LEVEL}" == x ]]; then
n=8;
case "$OSTYPE" in
linux*)
n=$(nproc);;
darwin*)
n=$(sysctl -n hw.physicalcpu);;
bsd*)
n=$(sysctl -n hw.ncpu);;
esac
export CMAKE_BUILD_PARALLEL_LEVEL=${n}
fi

Or for Windows, in environment variable settings:

CMAKE_BUILD_PARALLEL_LEVEL=%NUMBER_OF_PROCESSORS%

If the computer runs out of RAM, reduce the specific command parallelism with the cmake --build --parallel N command line option. For Ninja build systems, specific targets can control the number of workers with job pools.

CMake / Meson compiler flag Wno form

The purpose of compiler flag checks is to test if a flag is supported. Metabuild system (such as CMake and Meson) compiler flag checks must not test the “-Wno-” form of a warning flag. This is because several compilers including Clang, GCC, Intel oneAPI emit a “success” return code 0 for the “-Wno-” form of an unsupported flag.

Incorrect

check_compiler_flag(C "-Wno-badflag" has_flag)
cc = meson.get_compiler('c')
has_flag = cc.has_argument('-Wno-badflag')

The incorrect CMake and Meson example scripts above will in general always set “has_flag = true” for the “-Wno-” form of a warning flag.

Correct way

check_compiler_flag(C "-Wbadflag" has_flag)

if(has_flag)
  target_compile_options(myexe PRIVATE -Wbadflag)
endif()
cc = meson.get_compiler('c')
has_flag = cc.has_argument('-Wbadflag')

if has_flag
  executable('myexe', 'myexe.c', c_args : '-Wbadflag')
endif()

Meta build system multi-thread

From time to time, the topic of why meta-build systems like CMake and Meson are single-threaded sequentially executing processes is brought up. With desktop workstations (not to mention build farms) having 32, 64, and even 128+ CPU cores increasingly widespread, and the configure / generation step of meta-build systems taking tens of seconds to a few minutes on large projects, developers are understandably frustrated by the lack of parallelism.

A fundamental issue with CMake, Meson and equivalently for other meta-build systems is that the user’s CMake scripts would then have to be dataflow / declarative versus imperative. This would require reworking of script syntax and meta-build internal code radically.

In Meson, Python threading (single thread executes at one time) is used in subprojects, giving a speed boost in download time of subproject source code. There is no Python multiprocessing or ProcessPoolExecutor in Meson configure step. Meson parallel execution is for build (Ninja) and test. Both build and test are also done in parallel in CMake. For CMake, the ExternalProject steps can already be run in parallel (including download) via the underlying build system.

A way to speed up meta-build configure time–here specific to CMake–is to stuff the CMakeCache.txt file with precomputed values and/or use CMake Toolchain to do likewise, skipping configure tests when the host build system is static. CMakeCache.txt stuffing is a technique Meson uses to speed up configure time of CMake-based subprojects from Meson projects.

C, C++, Fortran GDB debugging

Debugging C, C++, and Fortran code with GNU Debugger “gdb” is akin to debugging Python with pdb.

Start GDB Fortran debugger: assuming executable myprog with arguments hello and 3:

gdb --args ./myprog hello 3

Run program in gdb (perhaps after setting breakpoints) with

r

A typical debugging task uses breakpoints. Breakpoints are where the debugger stops running until you type

c

Set breakpoints by functionName:lineNumber. Example: Set a breakpoint in function myfun on line 32

b myfun:32

For breakpoints in Fortran modules in this example line 32 of a module named mymod in function myfun:

b mymod::myfun:32

List all scope variables as two separate steps.

  1. local variables
  2. arguments to the function.

Variable type, size (bytes/element), and shape (elements/dim) are available for each variable “var” by

whatis var

Example: a Fortran iso_fortran_env real64 3-D array of size 220 x 23 x 83:

var = REAL(8) (220,23,83)

If “var” is a Fortran derived type, get the same information about each record (akin to “property”) of the derived type by:

whatis var%prop

Local variables are variables used only within the scope of the function–excluding arguments to the function.

info locals

List the names and values of all arguments to the current function:

info args

Example: in integer function myfun(a,b) or subroutine mysub(a,b), upon info args you’d see perhaps

a = 1.5
b = 0.2

If a or b are arrays or structs, array values are printed as well.

CMAKE_SYSTEM_NAME detect operating system

CMake OS name flags like APPLE and UNIX are terse and are frequently used. A possible downside is their have broad, overlapping meanings.

In contrast, CMAKE_SYSTEM_NAME has more fine-grained values.

However, it is often more convenient (if using care) to use terse variables that are not as specific:

if(APPLE)
  # macOS
elseif(BSD)
  # FreeBSD, NetBSD, OpenBSD, etc.
elseif(LINUX)
  # Linux
elseif(UNIX)
  # Linux, BSD, etc. (including macOS if not trapped above)
elseif(WIN32)
  # Windows
else()
  message(FATAL_ERROR "Unsupported system: ${CMAKE_SYSTEM_NAME}")
endif()

There is not one “right” or “wrong” way.

Octave vs. SciLab vs. Python

John W. Eaton continues to be heavily involved with GNU Octave development as seen in the commit log. The GNU Octave developer community has been making approximately yearly major releases. Octave is useful to:

  • run Matlab code to determine if it’s worth porting a function to Python
  • use Matlab function from Python with oct2py

Octave allows running Matlab “.m” code without changes for many tasks. “.m” code that calls proprietary toolboxes or advanced functions may not work in Octave.

I generally recommend learning and using Python unless one already has significant experience and a lot of code in Matlab. Practically what happens is that we choose a “good enough” language. What’s important is having a language that most other people are using so we can share results. The team might be building a radar or robot or satellite imager, and what’s being used in those domains is C, C++, Matlab, and Python.

I want a data analysis language that can scale from Cortex-M0 to Raspberry Pi to supercomputer. Yes, Matlab can use the Raspberry Pi as a target, works with software defined radio, etc. Will collaborators have the “right” version of Matlab and the toolbox licenses to replicate results? How can I debug 100 Raspberry Pi’s sitting out in a field? I need to use the GPIO, SDR, do machine learning processing and forward packets, perhaps using coroutines.

Since 2014, MicroPython has been rapidly growing in the number of MPU/SoC it supports. For just a few dollars, numerous IoT wireless modules can run an expansive subset of Python 3 including exception handling, coroutines, etc. For rapid prototyping, one can get the prototype SoC running remote sensing code passed to the cloud before the first planning meeting. Consider the higher-level languages ease of development and tools or inherent memory safety.

Like math systems such as Sage, SciLab allows integrating multiple numerical systems together. However, SciLab is its own language–with convenient syntax, and a Matlab to SciLab converter. SciLab, IDL, Mathematica, and Maple suffer from small audience size and limited number of third-party libraries.

Unfreeze lost SSH session

If an SSH session hasn’t been used for a while or the laptop goes to sleep, the SSH session typically disconnects. This leaves a frozen Terminal window that can’t be used. Usually the Ctrl c keyboard combo does not work.

To avoid having to close the Terminal window, unfreeze the SSH client so that the same Terminal window can be used to reconnect to the SSH session. This avoids needless rearranging when you’ve already got a desired tab/window layout for the Terminal.

In the Terminal windows, press these keys in sequence:

Enter ~ .

Python distutils removal

Python distutils was removed from Python 3.12 as proposed in PEP632. Setuptools 49.1.2 vendored distutils, but has experienced some friction in setuptools 50.x since so many packages monkeypatch distutils due to the little maintained status of distutils for several years.

With distutils deprecation in Python 3.10, migration to setuptools is a topic being worked on by major packages such as Numpy. Aside from major packages in the Scipy/Numpy stack, I don’t recall many current packages relying on distutils. However, there is code in some packages using import distutils that could break.

I applaud the decision to remove distutils from Python stdlib despite the fallout. The fallout is a symptom of the legacy baggage of Python’s packaging. Non-stdlib packages like setuptools are so much more nimble that sorely needed improvements can be made more rapidly.

Reference: bug report leading to PEP632

Compiler macro definitions

Compilers define macros that can be used to identify a compiler and platform from compiled code, such as C, C++, Fortran, et al. This can be used for platform-specific or compiler-specific code. If a significant amount of code is needed, it may be better to swap in different code files using the build system instead of lengthy #ifdef logic. There are numerous examples for C and C++ so here we focus on macros of Fortran compilers.

Gfortran compiler macros

Macro definitions are obtained in an OS-agnostic way by:

echo "" | gfortran -dM -E - > macros.txt

that creates a file “macros.txt” containing all the compiler macros.

commonly used macros to detect operating system / compiler configuration include:

  • _WIN32 1
  • __linux__ 1
  • __unix__ 1
  • __APPLE__ 1

CAUTION: these macros are actually not available in the Gfortran compiled programs as they are in GCC. A workaround is to have the build system define these for the particular compiler, OS, etc.

Clang LLVM compiler macros

LLVM-like compilers (including AMD AOCC) macro definitions are obtained in an OS-agnostic way by:

echo "" | clang -dM -E - > macros.txt

that creates a file “macros.txt” containing all the compiler macros.

__VERSION__ and __clang_version__ contain string version information.

Intel oneAPI LLVM compiler macros

oneAPI macros set:

  • __INTEL_LLVM_COMPILER 1

to distinguish from oneAPI Classic compiler macros like __INTEL_COMPILER 1

Cray compiler macros

Detect Cray compiler wrapper with compiler macro __CRAYXT_COMPUTE_LINUX_TARGET instead than non-universal __CRAYXC or __CRAYXE.

or use Cray environment variable PE_ENV and check for CRAY or PrgEnv-.

NVIDIA HPC compiler macros

Print compiler macros like:

touch main.c main.cpp main.f90

nvc -dryrun main.c
nvc++ -dryrun main.cpp
nvfortran -dryrun main.f90

NVIDIA HPC macros include:

  • __NVCOMPILER
  • __NVCOMPILER_LLVM__

Flang-f18 compiler macros

Flang-f18 (flang-new) Fortran compiler macros include __flang__ 1


Other Fortran compiler macros that identify the compiler and platform can be found in CMake source code.