Scientific Computing

Python for Windows on ARM

Anaconda Python is working toward Windows on ARM support. For now, Anaconda / Miniconda Python work for Windows on ARM via the built-in Prism emulation. To use native ARM64 Python, which could be useful for benchmarking or maximum (best) computing performance, use plain CPython install for ARM64 such as:

winget install Python.Python.3.14

Upon installing and starting, one sees the ARM64 designation in the Python dialogs.

GDL GNU Data Language build

GDL (GNU Data Language) is a free/libre open-source program that runs a good percentage of IDL code. GDL is actively developed and easily installed by:

  • Linux: apt install gnudatalanguage
  • macOS: use weekly gdl-macOS-arm64-standard.dmg. We do this instead of Homebrew because the homebrew/science tap for gnudatalanguage is currently unmaintained.
  • Windows: get the latest release

Building GDL source uses the GDL build script “scripts/build_gdl.sh” to get the prerequisites. If Anaconda Python is present, conda deactivate first to avoid library problems when building GDL.

git clone https://github.com/gnudatalanguage/gdl

cd gdl/

cmake -B build --install-prefix=$HOME/gdl

cmake --build build --parallel

(optional) Check the install. You will see several plots appearing and disappearing automatically during this test, which takes a few minutes.

cmake --test-dir build -V

Install (do not use sudo):

cmake --install build

Do not build on an ExFAT / FAT32 drive, as the build will fail since symbolic links are not allowed on ExFAT / FAT32. If cmake reports libeigen being too old, install LibEigen3 or:

cmake -Bbuild -DEIGEN3=OFF

To use the Linux distro’s older version of GDL, just use /usr/local/bin/gdl or similar, or rename ~/.local/bin/gdl to ~/.local/bin/gdl0.98 or similar.

Troubleshooting build:

  • Runtime search path conflicts: temporarily comment out those paths in ~/.profile (typically from Anaconda Python, libreadline, libhistory, libz, libjpeg.so)
  • Problems with LZMA, try disabling HDF5: cmake -DHDF5=OFF

Clear Pacman database lock

If upon attempting Pacman operations a failure occurs like:

failed to synchronize all databases (unable to lock database)

This may occur if the system was interrupted during a Pacman operation, leaving a lock file that prevents further package management operations. The lock file is located by:

$(pacman-conf DBPath)/db.lck

which is typically “/var/lib/pacman/db.lck”. Check no other Pacman process is running:

ps -ef | grep pacman

Then the Pacman lock file can be removed:

rm $(pacman-conf DBPath)/db.lck

GitHub outage workaround with SSH instead of HTTPS

Anecdotally we have observed that during GitHub outages, Git over SSH operations may have a better chance of succeeding than Git over HTTPS operations. This includes cloning repositories.

Rather than reconfiguring global Git settings ~/.gitconfig to use SSH, simply clone using the SSH URL instead of the HTTPS URL.

For example, instead of:

git clone https://github.com/user/repo.git

Assuming Git over SSH is setup on the computer:

git clone git@github.com:user/repo.git

Recursively clean CMake build directories

CMake build directories might contain 100s of megabytes of files for large projects. Over time, a developer computer might contain forgotten build directories that waste tens of gigabytes of disk space. With Python, a script using send2trash allows safe removal of CMake build directories by first moving them to the OS Trash / Recycle bin.

OS Trash location
macOS ~/.Trash
Linux ~/.local/share/Trash/files
Windows Hidden folder accessed from Powershell like Get-ChildItem -Path 'C:\$Recycle.Bin' -Force

In distinction from shutil.rmtree, this send2trash approach allows recovery of files if the deletion was accidental. The heuristic used to detect a CMake build directory was inspired by ctest_empty_binary_directory.

mpi_f08 Fortran on Windows

use mpi_f08 is recommended for Fortran across computing platforms, including Windows.

For native x86 (Intel / AMD CPU) binaries, currently only free Intel oneAPI has mpi_f08 for Fortran. As time progresses and ARM64 CPUs are increasingly widespread, including for Windows PCs, and the complexity / disk space requirements of setting up Visual Studio for Intel oneAPI on Windows, it may be better (easier, faster, performance) to use WSL for Windows MPI. WSL can use OpenMPI or MPICH to access mpi_f08. For Windows ARM CPU users, WSL is the only straightforward option for mpi_f08 in Fortran.

Git submodule shallow

Git projects using submodules can be set to default shallow Git clone submodules to save space and time. Edit the “.gitmodules” file to have the “shallow = true” option for each Git submodule. This is particularly useful when the top-level project uses third party libraries or libraries with a large Git revision history.

Example .gitmodules file with shallow Git submodules:

[submodule "proj1"]
	path = proj1
	url = https://github.invalid/nobody/proj1
    shallow = true

Then Git clone with the --recurse-submodules option or Git submodule update with the --init --recursive options:

git clone --recurse-submodules <url>

or if already Git cloned

git submodule update --init --recursive

performs a shallow clone of the Git submodules.

Confirm that the submodules are shallow cloned by checking the Git log of the submodule:

git -C ./proj1 rev-parse --is-shallow-repository

These each return “true” indicating that the submodule is shallow cloned.

Fortran submodule file naming

A Fortran submodule may be defined in the same file or a different file than the Fortran module that uses the submodule. Meta-build systems such as CMake and Meson are aware that each Fortran compiler has distinct Fortran submodule naming conventions. Fortran module and submodule interface files are like generated header files.

The order of the commands for each compiler is significant and handled by the meta build system. The Fortran module must be built before the Fortran submodule to generate the necessary module interface files BEFORE compiling the submodule. Each compiler creates corresponding basic.o and basic_sub.o object files.

  1. gfortran -c basic.90 creates object and module files: basic.o demo.mod demo.smod
  2. gfortran -c basic_sub.f90 -o basic_sub.o creates object and submodule files: basic_sub.o demo@hi.smod
  3. gfortran basic.o basic_sub.o -o basic creates executable basic

“module” are the files generated by step #1, building the file containing the Fortran module. “submodule” are the files generated by step #2, building the containing the Fortran submodule.

Compiler module files submodule files
GCC gfortran demo.mod demo.smod demo@hi.smod
LLVM flang demo.mod demo-hi.mod
Nvidia HPC nvfortran demo.mod demo-hi.mod
Intel ifx demo.mod demo@hi.smod
IBM xlf demo.mod demo_hi.smod
Cray ftn < 19.0 DEMO.mod HI.mod
Cray ftn ≥ 19.0 DEMO.mod DEMO.HI.mod
  • GCC Gfortran module and submodule naming convention is defined in module.cc by definintions “MODULE_EXTENSION” and “SUBMODULE_EXTENSION”.
  • LLVM Flang module and submodule naming convention is defined in Semantics.

The table above can be derived from the two-file example program below, or by the values of CMake variables CMAKE_Fortran_SUBMODULE_EXT and CMAKE_Fortran_SUBMODULE_SEP

file basic.f90

module demo
real, parameter :: pi = 4.*atan(1.)
real :: tau

interface
  module subroutine hello(pi,tau)
    real, intent(in) :: pi
    real, intent(out) :: tau
  end subroutine hello
end interface
contains
end module demo

program sm
use demo
call hello(pi, tau)
print *,'pi=',pi, 'tau=', tau
end program

file basic_sub.f90

submodule (demo) hi
contains
module procedure hello
  tau = 2*pi
end procedure hello
end submodule hi

Related: