MAJA on HPC : slurm example

If you use MAJA on HPC, you may need to call it with a job workload manager. Here are a few tips :

  • MAJA uses multi-core parallelism, and you may expect an increase in processing time up to 8 cores, beyond which benefits tend to decrease ;
  • The amount of RAM to allocate per core is probably specified by the infrastructure itself, you should refer to your HPC facility documentation ;
  • If you have retrieved the GIPP configuration files or the DTM, you may need to activate your proxy settings within the job to access the internet ;
  • Remember that startmaja does not retrieve CAMS products if missing for the dates of L1C products. You need to build the CAMS archive prior to launch your job ;
  • While your data directories (L1C, L2A, GIPP, DTM, CAMS…) may stay in your filesystem, you could still specify the path to the ‘tmp’ directory of the computation node to avoid unnecessary slow I/O between the computation node and the filesystem, by pointing the ‘repWork’ parameter of ‘folder.txt’ to the local ‘$TMPDIR’ of the node.

Here is an example of SLURM script :

#!/bin/bash
#SBATCH --job-name=maja11SMT      # job's name
# --output = name of the output file  --error= name of error file (%j = jobID )
#SBATCH --output=maja11SMT-%j.out
#SBATCH --error=maja11SMT-%j.err
#SBATCH -N 1                        # number of nodes (or --nodes=1)
#SBATCH -n 8                        # number of tasks (or --tasks=8)
#SBATCH --mem-per-cpu=8G            # memory per core
#SBATCH --time=12:00:00             # Wall Time
#SBATCH --account=toto            # MANDATORY : account  ( launch myaccounts to list your accounts)
#SBATCH --export=none               # To start the job with a clean environnement

# Go to the submit directory
cd ${SLURM_SUBMIT_DIR}

# Set TMPDIR path in folders.txt ($TMPDIR is unknown before job is assigned to a node)
cat folders.template | sed "s|REPLACEME|$TMPDIR|g" > folders.txt

# If DTM or GIPP need to be downloaded, activate the proxy (example, sourcing your config) :
source ~/.open_my_proxy

# Launch startmaja
/softs/MAJA/4.8.0/bin/startmaja -f folders.txt -t 11SMT -d 2023-05-01 -e 2023-06-20 -s LosAngeles --cams -y

In this example, we update the ‘repWork’ parameter by parsing a ‘folder.template’ file, like eg :

[Maja_Inputs]
repWork = REPLACEME
repGipp = /work/Maja/maja-gipp
repMNT  = /work/Maja/DTM
repL1   = /work/Maja/L1C
repL2   = /work/Maja/L2A
exeMaja = /softs/MAJA/4.8.0/bin/maja
repCAMS = /work/Maja/CAMS

[DTM_Creation]
repRAW = /work/Maja/DTM/raw
repGSW = /work/Maja/DTM/gsw

Search