This function submits a single script to a high-performance cluster using a scheduler (Slurm or TORQUE). It accepts a vector of arguments to be passed to the scheduler and a vector of environment variables that should be passed to the compute node at job execution.

The function returns the jobid of the scheduled job.

Usage

cluster_job_submit(
  script,
  scheduler = "slurm",
  sched_args = NULL,
  env_variables = NULL,
  export_all = FALSE,
  echo = TRUE,
  fail_on_error = FALSE,
  wait_jobs = NULL,
  wait_signal = "afterok",
  repolling_interval = 60,
  tracking_sqlite_db = NULL,
  tracking_args = list()
)

Arguments

script: A path to a script that should be executed by the scheduler or a single command to be run. When script does not point to an existing file, it is treated as a one-line command to execute. In the case of conflicts, the directives passed with sched_args will take precedence.
scheduler: Which scheduler to use for job submission. Options are 'qsub', 'torque', 'sbatch', 'slurm', or 'sh'. The terms 'qsub' and 'torque' are aliases (where 'torque' submits via the qsub command). Likewise for 'sbatch' and 'slurm'. The scheduler 'sh' does not submit to any scheduler at all, but instead executes the command immediately via sh.
sched_args: A character vector of arguments to be included in the scheduling command. On TORQUE, these will typically begin with '-l' such as '-l wall_time=10:00:00'.
env_variables: A named character vector containing environment variables and their values to be passed to the script at execution time. This is handled by the '-v' directive on TORQUE clusters and by '–export' on Slurm clusters. The names of this vector are the environment variable names and the values of the vector are the environment variable values to be passed in. If you want to propagate the current value of an environment variable to the compute node at runtime, use NA as the value of the element in env_variables. See examples.
export_all: Whether to export all environment variables to the compute node at runtime. Default: FALSE
echo: Whether to echo the job submission command to the terminal at the time it is scheduled. Default: TRUE.
fail_on_error: Whether to stop execution of the script (TRUE), or issue a warning (FALSE) if the job submission fails. Defaults to FALSE (i.e., issue a warning).
wait_jobs: a character string of jobs or process ids that should complete before this job is executed
wait_signal: on torque or slurm clusters, the signal that should indicate that parent jobs have finished.
repolling_interval: The number of seconds to wait before rechecking job status (used only for local scheduler)
tracking_sqlite_db: Path to a SQLite database used for job tracking. If provided, job submission metadata will be recorded, including job dependencies and parent-child relationships.
tracking_args: A named list of metadata fields to track about the submitted job. This may include the parent job ID, batch script path, and scheduler options. This is used in conjunction with tracking_sqlite_db to support job tracking.

Value

A character string containing the jobid of the scheduled job.

Author

Michael Hallquist

Examples

if (FALSE) { # \dontrun{
  #simple PBS submission
  cluster_job_submit('myscript.bash', scheduler="torque", 
     sched_args=c('-l walltime=10:00:00', '-l nodes=1:ppn=20'),
     env_variables=c(RUN_INDEX=2, MODEL_NAME='FSE21'))

  #To forward environment variables without explicitly providing values. Note that these must
  #  be in R's system environment (cf. Sys.getenv) at execution time to forward as expected.
  cluster_job_submit('myscript.sbatch', scheduler="slurm",
     sched_args=c('-p general', '-N 1', '-n 12', '--mem=10g', '-t 02-00:00:00'),
     env_variables=c(RUN_INDEX=2, R_HOME=NA, JAVA_HOME=NA))

  # submit a single command without a script file
  cluster_job_submit('echo "hello"', scheduler="slurm")
} # }