Personal tools
You are here: Home PBS (Portable Batch System) PBS Examples
Document Actions

PBS Examples

 

 
 
 

PBS Examples

 

 

1 Submitting an interactive job to the Batch Scheduler

 

2 Hello World

This is an extremely simple PBS script that will spawn a single process on a single node. In this case, it first determines the hostname, then it uses the echo command to print out "Hello World from host " followed by the hostname.

2.1 Bash Hello World PBS Script

This example uses the Bash shell to print a simple "Hello World" message. Note that it specifies the shell with the `-S' option. If you do not specify a shell using the `-S' option (either inside the PBS script or as an argument to qsub), then your default shell will be used.

hello-bash()
#PBS -lnodes=1:ppn=4     #PBS -lwalltime=1:00     ## Specify the shell to be bash    #PBS -S /bin/bash 
# print out a hello message     # indicating the host this is running on    export THIS_HOST=`hostname`
echo Hello World from host $THIS_HOST

2.2 Tcsh Hello World PBS Script

This example uses the tcsh shell to print a simple "Hello World" message. Note that it specifies the shell with the `-S' option. If you do not specify a shell using `-S' (either inside the PBS script or as an argument to qsub), then your default shell will be used.

hello-tcsh()
## Introductory Example   ## Copyright (c) 2010 The Center for Advanced Research Computing    ##                           at The University of New Mexico   #PBS -lnodes=1:ppn=4   #PBS -lwalltime=1:00   ## Specify the shell to be tcsh   #PBS -S /bin/tcsh 
# print out a hello message    # indicating the host this is running on 
setenv THIS_HOST `hostname`
echo Hello World from host $THIS_HOST

 

3 Submitting a PBS Script to the Batch Scheduler

In order to run a PBS script on the cluster, we will need to submit it to the batch scheduler using the command qsub followed by the name of the script we would like to run.

In this example, we submit our Hello World PBS script to the batch scheduler using qsub. Notice that it returns the job identifier when the job is successfully submitted. You can use this job identifier to query the status of your job.

tcsh> qsub hello.pbs
64811.nano.nano.alliance.unm.edu

 

4 Checking on the status of a job

If you would like to check the status of your job, you can use the qstat command to do so. With the hello.pbs script, the job may run so quickly that you do not see your job in qstat. The -a option causes PBS to display more infomation about the jobs currently in the scheduler.

If you would like to see just the status of this job, you would run the following from your shell:

shell> qstat 64811.nano.nano.alliance.unm.edu

Or, the shorter version with just the numeric portion of the job identifier:

shell> qstat 64811

My username is "download" and, for this example, the job identifier is 64811.nano.nano.alliance.unm.edu.

You should note that your job can be in one of three states while it is in the scheduler: Running, Queued, or Exiting denoted by R, Q, and E respectively in the job State column (the column labelled "S").

tcsh> qstat -a

nano.nano.alliance.unm.edu: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 64758.nano.nano.alli jruser one_long frob0001 1049 1 -- -- 160:0 R 46:27 64760.nano.nano.alli jruser one_long frob-1000 2037 1 -- -- 160:0 R 46:22 64761.nano.nano.alli jruser one_long frob-3000 9944 1 -- -- 160:0 R 46:18 64762.nano.nano.alli jruser one_long frob-6000 21219 1 -- -- 160:0 R 46:14 64763.nano.nano.alli jruser one_long frob-12000 -- 1 -- -- 160:0 Q -- 64764.nano.nano.alli jruser one_long frob-18000 -- 1 -- -- 160:0 Q -- 64765.nano.nano.alli jruser one_long frob-28000 -- 1 -- -- 160:0 Q -- 64766.nano.nano.alli jruser one_long frob-38000 -- 1 -- -- 160:0 Q -- 64770.nano.nano.alli alice defaultq abcd 32682 4 -- -- 60:00 R 28:24 64797.nano.nano.alli bill one_node blub11234 18940 1 -- -- 48:00 R 16:09 64799.nano.nano.alli fred one_node blub112345 24055 1 -- -- 48:00 R 15:25 64800.nano.nano.alli fred one_node blub112337 26151 1 -- -- 48:00 R 15:19 64801.nano.nano.alli bill defaultq hoodger 24066 4 -- -- 80:00 R 06:41 64803.nano.nano.alli george defaultq abc2 13111 2 -- -- 24:00 R 03:18 64804.nano.nano.alli george defaultq abc4 16579 4 -- -- 24:00 R 03:17 64805.nano.nano.alli george defaultq abc8 -- 8 -- -- 24:00 Q -- 64811.nano.nano.alli download one_node hello.pbs -- 1 -- -- 00:01 Q --

 

5 Determining which nodes your job is using

If you would like to check which nodes your job is using, you can pass the -n option to qsub. Note that if you currently have a job running on a node of the cluster you may freely log into that node in order to check on the status of your job. When your job is finished, your processes on that node will all be killed by the system.

tcsh> qstat -an 

nano.nano.alliance.unm.edu: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------- ------ ----- --- ------ ----- - ----- 64758.nano.nano.alli jruser one_long frob0001 1049 1 -- -- 160:0 R 46:27 nano34+nano34+nano34+nano34 64760.nano.nano.alli jruser one_long frob-1000 2037 1 -- -- 160:0 R 46:22 nano28+nano28+nano28+nano28 64761.nano.nano.alli jruser one_long frob-3000 9944 1 -- -- 160:0 R 46:18 nano12+nano12+nano12+nano12 64762.nano.nano.alli jruser one_long frob-6000 21219 1 -- -- 160:0 R 46:14 nano11+nano11+nano11+nano11 64763.nano.nano.alli jruser one_long frob-12000 -- 1 -- -- 160:0 Q -- -- 64764.nano.nano.alli jruser one_long frob-18000 -- 1 -- -- 160:0 Q -- -- 64765.nano.nano.alli jruser one_long frob-28000 -- 1 -- -- 160:0 Q -- -- 64766.nano.nano.alli jruser one_long frob-38000 -- 1 -- -- 160:0 Q -- -- 64770.nano.nano.alli alice defaultq abcd 32682 4 -- -- 60:00 R 28:24 nano27+nano27+nano27+nano27+nano25+nano25+nano25+nano25+nano24+nano24+nano24 +nano24+nano23+nano23+nano23+nano23 64797.nano.nano.alli fred one_node blub11234 18940 1 -- -- 48:00 R 16:09 nano20+nano20+nano20+nano20 64799.nano.nano.alli fred one_node blub12345 24055 1 -- -- 48:00 R 15:25 nano17+nano17+nano17+nano17 64800.nano.nano.alli fred one_node blub12337 26151 1 -- -- 48:00 R 15:19 nano16+nano16+nano16+nano16 64801.nano.nano.alli bill defaultq hoodger 24066 4 -- -- 80:00 R 06:41 nano26+nano26+nano26+nano26+nano22+nano22+nano22+nano22+nano19+nano19+nano19 +nano19+nano18+nano18+nano18+nano18 64803.nano.nano.alli george defaultq abc2 13111 2 -- -- 24:00 R 03:18 nano32+nano32+nano32+nano32+nano31+nano31+nano31+nano31 64804.nano.nano.alli george defaultq abc4 16579 4 -- -- 24:00 R 03:17 nano29+nano29+nano29+nano29+nano21+nano21+nano21+nano21+nano15+nano15+nano15 +nano15+nano14+nano14+nano14+nano14 64805.nano.nano.alli george defaultq abc8 -- 8 -- -- 24:00 Q -- -- 64811.nano.nano.alli download one_node hello.pbs -- 1 -- -- 00:01 Q -- --

 

6 Viewing output and error files

Once your job has completed, you should see two files in the directory that you submitted the job from. By default, these will be named <jobname>.pbs.oXXXXX and <jobname>.pbs.eXXXXX (where the <jobname> is replaced by the name of the PBS script X's are replaced by the numerical portion of the job identifier returned by qsub). Any output from the job sent to "standard output" will be written to the hello.pbs.oXXXXX file and any output sent to "standard error" will be written to the hello.pbs.eXXXXX file. These files are referred to as the "output file" and the "error file" respectively throughout this document.

For my Hello World job, the error file is empty and the output file contains the following:

Nano Portable Batch System Prologue
Job Id: 64811.nano.nano.alliance.unm.edu
Username: download

prologue running on host: nano10 Hello World from host nano10 Nano Portable Batch System Epilogue

 

7 Multi-process Hello World (Single Machine)

In this example, we use the "mpirun" command to spawn the same process on each of the processors on the compute node. In this case, we spawn a shell on each of the processors available on the compute node that prints the MPI ID of the process and the number of total processes.

7.1 Bash

## Introductory Example   ## Copyright (c) 2010 The Center for Advanced Research Computing    ##                           at The University of New Mexico   #PBS -lnodes=1:ppn=4 #PBS -lwalltime=1:00   ## Specify the shell to be bash   #PBS -S /bin/bash   # set up the PATH environment variable for the    # MX (Myrinet) version of mpirun   export PATH=/opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# print out a hello message from each of the processors on this host # indicating the host this is running on export THIS_HOST=`hostname` mpirun -np 4 -machinefile $PBS_NODEFILE /bin/sh \-c \ "echo Hello World from process \\\$MXMPI_ID of \\\$MXMPI_NP on host $THIS_HOST"

7.2 Tcsh

## Introductory Example ## Copyright (c) 2010 The Center for Advanced Research Computing   ##                           at The University of New Mexico  #PBS -lnodes=1:ppn=4 #PBS -lwalltime=1:00 ## Specify the shell to be tcsh  #PBS -S /bin/tcsh # set up the PATH environment variable for the   # MX (Myrinet) version of mpirun  setenv PATH /opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# print out a hello message from each of the processors on this host # indicating the host this is running on setenv THIS_HOST `hostname` mpirun -np 4 -machinefile $PBS_NODEFILE /bin/sh \-c \ 'echo Hello World from process \$MXMPI_ID of \$MXMPI_NP on host $THIS_HOST' echo Hello World from host `hostname`

Contents:

7.2.1 Output

In this job's output file, you should see something like this.

Nano Portable Batch System Prologue
Job Id: 64829.nano.nano.alliance.unm.edu
Username: download

prologue running on host: nano09 Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. Hello World from process 1 of 16 on host nano09 Hello World from process 2 of 16 on host nano09 Hello World from process 3 of 16 on host nano09 Hello World from process 0 of 16 on host nano09 Hello World from process 6 of 16 on host nano08 Hello World from process 5 of 16 on host nano08 Hello World from process 4 of 16 on host nano08 Hello World from process 7 of 16 on host nano08 Hello World from process 8 of 16 on host nano06 Hello World from process 11 of 16 on host nano06 Hello World from process 10 of 16 on host nano06 Hello World from process 9 of 16 on host nano06 Hello World from process 14 of 16 on host nano05 Hello World from process 12 of 16 on host nano05 Hello World from process 15 of 16 on host nano05 Hello World from process 13 of 16 on host nano05 Nano Portable Batch System Epilogue

 

8 Multi-Node Hello World

In this example, we use the "mpirun" command to spawn the same process on each of the processors on the four compute nodes we've requested. In this case, we spawn a shell on each of the processors available to the job that prints the MPI ID of the process and the number of total processes.

8.1 Bash

## Introductory Example  ## Copyright (c) 2010 The Center for Advanced Research Computing    ##                           at The University of New Mexico   #PBS -lnodes=4:ppn=4  #PBS -lwalltime=1:00   ## Specify the shell to be bash   #PBS -S /bin/bash   # set up the PATH environment variable for the    # MX (Myrinet) version of mpirun   export PATH=/opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# print out a hello message from each of the processors on this host # indicating the host this is running on mpirun -np 16 -machinefile $PBS_NODEFILE /bin/sh \-c \ "echo Hello World from process \\\$MXMPI_ID of \\\$MXMPI_NP on host `hostname`"

8.2 Tcsh

## Introductory Example   ## Copyright (c) 2010 The Center for Advanced Research Computing    ##                           at The University of New Mexico   #PBS -lnodes=4:ppn=4  #PBS -lwalltime=1:00   ## Specify the shell to be tcsh   #PBS -S /bin/tcsh # set up the PATH environment variable for the    # MX (Myrinet) version of mpirun 
setenv PATH /opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# print out a hello message from each of the processors on this host # indicating the host this is running on mpirun -np 16 -machinefile $PBS_NODEFILE /bin/sh \-c \ 'echo Hello World from process \$MXMPI_ID of \$MXMPI_NP on host `hostname`</span> </pre></div></div></div></div> <div class="outline-2" id="outline-container-9" style="display: block; visibility: visible;"><div style="display: none; visibility: hidden;"><div style="display: inline; float: right; text-align: right; font-size: 70%; font-weight: normal;" class="org-info-js_header-navigation"><a>HELP</a> / <a accesskey="m">toggle view</a></div></div><h2 style="cursor: pointer;">&nbsp;</h2><h2 id="sec-9" style="cursor: pointer;"><strong><span class="section-number-2">9</span> Multi-Node MPI Hello World (from C and Fortran77 Source Code)</strong></h2> <div id="text-9" class="outline-text-2" style="display: block; visibility: visible;"><p>The following examples show how to run an MPI &quot;Hello World&quot; program compiled from either C or Fortran77 source code. These examples each consist of a source code file, a Makefile, and a PBS script.</p> <p>The C and Fortran programs are very similar. In both, they call MPI_Init to initialize the MPI communications, MPI_Comm_size to determine the number of processors in the computation, MPI_Comm_rank to determine this process rank in the computation, gethostname to determine the hostname of the current machine, prints a message indicating the computation size, computation rank, and hostname.

9.1 hello.c Source Code

hello.c()
/* Introductory Example    Copyright (c) 2010 The Center for Advanced Research Computing                                                                                  at The University of New Mexico */
/* Include the MPI header file */
#include "mpi.h"
#include <unistd.h>

int main(int argc, char *argv[]) { int n, myid, numprocs, rc, i, j, k; char hostname[256]; size_t len = 255; /* Initialize MPI */ rc = MPI_Init( &argc, &argv ); /* store the number of processors for this computation in numprocs */ rc = MPI_Comm_size( MPI_COMM_WORLD, &numprocs); /* store the rank of this process in myid */ rc = MPI_Comm_rank( MPI_COMM_WORLD, &myid); /* store the hostname in hostname */ rc = gethostname( hostname, len ); printf( "\nHello World from process: %d of %d on host: %s\n", myid, numprocs, hostname); /* Finalize MPI */ rc = MPI_Finalize(); }

9.2 Makefile

Makefile()
## Introductory Example   ## Copyright (c) 2010 The Center for Advanced Research Computing    ##                           at The University of New Mexico   # Makefile to compile MPI Hello world C program: hello.c   MPI_INCLUDE   = -I/opt/local/mpich-mx-gnu-4.1.0/include
MPI_LIB_PATHS = -L/opt/local/mpich-mx-gnu-4.1.0/lib/ -L/opt/mx/lib 
MPI_LIBRARIES = -lmpich -lmyriexpress
hello: hello.o
        gcc hello.o $(MPI_LIBRARIES) -o hello
hello.o: hello.c
        gcc -c $(MPI_INCLUDE) hello.f -o hello.o
.PHONY: clean
clean:
        rm -f hello.o hello

9.3 PBS Script

hello-fortran-pbs()
# PBS Script for "Hello World" MPI job  ## Introductory Example ## Copyright (c) 2010 The Center for Advanced Research Computing   ##                           at The University of New Mexico  #PBS -lnodes=4:ppn=4 #PBS -lwalltime=1:00  ## Specify the shell to be tcsh #PBS -S /bin/tcsh  # set up the PATH environment variable for the   # MX (Myrinet) version of mpirun 
setenv PATH /opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# run hello on 16 processors mpirun -np 16 -machinefile $PBS_NODEFILE hello

9.4 hello.f Source Code

hello.f()
! Introductory Example ! Copyright (c) 2010 The Center for Advanced Research Computing   !                          at The University of New Mexico         
        program helloworld

include 'mpif.h' integer comm, rank, numproc, ierror

!Initialize MPI. call MPI_INIT(ierror) call MPI_COMM_RANK(mpi_comm_world, rank, ierror) call MPI_COMM_SIZE(mpi_comm_world, numproc, ierror)

print *,"Hello World from processor",rank,"of",numproc

if (rank == 0) then print *,"Hello again from processor", rank endif

call MPI_FINALIZE(ierror)

end

9.5 Makefile

Makefile()
## Introductory Example ## Copyright (c) 2010 The Center for Advanced Research Computing  ##                           at The University of New Mexico # Makefile to compile MPI Hello World Fortran program: hello.f MPI_INCLUDE   = -I/opt/local/mpich-mx-gnu-4.1.0/include
MPI_LIB_PATHS = -L/opt/local/mpich-mx-gnu-4.1.0/lib/ -L/opt/mx/lib 
MPI_LIBRARIES = -lmpich -lmyriexpress
hello.o: hello.f
        gfortran -c $(MPI_INCLUDE) hello.f -o hello.o

hello: hello.o gfortran hello.o $(MPI_LIBRARIES) -o hello .PHONY: clean clean: rm -f hello.o hello

9.6 PBS Script

hello-fortran-pbs()
# PBS Script for "Hello World" MPI job ## Introductory Example ## Copyright (c) 2010 The Center for Advanced Research Computing  ##                           at The University of New Mexico #PBS -lnodes=4:ppn=4 #PBS -lwalltime=1:00 ## Specify the shell to be tcsh #PBS -S /bin/tcsh # set up the PATH environment variable for the  # MX (Myrinet) version of mpirun setenv PATH /opt/local/mpich-mx-gnu-4.1.0/bin/:$PATH

# run hello on 16 processors mpirun -np 16 -machinefile $PBS_NODEFILE hello

Date: 2010-11-05 13:40:38 MDT

HTML generated by org-mode 7.01trans in emacs 24

 


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: