מדריך SGE
Some of our clusters and servers have SGE scheduling and queueing system
SGE queueing system
SGE main commands
A good reference can be found in the links
- https://web.njit.edu/topics/HPC/basement/sge/SGE.html
- https://www.hoffman2.idre.ucla.edu/computing/sge-commands/
Start by one of the below commands (login to the head node):
ssh <username>@lecs2
ssh lecs2 -l <username>
Create a batch job script, which contains the following lines:
#!/bin/tcsh
cd executables
./a.out
Send the script to be executed to one of the ‘resources’, you are belong to, using flag -l (=small ‘L’). For example, if you belong to Itay’s group:
qsub -l itaym <script>
This ‘resource’ parameter appears in the ‘complex_values’ field in the output of command:
qconf -sq <queue name>
The number which is returned from this command is the job id that was assigned to the new job:
Your job 6790845 (“<script>”) has been submitted
You can see the status of your executing jobs by executing:
qstat -u <username>
Which lists all the jobs running or being queued for the specified user.
Job status may be mainly one of the following:
qw – queued (waiting for its run)
r – running
Eqw – in error state
Rr – job was reran, due to some failure in one of the nodes.
You can see the status of all the executing jobs by executing:
qstat -u \*
To see the current available queues and their cputime and memory limits, execute:
qstat -f -g c
To see a list of all the queues:
qconf -sql
To see the status of a specific job, you may run:
qstat -j <job number>
e.g. qstat -j 6790845
The standard output and standard error files will be written by default at the end of the execution to files in your home directory: <script>.o#n and <script>.e#n (where #n is the job number given to your job by the batch queueing system).
To delete a job, use the qdel command:
qdel <job number>
SGE file parameters
The script to be ran may have additional commands which are directions to the scheduler, instead of adding parameters to the qsub command line.
The commands for SGE scheduler should have prefix ‘#$’, please see reference in http://bioinformatics.mdc-berlin.de/intro2UnixandSGE/sun_grid_engine_for_beginners/how_to_submit_a_job_using_qsub.html
An example to a script:
#!/bin/tcsh
#$ -e $JOB_NAME.ERR ⇒ error file name
#$ -o $JOB_NAME.OUT ⇒ error file name
#$ -cwd ⇒ run in current working directory
#$ -l dorothee ⇒ use resource dorothee
#$ -N script_name
#$ -M dvory@tauex.tau.ac.il ⇒ my email address
#$ -m e ⇒ send an email when the script ends
#$ -l hostname=compute-5-0 ⇒ specific node to be used
# Below are regular commands
module load python/python-2.7.2
setenv SOME_ENV_VAR 55
cd /home/dvory
perl my_script.pl
The script can be ran with the command:
qsub script
Interactive session
Interactive sessions (line mode) are enabled using the command:
qsh
Running matlab example
There 2 options to run matlab via SGE
- Command to run interactive job:
qsh -q all.q
bash
matlab
- Submitting a matlab batch job to queue:
qsub -q all.q my_table_script.sh
While:
myTable.m ⇒ This matlab file calculates something
function [] = myTable()
fprintf('=======================================\n');
fprintf(' a b c d \n');
fprintf('=======================================\n');
while 1
for j = 1:10
a = sin(10*j);
b = a*cos(10*j);
c = a + b;
d = a - b;
fprintf('%+6.5f %+6.5f %+6.5f %+6.5f \n',a,b,c,d);
end
end
fprintf('=======================================\n');
my_table_script.sh ⇒ This script executes the matlab program:
#!/bin/tcsh
#$ -N this_is_name
#$ -S /bin/tcsh
#$ -cwd
#$ -e /tmp/a.err
#$ -o /tmp/a.out
module load freesurfer
setenv KEY_MODELLER9v8 a
/usr/local.cc/bin/matlab -nodisplay -nojvm -nosplash -nodesktop myTable.m