单节点和集群服务器上分配串行和并行任务

在单节点服务器上,可以同步按行执行命令文件test.sh,test.sh里面有5,000行命令,先串联250行再并行提交20次,相当于20个线程,脚本如下:parallel.sh

#!/bin/bash
START=1
PER_TASK=250
for ((count=1; count<=20; count++));do
    END=${PER_TASK}*count
    for(( i=$START; i<=$END; i++ ));do
        `sed -n "$i"p test.sh`
    done &
    START=$START+${PER_TASK}
done

nohup bash parallel.sh &

注意这里很快就done了,只是完成了sed命令,后台还在进行内部命令并没有完成,不要注意反复提交!!!

&在第二个for外面是串联,在第一个for里面则是并行

在集群服务器上,例如用Slurm调用作业资源系统的,多节点多线程运行某个作业任务,脚本如下:

参考:https://crc.ku.edu/hpc/slurm/how-to/arrays#examplehttps://help.rc.ufl.edu/doc/SLURM_Job_Arrays

#!/bin/bash
#SBATCH -J codeml_array
#SBATCH -p a05208har3
#SBATCH --array=1-5
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --output=Array.%A_%a.log

# bash codeml.sh #Job1

pwd; hostname; date

#Set the number of runs that each SLURM task should do
PER_TASK=100

# Calculate the starting and ending values for this task based
# on the SLURM task and the number of runs per task.
START_NUM=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK + 1 ))
END_NUM=$(( $SLURM_ARRAY_TASK_ID * $PER_TASK ))

# Print the task and run range
echo This is task $SLURM_ARRAY_TASK_ID, which will do runs $START_NUM to $END_NUM

# Run the loop of runs for this task.
for (( run=$START_NUM; run<=END_NUM; run++ )); do
  echo This is SLURM task $SLURM_ARRAY_TASK_ID, run number $run
  #Do your stuff here
  `sed -n "$run"p codeml.sh`
done

date

推荐阅读更多精彩内容