 |
|
Policy to allocate resources
General policy:
1> System administrators have top priority to run jobs to maintain the system.
2> Every cluster user has an equal opportunity to allocate the resources.
3> If you have a very long job to run, please use the "long" queue, otherwise, please use "normal"
4> We STRONGLY suggest that you submit massive jobs during the evening, or on Friday.
5> If you are compiling and doing quick test of your program, please allocate one computing node and then ssh to that node as instruction in "how to run interactive jobs" (after you setup your working environment), once you land on that computing node, you can run/test your program as in a regular linux machine
Job priority to allocate resources:
By default, LSF calculates the dynamic priority based on the following information about each user:
1> Number of shares assigned to the user (pre-assigned priority for user)
2> Resources used by jobs belonging to the user:
a. Number of job slots reserved and in use.
b. Run time of running jobs.
c. Cumulative actual (not normalized) CPU time.
d. Historical run time of finished jobs.
e. Committed run time, specified at job submission with the -W option of bsub, or in the queue with the RUN_LIMIT parameter.
Tips and Suggestions:
1> If you know how long your job will be running, specify the "-c" argument for the job.
2> Please run small jobs before you run large jobs. To allocate a large amount of resources, always test your software/program first!
3> If you have emergency jobs that need to be run, please contact us and we can make a temporary adjustment.
4> Avoid reading/writing to disk if you can load all your data into memory; Avoid reading/writing through NFS in your computing process if you can load/dump all your data into local disk (/tmp directory of each computing node).
5> PLEASE DONOT zip/unzip, cp/scp across NFS. For example, you type "gzip" on computing node n10 to zip a large-size file that is located in your home directory---this could place a heavy load on the NFS file systems causing your own jobs and those of others to run slowly.
6> WARNING: DO NOT bypass queue system and ssh directly to computing node at anytime, your job will be terminated immediately and your account will be suspended, thanks!
|