.\" Copyright (C) 1994-2018 Altair Engineering, Inc. .\" For more information, contact Altair at www.altair.com. .\" .\" This file is part of the PBS Professional ("PBS Pro") software. .\" .\" Open Source License Information: .\" .\" PBS Pro is free software. You can redistribute it and/or modify it under the .\" terms of the GNU Affero General Public License as published by the Free .\" Software Foundation, either version 3 of the License, or (at your option) any .\" later version. .\" .\" PBS Pro is distributed in the hope that it will be useful, but WITHOUT ANY .\" WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS .\" FOR A PARTICULAR PURPOSE. .\" See the GNU Affero General Public License for more details. .\" .\" You should have received a copy of the GNU Affero General Public License .\" along with this program. If not, see . .\" .\" Commercial License Information: .\" .\" For a copy of the commercial license terms and conditions, .\" go to: (http://www.pbspro.com/UserArea/agreement.html) .\" or contact the Altair Legal Department. .\" .\" Altair’s dual-license business model allows companies, individuals, and .\" organizations to create proprietary derivative works of PBS Pro and .\" distribute them - whether embedded or bundled with other software - .\" under a commercial license agreement. .\" .\" Use of Altair’s trademarks, including but not limited to "PBS™", .\" "PBS Professional®", and "PBS Pro™" and Altair’s logos is subject to Altair's .\" trademark licensing policies. .\" .TH pbs_mom 8B "8 November 2017" Local "PBS Professional" .SH NAME .B pbs_mom - run the PBS job monitoring and execution daemon .SH SYNOPSIS .B pbs_mom [-a ] [-C ] .RS 8 [-c ] [-d ] [-L ] .br [-M ] [-N] [-n ] [-p|-r] .br [-R ] [-S ] .br [-s script_options] .RE .B pbs_mom --version .SH DESCRIPTION The .B pbs_mom command starts the PBS job monitoring and execution daemon, called MoM. The standard MoM starts jobs on the execution host, monitors and reports resource usage, enforces resource usage limits, and notifies the server when the job is finished. The MoM also runs any prologue scripts before the job runs, and runs any epilogue scripts after the job runs. The MoM performs any communication with job tasks and with other MoMs. The MoM on the first vnode on which a job is running manages communication with the MoMs on the remaining vnodes on which the job runs. The MoM manages one or more vnodes. PBS may treat a host as a set of virtual nodes, in which case one MoM manages all of the host's vnodes. See the .B PBS Professional Administrator's Guide. .B Logging .br The MoM's log file is in PBS_HOME/mom_logs. The MoM writes an error message in its log file when it encounters any error. If it cannot write to its log file, it writes to standard error. The MoM writes events to its log file. The MoM writes its PBS version and build information to the logfile whenever it starts up or the logfile is rolled to a new file. .B Required Permission .br The executable for .B pbs_mom is in PBS_EXEC/sbin, and can be run only by root on Linux, and Admin on Windows. .B Cpusets .br A cpusetted machine can have a "boot cpuset" defined by the administrator. A boot cpuset contains one or more CPUs and memory boards and is used to restrict the default placement of system processes, including login. If defined, the boot cpuset contains CPU 0. Run parallel jobs exclusively within a cpuset for repeatability of performance. HPE SGI states, "Using cpusets on an HPE SGI system improves cache locality and memory access times and can substantially improve an application's performance and runtime repeatability." The CPUSET_CPU_EXCLUSIVE flag prevents CPU 0 from being used by the MoM in the creation of job cpusets. This flag is set by default, so this is the default behavior. To find out which cpuset is assigned to a running job, use .B qstat -f to see the .I cpuset field in the job's .I altid attribute. .B HPE SGI Machine Running Supported Versions of Performance Software - Message Passing Interface .br The cpusets created for jobs are marked cpu-exclusive. MoM does not use any CPU which was in use at startup. A PBS job can run across multiple machines that run supported versions of Performance Software - Message Passing Interface. PBS can run using HPE SGI's MPI (MPT) over InfiniBand. See the .B PBS Professional Administrator's Guide. .LP .B Effect on Jobs of Starting MoM .br When MoM is started or restarted, her default behavior is to leave any running processes running, but to tell the PBS server to requeue the jobs she manages. MoM tracks the process ID of jobs across restarts. In order to have all jobs killed and requeued, use the .I r option when starting or restarting MoM. In order to leave any running processes running, and not to requeue any jobs, use the .I p option when starting or restarting MoM. .SH OPTIONS .IP "-a " 10 Number of seconds before alarm timeout. Whenever a resource request is processed, an alarm is set for the given amount of time. If the request has not completed before .I alarm timeout, the OS generates an alarm signal and sends it to MoM. Default: 10 seconds. Format: integer. .IP "-C " 10 Specifies the path of the directory where MoM creates job-specific subdirectories used to hold each job's restart files. MoM passes this path to checkpoint and restart scripts. Overrides other checkpoint path specification methods. Any directory specified with the .I -C option must be owned, readable, writable, and executable by root only .I (rwx,---,---, or 0700), to protect the security of the checkpoint files. See the .I -d option. Format: string. .br Default: PBS_HOME/spool/checkpoint. .IP "-c " 10 MoM will read this alternate default configuration file upon starting. If this is a relative file name it will be relative to PBS_HOME/mom_priv. If the specified file cannot be opened, .B pbs_mom will abort. See the .I -d option. MoM's normal operation, when the -c option is not given, is to attempt to open the default configuration file PBS_HOME/mom_priv/config. If this file is not present, .B pbs_mom will log the fact and continue. .IP "-d " 10 Specifies the path of the .I directory to be used in place of PBS_HOME by .B pbs_mom. The default directory is given by $PBS_HOME. Format: string. .IP "-L " 10 Specifies an absolute path and filename for the log file. The default is a file named for the current date in PBS_HOME/mom_logs/. See the .I -d option. Format: string. .IP "-M " 10 Specifies the number of the port on which MoM will listen for server requests and instructions. Overrides PBS_MOM_SERVICE_PORT setting in pbs.conf and environment variable. Default: 15002. Format: integer port number. .IP "-n " 10 Specifies the priority for the .B pbs_mom daemon. Format: integer. .IP "-N" 10 Specifies that when starting, MoM should not detach from the current session. .IP "-p" 10 Specifies that when starting, MoM should allow any running jobs to continue running, and not have them requeued. This option can be used for single-host jobs only; multi-host jobs cannot be preserved. Cannot be used with the .I -r option. MoM is not the parent of these jobs. .RS .IP "HPE SGI systems running Performance Software - Message Passing Interface" 5 The cpuset-enabled .B pbs_mom will, if given the .I -p flag, use the existing CPU and memory allocations for the /PBSPro cpusets. The default behavior is to remove these cpusets. Should this fail, MoM will exit, asking to be restarted with the .I -p flag. .LP .RE .IP "-r" 10 Specifies that when starting, MoM should requeue any rerunnable jobs and kill any non-rerunnable jobs that she was tracking, and mark the jobs as terminated. Cannot be used with the .I -p option. MoM is not the parent of these jobs. It is not recommended to use the .I -r option after a reboot, because process IDs of new, legitimate tasks may match those MoM was previously tracking. If they match and MoM is started with the .I -r option, MoM will kill the new tasks. .IP "-R " 10 Specifies the number of the port on which MoM will listen for pings, resource information requests, communication from other MoMs, etc. Overrides PBS_MANAGER_SERVICE_PORT setting in pbs.conf and environment variable. Default: 15003. Format: integer port number. .IP "-S " 10 Specifies the port number on which .B pbs_mom initially contacts the server. Default: 15001. Format: integer port number. .IP "-s