commands.rst 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534
  1. Overview of commands
  2. =====================
  3. Here is an overview of the most common usage of the PTL commands, there are many
  4. more options to control the commands, see the --help option of each command for
  5. details.
  6. .. _pbs_benchpress:
  7. How to use pbs_benchpress
  8. -------------------------
  9. pbs_benchpress is PTL's test harness, it is used to drive testing, logging
  10. and reporting of test suites and test cases.
  11. To list information about a test suite::
  12. pbs_benchpress -t <TestSuiteName> -i
  13. To check for compilation errors use below command::
  14. python -m py_compile /path/to/your/test/file.py
  15. Before running any test we have to export below 2 paths::
  16. export PYTHONPATH=</path/to/install/location>/lib/python<python version>/site-packages
  17. ::
  18. export PATH=</path/to/install/location>/bin
  19. To Run a test suite and/or a test case
  20. 1. To run the entire test suite::
  21. pbs_benchpress -t <TestSuiteName>
  22. where `TestSuiteName` is the name of the class in the .py file you created
  23. 2. To run a test case part of a test suite::
  24. pbs_benchpress -t <TestSuiteName>.<test_case_name>
  25. where `TestSuiteName` is as described above and `test_case_name` is the name
  26. of the test method in the class
  27. 3. You can run the under various logging levels using the -l option::
  28. pbs_benchpress -t <TestSuiteName> -l DEBUG
  29. To see various logging levels see :ref:`log_levels`
  30. 4. To run all tests that inherit from a parent test suite class run the
  31. parent test suite passing the `--follow-child` param to pbs_benchpress::
  32. pbs_benchpress -t <TestSuite> --follow-child
  33. 5. To exclude specific testsuites, use the --excluding option as such::
  34. pbs_benchpress -t <TestSuite> --follow-child --exclude=<SomeTest>
  35. 6. To run the test by the name of the test file, for example, if a test
  36. class is defined in a file named pbs_XYZ.py then you an run it using::
  37. pbs_benchpress -f ./path/to/pbs_XYZ.py
  38. 7. To pass custom parameters to a test suite::
  39. pbs_benchpress -t <TestSuite> -p "<key1>=<val1>,<key2>=<val2>,..."
  40. Alternatively you can pass --param-file pointing to a file where parameters
  41. are specified. The contents of the file should be one parameter per line::
  42. pbs_benchpress -t <TestSuite> --param-file=</path/to/file>
  43. Example: take file as "param_file" then file content should be as below.
  44. key1=val1
  45. key2=val2
  46. .
  47. .
  48. Once params are specified, a class variable called param is set in the Test
  49. that can then be parsed out to be used in the test. When inheriting from
  50. PBSTestSuite, the key=val pairs are parsed out and made available in the
  51. class variable ``conf``, so the test can retrieve the information using::
  52. if self.conf.has_key(key1):
  53. ...
  54. 8. To check that the available Python version is above a minimum::
  55. pbs_benchpress --min-pyver=<version>
  56. 9. To check that the available Python version is less than a maximum::
  57. pbs_benchpress --max-pyver=<version>
  58. 10. On Linux, you can generate PBS coverage data using PTL.
  59. To collect coverage data using LCOV/LTP, first ensure that PBS was
  60. compiled using --set-cflags="--coverage" and make sure that you have the lcov
  61. utility installed. Lcov utility can be obtained at http://ltp.sourceforge.net/coverage/lcov.php
  62. Then to collect PBS coverage data run pbs_benchpress as follow::
  63. pbs_benchpress -t <TestName> --lcov-data=</path/to/gcov/build/dir>
  64. By default the output data will be written to TMPDIR/pbscov-YYYMMDD_HHMMSS,
  65. this can be controlled using the option --lcov-out.
  66. By default the lcov binary is expected to be available in the environment, if
  67. it isn't you can set the path using the option --lcov-bin.
  68. 11. For tests that inherit from PBSTestSuite, to collect procecess information::
  69. pbs_benchpress -t <TestSuite> -p "procmon=<proc name>[:<proc name>],procmon-freq=<seconds>"
  70. where `proc name` is a process name such as pbs_server, pbs_sched, pbs_mom.
  71. RSS,VSZ,PCPU info will be collected for each colon separated name.
  72. 12. To run ptl on multinode cluster we have following two basic requirements.
  73. A. PTL to be installed on all the nodes.
  74. B. Passwordless ssh between all the nodes.
  75. Suppose we have a multinode cluster of three node (M1-type1, M2-type2, M3-type3)
  76. We can invoke pbs_benchpress command as below::
  77. pbs_benchpress -t <TestSuite> -p "servers=M1,moms=M1:M2:M3"
  78. .. _log_levels:
  79. Logging levels
  80. ~~~~~~~~~~~~~~
  81. PTL uses the generic unittest log levels: INFO, WARNING, DEBUG, ERROR, FATAL
  82. and three custom log levels: INFOCLI, INFOCLI2, DEBUG2.
  83. INFOCLI is used to log command line calls such that the output of a test run
  84. can be read with anyone familiar with the PBS commands.
  85. INFOCLI2 is used to log a wider set of commands run through PTL.
  86. DEBUG2 is a verbose debugging level. It will log commands, including return
  87. code, stdout and stderr.
  88. .. _pbs_loganalyzer:
  89. How to use pbs_loganalyzer
  90. --------------------------
  91. To analyze scheduler logs::
  92. pbs_loganalyzer -l </path/to/schedlog>
  93. To only display scheduling cycles summary::
  94. pbs_loganalyzer -l </path/to/schedlog> -c
  95. To analyze server logs::
  96. pbs_loganalyzer -s </path/to/serverlog>
  97. To analyze mom logs::
  98. pbs_loganalyzer -m </path/to/momlog>
  99. To analyze accounting logs::
  100. pbs_loganalyzer -a </path/to/accountinglog>
  101. To specify a begin and/or end time::
  102. pbs_loganalyzer -b "02/20/2013 21:00:00" -e "02/20/2013 22:00:00" <rest>
  103. Note that for accounting logs, the file will be 'cat' using the sudo command,
  104. so the tool can be run as a regular user with sudo privilege.
  105. To compute cpu/hour utilization against a given snapshot of nodes::
  106. pbs_loganalyzer -U --nodes-file=/path/to/pbsnodes-av-file
  107. --jobs-file=/path/to/qstat-f-file
  108. -a /path/acct
  109. A progress bar can be displayed by issuing::
  110. pbs_loganalyzer --show-progress ...
  111. To analyze the scheduler's estimated start time::
  112. pbs_loganalyzer --estimated-info -l <path/to/sched/log>
  113. To analyze per job scheduler performance metrics, time to run, time to discard,
  114. time in scheduler (solver time as opposed to I/O with the server), time to
  115. calendar::
  116. pbs_loganalyzer -l </path/to/schedlog> -S
  117. In addition to a scheduler log, a server log is required to compute the time in
  118. scheduler metric, this is due to the fact that the time in sched is measured
  119. as the difference between a sched log "Considering job to run" and a
  120. corresponding server log's "Job Run" message.
  121. To output analysis to a SQLite file::
  122. pbs_loganalyzer --db-name=<name or path of database> --db-type=sqlite
  123. Note that the sqlite3 module is needed to write out to the DB file.
  124. To output to a PostgreSQL database::
  125. pbs_loganalyzer --db-access=</path/to/pgsql/cred/file>
  126. --db-name=<name or path of database>
  127. --db-type=psql
  128. Note that the psycopg2 module is needed to write out ot the PostgreSQL database.
  129. The cred file should specify the following::
  130. user=<db username> password=<user's password> dbname=<databasename> port=<val>
  131. To analyze the time (i.e., log record time) between occurrences of a regular
  132. expression in any log file::
  133. pbs_loganalyzer --re-interval=<regex expression>
  134. This can be used, for example, to measure the interval of occurrences between
  135. E records in an accounting log::
  136. pbs_loganalyzer -a <path/to/accountlog> --re-interval=";E;"
  137. A useful extended option to the occurrences interval is to compute the number
  138. of regular expression matches over a given period of time::
  139. pbs_loganalyzer --re-interval=<regex> --re-frequency=<seconds>
  140. For example, to count how many E records are emitted over a 60 second window::
  141. pbs_loganalyzer -a <acctlog> --re-interval=";E;" --re-frequency=60
  142. When using --re-interval, the -f option can be used to point to an arbitrary
  143. log file instead of depending on -a, -l, -s, or -m, however all these log
  144. specific options will work.
  145. A note about the regular expression used, every Python named group, i.e.,
  146. expressions of the (?P<name>...), will be reported out as a dictionary of
  147. items mapped to each named group.
  148. .. _pbs_stat:
  149. How to use pbs_stat
  150. -------------------
  151. pbs_stat is a useful tool to display filtered information from querying
  152. PBS objects. The supported objects are nodes, jobs, resvs, server, queues.
  153. The supported operators on filtering attributes or resources are >,
  154. <, >=, <=, and ~, the latter being for a regular expression match on the value
  155. associated to an attribute or resource.
  156. In the examples below one can replace the object type by any of
  157. those alternative ones, with the appropriate changes in attribute or resource
  158. names.
  159. Each command can be run by passing a -t <hostname> option to specify a
  160. desired target hostname, the default (no -t) will query the localhost.
  161. To list a summary of all jobs equivalence classes on Resource_List.select, use::
  162. pbs_stat -j -a "Resource_List.select"
  163. To list a summary of all nodes equivalence classes::
  164. pbs_stat -n
  165. Note that node equivalence classes are collected by default on
  166. resources_available.ncpus, resources_available.mem, and state. To specify
  167. attributes to create the equivalence class on use -a/-r.
  168. To list all nodes that have more than 2 cpus::
  169. pbs_stat --nodes -a "resources_available.ncpus>2"
  170. or equivalently (for resources)::
  171. pbs_stat --nodes -r "ncpus>2"
  172. To list all jobs that request more than 2 cpus and are in state 'R'::
  173. pbs_stat --jobs -a "Resource_List.ncpus>2&&job_state='R'"
  174. To filter all nodes that have a host value that start with n and end with a,
  175. i.e., "n.*a"::
  176. pbs_stat --nodes -r "host~n.*a"
  177. To display information in qselect like format use the option -s to each command
  178. using -s the attributes selected are displayed first followed by a list of
  179. names that match the selection criteria.
  180. To display data with one entity per line use the --sline option::
  181. pbs_stat --nodes --sline
  182. To show what is available now in the complex (a.k.a, backfill hole) use::
  183. pbs_stat -b
  184. by default the backfill hole is computed based on ncpus, mem, and state, you
  185. can specify the attributes to compute it on by passing comma-separated list of
  186. attributes into the -a option. An alternative to compute the backfill hole is
  187. to use pbs_sim -b.
  188. To show utilization of the system use::
  189. pbs_stat -U [-r "<resource1,resource2,...>]
  190. resources default to ncpus, memory, and nodes
  191. To show utilization of a specific user::
  192. pbs_stat -U --user=<name>
  193. To show utilization of a specific group::
  194. pbs_stat -U --group=<name>
  195. To show utilization of a specific project::
  196. pbs_stat -U --project=<name>
  197. To count the grand total of a resource values in complex for the queried resource::
  198. pbs_stat -r <resource, e.g. ncpus> -C --nodes
  199. Note that nodes that are not up are not counted
  200. To count the number of resources having same values in complex for the queried resource::
  201. pbs_stat -r <resource e.g. ncpus> -c --nodes
  202. To show an evaluation of the formula for all non-running jobs::
  203. pbs_stat --eval-formula
  204. To show the fairshare tree and fairshare usage::
  205. pbs_stat --fairshare
  206. To read information from file use for example::
  207. pbs_stat -f /path/to/pbsnodes/or/qstat_f/output --nodes -r ncpus
  208. To list all resources currently set on a given object type::
  209. pbs_stat --nodes --resources-set
  210. To list all resources defined in resourcedef::
  211. pbs_stat --resources
  212. To list a specific resource by name from resourcedef (if it exists)::
  213. pbs_stat --resource=<custom_resource>
  214. To show limits associated to all entities::
  215. pbs_stat --limits-info
  216. To show limits associated to a specific user::
  217. pbs_stat --limits-info --user=<name>
  218. To show limits associated to a specific group::
  219. pbs_stat --limits-info --group=<name>
  220. To show limits associated to a specific project::
  221. pbs_stat --limits-info --project=<name>
  222. To show entities that are over their soft limits::
  223. pbs_stat --over-soft-limits
  224. The output of limits information shows named entities associated to each
  225. container (server or queue) to which a limit is applied. The entity's usage
  226. as well as limit set are displayed, as well as a remainder usage value that
  227. indicates whether an entity is over a limit (represented by a negative value)
  228. or under a limit (represented by a positive or zero value). In the case of a
  229. PBS_ALL or PBS_GENERIC limit setting, each entity's name is displayed using
  230. the entity's name followed by "/PBS_ALL" or "/PBS_GENERIC" as the case may be.
  231. Here are a few examples, if a server soft limit is set to 0::
  232. qmgr -c "set server max_run_soft=[u:user1=0]"
  233. for user user1 on the server object, pbs_stat --limits-info will show::
  234. u:user1
  235. container = server:minita.pbspro.com
  236. limit_type = max_run_soft
  237. remainder = -1
  238. usage/limit = 1/0
  239. if a server soft limit is set to 0 on generic users::
  240. qmgr -c "set server max_run_soft=[u:PBS_GENERIC=0]"
  241. then pbs_stat --limits-info will show::
  242. u:user1/PBS_GENERIC
  243. container = server:minita.pbspro.com
  244. limit_type = max_run_soft
  245. remainder = -1
  246. usage/limit = 1/0
  247. To print a site report that summarizes some key metrics from a site::
  248. pbs_stat --report
  249. optionally, use the path to a pbs_diag using the -d option to summarize that
  250. site's information.
  251. To show the number of privileged ports in use::
  252. pbs_stat --pports
  253. To show information directly from the database (requires psycopg2 module)::
  254. pbs_stat --db-access=<path/to/dbaccess_file> --db-type=psql
  255. --<objtype> [-a <attribs>]
  256. where the dbaccess file is of the form::
  257. user=<value>
  258. password=<value>
  259. # and optionally
  260. [port=<value>]
  261. [dbname=<value>]
  262. .. _pbs_config:
  263. How to use pbs_config
  264. ---------------------
  265. pbs_config is useful in the following cases, use:
  266. .. option:: --revert-config
  267. To revert a configuration of PBS entities specified as one or
  268. more of --scheduler, --server, --mom to its default configuration. Note that
  269. for the server, non-default queues and hooks are not deleted but disabled
  270. instead.
  271. .. option:: --save-config
  272. save the configuration of a PBS entity, one of --scheduler,
  273. --server, --mom to file. The server saves the resourcedef, a qmgr print
  274. server, qmgr print sched, qmgr print hook. The scheduler saves sched_config,
  275. resource_group, dedicated_time, holidays. The mom saves the config file.
  276. .. option:: --load-config
  277. load configuration from file. The changes will be applied to
  278. all PBS entities as saved in the file.
  279. .. option:: --vnodify
  280. create a vnode definition and insert it into a given MoM. There are
  281. many options to this command, see the help page for details.
  282. .. option:: --switch-version
  283. swith to a version of PBS installed on the system. This
  284. only supports modifying the PBS installed on a system that matches
  285. PBS_CONF_FILE.
  286. .. option:: --check-ug
  287. To check if the users and groups required for automated testing are defined as
  288. expected on the system
  289. .. option:: --make-ug
  290. To make users and groups as required for automated testing.This will create
  291. user home directories with 755 permission.If test user is not using this command
  292. for user creation then he/she has to make sure that the home directories
  293. should have 755 permission.
  294. To setup, start, and add (to the server) multiple MoMs::
  295. pbs_config --multi-mom=<num> -a <attributes> --serverhost=<host>
  296. The multi-mom option creates <num> pbs.conf files, prefixed by pbs.conf_m
  297. followed by an incrementing number by default, for which each configuration
  298. file has a unique PBS_HOME directory that is defined by default to be PBS_m
  299. followed by the same incrementing number as the configuration file. The
  300. configuration prefix can be changed by passing the --conf-prefix option and
  301. the PBS_HOME prefix can be changed via --home-prefix.
  302. To make a PBS daemons mimic the snapshot of a pbs_diag::
  303. pbs_config --as-diag=<path/to/diag>
  304. This will set all server and queue attributes from the diag, copy sched_config,
  305. resource_group, holidays, resourcedef, all site hooks, and create and insert a
  306. vnode definition that translates all of the nodes reported by pbsnodes -av.
  307. There may be some specific attributes to adjust, such as pbs_license_info,
  308. or users or groups, that may prevent submission of jobs.
  309. .. _pbs_py_spawn:
  310. How to use pbs_py_spawn
  311. -----------------------
  312. The pbs_py_spawn wrapper can only be used when the pbs_ifl.h API is SWIG
  313. wrapped. The tool can be used to invoke a pbs_py_spawn action associated to a
  314. job running on a MoM.
  315. To call a Python script during the runtime of a job::
  316. pbs_py_spawn -j <jobid> <path/to/python/script/on/MoM>
  317. To call a Python script that will detach from the job's session::
  318. pbs_py_spawn --detach -j <jobid> </path/to/python/script/on/MoM>
  319. Detached scripts essentially background themselves and are attached back to
  320. the job monitoring through pbs_attach such that they are terminated when the
  321. job terminates. The detached script must write out its PID as its first
  322. output.