pbs_snapshot.8B 15 KB


  1. .\" Copyright (C) 1994-2018 Altair Engineering, Inc.
  2. .\" For more information, contact Altair at www.altair.com.
  3. .\"
  4. .\" This file is part of the PBS Professional ("PBS Pro") software.
  5. .\"
  6. .\" Open Source License Information:
  7. .\"
  8. .\" PBS Pro is free software. You can redistribute it and/or modify it under the
  9. .\" terms of the GNU Affero General Public License as published by the Free
  10. .\" Software Foundation, either version 3 of the License, or (at your option) any
  11. .\" later version.
  12. .\"
  13. .\" PBS Pro is distributed in the hope that it will be useful, but WITHOUT ANY
  14. .\" WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
  15. .\" FOR A PARTICULAR PURPOSE.
  16. .\" See the GNU Affero General Public License for more details.
  17. .\"
  18. .\" You should have received a copy of the GNU Affero General Public License
  19. .\" along with this program. If not, see <http://www.gnu.org/licenses/>.
  20. .\"
  21. .\" Commercial License Information:
  22. .\"
  23. .\" For a copy of the commercial license terms and conditions,
  24. .\" go to: (http://www.pbspro.com/UserArea/agreement.html)
  25. .\" or contact the Altair Legal Department.
  26. .\"
  27. .\" Altair’s dual-license business model allows companies, individuals, and
  28. .\" organizations to create proprietary derivative works of PBS Pro and
  29. .\" distribute them - whether embedded or bundled with other software -
  30. .\" under a commercial license agreement.
  31. .\"
  32. .\" Use of Altair’s trademarks, including but not limited to "PBS™",
  33. .\" "PBS Professional®", and "PBS Pro™" and Altair’s logos is subject to Altair's
  34. .\" trademark licensing policies.
  35. .\"
  36. .TH pbs_snapshot 8B "26 April 2018" Local "PBS Professional"
  37. .SH NAME
  38. .B pbs_snapshot
  39. - Linux only. Capture PBS data to be used for diagnostics
  40. .SH SYNOPSIS
  41. .B pbs_snapshot
  42. -o <output directory path> [-H <server host>]
  43. .RS 12
  44. [-l <log level>] [--accounting-logs=<number of days>]
  45. [--additional-hosts=<hostname list>]
  46. .br
  47. [--daemon-logs=<number of days>] [--map=<file path>]
  48. [--obfuscate]
  49. .RE
  50. .br
  51. .B pbs_snapshot
  52. --version
  53. .SH DESCRIPTION
  54. You use
  55. .B pbs_snapshot
  56. to capture PBS data for diagnostics. This tool is written in Python
  57. and uses PTL libraries, including PBSSnapUtils, to extract the data.
  58. You can optionally anonymize the PBS data.
  59. .B Required Privilege
  60. .br
  61. To run
  62. .B pbs_snapshot
  63. , you must be root on Linux or Admin on Windows.
  64. .SH OPTIONS
  65. .IP "-H <server host>" 5
  66. Specifies server hostname. By default,
  67. .B pbs_snapshot
  68. uses the value of the
  69. .I PBS_SERVER
  70. parameter in pbs.conf. When you use this option,
  71. .B pbs_snapshot
  72. uses
  73. .I server host
  74. instead.
  75. .IP "-l <log level>" 5
  76. Specifies level at which
  77. .B pbs_snapshot
  78. writes its log. The log file is
  79. pbs_snapshot.log, in the output directory path specified using the
  80. .I -o <output directory path>
  81. option.
  82. Valid values, from most comprehensive to least: DEBUG2, DEBUG,
  83. INFOCLI2, INFOCLI, INFO, WARNING, ERROR, FATAL
  84. .br
  85. Default: INFOCLI2
  86. .IP "--accounting-logs=<number of days>" 5
  87. Specifies number of days of accounting logs to be collected; this
  88. count includes the current day.
  89. .br
  90. Value of number of days must be >=0:
  91. .br
  92. If number of days is 0, no logs are captured.
  93. .br
  94. If number of days is 1, only the logs for the current day
  95. .br
  96. are captured.
  97. .br
  98. Default:
  99. .B pbs_snapshot
  100. collects 30 days of accounting logs
  101. .IP "--additional-hosts=<hostname list>" 5
  102. Specifies that
  103. .B pbs_snapshot
  104. should gather data from the specified list of non-server hosts.
  105. .B pbs_snapshot
  106. always gathers data from the server host.
  107. .br
  108. The command collects the following information from the specified hosts:
  109. .br
  110. MoM and comm logs, for the number of days of logs being
  111. .br
  112. captured, specified via the
  113. .I --daemon-logs=<number of days>
  114. .br
  115. option
  116. .br
  117. The PBS_HOME/mom_priv directory
  118. .br
  119. System information
  120. .br
  121. Format for
  122. .I hostname list
  123. is a comma-separated list of one or more hostnames:
  124. .br
  125. .I <hostname>[, <hostname> ...]
  126. .br
  127. This option can greatly bloat the size of the snapshot, and cause
  128. .B pbs_snapshot
  129. to take a long time copying what may be large amounts of data over the network.
  130. .IP "--daemon-logs=<number of days>" 5
  131. Specifies number of days of daemon logs to be collected; this count
  132. includes the current day. All daemon logs are captured on the server host,
  133. and if you specify
  134. .I --additional-hosts=<hostname list>,
  135. MoM logs are captured on those hosts as well.
  136. .br
  137. Value of number of days must be >=0:
  138. .br
  139. If number of days is 0, no logs are captured.
  140. .br
  141. If number of days is 1, only the logs for the current day
  142. .br
  143. are captured.
  144. .br
  145. Default:
  146. .B pbs_snapshot
  147. collects 5 days of daemon logs
  148. .IP "--map=<file path>" 5
  149. Specifies path for file containing obfuscation map, which is a
  150. <key>:<value> pair-mapping of obfuscated data. Path can be absolute
  151. or relative to current working directory.
  152. .br
  153. Default:
  154. .B pbs_snapshot
  155. writes its obfuscation map in a file called obfuscate.map in the
  156. location specified via the
  157. .I -o <output directory path>
  158. option.
  159. .br
  160. Can only be used with the
  161. .I --obfuscate
  162. option.
  163. .IP "--obfuscate" 5
  164. Obfuscates (anonymizes) or deletes sensitive PBS data captured by
  165. .B pbs_snapshot.
  166. .br
  167. Obfuscates the following data:
  168. .RS 8
  169. euser, egroup, project, Account_Name, operators, managers, group_list,
  170. Mail_Users, User_List, server_host, acl_groups, acl_users,
  171. acl_resv_groups, acl_resv_users, sched_host, acl_resv_hosts,
  172. acl_hosts, Job_Owner, exec_host, Host, Mom, resources_available.host,
  173. resources_available.vnode
  174. .RE
  175. .IP " " 5
  176. Deletes the following data:
  177. .RS 8
  178. Variable_List, Error_Path, Output_Path, mail_from, Mail_Points,
  179. Job_Name, jobdir, Submit_arguments, Shell_Path_List
  180. .RE
  181. .IP "--version" 5
  182. The
  183. .B pbs_snapshot
  184. command returns its PBS version information and exits.
  185. This option can only be used alone.
  186. .SH Arguments to pbs_snapshot
  187. .IP "-o <output directory path>" 5
  188. Path to directory where
  189. .B pbs_snapshot
  190. writes its output tarball. Required. Path can be absolute or
  191. relative to current working directory.
  192. .br
  193. For example, if you specify
  194. .I -o /temp,
  195. .B pbs_snapshot
  196. writes "/temp/snapshot_<timestamp>.tgz".
  197. .br
  198. The output directory path must already exist.
  199. .SH Output
  200. .B Output Location
  201. .br
  202. You must use the
  203. .I -o <output directory path>
  204. option to specify the directory where
  205. .B pbs_snapshot
  206. writes its output. The path can be absolute or relative to current
  207. working directory. The output directory must already exist. As an
  208. example, if you specify "-o /temp",
  209. .B pbs_snapshot
  210. writes "/temp/snapshot_<timestamp>.tgz".
  211. .B Output Contents
  212. .br
  213. The
  214. .B pbs_snapshot
  215. command writes its output as a tarball. The tarball contains the
  216. following directory structure and files:
  217. Directory Directory
  218. .br
  219. or File\ \ \ \ Contents\ \ \ \ \ \ \ \ \ \ \ \ \ Description
  220. .br
  221. ------------------------------------------------------------------------
  222. .br
  223. server/
  224. .br
  225. \ \ \ \ \ \ \ \ \ \ \ qstat_B.out\ \ \ \ \ \ \ \ \ \ Output of qstat -B
  226. .br
  227. \ \ \ \ \ \ \ \ \ \ \ qstat_Bf.out Output of qstat -Bf
  228. .br
  229. \ \ \ \ \ \ \ \ \ \ \ qmgr_ps.out Output of qmgr print server
  230. .br
  231. \ \ \ \ \ \ \ \ \ \ \ qstat_Q.out Output of qstat -Q
  232. .br
  233. \ \ \ \ \ \ \ \ \ \ \ qstat_Qf.out Output of qstat -Qf
  234. .br
  235. \ \ \ \ \ \ \ \ \ \ \ qmgr_pr.out Output of qmgr print resource
  236. .br
  237. server_priv/ Copy of the PBS_HOME/server_priv
  238. .br
  239. directory.
  240. .br
  241. Core files are captured separately;
  242. .br
  243. see
  244. .I core_file_bt/.
  245. .br
  246. \ \ \ \ \ \ \ \ \ \ \ accounting/ Accounting logs from
  247. .br
  248. PBS_HOME/server_priv/accounting/
  249. .br
  250. directory for the number of days
  251. .br
  252. specified via
  253. .I --accounting-logs
  254. option
  255. .br
  256. server_logs/ Server logs from the
  257. .br
  258. PBS_HOME/server_logs directory for the
  259. .br
  260. number of days specified
  261. .br
  262. via
  263. .I --daemon-logs
  264. option
  265. .br
  266. job/
  267. .br
  268. \ \ \ \ \ \ \ \ \ \ \ qstat.out Output of qstat
  269. .br
  270. \ \ \ \ \ \ \ \ \ \ \ qstat_f.out Output of qstat -f
  271. .br
  272. \ \ \ \ \ \ \ \ \ \ \ qstat_t.out Output of qstat -t
  273. .br
  274. \ \ \ \ \ \ \ \ \ \ \ qstat_tf.out Output of qstat -tf
  275. .br
  276. \ \ \ \ \ \ \ \ \ \ \ qstat_x.out Output of qstat -x
  277. .br
  278. \ \ \ \ \ \ \ \ \ \ \ qstat_xf.out Output of qstat -xf
  279. .br
  280. \ \ \ \ \ \ \ \ \ \ \ qstat_ns.out Output of qstat -ns
  281. .br
  282. \ \ \ \ \ \ \ \ \ \ \ qstat_fx_F_dsv.out Output of qstat -fx -F dsv
  283. .br
  284. \ \ \ \ \ \ \ \ \ \ \ qstat_f_F_dsv.out Output of qstat -f -F dsv
  285. node/
  286. .br
  287. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_va.out Output of pbsnodes -va
  288. .br
  289. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_a.out Output of pbsnodes -a
  290. .br
  291. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_avSj.out Output of pbsnodes -avSj
  292. .br
  293. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_aSj.out Output of pbsnodes -aSj
  294. .br
  295. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_avS.out Output of pbsnodes -avS
  296. .br
  297. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_aS.out Output of pbsnodes -aS
  298. .br
  299. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_aFdsv.out Output of pbsnodes -aFdsv
  300. .br
  301. \ \ \ \ \ \ \ \ \ \ \ pbsnodes_avFdsv.out Output of pbsnodes -avFdsv
  302. .br
  303. \ \ \ \ \ \ \ \ \ \ \ qmgr_pn_default.out Output of qmgr print node @default
  304. .br
  305. mom_priv/ Copy of the PBS_HOME/mom_priv
  306. .br
  307. directory.
  308. .br
  309. Core files are captured separately;
  310. .br
  311. see core_file_bt/.
  312. .br
  313. mom_logs/ MoM logs from the PBS_HOME/mom_logs
  314. .br
  315. directory for the number of days
  316. .br
  317. specified via
  318. .I --daemon-logs
  319. option
  320. .br
  321. comm_logs/ Comm logs from the PBS_HOME/comm_logs
  322. .br
  323. directory for the number of days
  324. .br
  325. specified via
  326. .I --daemon-logs
  327. option
  328. .br
  329. sched_priv/ Copy of the PBS_HOME/sched_priv
  330. .br
  331. directory, with all files.
  332. .br
  333. Core files are not captured;
  334. .br
  335. see core_file_bt/.
  336. .br
  337. sched_logs/ Scheduler logs from the
  338. .br
  339. PBS_HOME/sched_logs directory for
  340. .br
  341. the number of days specified
  342. .br
  343. via
  344. .I --daemon-logs
  345. option
  346. .br
  347. sched_priv_<multisched name>/ Copy of the
  348. .br
  349. PBS_HOME/sched_priv_<multisched_name>
  350. .br
  351. directory, with all files.
  352. .br
  353. Core files are not captured;
  354. .br
  355. see core_file_bt/.
  356. .br
  357. sched_logs_<multisched name>/ Scheduler logs from the
  358. .br
  359. PBS_HOME/sched_logs_<multisched_name>
  360. .br
  361. directory for the number
  362. .br
  363. of days specified
  364. .br
  365. via
  366. .I --daemon-logs
  367. option
  368. .br
  369. reservation/
  370. .br
  371. \ \ \ \ \ \ \ \ \ \ \ pbs_rstat_f.out Output of pbs_rstat -f
  372. .br
  373. \ \ \ \ \ \ \ \ \ \ \ pbs_rstat.out Output of pbs_rstat
  374. .br
  375. scheduler/
  376. .br
  377. \ \ \ \ \ \ \ \ \ \ \ qmgr_lsched.out Output of qmgr list sched
  378. .br
  379. hook/
  380. .br
  381. \ \ \ \ \ \ \ \ \ \ \ qmgr_ph_default.out Output of qmgr print hook @default
  382. .br
  383. \ \ \ \ \ \ \ \ \ \ \ qmgr_lpbshook.out Output of qmgr list pbshook
  384. .br
  385. datastore/
  386. .br
  387. \ \ \ \ \ \ \ \ \ \ \ pg_log/ Copy of the
  388. .br
  389. PBS_HOME/datastore/pg_log directory
  390. .br
  391. for the number of days specified
  392. .br
  393. via
  394. .I --daemon-logs
  395. option
  396. .br
  397. core_file_bt/ Stack backtrace from core files
  398. .br
  399. \ \ \ \ \ \ \ \ \ \ \ sched_priv/ Files containing the output of thread
  400. .br
  401. apply all backtrace full on all core
  402. .br
  403. files captured from PBS_HOME/sched_priv
  404. .br
  405. \ \ \ \ \ \ \ \ \ \ \ sched_priv_ Files containing the output of thread
  406. .br
  407. \ \ \ \ \ \ \ \ \ \ \ <multisched name>/ apply all backtrace full on all core
  408. .br
  409. files captured from PBS_HOME/sched_priv
  410. .br
  411. \ \ \ \ \ \ \ \ \ \ \ server_priv/ Files containing the output of thread
  412. .br
  413. apply all backtrace full on all core
  414. .br
  415. files captured from
  416. .br
  417. PBS_HOME/server_priv
  418. .br
  419. \ \ \ \ \ \ \ \ \ \ \ mom_priv/ Files containing the output of thread
  420. .br
  421. apply all backtrace full on all core
  422. .br
  423. files captured from PBS_HOME/mom_priv
  424. .br
  425. \ \ \ \ \ \ \ \ \ \ \ misc/ Files containing the output of thread
  426. .br
  427. apply all backtrace full on any other
  428. .br
  429. core files found inside PBS_HOME
  430. .br
  431. system/
  432. .br
  433. \ \ \ \ \ \ \ \ \ \ \ pbs_probe_v.out Output of pbs_probe -v
  434. .br
  435. \ \ \ \ \ \ \ \ \ \ \ pbs_hostn_v.out Output of pbs_hostn -v $(hostname)
  436. .br
  437. \ \ \ \ \ \ \ \ \ \ \ pbs_environment Copy of PBS_HOME/pbs_environment file
  438. .br
  439. \ \ \ \ \ \ \ \ \ \ \ os_info Information about the OS
  440. .br
  441. \ \ \ \ \ \ \ \ \ \ \ process_info List of processes running on the system
  442. .br
  443. when the snapshot was taken. Output of
  444. .br
  445. ps -aux | grep [p]bs on Linux systems,
  446. .br
  447. or tasklist /v on Windows systems
  448. .br
  449. \ \ \ \ \ \ \ \ \ \ \ ps_leaf.out Output of ps -leaf. Linux only.
  450. .br
  451. \ \ \ \ \ \ \ \ \ \ \ lsof_pbs.out Output of lsof | grep [p]bs.
  452. .br
  453. Linux only.
  454. .br
  455. \ \ \ \ \ \ \ \ \ \ \ etc_hosts Copy of /etc/hosts file. Linux only.
  456. .br
  457. \ \ \ \ \ \ \ \ \ \ \ etc_nsswitch_conf Copy of /etc/nsswitch.conf file.
  458. .br
  459. Linux only.
  460. .br
  461. \ \ \ \ \ \ \ \ \ \ \ vmstat.out Output of the command vmstat.
  462. .br
  463. Linux only.
  464. .br
  465. \ \ \ \ \ \ \ \ \ \ \ df_h.out Output of the command df -h.
  466. .br
  467. Linux only.
  468. .br
  469. \ \ \ \ \ \ \ \ \ \ \ dmesg.out Output of the dmesg command.
  470. .br
  471. Linux only.
  472. .br
  473. pbs.conf Copy of the pbs.conf file on the
  474. .br
  475. server host
  476. .br
  477. ctime Contains the time in seconds since
  478. epoch when the snapshot was taken
  479. .br
  480. pbs_snapshot.log Log messages written by
  481. .B pbs_snapshot
  482. .SH Examples
  483. .IP "pbs_snapshot -o /tmp" 5
  484. Writes a snapshot to /temp/snapshot_<timestamp>.tgz that includes 30
  485. days of accounting logs and 5 days of daemon logs from the server
  486. host.
  487. .IP "pbs_snapshot --daemon-logs=1 --accounting-logs=1 -o /tmp --obfuscate --map=mapfile.txt" 5
  488. Writes a snapshot to /temp/snapshot_<timestamp>.tgz that includes 1
  489. day of accounting and daemon logs. Obfuscates the data and stores the
  490. data mapping in the map file named "mapfile.txt".