Configure → Job Summarization
This guide explains how to configure the Job Summarization software.
Prerequisites
The Job Performance (SUPReMM) XDMoD module must be installed and configured before configuring the Job Summarization software.
Setup Script
The Job Summarization software includes a setup script to help you configure your installation. The script will prompt for information needed to configure the software and update the configuration files and databases. If you have modified your configuration files manually, be sure to make backups before running this command:
# supremm-setup
The setup script needs to be run as a user that has write access to the configuration files. You may either specify a writable path name when prompted (and then manually copy the generated configuration files) or run the script as the root user.
The setup script has an interactive ncurses-based menu-driven interface. A description of the main menu options is below:
Create configuration file
This section prompts for the configuration settings for the XDMoD datawarehouse and the MongoDB database. The script will automatically detect the resources from the XDMoD datawarehouse and prompt for the settings for each of them.
Create database tables
This section will create the database tables that are needed for the job summarization software.
The default connection settings are read from the configuration file (but can be overridden). It is necessary to supply username and password of a database user account that has CREATE privileges on the XDMoD modw_supremm database.
Initialize MongoDB Database
This section will add required data to the MongoDB database.
The default connection settings are read from the configuration file (but can be overridden).
Configuration Guide
The SUPReMM job summarization software is configured using a json-style format
file that uses json syntax but permits line-based comments (lines starting with
//
are ignored by the parser).
This file is stored in /etc/supremm/config.json
for RPM based installs or
under [PREFIX]/etc/supremm/config.json
for source code installs, where
[PREFIX]
is the path that was passed to the install script.
The paths shown in this configuration guide show the default values for RPM-based installs. For source code installs you will need to adjust the paths in the examples to match the installed location of the package.
The top level properties are listed in the table below:
Setting | Description |
---|---|
summary | Contains configuration settings for the summarize_jobs.py script. |
resources | Contains details about the compute resources. |
datawarehouse | Contains configuration to access XDMoD's database. |
outputdatabase | Contains configuration settings for the database used to store the job summary data. |
xdmodroot | This optional setting defines the path to the XDMoD configuration directory. This is only used if the summarization software runs on the same machine as the XDMoD software is installed. If present then the software will read the XDMoD database configuration directly from the XDMoD portal settings file. This obviates the need to redundantly specify database settings. |
Summary settings
The summary
element contains configuration for the summarize_jobs.py
script.
{
...
"summary": {
"archive_out_dir": "/dev/shm/supremm_test",
"subdir_out_format": "%r/%j"
}
}
Setting | Example value | Description |
---|---|---|
archive_out_dir | /dev/shm/supremm | Path to a directory that is used
to store temporary files. The summary script will try to create the directory if is does not exist.
The default value is to use a path under /dev/shm because this is
the typical location of a tmpfs filesystem. The summarization software performance
is typically improved by using tmpfs for temporary files but this is not required. |
subdir_out_format | %r/%j | Specifies the path under
the archive_out_dir to be used for temporary files during the summarization of each job.
Different subdirectories should used for each job because jobs are processed in parallel.
The format string includes the following substitutions: %r is replaced by the resource name
and %j the job identifier. Additionally any valid format specifiers to the strftime function
are permitted. The strftime function is called with
the end time of the job. |
Resource settings
The “my_cluster_name” string and value of the resource_id
field should be set to
the same values as the code
and id
columns in the Open XDMoD
modw.resourcefact table in the datawarehouse.
{
...
"resources": {
"my_cluster_name": {
"enabled": true,
"resource_id": 1,
"batch_system": "XDMoD",
"hostname_mode": "hostname",
"pcp_log_dir": "/data/pcp-logs/my_cluster_name",
"batchscript": {
"path": "/data/jobscripts/my_cluster_name",
"timestamp_mode": "start"
}
}
}
}
The various settings are described in the table below:
Setting | Allowed values | Description |
---|---|---|
enabled | true | false | If set to false then this resource will be ignored by the software |
resource_id | [integer] | The value from the id column in the modw .resourcefact table in the XDMoD database |
batch_system | XDMoD | Sets the module used to obtain job accounting information. This should be set to XDMoD |
hostname_mode | hostname | fqdn | Determines how compute node names as reported by the resource manager are compared
with the node name information from the PCP archives.
If the resource manager reports just the hostname for compute nodes in the accounting logs then
this value should be set to hostname . If the resource manager reports
full domain names in the accounting logs then this value should be set to
fqdn (see also the host_name_ext setting below).
Typically, the Slurm resource manager reports just the hostname in the accounting logs. |
host_name_ext | [domain name] | If the hostname_mode is fqdn and the host_name_ext is specified then the string will
be appended to the node name from the PCP archives if it is absent. This is used to workaround misconfigured /etc/hosts files on the compute
nodes that result in only the hostname information begin recorded in the PCP achive metadata.
This setting is ignored if the hostname_mode is set to hostname and may be omitted in this case. |
datasource | [pcp or prometheus] | Data collection software used to monitor the resource. |
pcp_log_dir | [filesystem path] | Path to the PCP log files for the resource. |
prom_host | [hostname] | Hostname for the Prometheus server monitoring the resource. |
prom_user | [username] | Username for basic authentication to the Prometheus server. |
prom_password | [password] | Password for basic authentication to the Prometheus server. |
batchscript.path | [filesystem path] | Path to the batch script files. The batch scripts must be stored following the naming convention described in the job script documentation. Set this to an empty string if the batch script files are not saved. |
batchscript.timestamp_mode | start | submit | end | none | How to interpret the
directory timestamp names for the batch scripts. start means that the directory name corresponds
to the job start time, submit the job submit time, end the job end time and none the timestamp
should not be included in the job lookup. |
Database authentication settings
The configuration file supports two different mechanisms to specify the access credentials for the Open XDMoD datawarehouse. Choose one of these options. Either:
- Specify the path to the Open XDMoD install location (and the code will use the Open XDMoD configuration directly) or
- Specify the location and access credentials directly.
If the summarization software is installed on the same machine as Open XDMoD then (1) is the recommended option. Otherwise use option (2).
Option (1) XDMoD path specification
If the summarization software is installed on the same machine as Open XDMoD
then ensure the config.json
has the following settings:
{
...
"xdmodroot": "/etc/xdmod",
"datawarehouse": {
"include": "xdmod://datawarehouse"
},
}
Where xdmodroot should be set to the location of the xdmod configuration
directory, typically /etc/xdmod
for RPM based installs. Note that the user
account that runs the summarization scripts will need to have read permission
on the xdmod configuration files. For an RPM based install, the xdmod
user
account has the correct permission.
Option (2) Direct DB credentials
If the summarization software is installed on a dedicated machine (separate from the Open XDMoD server), then the XDMoD datawarehouse location and access credentials should be specified as follows:
Create a file called .supremm.my.cnf
in the home directory of the user that
will run the job summarization software. This file must include the username
and password to the Open XDMoD datawarehouse mysql server:
[client]
user=[USERNAME]
password=[PASSWORD]
ensure the “datawarehouse” section of the config.json
file has settings like
the following, where XDMOD_DATABASE_FILL_ME_IN should be set to the hostname of
the XDMoD database server.
{
...
"datawarehouse": {
"db_engine": "MySQLDB",
"host": "XDMOD_DATABASE_FILL_ME_IN",
"defaultsfile": "~/.supremm.my.cnf"
},
}
MongoDB settings
If you used Option (1) XDMoD path specification in the datawarehouse configuration then use the following configuration settings:
{
...
"outputdatabase": {
"include": "xdmod://jobsummarydb"
}
}
Otherwise the MongoDB settings can be specified directly as follows:
The outputdatabase
.uri
should be set to the uri of the MongoDB server that
will be used to store the job level summary documents. The uri syntax is
described in the MongoDB documentation. You must specify the database name in
the connection uri string in addition to specifying it in the dbname
field
{
...
"outputdatabase": {
"type": "mongodb",
"uri": "mongodb://localhost:27017/supremm",
"dbname": "supremm"
},
}
Setup the Database
The summarization software uses relational database tables to keep track of
which jobs have been summarized, when and which version of the software was
used. These tables are added to the modw_supremm
schema that was created when
the Open XDMoD SUPReMM module was installed. The database creation script is
located in the /usr/share/supremm/setup
directory and should be run on the
XDMoD datawarehouse DB instance.
$ mysql -u root -p < [PATH TO PYTHON SITE PACKAGES]/supremm/assets/modw_supremm.sql
Where [PATH TO PYTHON SITE PACKAGES]
is the path to the python site packages install directory
(/usr/lib64/python2.7
for a Centos 7 RPM install and /usr/lib64/python3.6/site-packages
for Rocky 8 RPM install).
Setup MongoDB
$ mongo [MONGO CONNECTION URI] [PATH TO PYTHON SITE PACKAGES]/supremm/assets/mongo_setup.js
where [MONGO CONNECTION URI] is the uri of the MongoDB database.