Job Persistence

An important feature of Accelerator is job persistence. After a job completes, its information remains in vovserver's memory until the job is forgotten, manually or automatically.

Jobs are automatically forgotten to limit vovserver memory consumption.
Note: The Accelerator default is to automatically forget jobs after a configurable time interval, as described in Automatic Forgetting.

You may override automatic forgetting by submitting jobs using the -keep or -keepfor options.

The benefits of job persistence:
  • You can re-execute a job using nc rerun jobID without the need to type the job command line again.
  • Duration information can be used to execute the job on the appropriate taskers (enough time left)
  • Commands are easily edited in the GUI or the browser UI
  • Preserves the job info for documentation and auditing
  • Jobs in vovserver memory have full information accessible via the VTK API
To enable persistence, use the -keep option when you submit the job, as shown below.
Important: Jobs submitted with this option remain in vovserver memory until they are explicitly forgotten, so there is a tradeoff vs. memory usage, and jobs should only be kept when needed.

Keep Jobs for Longer than the Default

If your Accelerator administrator has arranged to support it, you may also use the -keepfor option when submitting jobs. This option takes a VOV timespec, e.g. 4h, 14d. Jobs will be automatically removed from the system when older than the specified age.

This is supported by a liveness script which examines the jobs in memory periodically. The example script may be found in $VOVDIR/etc/liveness/live_keepfor_jobs.tcl and should be copied into the Accelerator vovserver's 'tasks' subdirectory.

The scripts in the tasks subdirectory are triggered every vovserver update cycle, about once a minute. The -keepfor script uses a property KEEPFOR_LASTRUN_TS to record the last time it was active, and an optional KEEPFOR_FREQUENCY to determine how frequently to clean up kept jobs. The default is 1800s, or 30m. These properties are on the object having ID 1.

When you submit a job using the -keepfor option, Accelerator attaches a KEEPFOR property to the job to record how long the job should be kept. This is silently limited to the range 0..32000000 seconds (just over 1 year).

Example commands:
% nc run -r unix -t BASE -keep runregression daily
% nc run -r unix -e BASE -keepfor 2w runregression quick

CPU Effort Considerations

The script that implements -keepfor needs to make a scan through all jobs in the system that are in the Done or Failed state, so this takes some CPU time, and this time will increase with the total number of jobs in the system. You should generally not need to reduce the scan interval below the default of 30m, and in most cases, you can make it longer, perhaps to 8-24h depending on the rate of kept job creation in your system.

Forget Jobs from vovserver Memory

You can forget jobs from vovserver's memory using the following command. Common values for <job-spec> include a jobID, '-mine' and '-set' with a set name.

For more usage information, see nc forget.
% nc forget <job-spec>

Automatic Forgetting

Accelerator's default is to automatically forget jobs as listed below:
  • Successful (Done) jobs are forgotten after 1 hour
  • Failed jobs are forgotten after 2 days
  • Unscheduled (Idle) jobs are forgotten after 2 days

To control the time jobs are kept in the system, edit the VovServer configuration in the policy.tcl file. The parameters controlling this are autoForgetValid, autoForgetFailed, and autoForgetOthers.

For more details, refer to Autoforget Jobs.