Streamline document automation with XML job definitions

Job Definition Creation and Processing

Document Automation Server (DAS) Job Definitions are stored as XML files which are generated using the DAS Administrator.

A job definition file contains certain standard pieces of information (source folder for example) that are common to all jobs. See the Step Type for details on each step in the job.

It is these step definitions that are executed via the DAS service.

Job IDs

DAS uses a sequential integer job id that starts at 1001. The “next job id” value is held in the <Autobahn DX Installation folder\>\temp\next_job_id.xml file and is updated each time a new job is created or copied using the administration tool.

The initial contents of the file are:

<?xml version="1.0" encoding="ISO8859-1" ?>

<next_job_id>1001</next_job_id>

Sample Job Definition File

Below is a simple example of a job definition which is designed to continuously (every 30 seconds) monitor a directory (c:\faxes) for new .TIFF files and convert them to searchable PDF and place the resulting files in c:\processed faxes. The step details have been removed for clarity and are covered under a separate section - Step Details.

<autobahnjob>
  <jobid>1009</jobid>
  <jobname>Monitor incoming faxes and OCR them</jobname>
  <adxversion>6.0</adxversion>
  <trigger />
  <scheduletype>continuous</scheduletype>
  <scheduleevery>30</scheduleevery>
  <scheduleeveryunits>Seconds</scheduleeveryunits>
  <schedulefrom>00:00:00</schedulefrom>
  <scheduleto>23:55:00</scheduleto>
  <scheduleat>16:00:00</scheduleat>
  <stopprocessingonerror>True</stopprocessingonerror>
  <sendemailalerts>False</sendemailalerts>
  <attachjobreport>False</attachjobreport>
  <attachlogfile>False</attachlogfile>
  <dontsendnoerror>False</dontsendnoerror>
  <dontsendnofiles>False</dontsendnofiles>
  <dontsendonsuccess>False</dontsendonsuccess>
  <SendEmailAlertsfromaddress />
  <SendEmailAlertstoaddress />
  <skipprocessedfiles>False</skipprocessedfiles>
  <skipmask>%FILENAME.pdf</skipmask>
  <SendEmailAlertstitle>%JOBNAME% %JOBSTATUS%!</SendEmailAlertstitle>
  <SendEmailAlertsmessage>Job: '%JOBNAME%' Status: '%JOBSTATUS%'.&lt;br&gt;
Log: %LOGFILE%.&lt;br&gt;Time: %DATESTAMP% %TIMESTAMP%&lt;br&gt;
Source: &lt;a href='%JOBSOURCE%'&gt;%JOBSOURCE%&lt;/a&gt;&lt;br&gt;
Target: &lt;a href='%JOBTARGET%'&gt;%JOBTARGET%&lt;/a&gt;&lt;br&gt;
  </SendEmailAlertsmessage>
  <jobprogresscsv>C:\Aquaforest\Autobahn DX\work\1009\JobProgress.csv</jobprogresscsv>
  <jobsteps>1</jobsteps>
  <joblogrention />
  <joblogmaxsize />
  <jobsourcetype>folder</jobsourcetype>
  <jobsource>C:\Faxes</jobsource>
  <jobtarget>C:\Processed Faxes</jobtarget>
  <jobwork>C:\Aquaforest\Autobahn DX\work\1009</jobwork>
  <jobarchive>C:\Aquaforest\Autobahn DX\work\Archive\1009</jobarchive>
  <inputfilesrename>%FILENAME%%TIMESTAMP%.%EXT%</inputfilesrename>
  <joberrors>C:\Aquaforest\Autobahn DX\work\1009\errors</joberrors>
  <joblogfile>C:\Aquaforest\Autobahn DX\logs\1009\%DATESTAMP%.txt</joblogfile>
  <jobCSVlogfile>C:\Aquaforest\Autobahn DX\logs\1009\%DATESTAMP%.csv</jobCSVlogfile>
  <jobtemp>C:\Aquaforest\Autobahn DX\work\1009\temp</jobtemp>
  <JobUseWorkFolder>False</JobUseWorkFolder>
  <DeleteEmptyFolders>False</DeleteEmptyFolders>
  <SkipErrorFolder>False</SkipErrorFolder>
  <inputfileprocessing>leave</inputfileprocessing>
  <jobfilterfile>include</jobfilterfile>
  <jobfiltertype />
  <jobfileorder>alphabetical</jobfileorder>
  <jobreturnstructure>False</jobreturnstructure>
  <filelength>False</filelength>
  <jobinerror>Copy to Error Folder</jobinerror>
  <jobstep>
...
  </jobstep>
</autobahnjob>
XML Element Description
jobid The job ID number (see Job ID section above).
jobname Job Description. Default is “Job %JOBID%
adxversion DAS version
trigger Name of the trigger file. The job will not start automatically until the trigger file is in the input folder.
scheduletype - Ad-hoc
- Continuous
- Onceperday
scheduleevery For continuous schedule type, interval between runs.
scheduleeveryunits For the continuous schedule type, unit of the interval between runs.
schedulefrom For continuous, start time for runs.
scheduleto For onceperday, the time at which to run.
scheduleat For continuous, end time for runs.
stopprocessingonerror If true, stop processing if an error occurs.
sendemailalerts If true, send an email alert.
attachjobreport Attach the job report to the email.
attachlogfile Attach the log file to the email.
dontsendnoerror If no errors, do not send email.
dontsendnofiles If no files are processed, do not send email.
dontsendonsuccess On success, do not send email.
SendEmailAlertfromaddress Address of sender.
SendEmailAlerttoaddress Address of recipient.
skipprocessedfiles Skip files.
skipmask Mask of processed files to skip.
SendEmailAlertstitle Title of email.
SendEmailAlertsmessage Email body (HTML elements need to be escaped).
jobprogresscsv The location of the temporary progress file used to store status information while the job is running.
joblogfile Location of the job log file. By default, the log file is logs /%JOBID%/%TIMESTAMP%.txt
jobsteps The number of job steps.
joblogrention Log file retention period (days).
joblogmaxsize Maximum log file size.
jobsourcetype File, folder or tree.
jobsource The source file or folder.
jobtarget The target folder.
Joberrors Folder for job errors. If not specified files that cannot be processed will be placed in jobwork/errors (this is also the default).
jobdeleteonsuccess If “yes”, when a job has successfully completed, all work files (hence input files) are deleted.
jobwork The root of the temporary work directories used by the job. The work directories themselves are named work1, work2 etc.
jobarchive The path of the folder used as the archive location when inputfileprocessing is set to move or copy to archive.
inputfilesrename Name template to rename input files after processing.
joblogfile The log file filepath template, default is to name the log file as the current date.
jobCSVlogfile The CSV log file filepath template, default is to name the log file as the current date.
jobtemp The temporary folder path.
JobUseWorkFolder Use intermediate work folders – required if the job has multiple steps or the input and output folder are the same.
DeleteEmptyFolders If true, delete any empty folders in the job source folder.
SkipErrorFolder If true, inaccessible folders will be skipped without throwing an error.
inputfileprocessing Action on input files:
- Copy to Archive after processing
- Move to Archive after processing
- Leave input files after processing
- Move input files to Target folder after processing
- Delete input files
jobfilterfile Filter files based on:
- Include files matching
- Exclude files matching
- Include with document count limit
- Include unprocessed PDFs only
- Include unprocessed PDFs only with document count limit
jobfiltertype File filter.
jobfilterorder Ordering of files included (mainly used in conjunction with document count limit options):
- Alphabetical
- Created date/UTC date ascending or descending
- Modified date/UTC date ascending or descending.
jobreturnstructure If true, the input folder structure will be preserved in the output.
filelength If true, skip long file names.
jobinerror Action when job is in error:
- Move to error
- Copy to error
- Take no action.
jobstep Contains the definition of a job step. Multiple elements, including multiple steps. See Sample Step Details.

Job Step

The following is a sample job step. Other step types may have more step details and attribute vales.

<jobstep>
    <stepsequence>1</stepsequence>
    <steptype>kingfisher</steptype>
    <stepdetails>
      <operation>kingfisher</operation>
      <sourcetype>folder</sourcetype>
      <source>C:\Aquaforest\Autobahn DX\work\1010\work1\In</source>
      <target>C:\Aquaforest\Autobahn DX\work\1010\work2\In</target>
      <inerror>Copy to Error Folder</inerror>
      <returnstructure>False</returnstructure>
      <errors>C:\Aquaforest\Autobahn DX\work\1010\errors</errors>
      <tempfolder>C:\Aquaforest\Autobahn DX\work\1010\temp</tempfolder>
      <joboptions />
      <advancedflags />
      <fileorder>alphabetical</fileorder>
      <kfjobid>10002</kfjobid>
      <docoptions />
      <logfile />
    </stepdetails>
    <attributevalues>
      <attribute>
        <attributeid>topdf2</attributeid>
        <currentvalue />
      </attribute>
      <attribute>
        <attributeid>topdf1</attributeid>
        <currentvalue>10002</currentvalue>
      </attribute>
    </attributevalues>
  </jobstep>
XML Elements Description
stepsequence The number of the step in the sequence. The steps will be processed in stepsequence order (ascending).
steptype Identifier of the step type. This will match the files (without the ordering number) found in <installation folder\>\Autobahn DX\steptype
stepdetails Details of the step that are passed to the step processor. See Step Details for more information.
attributevalues Elements that are displayed in the UI. See Attribute Values for more information.

A Job step comprises of four elements.

Step Details

The details of the step are provided here and provide sufficient information for the DAS service to execute the step, in conjunction with the information in the StepType definitions. Files are moved from the jobsource directory into the working directory specified by <sourcefiles> and the result files are placed in the <target> directory. Upon completion of all the steps, the service will move the files from the final work directory to the <jobtarget> directory.

The Step Details will vary by step type.

XML Element Description
Operation The operation (For example, split). This is defined in the step definition file for the steptype.
Sourcetype Folder, file or tree.
Source Source file or folder.
Target Target folder.
inerror Action to be undertaken in the event of an error.
returnstructure Retain file structure when copying/moving to error folder (based on inerror setting above).
Errors Files that cannot be processed are placed in this directory. Inherited from the Job definition.
tempfolder The temporary folder to use for this step (defaults to job temp folder).
Joboptions These are steptype-specific parameters that are derived from the options selected in the Job Designer.
advanced Flags Additional advanced steptype-specific parameters that can be entered manually only.
fileorder Order of file selection.
kfjobid Document Automation Server Content Extraction (Kingfisher) job id. This is linked to the attribute topdf2 (see below).
Docoptions PDF file open options derived from the options selected in the Job Designer.
Logfile If specified, the output will be logged to a file with this name in %PDFJUNCTIONDIR%\logs or %TIFFJUNCTIONDIR%\logs.

Attribute Values

The attribute values element contains the attributes that are displayed and edited in the UI.

Each attribute element comprises paired attributeid and currentvalue elements.

The attributeid can be used to link to the matching stepdetail by examining the Step Type file.