Searchlight.config file

The Searchlight.config file contains advanced settings that should only be updated from guidance of the [support team] (mailto:[email protected]). The file is located in the following location: “[installation path]\config\Searchlight.config”.

If a setting in the config file is updated, the Document Searchability service must be restarted by going to Settings > Advanced and turning the service off and on again.

Some of the common settings available in the Searchlight.config file are described below.

Setting Description
skipEnumerationErrors Set this to true to skip documents that can’t be enumerated due to permissions restrictions, long path errors, etc. instead of failing the whole job.
checkServiceEvery This interval to periodically check the status of the Document Searchability service. If the status of a job is set to as running when the service has stopped, it will be put into an error state. The default is to check the service every 60 minutes.
enumerationMaxParallelism When enumerating documents from large SharePoint libraries, Document Searchability partitions the retrieval so that the documents are retrieved in chunks. These chunks can be retrieved in parallel which can significantly speed up enumeration. This setting is used to control the maximum number of chunks that can be retrieved at once. Note, however, that the maximum value will be limited to the maximum cores your license permits.
deleteDocumentsAfterAudit If the processing mode is “Audit and OCR” and there is enough space in the local computer where the Temp Folder is defined, the same downloaded documents can be used for OCR after all documents have been audited. However, if space is an issue, the documents can be deleted as soon as they have been audited and they will be downloaded again during the OCR process.
processSharepointList By default, Document Searchability only processes SharePoint document libraries. Set this setting to “true” if you want to process attachments in SharePoint Lists as well.
skipCheckedOutDocument Set this to true to skip checked-out documents from being processed (during OCR stage only).
retainApprovalStatus When Document Searchability processes documents in a SharePoint library which requires Content Approval, it will set them to ‘Pending’ after processing. Set this setting to “true” to retain the original Approval Status after the documents have been processed.
ignorePreviouslyOcredDocuments Document Searchability may re-OCR documents that have already been processed previously if its modified date in SharePoint has changed since the last time it was processed and process “Fully Searchable” and/or “Partially Searchable” options are set in the Document Settings. The modified date can change if a document is replaced by a new one or its metadata/properties are modified in SharePoint. To avoid re-processing these documents again irrespective of whether the modified has changed, set this setting to “true”. The default value is false.
sharePointFailCheckinComment When a SharePoint document is successfully OCRed, a comment indicating the file was processed by Document Searchability is added during check-in. This check-in comment can be configured in the “Library Settings” tab. However, when a document failed to OCR, no comment is added. To force Document Searchability to add a comment to the non-OCRed document in SharePoint, specify a comment in this setting.
failOnPixelLimit Force a document to error out in Native mode if it has an image in a page that exceeds the pixel limit (IRIS engine only). The default value is ‘false’ which will cause the page to be skipped. Extended OCR has the following image limits:
- Max Height = 32,768 pixels
- Max Width = 32,768 pixels
- Max Size = 75,000,000 pixels
pdfTextOperators The PDF text operators that need to be present in a page to consider it searchable.
downloadAndUploadRetries
- sharePointRequestRetries
Occasionally, there might be some intermittent network problems or unusual extreme load on the SharePoint server which can cause problems when processing SharePoint document libraries. To cope with this, retry mechanisms have been implemented for different scenarios that will retry performing a particular task in the event of such problems (e.g., timeouts). There are 2 SharePoint retry settings available:
- downloadAndUploadRetries
- used when downloading and uploading documents fail.
- sharePointRequestRetries
- used when executing SharePoint queries fail.
- The number of retries and the amount of time to wait between retries can be controlled through the respective config settings. The value needs to be entered in the format “x,y”, where x is the number of retries and y is the time (in milliseconds) to wait before the first retry. For subsequent retries, the time to wait will be twice the previous wait time.
databaseRetries Sometimes, if a document library is set to process using multiple cores, Document Searchability may encounter problems when it tries to update the database due to it being ‘locked’ because of concurrent updates. To overcome this problem, a retry mechanism has been implemented that will retry updating the database if it fails the first time. The number of retries and the amount of time to wait between retries can be controlled through this setting. The value needs to be entered in the format “x,y”, where x is the number of retries and y is the amount of time in milliseconds to wait for each retry.