Optimize document tagging settings for searchability

1. Select the document types to process.

2. The Temp Folder Location is where Tagging temporarily stores downloaded files for processing. Once processing is completed for each document, it is deleted.

3. There are different options to filter documents:

  • Date Filter – Either by modified or creation date. Documents that fall within the specified range are included.

  • Exclude Specific Documents – documents that match the specified paths are excluded.

  • Filter Documents by Regular Expression - documents whose properties match the specified regular expressions are included.

4. You can limit the number of documents to process in each run. This is helpful if you want to process the documents in batches.

Set it to ‘0’ to process all documents.

5. Set this to true if you want to re-process documents that have already been tagged. This can be useful if you tagged a document previously using one method (e.g. Zonal) and want to tag it again using another method (e.g. NLP).

6. This option must be used in conjunction with Nutrient Document Searchability OCR. Set this to true to only process PDF documents that have been processed by Document Searchability OCR to make sure they are text searchable before trying to extract metadata.

See Running Document Searchability Tagging with OCR for more information about this setting.

7. You can also Retain Modified Date or Retain Modified By of the documents in SharePoint so that the Modified Date and Modified By columns will not be changed even after tagging the documents with new metadata.