Product Overview

Aquaforest Searchlight is an in-place document processing tool that is designed to monitor and make files within an organization Searchable. It is able to integrate with Microsoft SharePoint and Windows File Systems.

The Business Problem: Documents that are not searchable.

Studies have shown that in most organizations over 20% of documents are not fully text searchable so will not be located by text search or discovery exercises. In addition, a greater percentage of documents may not be tagged with appropriate metadata. With the increase in distributed capture and ad-hoc publishing to document stores such as Microsoft SharePoint, there is a need for a solution to this problem that does not require a strict capture-time process.

Many types of documents are not searchable without special processing. For example:

  • Scanned TIFF Files

  • Image PDF Files

  • Image Files (BMP, PNG, JPG)

  • Faxes

These types of files need to be processed with Optical Character Recognition (OCR) technology to create a text version of the file contents which allows a searchable PDF to be created by merging the original page images with the text. The text is stored in the PDF file as a hidden layer overlaying each page image. This enables the file to be searched.

Documents stored in Microsoft SharePoint may often be lacking key metadata required to enable straightforward metadata searches. For example, attributes such as “Keywords” or “Company” may not have been fully indexed when the document was stored in SharePoint. The Aquaforest Searchlight Metadata Extractor module can be configured to automatically add metadata to new and existing documents.

In order to enable searches across files in SharePoint, Windows Search or other Document Management Systems the searchable files need to be indexed by the system. System iFilters manage this automatically for Microsoft Office but for PDF files a separate iFilter is required. A free iFilter is available from Adobe which does a good job but only indexes basic PDF content, not PDF titles, subjects, authors, keywords, annotations, bookmarks, attachments, create time/date, number of pages.

The Solution: Aquaforest Searchlight

  • Audits document stores to determine which documents require processing.

  • Document Stores are monitored to deal with new and updated documents.

  • Dashboard provides a convenient summary of the state of all managed stores.

  • Provides detailed conversion reporting.

  • convenient GUI which enables management of all stores via a single interface

  • OCR Support for 100+ languages including English, Spanish, German, French