Optimizing document searchability in SharePoint
SharePoint and Office 365 Document Stores Concepts
Nutrient Document Searchability can be configured to monitor multiple SharePoint libraries. Below are some concepts that should be taken into consideration during configuration.
File and path lengths
The file path is everything after the server’s name and port number in the URL. File path includes the name of the site and subsites, document library, folders, and the file name itself.
SharePoint Type | Maximum file path Length | Maximum file or folder name length |
---|---|---|
SharePoint Online (Office 365) | 400 | 400 |
SharePoint On-Premises 2019 | 400 | 400 |
SharePoint On-Premises 2016 | 256 | 128 |
SharePoint On-Premises 2013 | 256 | 128 |
SharePoint On-Premises 2010 | 256 | 128 |
Versioning
Since Document Searchability uses in-place processing, the source document is replaced by the resulting PDF file. However, if versioning is turned on, the resulting PDF file will be created as another version of the input file in SharePoint. If versioning is turned off, then the resulting PDF file replaces the source file.
URL formats
Below are examples of SharePoint URL formats accepted by Document Searchability when setting up a document library. NOTE: Make sure the URLs start with “http” or “https”
Example formats
Site/Web:
Document Library:
List:
OneDrive for Business
-
https://myCompany-my.sharepoint.com/personal/firstname_lastname_mycompany_onmicrosoft_com
-
https://myCompany-my.sharepoint.com/personal/firstname_lastname_mycompany_onmicrosoft_com/myLibrary
However, if the full URL is entered (i.e., ending with “.aspx”) as shown below, Document Searchability will try to automatically format it to one of the above accepted formats:
-
https://myCompany/sites/mySite/myLibrary/Forms/AllItems.aspx
-
https://myCompany/sites/mySite/_layouts/15/start.aspx#/myLibrary/Forms/AllItems.aspx
-
https://myCompany/sites/mySite/_layouts/15/start.aspx#/Lists/myList/AllItems.aspx
Windows File System Stores Concepts
File and path lengths
Windows File System Standard Windows File System
The maximum length of a path is 260 characters (D:\some 256-character path string<NUL>).
Windows File System (Unicode)
The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters.
This type of path is composed of components separated by backslashes, each up 255 characters.
To specify an extended-length path, use the “\?\ prefix. For example, “\?\D:\very long path”.
Windows File System (long path)
Starting in Windows 10 version 1607 it is possible to opt out of the MAX_PATH limitations in common Win32 file and directory functions.
File Access Permissions
Document Searchability Service must be configured with the security credentials of a user that has permissions to access that specific location.
Azure File Storage Stores Concepts
The entire path, including the file name, must contain fewer than 2,048 characters.
The path is composed of components separated by backslashes (for example \A\B\C\D, each letter is a component), each component can be up to 255 characters in length.
Azure Blob Storage Stores Concepts
Blob storage is a flat storage scheme. Within one container, each blob name identifies a blob. It is possible to simulate a folder structure using delimiters within the blob name.
Blobs are identified by both a container name and a blob name.
Container names are between 3 and 63 characters in length.
A blob name must be at least one character long and cannot be more than 1,024 characters long.
Mixed Storage Types
Though it is possible within a Document Searchability library to use one document store type as the source, and another document store type for both Archive location, and for files generating errors, there will be issues due to differences in file path lengths and characters acceptable in file paths.
For general use, it is recommended that a Document Searchability Library uses the same type of storage for all locations.
Use of Windows File System for Archive and Error locations has been tested, but there are issues with respect to path lengths and accepted characters as noted above.