This section provides administrators with detailed information about expected functionality, operational limits, and any constraints of iManage Insight+. It covers key technical specifications, common limitations, and includes a list of frequently asked questions (FAQ) for quick reference.
Indexing image-based files through OCR
Optical Character Recognition (OCR) is a technology that extracts text from image-based documents. OCR is required for any image-based document that doesn't contain embedded text, to make it searchable.
iManage Work at cloudimanage.com supports OCR using Azure Document Intelligence, and iManage Insight+ supports OCR on the same basis. This means that image-based documents up to 150 pages (for example—PDFs or JPEGs) will be indexed and searchable.
This feature is available to customers who have purchased OCR with iManage Work cloudimanage.com.
For more information on supported filetypes, languages, and constraints, refer to OCR for cloudimanage.com.
Indexing extra-large files
iManage Insight+ supports indexing of extra-large documents, meaning files up to 100 million characters are indexed and searchable.
Most file types are supported, with a few exceptions:
RSS
Java
ZIP files with attachments
Zip files containing extra-large files will be processed up to a combined total limit of 100 million characters for all attachments. This is to keep indexing and search time performant.
For example—Zip File A contains:
Attachment 1: 90 million characters
Attachment 2: 5 million characters
Attachment 3: 5 million characters
Attachment 4: 1 million characters
Zip file A and the text of Attachments 1–3 will be indexed. Attachment 4 would be displayed in the list of attachments to Zip File A, but its text wouldn't be indexed, because it's over the 100 million limit.
Emails and attachments
Similarly, for emails and attachments, the combined total indexed characters will be 100m.
For example—Email B comprises:
Email: 10,000 characters
Attachment 1: 90 million characters
Attachment 2: 5 million characters
Attachment 3: 5 million characters
Attachment 4: 1 million characters
Email B will be indexed and the titles (subjects) of its attachments will be displayed in the results list UI.
Email B: fully indexed
Attachment 1: fully indexed
Attachment 2: fully indexed
Attachment 3: 4.9 million characters of text indexed
Attachment 4: no text indexed
FAQ’s
How do document search suggestions work?
As you type in the simple search bar, Insight+ offers real-time search suggestions to help you find relevant content faster. These suggestions appear after you've entered at least three characters.
Where do the suggestions come from?
Suggestions are automatically extracted from documents in your search index. It works like this:Up to 10 key phrases are extracted from each document.
A phrase will only be suggested if it appears in at least one indexed document.
Suggestions are permission-sensitive, meaning:
You'll only see suggestions from documents you have access to.
Suggestions also respect any filters you've applied.
Additionally, all suggestions are normalized to lowercase for consistency, and you'll see a maximum of five suggestions at a time.How are phrases selected?
We use a smart algorithm to automatically identify important phrases in each document. It works without requiring any training data, dictionaries, or external sources, and supports multiple languages.
It analyzes each document using statistical features such as:Word frequency
Word placement and context
Capitalization
How widely the word appears across the text
The algorithm then extracts short phrases (usually two to four words long), ranks them, and selects up to 10 from each document.
When does phrase extraction happen?
Phrase extraction happens during the indexing process. This ensures that all phrases are stored securely in the index and filtered appropriately, so your users only see suggestions tied to documents they’re authorized to access