This page covers information related to search logic, that is how the system searches for content based on the query you entered.
When you run a search in iManage Insight+, the following factors help determine which results are returned:
Determining relevance - Term Frequency: Inverse Document Frequency (TF:IDF)
The most relevant documents appear at the top, based on your keywords.
Relevancy increases in proportion to the number of times a word appears in the document and is offset by the number of documents in the repository that contain the word. This offset adjusts for how some words appear more frequently in general.
For example, you'll find the term “iManage” within almost all the documents in an iManage repository and probably multiple times in each document. So, though the term frequency is high, the number of documents in which this term appears is also high, indicating that it must be a very common word and thereby, the relevancy of the term in the search results is reduced.
Boolean and Proximity operators
iManage Insight+ search accepts simple or complex Boolean and bracketed Boolean expressions, and it returns a list of matching documents.
Operators can be combined to create more advanced expressions. In the absence of brackets, Boolean and proximity operators are treated in the order of precedence below.
NOT
NEAR (NEARn and PREn)
AND
OR
NOTE:
Boolean operators must be entered in uppercase (AND). If they're in lowercase (and) or combination (aNd), they're treated as literal words.
Boolean syntax is read from left to right.
Table: Boolean and Proximity operators
Operator | Description | Example |
|---|---|---|
AND | Requires both terms on either side of the Boolean operator to be present for a match. | Input: Cease AND desist Output: This query returns documents that contain both the terms Cease and Desist, even if the two terms appear in different fields, for example—Cease found in Title and Desist found in Comments. |
OR | Requires that either term (or both terms) be present for a match. | Input: Cease OR desist Output: This query returns all documents that have at least one of these terms. |
NOT | Requires that one of the specified terms is present and the other isn't present. | Input: Cash NOT Credit Output: This query returns only documents that contain Cash and not Credit. |
NEARn | In the case of a phrase search, the second term must be within n words of the first term, that is, the terms are n or fewer words apart in either direction. N is the maximum number of words that are allowed to be found in between the terms. If you don't specify n, it defaults to 5. | Input: Cash NEAR1 Agreement Output: This query returns documents in which the term Cash is within 1 word of the term Agreement. For example—documents that contain Cash Agreement or Agreement Cash are returned. Documents that contain Cash Credit Agreement are also returned, because 1 is the maximum number of words which are allowed to be found in between Cash and Agreement. |
PREn | The second term must be n words before the first term. N is the maximum number of words that are allowed to be found in between the terms. If you don't specify n, it defaults to 5. | Input: Cash PRE1 Agreement Output: This query returns only documents in which the term Cash is before the term Agreement. For example, documents that contain Cash Agreement are returned. Documents that contain Agreement Cash aren't returned (because the order of the terms is incorrect). |
When they're within a quoted phrase search, operators are ignored, even if in uppercase. For example, in—“Cease AND desist”, and “Cash NOT credit”, AND and NOT are treated as literals.
Proximity operators (NEAR and PRE) can be combined with wildcards.
Wildcards
A wildcard search in iManage Insight+ is designed to return expanded variations of the search term(s). This powerful search technique allows users to search for a term where some of the characters are unknown.
The Asterisk * and Question Mark ? wildcards are supported.
Asterisk (*): Can be used to replace multiple unknown characters. It can be used anywhere in a term, but using * at the start of a word isn't recommended as it can slow down the search.
Question Mark (?): Can be used to replace a single unknown character. You can use multiple ? in a search term. It shouldn't be the first character in a term.
NOTE: Wildcard searches may take longer to run than non-wildcard searches.
Stemming
Stemming is a sophisticated search technique wherein the search term is reduced to its base root to broaden the search to include terms that share the same root. This process can enhance the search scope to include all variations of a term.
By default, search terms are stemmed (variations of the root form are returned). For example, customize, customizes, and customized are associated with the same root form, custom.
Unstemmed matches are boosted higher than stemmed matches.
To search without stemming, enclose your search terms in double quotes.
Phrase Search - Exact, Dynamic
Using double quotes (“ “) around search terms means that only documents containing the exact terms, unstemmed and in the same order as typed, are returned.
When searching for multi-word terms without double quotes, documents where the terms appear together in the same order are boosted higher than other results. This means you don’t necessarily need to use double quotes to find the phrase you're looking for.
Special Characters
Special characters are non-alphanumeric characters—for example, full stops, hyphens, dashes, underscores, currency symbols, and punctuation marks. Punctuation marks are treated as a space or word boundary. Special characters are treated as spaces.
Note that by default, an AND is used between multi-word terms. So if a special character is used, it's treated as a space and therefore an AND. For example, a search for ‘self-evaluation’ would be treated as ‘self AND evaluation’, so would return items containing self-evaluation, self_evaluation, and self-evaluation.
Special characters within a quoted phrase search are ignored.
Synonyms
A default synonym list is used to recognize variations between US/UK spellings and match both. Synonyms aren't applied to quoted (exact match) searches.
Stop Words
Stop words are common words that typically modify the meaning of other words but carry no inherent meaning themselves, such as adverbs, conjunctions, and prepositions.
There are no stop words, by design, to allow for better phrase matching. For example, ‘stop and search’ wouldn't return matches for ‘stop the search’.
Versions
All versions of a document are available for search. Where multiple versions of a document exist, the latest version of the document containing matches for the terms is returned.