On 7th November, we will be introducing an update to our stemming logic which you will find to be highly accurate and advanced.
What is stemming?
If you have never come across stemming before, you can think of it as being a method in which you can search for a word and have related words returned. Thinking of it more technically, it is a method of producing morphological variants of a root or base word. It can reduce words such as "spraying", "sprayable", "sprayed" or "sprays" to the root word of "spray". This will allow you to search for a base word and not have to additionally include variants of that word in your query.
What are the benefits of the new stemming logic?
Currently, we use an industry standard stemming algorithm called Porter Stemming which is a widely known method used by many search and text-based tools.
Our new stemming logic will improve upon Porter Stemming and this is illustrated by the following example:
Previous: Understand=understands, understanding, understandable, understandably
New: Understand=understands, understanding, understandable, understandably, understood
In addition to this, our new stemming logic separates and stems based on whether you are searching for a verb or noun, as can be seen in the following example:
Verbs: Understate=Understated, understates, understating
To sum this all up, our new stemming logic will allow you to be more confident when performing searches since you know your results will only include relevant variants of the keywords you inputted.
(Note: the use of wildcards such as "*" will not be affected by the release of our new stemming. If you decide to use wildcards, then our stemming logic will not take effect in these cases since the wildcards will take precedence over stemming.)
What is the impact of this update?
The main impact of this update is on search queries and therefore, anything that is tied to a search query could be affected by this change. Due to this, the areas of the platform that could be affected by this change are Saved Searches, Workspace folders with an Automatic Update which are set to the option of "All new and existing results", Email Alerts and Insights Dashboards.
How can I manage the change?
We anticipate that the search results for some of your queries may update. We would recommend you to refer to the following FAQs to manage this change better:
- Will I lose my results?- You will not lose any existing work. However, if you would like to retain your original results, you can export your analyses or copy your Workspace folders into a non-automatically updated version to avoid any change in results.
- How can I retain my original analyzes?- Any static analyzes that you have performed will not be affected by this change, where examples of these include Matrix Analysis and analysis generated from static Workspaces. For auto updating Workspaces: If you are using "new results only" option, you don't need to do anything as there will be no change in historical results. If you are using the "all existing and new results" option, then you can easily sort by the latest added documents, review/remove these documents as you see fit.