What Is Semantic Search? How Does It Work?

PatSnap's Semantic Search is a quick and easy way of searching for patents relevant to a portion of text or publication/application number that you've provided. 

You can simply write your product, invention idea or patent of interest's publication/application number down as a base point, and use it to search across 128 million patents. These searches may find similar inventions that may restrict freedom-to-operate, or indicate patent value estimations of neighbouring patented inventions, or even nurture further and stronger ideation through adjacent technologies and product methodologies.

How Does It Work?

The semantic search algorithm has been trained on our entire patent data set and every single patent has been recorded in a standardized high information format, which can be read in just a few seconds to create a set of relevant search results for you.

When analyzing the patents, the algorithm will take into account the entire patent, ignoring transitional phrases and indefinite articles, and stemming phrases which include prefixes or suffixes. The algorithm will build a relationship between the words to understand the essential meaning and linkage between them. Then it can complete a comparison with the hundreds of millions of documents in our database so that only the most similar results are returned.

So when you request a set of similar documents through semantic search, the algorithm is able to perform very quick calculations to provide a similarity score for every document in the dataset and then provide the most relevant for you (the top 1,000 and only ones that have the minimum similarity score).

The goal is to find the most similar documents much in the same way that a human being would find them if they were to read every single document in the dataset and rank them according to how similar each appears, overall.

We recommend that to get the best results, you should enter a single publication/application number or when entering text it should ideally be more than 2,000 characters.




