Bio Advanced Preferences

Below is a description of the most commonly used advanced preferences available on the Patsnap Bio platform

Alignment Identity- The Alignment Identity is a number that describes how similar the query sequence is to the target sequence (how many characters in each sequence are identical) within the aligned region. The higher the percent identity is, the more significant the match.

Query Coverage- This specifies the percentage of your query sequence you wish to match. For example, if I search for a sequence that contains 100 amino acids, and I would like to see sequences that match at least 70 amino acids of my query sequence, then I would change the slider from 100% to 70%. This will not take account into positioning of the amino acids.

Match with Gaps- Allows for sequences to match and found in your results if the result sequences display a gap rather than an amino acid.

E-Value - This indicates how likely it is that a sequence is similar to yours simply by chance. For instance, if your sequence is very short, there is a higher likelihood that it appears in several locations simply by chance. The greater the e-value, the more likely it is that this is just down to luck.

Algorithm- There are 3 algorithms to choose from depending on the search you wish to perform.

MegaBlast

Great for comparing very similar sequences

Blastn

Standard nucleotide-nucleotide comparisons

Blastn-short

Optimized for sequences with fewer than 50 nucleotides

Search results Cap

Within Advanced Preferences, you will have the option to increase your search results by changing the 'Max Target Sequence'. The default selection will be 5000, but this can be changed to 1000 or 10,000. For results greater than 10,000, there will be a range selection option to View Sources.

Bio Premium users have a maximum of 10,000 search results, while Bio Professional users can go up to 50,000 and 1,000,000 search results:

Advanced Preferences Search Parameters

We covered 2 of the 6 search parameters you can change to get broader or more specific results. The other 4 are the following:

Subject Length - This is the length of the subject the system will look at to match your query against. You can use this parameter to limit your search results based on how long you want your subject to be.

Query Identity (%) - This is the percentage of matching amino acids or nucleotides in the query.

Subject Coverage (%) - This is the percentage of the subject sequence that matches the query sequence. If you would like the entire subject to be present in the query sequence, select 100%.

Subject Identity (%) - This is the percentage of matching amino acids or nucleotides in the subject sequence.

The image below summarizes the advanced search preferences with an example:

The following are some setting recommendations you might find helpful for different search types:

When the target sequence to be retrieved is similar in length to the query sequence, for example, when using wild-type sequences to find mutant sequences, you might not want to get very short or very long sequence results. In this case, you can set Query Identity and Subject Coverage to 90-100:
When you want to get short target sequences by using a long sequence query, for example, using a gene sequence with thousands of base pairs to get a SiRNA or short fragments. In this case, you can set the Alignment identity to 95-100 and Query Coverage to 0-10. Along with this, you can also set a lower word size and use the Blastn algorithm:
When using short sequences to retrieve long sequences, for example, if you know the sequence of a linked peptide (GGGGSGGGGSGGGGSGGGGS) and want to find the long sequence containing the short peptide. In this case, you can use an Alignment Identity of 95-100 and Subject Coverage of 0-10. Along with this, you can use the Blastn-short algorithm:
To conduct a broad search, it is recommended to use the default settings so that you don't miss out on certain results and also to match with gaps:

Was this article helpful?

Have more questions? Submit a request