Melvin's digital garden

Burgetz2006

CREATED: 201004101352 LINK: url:/home/melvin/Modules/Literature/Burgetz2006.pdf

Defines positional homologs as those that also have homologous neigboring genes.

They found that positional homologs had on average relatively lower rates of substitution at the DNA level than duplicate homologs in different genomic locations, regardless of the level of protein sequence divergence.

Addition of a stringency filter based on the second best hits was an efficient way to remove dubious ortholog identification in BLAST and FASTA analyses.

Adding a minimum length requirement improved the accuracy of prediction. Requirement of 70% coverage of the longer protein > 70% converage of the shorter protein.

The largest score improvement was obtained by putting a strict E-value spread requirement. E-value spread is defined as the ratio of the best hit and the next best hit E-values.

Links to this note