Roth2008
[file:///home/melvin/Modules/Literature/Roth2008.pdf]
homologs
- orthologs
- paralogs
- in
- out
phylogeny based methods
pairwise methods
Compute pairwise sequence similarity
Local alignment between every pair of sequences. Pairs with significant alignment scores (> 85) are refined by finding the PAM that maximizes the alginment score. PAM 224 yield scores that are on average closest to the ones obtained in the refinement step. Refined alignments with score > 181 (E-value of 10^-4) are significant.
Using local alignment, we need a length tolerance criteria. The shorter of the aligned sequence should be at least l of the longest sequence. Use two tests to decide on an value of l = 0.61
Finding stable pairs
Stable pairs are genes from two genomes that are mutually the most closely related sequences.
Verifying stable pairs
Where an ortholog is missing, want to avoid classifying paralogs as orthologs. Verify stable pairs using a third genome, that act as a witness of evolution. Pairs that pass become verified pairs and those that fail are broken pairs
Grouping verified pairs
Cluster orthologs into orthologous groups. Pairs in a group are group pairs.