Melvin's digital garden

Roth2008

[file:///home/melvin/Modules/Literature/Roth2008.pdf]

homologs

  • orthologs
  • paralogs
    • in
    • out

phylogeny based methods

pairwise methods

Compute pairwise sequence similarity

Local alignment between every pair of sequences. Pairs with significant alignment scores (> 85) are refined by finding the PAM that maximizes the alginment score. PAM 224 yield scores that are on average closest to the ones obtained in the refinement step. Refined alignments with score > 181 (E-value of 10^-4) are significant.

Using local alignment, we need a length tolerance criteria. The shorter of the aligned sequence should be at least l of the longest sequence. Use two tests to decide on an value of l = 0.61

Finding stable pairs

Stable pairs are genes from two genomes that are mutually the most closely related sequences.

Verifying stable pairs

Where an ortholog is missing, want to avoid classifying paralogs as orthologs. Verify stable pairs using a third genome, that act as a witness of evolution. Pairs that pass become verified pairs and those that fail are broken pairs

Grouping verified pairs

Cluster orthologs into orthologous groups. Pairs in a group are group pairs.

Links to this note