Melvin's digital garden

Lee2008

CREATED: 200705180606 LINK: url:~/Modules/Literature/Lee2008.pdf ** Motivation

  • recombination
  • affects accuracy of phylogenetic relationship
  • recombination of biological significance

** Challenges

  • strength of recombination (amount of mutation)
  • timeline
  • mutation rates

** Approaches

  • Global reference tree
  • Sliding window approach ** compare sequence on left and right side of window to detect significant difference ** base on phylogeny: PDM, Pruned PDM, RECOMP ** base on computing distance: PhyPro, DSS

** Weakness of distance based approaches

  • uses hamming distance/edit distance which is not informative, does not take into account distribution of the mutations
  • sensitive to choice of window length

** Contributions

  • concept of shared (k,m) mers as an informatice distance measure
  • positive specific weights, heavier wights near bp, lower weights away from bp
  • scoring function considers chains of shared (k,m) mers

** Experiments

  • three simulated datasets from Pruned PDM paper
  • HepB dataset, compare with bp in literature
  • finding CRF of HIV-1

Links to this note