🌱

Melvin's digital garden

Sequence composition based on K-string composition

CREATED: 200910031527 ** Problems in phylogenomics

too few common genes, “tree of one percent” (Ciccarelli 2006)
depends on sequence alignment
gene tree vs species tree ** CVTree
$S(t)$ denote the number of occurrence of word $t$ in $S$
$S^0(atb) = S(at) S(tb) / S(t)$
$w(S,t) = [S(t) - S^0(t)] / S^0(t)$
$w(S)$ is the vector $[w(S, t)]$ for all $t$ (K-strings)

Links to this note

Literature notes