Melvin's digital garden

Qi2009

Title: Sorting Genomes by Translocations and Deletions, Dealing with Duplicated Genes

Translocation and deletions (without duplications)

Use break-point graph (cycle-graph)

Main idea: translocations first to bring genes that should be deleted together, deletions later

Q: Is this natural? A: Probably not. The computed TD distance is a lower bound

Dealing with duplications

Only non-tail genes can be duplicated.

Label the original genes (by finding minimum cover) and removing the others.

Example: $\Pi = (1,2,a,2,3), \Gamma = (1,2,3), \Pi^R = (1,2,\delta,3)$

Types of substrings: Sorted, Maximal, Normalized

Q: Is this formulation optimal? i.e. suppose we find the optimal minimum cover, does this mean the overall algorithm is minimum? A: Not sure.

Q: Is the problem of finding minimum cover NP-hard? A: The one defined in [Marron2004] is NP-hard, but is the one here the same one?

Approximation algorithm for MC

A greedy algorithm that extends the current cover as much as possible.

Dealing with overlap in MC

Gave some rule for deciding which copy is the original.

Bounds

There exists a cover of size $s$ and $s \le 1 + 2 * d$ where $d$ is the optimal translocation-deletion distance from $\Pi$ to $\Gamma$.

The whole algorithm has a bound of $9d+4$

Links to this note