Chauve2006

CREATED: 200801070356 LINK: url:~/Modules/Literature/Chauve2006.pdf TITLE: On common interval with errors

** Definitions

permutation/sequence
multiple genomes
quorum parameter
allowing errors

** Concepts

character set
set distance: symmetric set distance, set transformation distance
d-neighbourhood (absolution), $\epsilon$-neighbourhood (relative)
occurrences
location set
covering
d-cluster, $\epsilon$-cluster
maximality of a cluster
equivalence of clusters

** Algorithms *** common intervals as character sets Generate all possible character sets that could be a representative and checks if each of these character sets is a representative for a set of approximate occurrences that cover at least q sequences. Running time is $O(k^2 n^{2d + 5})$. *** common intervals as cliques in graphs Construct a vertex-labeled bipartite graph $G = (U \cup V, R)$ where U contains one vertex u for each character set in C^all and V contains one vertex v for each character set in N_d(C^all). Two vertices u and v are connected by an edge if the distance between them is not larger than d

Can reduce the problem to finding all maximal cliques, maximal cliques as equivalence classes of clusters *** common intervals as binary arrays Represent a set as a binary array and define a d-center of a set of binary arrays

** Discussion

cluster maximality is not efficiently taken into account
maximality is central to filter the enormous amount of output

Melvin's digital garden

Chauve2006

Links to this note