CS5228 project

CREATED: 200801250740 Group: Trung, Me

** Clustering without similarity measure

  • distance function difficult to apply to categorical data
  • distance function is proxy, really want to find consistent groups but similarity is only pairwise, hence need properties of metric
  • notation of a consistent group? information theory? See paper “Clustering without a Metric”

** Outlier detection without clustering

  • is notion of distance necessary

** Data mining in structured data

  • KEG database of pathways
  • Structured data ** tuples (nominal, ordinal, interval) ** set ** sequence (document as sequences) ** tree ** graph

** Mining changes to data (Temporal mining)

  • evolution of sequences, eg gene order
  • evolution of graphs, eg social network

** Mining KDD Cup

