Melvin's digital garden

Conditional random field

CREATED: 200909180313 ** Sequence labeling

  • eg. Name entity recognition
  • feature-based approach
  • capture local correlations
  • weight: strength of correlation ** CRF
  • $P_\lambda(y|x) \propto Z_x(y) = \exp(\sum_i \sum_j \lambda_i f_i(x, y, j))$
  • Training: examples are best accounted by maximising likelihood of the data
  • Decoding/prediction: individual decoding vs Viterbi decoding ** Computational complexity
  • allowing arbitrary features makes computation infeasible as f
  • restrict f to use only a prefix of y but no restriction on x
  • order of f is the length of label pattern - 1
  • order 0 and order 1 have efficient algorithms based on DP ** Higher order CRF
  • basic algorithm is exponential in the order
  • sparsity of labels: not all combinations of labels occur in practice ** Key idea record only the probabilities of the prefixes of label patterns instead of all combination of patterns

Links to this note