Melvin's digital garden

SAP and disease association

Collection disease SAPs (single amino acid polymophism) and SNPs.

Define AA features

  • Single amino acid attributes

    • look at four AA before after the current one
    • consider aa helix, hydrophobicity and other features
  • BLOSUM and Grantham substitution matrix

  • PSI-BLAST to get PSSM

Key idea: different amino acids have different effects on disease

Separate aa into 20 groups and do machine learning using SVM. Gives better results as compared to running SIFT on the whole dataset.

Links to this note