Melvin's digital garden

Employee Attrition Prediction with R Accelerator

speaker: Le Zhang, Data Scientist at Microsoft event: FOSSASIA pre-meetup

Client project done by Microsoft Algorithms and Data Science team

Team data science process https://github.com/Azure/Microsoft-TDSP

Identify what features are needed to build the model

  • Look at reasons for employees to leave and corresponding features

Data collection

  • HR dept
  • IT dept
  • direct reports
  • social media network

Data understanding

  • visualization
  • basic statistics

Feature engineering

  • statistics
  • time series model
  • text mining

Feature selection

  • correlation analysis
  • model based feature selection

Model selection and validation

R accelerator

  • template with R scripts
  • helps to create a proof of concept

Resampling to balance the data classes

  • SMOTE

Links to this note