Employee Attrition Prediction with R Accelerator
speaker: Le Zhang, Data Scientist at Microsoft event: FOSSASIA pre-meetup
Client project done by Microsoft Algorithms and Data Science team
Team data science process https://github.com/Azure/Microsoft-TDSP
Identify what features are needed to build the model
- Look at reasons for employees to leave and corresponding features
Data collection
- HR dept
- IT dept
- direct reports
- social media network
Data understanding
- visualization
- basic statistics
Feature engineering
- statistics
- time series model
- text mining
Feature selection
- correlation analysis
- model based feature selection
Model selection and validation
R accelerator
- template with R scripts
- helps to create a proof of concept
Resampling to balance the data classes
- SMOTE