Predict and Analyze Employee Retention using Machine Learning
Employee Retention is one of HR Analytics largest business concerns. Companies are investing a lot in the preparation of workers taking into account future returns. When an employee leaves the company, the company loses the risk of chance. Mass recruitment agencies are particularly affected by retention. A summary of the employee database was taken in this article. In order to forecast the likelihood of the turnover of any new employee, and simultaneously evaluate them, we have introduced various classification algorithms such as logistic regression, linear regression, and decision trees Classifier. We performed a comparative analysis of the models using several model assessment criteria and found that the functional accuracy is the best. The correlation matrix and heat map are created to show the relationship between attributes. The histogram is generated on the experimental side, which indicates the difference between the left and the pay, the department and the level of satisfaction, etc. We use five separate machine-learning algorithms for prediction purposes including linear regression, logistic regression, the Decision Tree, and K-means clustering. This paper offers reasons to maximize the turnover of employees in every company.
Index Terms – Logistic Regression, Linear Regression, K-means Clustering, Decision Tree Classifier
Employee Retention is one of the major problems faced by any organization. In this age of cut-throat competition, there are many factors that lead to dissatisfaction in employees. Long working hours, peer pressure, job location, and job role, traveling time, office space, amenities in the office, perks, and many more reasons could be a factor for employee retention. It is very important for the HR department to understand employee satisfaction levels. Sometimes the employee may not have any problem in the company but others may offer a better profile with a better pay package. So, the employee may be willing to leave. Retaining one employee needs a lot of insight in many areas. In this research, we try to find out important factors that lead to employee retention. The results of our model can be used by the HR department to plan a strategy before the employee sends his resignation. This paper involves a comparative study of various algorithms using the model evaluation metrics like accuracy, precision, recall, FN Rate, F-measure unlike the majority of the existing research work which focuses on a single algorithm for solving a business problem. The reason for using multiple algorithms is that each algorithm has its special advantages and pitfalls. For e.g. when the event rate is low, the logistic model underperforms while random forests outperform. When there is a large number of significant attributes out of the total attributes, ridge outperforms lasso. When the data is highly non-linear, the decision tree outperforms the linear models. Thus, the comparative analysis would take into account all the advantages & pitfalls enabling the organizations to select to the best model for their data. In this paper, we have discussed an approach based on machine learning, which offers an insight into any company’s sales by identifying the main factors. A holistic approach to selecting the highest weighted features in two stages is the proposed model. Our main objective is to decrease the number to which the apps are most dominant. Another problem, the employment of substitutes, imposes high costs on the company, including the costs of interviews, recruitment, and training. Projecting the turnover of workers to an organization will assist management in developing its internal policies and strategies faster. In case of high functional space and susceptibility to over-fitting a data set. Once, it is very difficult to see a large number of features. There are two methods to follow in order to ensure that no over-fitting and proper visualization occur. One is the technique of dimensional reduction and one is the selection of features. Nonetheless, we have selected a methodology for feature selection, with the goal of raising the range of features when selecting the most significant features. For various reasons, an employee may leave the job. Here are the words “Turnover” and “retention,” which are always contradictory. In a company, there are different types of “turnover.”
1.2. Employee Retention
Employee retention refers to the policy and practices which companies use to prevent valuable employees from leaving their jobs. One of the major problems facing businesses in the competitive marketplace is how to attract valuable employees. Not too much ago, businesses adopted the “revolving door policy” as part of doing business and were able to find another willing candidate to fill a vacant job. Nowadays, companies frequently find that they spend considerable time, energy, and money educating an employee only to have them turn into a valuable commodity and leave the company for greener pastures. To order to create a successful business, employers should explore as many options as possible when it comes to maintaining workers whilst at the same time ensuring their trust and loyalty so they have less desire to leave in the future. Employees need to be retained because they need healthy, loyal, educated and hard-working employees to run the business. Over the long term, they have gained good product expertise and a skilled employee can better handle clients and also address peer issues that are unique to the organization. When an employee leaves he brings with him all information about the company such as ongoing projects etc. Because of more employee turnover rate, the company’s reputation gets hindered and the rivals continue to punch their nose to hire the best talents from them. Job performance is in large measure hindered. Let me give you an example–If an employee leaves it is very difficult to step in in the middle of an ongoing project that Vacuum and a new employee can never replace an old and talented employee so this leads to delayed project completion and less job satisfaction among other team members.
Employee retention is the term that means “gradual wearing down”. Employee retention nowadays in the industry is the most challenging task to be controlled by HR managers. This employee retention affects the wealth of the organization. Employee retention means employees are going to leave the organization in both voluntary and involuntary ways. The percentage and frequency of employee retention reflect the HR practices, profitability and the future of a business organization. Employee retention refers to the various policies and practices which let the employees stick to an organization for a longer period of time. Every organization invests time and money to groom a new joiner, make him a corporate ready material and bring him at par with the existing employees. The organization is completely at a loss when the employees leave their job once they are fully trained. Employee retention takes into account the various measures taken so that an individual stays in an organization for the maximum period of time.
This situation in an organization creates the breakdown of work in the organization and gives loss to the organization. In Indian companies, retention is of skilled employees in organization. Employees are not staying in the company mostly because of less freedom, poor quality of work life. These employees when left the organization they do not only vacate skilled job for skilled employee the company but also take away the business of the company also which affects the company’s wealth and image. Retention’s are the problem in many industries especially in the automobile, manufacturing industry, pharmaceutical industry, agriculture, breweries, steel and agriculture industry. The total employee turnover rate doesn’t give the idea of the retention there is numerous reason behind the retention why employees leave the organization? Better career opportunities and better pay have been found to be the reasons for retention across most sectors, again foreign direct investment has created a lot of competition among HR managers to hold the present employees of the organization mostly the middle management people have recorded the highest mobility, again the difficult task of the HR manager is to retain the talent in their organization. Experienced HR managers say that middle management people leave the organization very frequently, and that create sudden loss to an organization in form of reduce productivity. Retention is the result of low job satisfaction, the most common reason for retention is better career opportunities elsewhere, as employees have a lot of options to choose from given that their skills can be easily transferable attracting and retaining skilled talent is the most immediate concern for organizations. Retention is most dangerous when you are losing your employees being attracted by your immediate competitors who create sudden impact on your productivity, customer and overall gives you a huge loss. Employees won’t feel happy while leaving their present job in which they find more happiness even in organization where employees are not happy their also employees feel very bad saying goodbye to the present job. This is because when employees leave the job they are attached with emotions, relationships, loyalty to the organization, to take the decision to leave their present job is very difficult for them if they get all things that they desire under one roof they will never take such decision. Basically, an organization that shows steady growth but gives work satisfaction people want to leave where work satisfaction is there. In today’s competitive world of globalization where prize of product remains the basic concern for every organization, it becomes difficult for an employer to take care of an employee’s remuneration, perks and incentives though employee is also a major factor of production. Employers are in business and where a new employee at the time of selection receives various promises for getting some fringe and other benefits doesn’t receive these develops a negative image of the organization and would never like to stay in that organization. On another side, the employer is also helpless in such situations such organizations’ retention rate is found to be comparatively more.
Employee retention relates to an organization’s ability to retain its workforce. Employee retention may be defined by a simple statistic (for example, an 80 percent retention rate usually indicates that a company retained 80 percent of its employees over a given period). However, many find retention of workers to be linked to the efforts employers are making to keep the employees in their workforce. In that sense, retention is the tactic instead of the result.
A distinction should be made between low-performing workers and top performers, and attempts to retain employees should concentrate on valuable employees who contribute to them. Employee turnover is a symptom of deeper unresolved issues that could include low employee morale, lack of a clear career path, lack of appreciation, poor employee-manager relationships, or many other issues. A lack of job satisfaction and organization’s dedication can also lead an employee to leave and start looking for other opportunities. Pay doesn’t always play as big a role in creating turnover as is usually thought.
The employers ‘aim in a business setting is typically to decrease employee turnover, thereby reducing training costs, recruiting costs and loss of talent and organizational knowledge. By applying lessons learned from key concepts in organizational behavior, managers may increase retention rates and the associated high turnover costs. This is not always the case though. Employers should strive for “good retention,” with the goal of retaining only those workers that they find to be high performers.
Organizations that are more conscientious for environment and sustainability activities will attract and retain workers in today’s environmentally conscious actions society. Employees like to work with environmentally friendly businesses.
1.3. Research Question
Research Question focuses primarily on how to forecast employee turnover, pick valuable employees from them, and then use machine learning techniques to identify the most powerful employee retention factors.
1.4. Problem Statements
The aim of this study is to use Predictive Analytics for HR on an example of employee turnover and to investigate variables affecting employee retention within the organization, using Machine Learning algorithms for employee data from Swan bank. The aim is to try out various Machine Learning algorithms and test their performance on the data of the organization to pick the most accurate model. Data for this modeling issue consists of structured data from multiple sources so pre-processing of data will be important. It will include employee demographic information, and the likelihood of employees leaving the company will be the result value. Accurate prediction of employee turnover would allow the company to make strategic decisions about retention of employees and take the necessary steps.
• Analyze the employee turnover in a company.
• Predict the number of employees that leave the company.
• Analyze the performance of the employee in a company.
Research goal is to find the characteristics that are responsible for employee retention and then forecast employee retention using data mining technique. Once the turnover is identified, research the factors that decide on the valuable employees and create the decision model for valuable employees after finalizing those factors. And then identify the retention factors and demonstrate the most successful retention factors for each employee in order to improve retention of employees .
With the aid of questionnaires, consider characteristics for employee retention, using previous study studies and surveys among HR professionals. Also, the most accurate predictive model was developed with the aid of previous study and work on prediction of employee retention. Then here developed the decision model for the valuable employees with the previous case studies, analysis, and methodological assumptions as discussed in the methodology chapter. Then again with other statistical assumptions, I discovered the most successful factors for retention and showed them on the dashboard.
Then chose this research subject because the involved Data Analytics and Machine Learning and saw the difficulties faced by Line Managers, Project Managers, and HR Managers in previous companies, where several workers left at the same time and the project faced multiple challenges to complete the task within the specified deadlines due to unexpected turnover of employees. Then, decided to create an HR Analytical tool to help the HR professionals in the processes of recruitment and retention of employees.
1.6. Scope and limitations of research
The Research Scope is divided into five parts-First section includes the study of factors influencing the turnover of employees. The second part has the research of quantitative model analysis along with model accuracy to predict turnover of the employee. The third-party has valuable employee quality analysis. Fourth component has a review on the decision tree to be designed to assess which employee is more important to keep (such as high performance rating). In the fifth section, incorporated all the Machine Learning Models and Data Source and built the prediction and analysis code of Machine Learning showing predictive results, model description, useful employee and dashboard results table with graphs and plots. The python shiny ML Model is developed.
Employee turnover budgeting to demonstrate how much the organization has saved the employee’s compensation budget after keeping the successful employee is not included in the study as it joins the entirely different human resource management budget field. And therefore, limited my research work to improvising retention of employees.
Limitations include implementation of the employee retention prediction system due to less training data set, implementation of the decision-making system for retention due to complicated creation of the classification tree and algorithm and conflicting data, the accuracy of the decision outcome, limited access to the employee data set due to GDPR.
The research has a significant contribution in the field of HR Analytics in improving the accuracy of employee turnover prediction and advancing to help HR and project managers increase the retention rate of valuable employees by creating a decision tree to pick the valuable employee and recognize the factors that cause them to resign, thus saving the employee.
The aim of this study is to apply Predictive Analysis methodology on ’ HR data, in order to analyze the use of predictive analysis for HR. Chosen metric for prediction was employee turnover, because of its high importance for organization. Thus the goal of the thesis was to predict employee turnover. In this work, data preparation has been done on HR datasets. Example dataset was used for demonstrating methods used for cleaning and preparing data. First of all, employee data from multiple sources has been gathered and unified. Then missing values has been imputed, outliers have been removed and skew has been reduced. Afterward, parameters for selected Machine Learning algorithms have been tuned using different methodologies. Later, on pre-processed employee data different machine learning algorithms have been applied and algorithm with best prediction accuracy has been selected.
Eventually model has been applied to current employee dataset and outcome predictions along with prediction probabilities were saved in flat file. In addition, for understanding how decision was made, the model was interpreted by plotting decision graph and identifying most important features. The outcome of the thesis is employee turnover prediction. The aim is reached as the retention and retention of the model were successful. Using outcome flattened file turnover rate can be estimated or for key employees retention actions can be applied. To improve prediction results in the future other features can be added to the dataset. Missing values can be eliminated by collecting respective data. More Machine Learning methods can be applied and evaluated. A strategic “Retention Plan” should be drawn for each Risk Category group. In addition to the suggested steps for each feature listed above, face-to-face meetings between an HR representative and employees can be initiated for medium- and high-risk employees to discuss work conditions. Also, a meeting with those employee’s Line Manager would allow to discuss the work environment within the team and whether steps can be taken to improve it.
1.8. Conclusion and Recommendation
This concludes the research work and recommends future advancements in retention. From the study, it is found that the factors responsible for retention and retention are – Retention, Percent Salary Hike, Monthly Income, Years Since Last Promotion, Distance From Home, Job Role, Performance Rating, Job Level, Satisfaction, With Current Manager, Job Satisfaction, Work-Life Balance, Number of Companies Worked, Years At Company, Over-Time, Total Working Years. Methodology demonstrated the processes followed while carrying on the research and developing an analytical prediction prototype. Finally, an Artifact design illustrated the development and working of system architecture. The advancement for improving the result accuracy is also mentioned so that it can be used for developing for commercial purpose. In this research, it is found that with the above attributes and Logistic regression algorithm, the most accurate prediction result is obtained if the training dataset is 80% of the total data.
As a recommendation, the above analytical purpose can be integrated with the Human resource management finance budgeting and thereby predicting the overall profit or savings in the Human Resource Management process which include retention, retention, hiring of a new employee, the amount spent on the training, and development of new employee and loss to the project due to loss of valuable employee and prescribe on the further actions to be taken Thus, the company can keep track of the amount of budget it had spent on Human resource management and budget to be spent on future and take necessary actions. Also, the system can be put on the cloud and the data can directly be taken from the cloud storage through server connections by using FTP and SFTP commands in a Unix environment.