Employee efficiency prediction of garment production line based on machine learning
Objective The significant impact of variations in employee productivity on the balance of apparel production lines has prompted the need for a solution to address the shortfall in achieving targeted productivity levels under manually scheduled operations lacking historical data analysis support.This research aims to utilize machine learning models to predict actual employee efficiency,providing management with valuable insights for goal setting and decision-making to enhance production profitability and prevent erroneous decisions to some extent.Method In order to achieve efficiency prediction,this research conducted on-site surveys at factory A,gathering 526 historical production records from 13 orders.Through feature engineering,15 initial prediction datasets were constructed,and efficiency levels were categorized using quantile division.Subsequently,considering the production data characteristics,RandomForest regression and classification models were selected for efficiency prediction.In order to validate the predictive performance of the model,it was compared with eight other models.Pearson and Spearman correlation coefficient analyses were performed to investigate the impact of variables on the model predictions.Finally,recursive feature elimination was employed to optimize the model by selecting the optimal feature subset from the initial feature set for maximum predictive performance.Results Using a random split function,20%of the prediction dataset was set aside for validation,while the remaining 80%was divided into training and testing sets for ten-fold cross-validation.R2 and RMSE were chosen as regression metrics,and F1 score was selected as the classification metric.The RandomForest regression model demonstrated the optimal predictive performance,showing the smallest range of fit and root mean square error in ten-fold cross-validation,with a fitting goodness value of 0.826 and an RMSE value of 0.126.In the classification task,the random forest model exhibited higher predictive performance compared to most models,with a balanced F1 score of 0.809 in the validation set,slightly lower than the gradient boosting classification model.Prior to model optimization,correlation coefficient and feature importance analyses revealed the crucial role of the auxiliary variable"annual efficiency"in predictions.Based on variable analysis,recursive feature elimination was employed to select the optimal feature parameter set for both the RandomForest regression and classification models.In the regression task,the RandomForest model achieved the optimal parameter combination with eight features,yielding a validation set R2value of 0.836.In the classification task,the growth curve of the random forest model's predictive performance was relatively gradual,using nine features to form the optimal parameter combination,resultingin a validation F1 score of 0.823.In the optimization results,setting the threshold for the difference between RandomForestRegressor predictions and actual results to 30%identified only three outliers,accounting for 3.16%of the data.For the RandomForestClassifier model,the classification results indicated a very low recall rate for sample 3,contributing to the relatively lower F1 score.Conclusion Through comparative experiments on predictive performance,the RandomForest model was selected as the optimal optimization model.Recursive feature elimination was chosen for model optimization based on the analysis of variable impacts on efficiency prediction.The results demonstrate that machine learning can accurately predict employee efficiency.Due to limitations imposed by the experimental factory,parameter collection was restricted.Future efficiency prediction research could consider adding more feature parameters to enhance model generalization.Additionally,considering the influence of time series,recurrent neural networks(RNNs)could be employed for modeling production efficiency prediction.In the future,we will continue to optimize this predictive model and apply it to the scheduling and arrangement of actual apparel assembly line workers.
garment production datamachine learningprenatal efficiencyrecursive feature eliminationflexible scheduling