Abstract
In this study, an early warning system predicting first-year undergraduate student academic performance is developed for higher education institutions. The significant factors that affect first-year student success are derived and discussed such that they can be used for policy developments by related bodies. The dataset used in experimental analyses includes 11,698 freshman students’ data. The problem is constructed as classification models predicting whether a student will be successful or unsuccessful at the end of the first year. A total of 69 input variables are utilized in the models. Naive Bayes, decision tree and random forest algorithms are compared over model prediction performances. Random forest models outperformed others and reached 90.2% accuracy. Findings show that the models including the fall semester CGPA variable performed dramatically better. Moreover, the student’s programme name and university placement exam score are identified as the other most significant variables. A critical discussion based on the findings is provided. The developed model may be used as an early warning system, such that necessary actions can be taken after the second week of the spring semester for students predicted to be unsuccessful to increase their success and prevent attrition.