Academic Dishonesty or Academic Integrity? Using Natural Language Processing (NLP) Techniques to Investigate Positive Integrity in Academic Integrity Research

Abstract

Is academic integrity research presented from a positive integrity standpoint? This paper uses Natural Language Processing (NLP) techniques to explore a data set of 8,507 academic integrity papers published between 1904 and 2019.

Two main techniques are used to linguistically examine paper titles: (1) bigram (word pair) analysis and (2) sentiment analysis. The analysis sees the three main bigrams used in paper titles as being “academic integrity” (2.38%), “academic dishonesty” (2.06%) and “plagiarism detection” (1.05%). When only highly cited papers are considered, negative integrity bigrams dominate positive integrity bigrams. For example, the 100 most cited academic integrity papers of all time are three times more likely to have “academic dishonesty” included in their titles than “academic integrity”. Similarly, sentiment analysis sees negative sentiment outperforming positive sentiment in the most cited papers.

The history of academic integrity research is seen to place the field at a disadvantage due to negative portrayals of integrity. Despite this, analysis shows that change towards positive integrity is possible. The titles of papers by the ten most prolific academic integrity researchers are found to use positive terminology in more cases that not. This suggests an approach for emerging academic integrity researchers to model themselves after.

Read the full article ›