Russian scientists created a computer model that predicted student performance with 94% accuracy from their texts on social networks, and then transferred it to the determination of depressive conditions and the assessment of the psychological health of adolescents and students on a scale of an educational institution. The results were published in the journal EPJ Data Science .
“In our new work, we tried to predict the performance of students in schools and universities by their posts on VKontakte and Twitter. Learning ability is a very complex human characteristic. It is influenced not only by character traits, but also by psychological well-being, for example, the presence of various disorders. Alas, the latter is not measured on the scale of the institution, unlike academic success, which is also in the public domain. We are developing a system that would be able to identify psychological difficulties, in particular depression, by a person’s activity in a social network. One cannot be sure how this model will perform unless first validated.it is based on characteristics, information about which is widely available, for example, on academic performance, ”says project manager Ivan Smirnov, PhD , head of the laboratory of computational social sciences at the Institute of Education at the Higher School of Economics.
At the first stage, scientists used posts from open VKontakte pages from 2,468 subjects who passed the PISA test in 2012, which allows to assess literacy and the ability to apply their knowledge in practice. Experts taught the model to compare words from posts to vectors “: each word has its place in the space of meanings. The model was then trained to distinguish between student posts with good and bad PISA scores. After that, the system was applied to the posts of students from hundreds of the country’s largest universities and compared the results with official data, which showed the average USE scores of applicants and graduates of the educational institution, as well as general information on academic performance.
The model revealed that the texts of “excellent students” are often voluminous, written in richer language and contain long and foreign words. Such students often discuss physics, literature and tend to use expressions describing the thought process. Errors, emoticons, exclamations and words written in capital letters are typical for “poor” students. They often talk about horoscopes, military service and road accidents. The accuracy of the model was 94%. The new approach can be helpful in identifying depression that affects academic achievement. The results of the work showed once again how vulnerable the user’s privacy on the social network is.