Our team developed a machine learning model to automatically score essays based on the Holistic Scoring Rubric.

Project period: May 2024 - June 2024

Project team:

Le Thi Minh Phuong (Team leader)
Vo Hoang Hoa Vien
Pham Le Tu Nhi
Huynh Tri Nhan
Hoang Trung Nam
Huynh Cao Khoi

Role:

Data Processing
Modeling and evaluating.

Tools Python, NLTK, SpellChecker, LGBM

Overview

This project is a part of my course: “Intelligent Data Analysis”. In this course, I was introduced to the fundamental concepts of data analysis, various methods for conducting effective analysis, and how to develop a analytical mindset.

Together with my teammates, I participated in a Kaggle competition where we try to automatically score essays based on the Holistic Scoring Rubric. Our team developed a machine learning model to tackle this challenge.

In this project, my main tasks are:

Applied feature engineering and NLP techniques to extract semantic features from essays.
Analyzed baseline’s performance and conducted experiments to improve model’s accuracy.

Conclusion

Finally, our team achieved a competitive model that improved the overall accuracy, contributed to a higher team ranking on the competition leaderboard

Overview

**Conclusion **

Conclusion