Options
Towards enhanced assessment question classification: a study using machine learning, deep learning, and generative AI
Journal
Connection Science
ISSN
0954-0091
Date Issued
2025-01-02
Author(s)
Mohammed Osman Gani
Saadat M. Alhashmi
Khondaker Sajid Alam
Anbuselvan Sangodiah
Khondaker Khaleduzzman
Chinnasamy Ponnusamy
DOI
10.1080/09540091.2024.2445249
Abstract
This study aims to benchmark the performance of machine learning (ML), deep learning (DL), and generative AI (GenAI) models in categorising assessment questions based on Bloom's Taxonomy. Previous studies have lacked comprehensive investigations into the performance of these approaches. Further, the GenAI remains unexplored, offering a promising avenue for groundbreaking explorations. Therefore, we explore the effectiveness of various ML models by incorporating domain-specific term weighting and utilising word embeddings. The study also analyses the performance of Recurrent Neural Networks (RNNs) and Convolutional Neural Network (CNN) with and without bidirectional connections, as well as an approach that combines RNNs and CNN. Furthermore, we evaluate several transformer-based models by fine-tuning them alongside GenAI models text-davinci-003, gpt-3.5-turbo, PaLM2, and Gemini Pro in zero-shot classification settings. The results demonstrate that ML models outperformed DL models, achieving a best accuracy of 0.871 and F1 score of 0.872. Additionally, domain-specific term weighting is found to be superior to word embeddings. Furthermore, most ML and DL models performed better than GenAI models, with GenAI models achieving a best accuracy of 0.618 and a best F1 score of 0.627. Therefore, the outcome suggests considering the ML models with domain-specific term weighting as benchmark models in future research.
File(s)
Loading...
Name
j.png
Size
17.27 KB
Format
PNG
Checksum
(MD5):85f5e85fa8f8c13d7350540217a227b6
