Evaluating an Artificial Intelligence (AI) Model Designed for Education to Identify Its Accuracy: Establishing the Need for Continuous AI Model Updates
Article
Article Title | Evaluating an Artificial Intelligence (AI) Model Designed for Education to Identify Its Accuracy: Establishing the Need for Continuous AI Model Updates |
---|---|
ERA Journal ID | 200411 |
Article Category | Article |
Authors | Verma, Navdeep, Getenet, Seyum, Dann, Christopher and Shaik, Thanveer |
Journal Title | Education Sciences |
Journal Citation | 15 (4) |
Article Number | 403 |
Number of Pages | 22 |
Year | 2025 |
Publisher | MDPI AG |
Place of Publication | Switzerland |
ISSN | 2227-7102 |
Digital Object Identifier (DOI) | https://doi.org/10.3390/educsci15040403 |
Web Address (URL) | https://www.mdpi.com/2227-7102/15/4/403 |
Abstract | The growing popularity of online learning brings with it inherent challenges that must be addressed, particularly in enhancing teaching effectiveness. Artificial intelligence (AI) offers potential solutions by identifying learning gaps and providing targeted improvements. However, to ensure their reliability and effectiveness in educational contexts, AI models must be rigorously evaluated. This study aimed to evaluate the performance and reliability of an AI model designed to identify the characteristics and indicators of engaging teaching videos. The research employed a design-based approach, incorporating statistical analysis to evaluate the AI model’s accuracy by comparing its assessments with expert evaluations of teaching videos. Multiple metrics were employed, including Cohen’s Kappa, Bland–Altman analysis, the Intraclass Correlation Coefficient (ICC), and Pearson/Spearman correlation coefficients, to compare the AI model’s results with those of the experts. The findings indicated low agreement between the AI model’s assessments and those of the experts. Cohen’s Kappa values were low, suggesting minimal categorical agreement. Bland–Altman analysis showed moderate variability with substantial differences in results, and both Pearson and Spearman correlations revealed weak relationships, with values close to zero. The ICC indicated moderate reliability in quantitative measurements. Overall, these results suggest that the AI model requires continuous updates to improve its accuracy and effectiveness. Future work should focus on expanding the dataset and utilise continual learning methods to enhance the model’s ability to learn from new data and improve its performance over time. |
Keywords | video conferencing; AI; teachers’ movements; design-based research; teachers’ behaviours; online student engagement |
Article Publishing Charge (APC) Funding | Other |
Contains Sensitive Content | Does not contain sensitive content |
ANZSRC Field of Research 2020 | 390399. Education systems not elsewhere classified |
Byline Affiliations | School of Education |
https://research.usq.edu.au/item/zwx63/evaluating-an-artificial-intelligence-ai-model-designed-for-education-to-identify-its-accuracy-establishing-the-need-for-continuous-ai-model-updates
Download files
1
total views4
total downloads1
views this month4
downloads this month