Comparison of artificial intelligence systems in answering prosthodontics questions from the dental specialty exam in Turkey
Background/purpose: Artificial intelligence (AI) is increasingly vital in dentistry, supporting diagnostics, treatment planning, and patient education. However, AI systems face challenges, especially in delivering accurate information within specialized dental fields. This study aimed to evaluate th...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-07-01
|
Series: | Journal of Dental Sciences |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S199179022500025X |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Background/purpose: Artificial intelligence (AI) is increasingly vital in dentistry, supporting diagnostics, treatment planning, and patient education. However, AI systems face challenges, especially in delivering accurate information within specialized dental fields. This study aimed to evaluate the performance of seven AI-based chatbots (ChatGPT-3.5, ChatGPT-4, Gemini, Gemini Advanced, Claude AI, Microsoft Copilot, and Smodin AI) in correctly answering prosthodontics questions from the Dental Specialty Exam (DUS) in Turkey. Materials and methods: The dataset for this study consists of 128 multiple-choice prosthodontics questions from the DUS, a national exam administered in Turkey by the Student Selection and Placement Center (ÖSYM) between 2012 and 2021. Chatbot performance was assessed by categorizing the questions into case-based and knowledge-based. Results: ChatGPT-4 achieved the highest accuracy (75.8 %), while Gemini AI had the lowest (46.1 %). Gemini AI also had more incorrect (69) than correct answers (59). ChatGPT-4 and ChatGPT-3.5 showed significantly higher accuracy in knowledge-based questions compared to case-based ones (p < 0.05). For case-based questions, Gemini and Gemini Advanced had the lowest accuracy (36.4 %), while other chatbots averaged 45.5 %. In knowledge-based questions, ChatGPT-4 performed best (78.6 %) and Gemini AI the worst (47 %). Conclusion: ChatGPT-4 excelled in knowledge-based prosthodontic questions, showing potential to enhance dental education through personalized learning and clinical reasoning support. However, its limitations in case-based scenarios highlight the need for optimization to better address complex clinical situations. These findings suggest that AI models can significantly contribute to dental education and clinical practice. |
---|---|
ISSN: | 1991-7902 |