EVALUATION OF THE POTENTIAL OF ARTIFICIAL INTELLIGENCE LANGUAGE LEARNING ALGORITHM IN INTERPRETATION RADIOLOGIC REPORTS


Creative Commons License

Çelik S., Baysal O., Seki U., Sinanoğlu E. A.

2. GENÇ ORAD SEMPOZYUMU, Adana, Turkey, 5 - 09 March 2024, pp.28, (Summary Text)

  • Publication Type: Conference Paper / Summary Text
  • City: Adana
  • Country: Turkey
  • Page Numbers: pp.28
  • Kocaeli University Affiliated: Yes

Abstract

The Success of Natural Language Processing Artificial Intelligence Algorithms in Interpreting Radiological Reports

Introduction-Objective:

 Artificial Intelligence Natural Language Processing Algorithms (NLPAs), widely utilized across various sectors, have found increasing application in dentistry as well. One such NLPA, ChatGPT, is increasingly utilized in education, research, and dental practice. The aim of this study is to evaluate the effectiveness of two different versions of ChatGPT in diagnosing and interpreting cone-beam computed tomography (CBCT) reports.

Material and Methods:

 Ten cases were selected from the archive of Cone-Beam Computed Tomography (CBCT) reports at the Department of Oral and Maxillofacial Radiology, Kocaeli University Faculty of Dentistry. Personal information, preliminary diagnosis, differential diagnosis, and recommendations were excluded from these reports. ChatGPT-3.5 and ChatGPT-4 were prompted to provide radiological preliminary and differential diagnoses for these cases based on the provided information. Subsequently, the preliminary diagnosis sections were added to the same reports, and both NLPA versions were asked to explain these reports to a layperson without medical knowledge. The responses to the questions posed to both NLPA versions were evaluated by Oral, Dental, and Maxillofacial Radiology specialists using a Likert scale.

Results: Scores for the diagnostic and explanatory responses provided by ChatGPT-3.5 and ChatGPT-4 versions were evaluated using the Mann-Whitney U test for the selected cases from the CBCT reports. The diagnostic capacity for ChatGPT-3.5 averaged 5.80 points, while for ChatGPT-4, it was 15.20, indicating a statistically significant difference (p <0.001). There was no statistically significant difference in the average scores for explanatory capacity between the two NLPA versions (p = 0.143).

Discussion-Conclusion:

 In the interpretation and evaluation of CBCT reports by NLPAs, the results of ChatGPT-4 version were found to be more successful. Although there was no significant difference in the ability to explain CBCT reports to patients, both versions yielded satisfactory scores, with an average of 35/50 for ChatGPT-3.5 and 37.7/50 for ChatGPT-4. The high performance of the new version of NLPA suggests that as technology advances, NLPAs will be valuable tools in radiology reporting processes, as in many other fields.

 Keywords: Natural Language Processing Artificial Intelligence Algorithms, Cone-Beam Computed Tomography, ChatGPT