2. GENÇ ORAD SEMPOZYUMU, Adana, Turkey, 5 - 09 March 2024, pp.28, (Summary Text)
The Success of Natural Language Processing Artificial
Intelligence Algorithms in Interpreting Radiological Reports
Introduction-Objective:
Artificial
Intelligence Natural Language Processing Algorithms (NLPAs), widely utilized
across various sectors, have found increasing application in dentistry as well.
One such NLPA, ChatGPT, is increasingly utilized in education, research, and
dental practice. The aim of this study is to evaluate the effectiveness of two
different versions of ChatGPT in diagnosing and interpreting cone-beam computed
tomography (CBCT) reports.
Material and Methods:
Ten cases were
selected from the archive of Cone-Beam Computed Tomography (CBCT) reports at
the Department of Oral and Maxillofacial Radiology, Kocaeli University Faculty
of Dentistry. Personal information, preliminary diagnosis, differential
diagnosis, and recommendations were excluded from these reports. ChatGPT-3.5
and ChatGPT-4 were prompted to provide radiological preliminary and
differential diagnoses for these cases based on the provided information.
Subsequently, the preliminary diagnosis sections were added to the same
reports, and both NLPA versions were asked to explain these reports to a
layperson without medical knowledge. The responses to the questions posed to
both NLPA versions were evaluated by Oral, Dental, and Maxillofacial Radiology
specialists using a Likert scale.
Results: Scores for the diagnostic and explanatory
responses provided by ChatGPT-3.5 and ChatGPT-4 versions were evaluated using
the Mann-Whitney U test for the selected cases from the CBCT reports. The
diagnostic capacity for ChatGPT-3.5 averaged 5.80 points, while for ChatGPT-4,
it was 15.20, indicating a statistically significant difference (p <0.001).
There was no statistically significant difference in the average scores for
explanatory capacity between the two NLPA versions (p = 0.143).
Discussion-Conclusion:
In the interpretation
and evaluation of CBCT reports by NLPAs, the results of ChatGPT-4 version were
found to be more successful. Although there was no significant difference in
the ability to explain CBCT reports to patients, both versions yielded
satisfactory scores, with an average of 35/50 for ChatGPT-3.5 and 37.7/50 for
ChatGPT-4. The high performance of the new version of NLPA suggests that as
technology advances, NLPAs will be valuable tools in radiology reporting
processes, as in many other fields.
Keywords:
Natural Language Processing Artificial Intelligence Algorithms, Cone-Beam
Computed Tomography, ChatGPT