Interobserver Reliability of the Kellgren–Lawrence Classification in Knee Osteoarthritis: A Comparison Between Orthopedic Surgeons and Artificial Intelligence

Sayar, Şafak; Boz, Mustafa; Topkarcı, Yasemin Begüm; Batar, Suat; Demir, Necdet

EJMI Volume : 10 Issue : 1 Year : 2026

10/1EJMI Current Issue Ahead of Print EJMI Archive Most Accessed Articles

ICMJE COI Form

EJMI. 2026; 10(1): 80-83 | DOI: 10.14744/ejmi.263080

Interobserver Reliability of the Kellgren–Lawrence Classification in Knee Osteoarthritis: A Comparison Between Orthopedic Surgeons and Artificial Intelligence

Şafak Sayar¹, Mustafa Boz¹, Yasemin Begüm Topkarcı², Suat Batar¹, Necdet Demir¹
¹Department of Orthopaedics and Traumatology, Biruni University Faculty of Medicine Hospital, Istanbul, Türkiye, ²Faculty of Medicine, Gazi University, Ankara, Türkiye

Objectives: To evaluate the interobserver reliability of the Kellgren–Lawrence (KL) classification among orthopedic surgeons and to compare their assessments with artificial intelligence (AI) systems. Methods: One hundred anteroposterior weight-bearing knee radiographs from patients aged 65 years and older were retrospectively analyzed. Four orthopedic surgeons and two AI systems independently graded all radiographs according to the KL classification and were blinded to clinical information and to each other’s evaluations. Interobserver agreement was assessed using quadratically weighted Cohen’s kappa and intraclass correlation coefficients (ICC). Results: Interobserver agreement among orthopedic surgeons demonstrated good reliability (mean weighted?=0.780; ICC=0.784). Agreement between the orthopedic consensus and ChatGPT was moderate (?=0.481), whereas Gemini demonstrated moderate-to-good agreement (?=0.561). Agreement between the two AI systems was also moderate (?=0.484). Conclusion: The KL classification demonstrated good reliability among orthopedic surgeons. AI systems demonstrated moderate agreement with orthopedic experts and may serve as supportive screening tools rather than as diagnostic replacements. Keywords: Artificial

Cite This Article

Sayar Ş, Boz M, Topkarcı Y, Batar S, Demir N. Interobserver Reliability of the Kellgren–Lawrence Classification in Knee Osteoarthritis: A Comparison Between Orthopedic Surgeons and Artificial Intelligence. EJMI. 2026; 10(1): 80-83

Corresponding Author:

Full Text PDF PDF Download

TOOLS Full Text PDF (English) Print Download citation RIS EndNote BibTex Medlars Procite Reference Manager Share with email Share Send email to author Similar articles PubMed Google Scholar