Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Performance of readers and an artificial intelligence tool for grading of radiographic knee osteoarthritis at prespecified thresholds: Statistical analysis plan

View ORCID ProfileMathias Willadsen Brejneboel, View ORCID ProfileMikael Boesen, View ORCID ProfileKay Geert A. Hermann, View ORCID ProfileEdwin Oei, View ORCID ProfileHuib Ruitenbeek, View ORCID ProfileKatharina Ziegeler, View ORCID ProfileJacob J. Visser, View ORCID ProfileAnders Lenskjold, View ORCID ProfilePhilip Hansen, View ORCID ProfileJanus Uhd Nybing
doi: https://doi.org/10.1101/2024.03.13.24304202
Mathias Willadsen Brejneboel
1Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mathias Willadsen Brejneboel
  • For correspondence: mathiaswbrejne{at}outlook.com
Mikael Boesen
1Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mikael Boesen
Kay Geert A. Hermann
2Department of Radiology, Charité Universitätsmedizin, Berlin, Germany
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kay Geert A. Hermann
Edwin Oei
3Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Edwin Oei
Huib Ruitenbeek
3Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Huib Ruitenbeek
Katharina Ziegeler
2Department of Radiology, Charité Universitätsmedizin, Berlin, Germany
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Katharina Ziegeler
Jacob J. Visser
3Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jacob J. Visser
Anders Lenskjold
1Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anders Lenskjold
Philip Hansen
1Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philip Hansen
Janus Uhd Nybing
1Department of Radiology, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark
Roles: investigator
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Janus Uhd Nybing
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

1. ABSTRACT

Background and rationale Knee osteoarthritis (OA) is a common disease characterized by reduced function, stiffness, and pain. This clinical diagnosis is commonly supported with radiography of the weight-bearing knee. Radiographic features, such as the Kellgren-Lawrence (KL) grading system, are used as eligibility criteria for clinical studies while others, such as the OARSI grades and minimal joint space width, are used as endpoints for structural OA progression. A higher preoperative KL-grade has been correlated with better pain- and functional outcomes after knee arthroplasty surgery. Consequently, the KL-grade is a common requirement for approving knee arthroplasty among health insurance providers and it is commonly used by orthopedic surgeons as part of determining knee arthroplasty candidacy.

Historically, a radiologist was required to draw on and grade radiographs of the knee to extract the features. With increasing computational power and the increased use of deep convolutional neural networks, off-the-shelf artificial intelligence (AI) tools have become available for automatic extraction of these features. They have received regulatory approval for commercialization, but it is apparent that more diligent external validation is required. Finally, as AI tools begin to mature, new versions are released. It is important to assess how these developments change the current performance of the tool.

Objectives The aim of this analysis is to evaluate the performance of a commercially available AI tool and of readers with different experience levels in orthopedic surgery and radiology at clinically relevant Kellgren-Lawrence grading system thresholds. Additionally, the performance of the AI tool for OARSI grades and patellar osteophytes will be evaluated across two versions of the AI tool.

Methods This study is a secondary analysis of the data from the AutoRayValid-RBknee study, a retrospective observer performance study. It consists of non-fixed-flexion radiographs acquired from the production picture archiving and communications system (PACS) from three European centers. The primary outcome will be the difference in area under the receiver operating curve (AUC) between the readers and the AI tool at the prior authorization clinical criteria threshold (KL ≥ 3). Key secondary outcomes will be radiographic knee osteoarthritis (KL ≥ 2), osteoarthritis clinical trial inclusion (2 ≤ KL ≤ 3), and weight-loss trial inclusion (1 ≤ KL ≤ 3). The AUC of the readers will be computed using the SROC approach as proposed by Oakden-Rayner et al. Further, the performance of the AI tool for grading ordinal OARSI grades will be evaluated using the ordinal ROC as proposed by Obuchowski et al. and the AUC is used for estimating binary OARSI-grade and patellar osteophyte classification performance.

Population Patients with knee pain referred for radiography on suspicion of knee osteoarthritis.

Readers Each center will recruit four readers from across radiology and orthopedic surgery, one in-training and one board-certified for each specialty.

AI tool RBknee-2.2.0 (CE version, KL-grading, OARSI grading, patellar osteophytes) and RBknee-2.1.0 (CE version, KL-grading, OARSI grading, patellar osteophytes) will be used to perform the change impact analysis of advancing product development.

Reference test The reference standard will be determined by the majority vote of three readers, one from each participating hospital who are a board-certified musculoskeletal radiology consultant with expertise in clinical and research evaluation of KOA including extensive experience using the KL-grade.

Sample size Not applicable as this is a secondary analysis.

Framework This is a diagnostic test accuracy study assessing the performance of a commercially available AI tool for radiographic evaluation of knee osteoarthritis according to established grading systems. Additionally, change impact analysis will be performed where multiple versions of the AI tool are available.

Confidence intervals and P values All 95% confidence intervals and P values will use an alpha of 5%.

Multiplicity No explicit multiplicity correction will be performed. Instead, a hierarchical approach will be taken based on tabular order of the tested hypotheses in Table 3.

Statistical software R version 4.2.2 (or newer).

Competing Interest Statement

One author, Mikael Boesen, is a medical advisor for and shareholder of Radiobotics ApS.

Funding Statement

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 954221 for the EIC SME Instrument project AutoRay. The work only reflects the authors' view and the European Commission is not responsible for any use that may be made from the information it contains.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Danish Patient Safety Authority of Denmark waived ethical approval for this work. The IRB of Charite Universitatsmedizin - Berlin waived ethical approval for this work. The IRB of Erasmus Medical Center waived the ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • Funding: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 954221 for the EIC SME Instrument project AutoRay. The work only reflects the authors’ view and the European Commission is not responsible for any use that may be made from the information it contains.

  • Revision, 2024 May 27. Expanded knowledge on the clinical relevance of the prior authorization clinical criteria at the Kellgren-Lawrence threshold of KL ≥ 3 has led to a thorough revision of this SAP. As originally planned, we will still evaluate the current and previous versions of the AI tool for the full set of semi-quantitative grades that it can analyze. Additions focus on the above-mentioned clinical threshold. To better estimate the real-world value of the AI tool, we have added a comparison of the AI tool and the readers in the original AutoRayValid-RBknee study. This comparison will be based on the SROC approach proposed by Oakden-Rayner et al. where each reader is treated as a diagnostic test study in a meta-analysis. The KL ≥ 3 threshold will be the primary outcome of the study, with key secondary outcomes being other clinically relevant thresholds: KL ≥ 2 (radiographic knee osteoarthritis threshold),2 ≤ KL ≤ 3 (inclusion criteria in osteoarthritis trials), and 1 ≤ KL ≤ 3 (inclusion criteria in osteoarthritis-related weight-loss trials). We removed the analysis of the joint space width measurements. These will be delegated to future research where fixed-flexion images will be included as well.

Data Availability

All data produced in the present study are available upon reasonable request to the corresponding author.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted May 27, 2024.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Performance of readers and an artificial intelligence tool for grading of radiographic knee osteoarthritis at prespecified thresholds: Statistical analysis plan
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Performance of readers and an artificial intelligence tool for grading of radiographic knee osteoarthritis at prespecified thresholds: Statistical analysis plan
Mathias Willadsen Brejneboel, Mikael Boesen, Kay Geert A. Hermann, Edwin Oei, Huib Ruitenbeek, Katharina Ziegeler, Jacob J. Visser, Anders Lenskjold, Philip Hansen, Janus Uhd Nybing
medRxiv 2024.03.13.24304202; doi: https://doi.org/10.1101/2024.03.13.24304202
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Performance of readers and an artificial intelligence tool for grading of radiographic knee osteoarthritis at prespecified thresholds: Statistical analysis plan
Mathias Willadsen Brejneboel, Mikael Boesen, Kay Geert A. Hermann, Edwin Oei, Huib Ruitenbeek, Katharina Ziegeler, Jacob J. Visser, Anders Lenskjold, Philip Hansen, Janus Uhd Nybing
medRxiv 2024.03.13.24304202; doi: https://doi.org/10.1101/2024.03.13.24304202

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Radiology and Imaging
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)