Abstract
Background: Real-time computer vision-based artificial intelligence (CV-AI) systems for surgical video analysis are rapidly advancing. Current evaluation strategies and clinical-readiness reporting, however, remain inconsistent. This scoping review mapped contemporary CV-AI task domains, performance metrics, and evidence of readiness for real-time intraoperative deployment within general surgery.
Methods: This study followed Joanna Briggs Institute methodology for scoping reviews, and was reported in accordance with PRISMA-ScR. Eligible studies were identified by systematic literature search of the MEDLINE, Embase, PubMed, and Scopus databases. All studies published between 1 June 2015 and 1 June 2025 were eligible.
Results: A total of 490 articles were screened, with 113 studies meeting the inclusion criteria after full-text review. Retrospective feasibility analyses predominated, with only 13 studies (12%) evaluating real-time intraoperative integration. Five task domains were identified (phase recognition, anatomy identification, action-event recognition, instrument tracking, and skill-assessment). Forty-one unique performance metrics were reported, with predominant use of discrimination-style summary measures (e.g., accuracy, recall, F1 score), and comparatively sparse reporting of class imbalance, boundary-aware (e.g., Hausdorff distance) or real-time workflow factors (e.g., latency/stability, interface design, surgeon feedback). External validation was described in 13 (12%) studies. Nine studies (8%) referenced artificial intelligence-specific reporting frameworks.
Conclusion: Surgical CV-AI is advancing technically, but remains predominantly at an early feasibility stage. Variability in current metric application and limited real-time clinical evaluation limit potential for comparability, applicability and widespread adoption. Standardised metrics, evaluation frameworks, prospective clinical trials, and collaborative end-user engagement are critical to translate conceptual promise to reliable real-time decision-support tools that support surgeon judgement and integrate seamlessly into routine operative workflows.