Abstract
Objectives: To investigate the performance of Doctor AI in answering frequently asked questions (FAQs) and to analyse the advantages and potential limitations.
Methods: Twenty validated FAQs were entered into ChatGPT-4 and DeepSeek-v3 to compare the responses generated by the two models and their unmodified responses were assessed by expert panels from English-, Chinese-, and bilingual-speaking institutions. Responses were rated across six dimensions: accuracy, comprehensiveness, detail, relevance, readability, and logic.
Results: Both AI models generated high-quality answers, scoring above 8/10 across all dimensions. ChatGPT outperformed DeepSeek in relevance (9.01 +/- 0.33 vs. 8.69 +/- 0.42; p = 0.005), while DeepSeek scored higher in comprehensiveness (9.21 +/- 0.31 vs. 8.27 +/- 0.65; p < 0.001). No significant differences were observed in accuracy, detail, readability, or logic (p > 0.100 for all). The word counts generated by ChatGPT (243.20 +/- 58.27) and DeepSeek (257.35 +/- 44.65) across 20 FAQs were similar (p = 0.223).
Conclusions: Doctor AI such as ChatGPT and DeepSeek are capable of delivering high-quality responses to orthodontic FAQs. ChatGPT may provide more relevant answers, while DeepSeek offers greater comprehensiveness. Combining their outputs may improve patient understanding, but AI-generated information should supplement, not replace, guidance from qualified orthodontic professionals.