理学療法士国家試験に対するChatGPTのパフォーマンス評価 | 理学療法ジャーナル58巻3号

短報

理学療法士国家試験に対するChatGPTのパフォーマンス評価

著者：澤村彰吾¹ 尾藤貴宣² 安藤貴洋² 増田健人² 古桧山建吾³

所属機関： ¹平成医療短期大学リハビリテーション学科理学療法専攻 ²岐阜大学附属病院リハビリテーション部 ³平成医療短期大学リハビリテーション学科作業療法専攻

ページ範囲：P.363 - P.366

文献購入ページに移動

文献概要

要旨　【目的】本研究では，ChatGPT-3.5およびアップグレード版であるChatGPT-4の理学療法士国家試験におけるパフォーマンスを検証することを目的とした．【方法】第57回および第58回理学療法士国家試験を対象として，ChatGPT-3.5とChatGPT-4に回答を生成させた．なお，画像問題や厚生労働省が不適切と判断した問題は対象から除外した．【結果】第57回理学療法士国家試験の正答率はChatGPT-3.5が47.6％（79/166問）であり，ChatGPT-4が80.7％（134/166問）であった．第58回理学療法士国家試験の正答率はChatGPT-3.5が55.5％（96/173問），ChatGPT-4が72.3％（125/173問）であった．【結論】ChatGPT-3.5は第57回，第58回ともに合格基準を満たさなかったが，ChatGPT-4は合格基準に達していた．しかし，臨床場面や教育現場での使用を考慮すると，生成された回答の正確性を確認し，情報を検証することが必要であると考えられた．

参考文献

1）OpenAI：APIウェブサイト．https://platform.openai.com（2023年6月1日閲覧）

2）Brown TB, et al：Language models are few-shot learners. ArXiv 2005；14165. doi：10.48550/arXiv.2005.14165

3）Thorp HH：ChatGPT is fun, but not an author. Science 2023；379：313. doi：10.1126/science.adg7879

4）Kung TH, et al：Performance of ChatGPT on USMLE：potential for AI-assisted medical education using large language models. PLoS Digit Health 2023；2：e0000198. doi：10.1371/journal.pdig.0000198

5）Li SW, et al：ChatGPT outscored human candidates in a virtual objective structured clinical examination in obstetrics and gynecology. Am J Obstet Gynecol 2023；S0002-9378：00251-X. https://doi.org/10.1016/j.ajog.2023.04.020（2023年6月29日閲覧）

6）Rao A, et al：Assessing the utility of ChatGPT throughout the entire clinical workflow. medRxiv. 2023；23285886. doi：10.1101/2023.02.21.23285886

7）Sallam M, et al：ChatGPT utility in healthcare education, research, and practice：systematic review on the promising perspectives and valid concerns. Healthcare（Basel） 2023；11：887. doi：10.3390/healthcare11060887

8）Gilson A, et al：How does ChatGPT perform on the united states medical licensing examination? the implications of large language models for medical education and knowledge assessment. JMIR Med Educ 2023；9：e45312. doi：10.2196/45312

9）Lee H：The rise of ChatGPT：exploring its potential in medical education. Anat Sci Educ 2023. doi：10.1002/ase.2270

10）Tanaka Y, et al：Performance of generative pretrained transformer on the national medical licensing examination in Japan. MedRxiv 2023. https://doi.org/10.1101/2023.04.17.23288603（2023年6月29日閲覧）

11）厚生労働省：第57回理学療法士国家試験，第57回作業療法士国家試験の問題および正答について．2022. https://www.mhlw.go.jp/seisakunitsuite/bunya/kenkou_iryou/iryou/topics/tp220421-08_09.html（2023年7月27日閲覧）

12）厚生労働省：第58回理学療法士国家試験，第58回作業療法士国家試験の問題および正答について．2023. https://www.mhlw.go.jp/seisakunitsuite/bunya/kenkou_iryou/iryou/topics/tp230524-08_09.html（2023年7月27日閲覧）

13）厚生労働省：第57回理学療法士国家試験及び第57回作業療法士国家試験の合格発表について．2022. https://www.mhlw.go.jp/general/sikaku/successlist/2022/siken08_09/about.html（2023年7月27日閲覧）

14）厚生労働省：第58回理学療法士国家試験及び第58回作業療法士国家試験の合格発表について．2023. https://www.mhlw.go.jp/general/sikaku/successlist/2023/siken08_09/about.html（2023年7月27日閲覧）

15）Bubeck S, et al：Sparks of artificial general intelligence：early experiments with GPT-4. arXiv 2023；12712. doi：10.48550/arXiv.2303.12712

16）Ghosh A, et al：Evaluating ChatGPT's ability to solve higher-order questions on the competency-based medical education curriculum in medical biochemistry. Cureus 2023；15：e37023. doi：10.7759/cureus.37023

17）Johnson D, et al：Assessing the accuracy and reliability of AI-generated medical responses：an evaluation of the Chat-GPT model. Res Sq 2023；3：rs-2566942. doi：10.21203/rs.3.rs-2566942/v1

18）OpenAI：GPT-4 technical report. arXiv 2023；08774. doi：10.48550/arXiv.2303.08774

19）Vasey B, et al：Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence：DECIDE-AI. Nat Med 2022；28：924-933. doi：10.1038/s41591-022-01772-9

20）Liévin V, et al：Can large language models reason about medical questions? ArXiv 2023；08143. doi：10.48550/arXiv.2207.08143

21）厚生労働省：平成28年版理学療法士作業療法士国家試験出題基準について．2024．https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/0000058636.html（2023年8月3日閲覧）

22）Athaluri SA, et al：Exploring the boundaries of reality：investigating the phenomenon of artificial intelligence hallucination in scientific writing through ChatGPT references. Cureus 2023；15：e37432. doi：10.7759/cureus.37432

23）Fijačko N, et al：Can ChatGPT pass the life support exams without entering the American heart association course? Resuscitation 2023；185：109732. doi：10.1016/j.resuscitation. 2023. 109732

24）De Angelis L, et al：ChatGPT and the rise of large language models：the new AI-driven infodemic threat in public health. Front Public Health 2023；11：1166120. doi：10.3389/fpubh.2023.1166120

掲載誌情報

出版社：株式会社医学書院

電子版ISSN：1882-1359

印刷版ISSN：0915-0552

雑誌購入ページに移動

文献詳細