این سایت در حال حاضر پشتیبانی نمی شود و امکان دارد داده های نشریات بروز نباشند
The Archives of Bone and Joint Surgery، جلد ۱۳، شماره ۴، صفحات ۲۱۲-۲۲۲

عنوان فارسی
چکیده فارسی مقاله
کلیدواژه‌های فارسی مقاله

عنوان انگلیسی From Algorithms to Academia: An Endeavor to Benchmark AI-Generated Scientific Papers against Human Standards
چکیده انگلیسی مقاله Objectives: The aim of this study is to quantitatively investigate the accuracy of text generated by AI large language models while comparing their readability and likelihood of being accepted to a scientific compared to human-authored papers on the same topics.Methods: The study consisted of two papers written by ChatGPT, two papers written by Assistant by scite, and two papers written by humans. A total of six independent reviewers were blinded to the authorship of each paper and assigned a grade to each subsection on a scale of 1 to 4. Additionally, each reviewer was asked to guess if the paper was written by a human or AI and explain their reasoning. The study authors also graded each AI-generated paper based on factual accuracy of the claims and citations.Results: The human-written calcaneus fracture paper received the highest score of a 3.70/4, followed by Assistantwritten calcaneus fracture paper (3.02/4), human-written ankle osteoarthritis paper (2.98/4), ChatGPT calcaneus fracture (2.89/4), ChatGPT Ankle Osteoarthritis (2.87/4), and Assistant Ankle Osteoarthritis (2.78/4). The human calcaneus fracture paper received a statistically significant higher rating than the ChatGPT calcaneus fracture paper (P = 0.028) and the Assistant calcaneus fracture paper (P = 0.043). The ChatGPT osteoarthritis review showed 100% factual accuracy, the ChatGPT calcaneus fracture review was 97.46% factually accurate, the Assistant calcaneus fracture was 95.56% accurate, and the Assistant ankle osteoarthritis was 94.98% accurate. Regarding citations, the ChatGPT ankle osteoarthritis paper was 90% accurate, the ChatGPT calcaneus fracture was 69.23% accurate, the Assistant ankle osteoarthritis was 35.14% accurate, and the Assistant calcaneus fracture was 39.68% accurate. Conclusion: Through this paper we emphasize that while AI holds the promise of enhancing knowledge sharing, it must be used responsibly and in conjunction with comprehensive fact-checking procedures to maintain the integrity of the scientific discourse. Level of evidence: III
کلیدواژه‌های انگلیسی مقاله Artificial intelligence, ChatGPT, Large Language Models, Natural Language Processing, Prompt Engineering

نویسندگان مقاله | Jackson Woodrow
Foot & Ankle Research and Innovation Lab (FARIL), Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA


| Nour Nassour
Foot & Ankle Research and Innovation Lab (FARIL), Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA


| John Kwon
Foot & Ankle Research and Innovation Lab (FARIL), Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA


| Soheil Ashkani-Esfahani
Foot & Ankle Research and Innovation Lab (FARIL), Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA


| Mitchel Harris
Foot & Ankle Research and Innovation Lab (FARIL), Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA



نشانی اینترنتی https://abjs.mums.ac.ir/article_25053.html
فایل مقاله فایلی برای مقاله ذخیره نشده است
کد مقاله (doi)
زبان مقاله منتشر شده en
موضوعات مقاله منتشر شده
نوع مقاله منتشر شده RESEARCH PAPER
برگشت به: صفحه اول پایگاه   |   نسخه مرتبط   |   نشریه مرتبط   |   فهرست نشریات