文献详情 >Evaluating research quality with La... 收藏

Evaluating research quality with Large Language Models:An analysis of ChatGPT’s effectiveness with different settings and inputs

作者：Mike Thelwall Mike Thelwal

作者机构：Information SchoolUniversity of SheffieldSheffield S102TNUK

出版物：《Journal of Data and Information Science》 (数据与情报科学学报(英文版))

年卷期：2025年第10卷第1期

页码：7-25页

摘要：Purpose:Evaluating the quality of academic journal articles is a time consuming but critical task for national research evaluation exercises,appointments and *** is therefore important to investigate whether Large Language Models(LLMs)can play a role in this ***/methodology/approach:This article assesses which ChatGPT inputs(full text without tables,figures,and references;title and abstract;title only)produce better quality score estimates,and the extent to which scores are affected by ChatGPT models and system ***:The optimal input is the article title and abstract,with average ChatGPT scores based on these(30 iterations on a dataset of 51 papers)correlating at 0.67 with human scores,the highest ever *** 4o is slightly better than 3.5-turbo(0.66),and 4o-mini(0.66).Research limitations:The data is a convenience sample of the work of a single author,it only includes one field,and the scores are *** implications:The results suggest that article full texts might confuse LLM research quality evaluations,even though complex system instructions for the task are more effective than simple ***,whilst abstracts contain insufficient information for a thorough assessment of rigour,they may contain strong pointers about originality and ***,linear regression can be used to convert the model scores into the human scale scores,which is 31%more accurate than ***/value:This is the first systematic comparison of the impact of different prompts,parameters and inputs for ChatGPT research quality evaluations.

主题词：ChatGPT Large Language Models LLMs Scientometrics Research Assessment

学科分类：0502[文学-外国语言文学类] 050201[050201] 05[文学]

核心收录：

D　O　I：10.2478/jdis-2025-0011

馆藏号：203156501...

维普期刊资源

目录详情 | 试阅读 | 预约结果

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

高级检索 表达式检索

时间限定

文献类型

馆藏选择

核心期刊

语言

高级检索 表达式检索

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

看过本文的还看了

相关文献

该作者的其他文献

Evaluating research quality with Large Language Models:An analysis of ChatGPT’s effectiveness with different settings and inputs

读者评论与其他读者分享你的观点

收藏书架

请选择收藏分类：

选择图书所在场馆

申请转借

高级检索 表达式检索

时间限定

文献类型

馆藏选择

核心期刊

语言

高级检索 表达式检索

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

看过本文的还看了

相关文献

该作者的其他文献

Evaluating research quality with Large Language Models:An analysis of ChatGPT’s effectiveness with different settings and inputs

读者评论 与其他读者分享你的观点

收藏书架

请选择收藏分类： 新增自定义分类 确定 取消

选择图书所在场馆

申请转借

高级检索表达式检索

高级检索表达式检索

读者评论与其他读者分享你的观点

请选择收藏分类：