Can ChatGPT help science writers?

https://www.science.org/content/blog-post/can-chatgpt-help-science-writers

Abigail Eisenstadt

2/11/20263 min read

When the artificial intelligence (AI) company OpenAI released the generative AI platform ChatGPT nearly 3 years ago, people began to speculate what the arrival of the large language model (LLM) meant for many creative industries, including journalism and other writing.

In December 2023, the press office for Science (and the Science family of journals) decided to explore whether ChatGPT Plus had potential as a tool to help writers (the Science Press Package team, SciPak) convey information about upcoming research papers to the media. We sought to evaluate whether ChatGPT Plus could adhere to the specific writing style of SciPak.

SciPak’s experiment addressed the question: Can ChatGPT Plus successfully produce news briefs that emulate the style of our trained SciPak writers? Referencing SciPak style tightened the experiment’s framing, keeping it smaller in scale and more pointed.

The SciPak team crafts news briefs for journalists that are written in a standard narrative structure that journalists use—the so-called inverted pyramid. However, SciPak writers follow a slightly different path to achieve this framework. The inverted pyramid places the most crucial information at the beginning of the news brief, with supporting details following in decreasing order of importance. Many science writers use this structure and assemble it in different ways. SciPak writers first deconstruct the inverted pyramid, using a foundational outline called the “5 bits.” We identify the study’s premise. Then, we tackle its methods and context. We only draft the critical first sentences of the brief after we’ve assembled the rest. This helps us understand the study’s intricacies and avoid misrepresenting any information in the first sentence.

The experimental design required the selection of two papers (research or commentary) each week (over the course of 1 year) that had already been published from among the Science family of journals (Science, Science Advances, Science Robotics, Science Translational Medicine, Science Immunology, and Science Signaling). These candidates for ChatGPT Plus–generated summaries had to meet one of several qualifications, such as featuring a controversial topic, technical content with high levels of jargon, or a nontraditional format such as a Policy Forum. We assigned ChatGPT Plus to write three summaries for each paper based on three prompts. One prompt asked for a summary written in a manner that is accessible to a general audience (for nonexperts); another prompt for the LLM asked for a summary written with precise language (language similar to that used in peer-reviewed papers); and the final prompt asked for a summary written like a professional journalist. (We used updated versions of ChatGPT as they were made available over the year, including the “Plus” version because of its ability to ingest new content.) Each ChatGPT-authored summary was then reviewed by the SciPak writer who had written the earlier news brief. Writers completed a survey with quantitative and qualitative questions about the LLM’s performance. Certainly, the experiment had limitations. Most notably, it used human evaluations of Chat GPT–generated text and did not account for human biases.

The results were mixed. The LLM did summarize scientific findings in language that was accessible to nonexperts (it avoided technical terms and jargon, for example). It also effectively summed up commentary content, such as Science’s Policy Forums, for a lay audience. However, it tended to sacrifice accuracy for simplicity. The ChatGPT Plus summaries required rigorous fact-checking by SciPak writers. Also, extensive editing for hyperbole was needed. For example, ChatGPT Plus had a fondness for using the word “groundbreaking.” It also struggled to highlight more than one result from multifaceted studies. When asked to summarize two papers at once, it could only cover the first of the two submitted. It ultimately defaulted to jargon if challenged with research particularly dense in information, detail, and complexity.

The conclusion of this experiment is that ChatGPT Plus did not meet SciPak’s standards. These technologies may have potential as helpful tools for science writers, but they are not ready for “prime time” at this point for the SciPak team. Although this project has ended, SciPak plans to keep an eye on further updates to ChatGPT Plus and to monitor the capabilities of other LLMs.

If you want to look at an in-depth description of the methods and results of this experiment, you can read the white paper. For those who have questions or comments, please reach out to aeisenstadt@aaas.org.

Abigail Eisenstadt is a writer for the Science Press Package, American Association for the Advancement of Science, Washington, DC, USA. aeisenstadt@aaas.org i am only giving you information . go and check real website https://www.science.org/content/blog-post/can-chatgpt-help-science-writers

Can ChatGPT help science writers?

Explore