Partnerzy

Klaudia Watros

Ostatnie publikacje

Olszewski R., Brzeziński J.^♦, Watros K.^♦, Rysz J.^♦, Quantifying Readability in Chatbot-Generated Medical Texts Using Classical Linguistic Indices: A Review, Applied Sciences, ISSN: 2076-3417, DOI: 10.3390/app16031423, Vol.16, No.3, pp.1-29, 2026

Streszczenie:
The rapid development of large language models (LLMs), including ChatGPT, Gemini, and Copilot, has led to their increasing use in health communication and patient educa-
tion. However, their growing popularity raises important concerns about whether the lan-guage they generate aligns with recommended readability standards and patient health literacy levels. This review synthesizes evidence on the readability of medical information generated by chatbots using established linguistic readability indices. A comprehensive search of PubMed, Scopus, Web of Science, and Cochrane Library identified 4209 records, from which 140 studies met the eligibility criteria. Across the included publications, 21 chatbots and 14 readability scales were examined, with the Flesch–Kincaid Grade Level and Flesch Reading Ease being the most frequently applied metrics. The results demon-strated substantial variability in readability across chatbot models; however, most texts
corresponded to a secondary or early tertiary reading level, exceeding the commonly rec-ommended 8th-grade level for patient-facing materials. ChatGPT-4, Gemini, and Copilot
exhibited more consistent readability patterns, whereas ChatGPT-3.5 and Perplexity pro-duced more linguistically complex content. Notably, DeepSeek-V3 and DeepSeek-R1 gen-
erated the most accessible responses. The findings suggest that, despite technological ad-vances, AI-generated medical content remains insufficiently readable for general audi-
ences, posing a potential barrier to equitable health communication. These results under-score the need for readability-aware AI design, standardized evaluation frameworks, and future research integrating quantitative readability metrics with patient-level comprehen-
sion outcomes.

Słowa kluczowe:
medical chatbots, readability, health communication, digital health, artificial intelligence

Afiliacje autorów:

Olszewski R.	-	IPPT PAN
Brzeziński J.	-	inna afiliacja
Watros K.	-	inna afiliacja
Rysz J.	-	Medical University of Lodz (PL)

100p.

Brzeziński J.^♦, Watros K.^♦, Mańczak M.^♦, Owoc J.^♦, Jeziorski K.^♦, Olszewski R., Readability and source transparency of AI‑generated health information on human metapneumovirus: A comparative evaluation of five chatbots, Journal of Public Health, ISSN: 1613-2238, DOI: 10.1007/s10389-025-02643-6on., pp.1-8, 2025

Streszczenie:
Aim This study aimed to evaluate the readability and citation practices of artificial intelligence (AI)-generated responses to questions about human metapneumovirus, a respiratory virus of growing public health concern. Subject and methods Five widely used AI chatbots—ChatGPT-4, Copilot, Gemini,Claude.ai, and Grok—were prompted with 14 standardized questions based on official guidelines from the World Health Organization, the Centers for Disease Control and Prevention, and the Australian National Health and Medical Research Council. Responses were anonymized and assessed using six established readability metrics: Flesch–Kincaid Reading Ease and Flesch–Kincaid Grade Level, Gunning Fog Index, SMOG (Simple Measure of Gobbledygook) Index, Coleman–Liau Index, and Automated Readability Index. Scores were compared to standards recommended by the American Medical Association and the National Institutes of Health. Citation frequency and credibility were also analyzed. Results Among 70 chatbot responses, only one met the recommended readability level. Median readability scores ranged from grade 10.4 to 16.0, indicating high complexity. One chatbot generated the most readable content, while another scored lowest. Only two chatbots included source citations. One cited 68 reliable sources, primarily from health organizations and academic institutions, while the other referenced 31 sources of varying quality. Conclusion AI-generated health content often exceeds recommended readability thresholds and lacks consistent citation practices. These issues may hinder understanding and trust. Improving default readability settings and integrating real-time citation features could enhance the accessibility and credibility of chatbot-based health communication.

Słowa kluczowe:
Human metapneumoviru, Artificial intelligence, Chatbots , Readability

Afiliacje autorów:

Brzeziński J.	-	inna afiliacja
Watros K.	-	inna afiliacja
Mańczak M.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Owoc J.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Jeziorski K.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Olszewski R.	-	IPPT PAN

40p.

Olszewski R., Watros K.^♦, Mańczak M.^♦, Owoc J.^♦, Jeziorski K.^♦, Brzeziński J.^♦, Assessing the response quality and readability of chatbots in cardiovascular health, oncology, and psoriasis: A comparative study, International Journal of Medical Informatics, ISSN: 1386-5056, DOI: 10.1016/j.ijmedinf.2024.105562, Vol.190, No.105562, pp.1-7, 2024

Streszczenie:
Background: Chatbots using the Large Language Model (LLM) generate human responses to questions from all
categories. Due to staff shortages in healthcare systems, patients waiting for an appointment increasingly use
chatbots to get information about their condition. Given the number of chatbots currently available, assessing the
responses they generate is essential.
Methods: Five chatbots with free access were selected (Gemini, Microsoft Copilot, PiAI, ChatGPT, ChatSpot) and
blinded using letters (A, B, C, D, E). Each chatbot was asked questions about cardiology, oncology, and psoriasis.
Responses were compared to guidelines from the European Society of Cardiology, American Academy of
Dermatology and American Society of Clinical Oncology. All answers were assessed using readability scales
(Flesch Reading Scale, Gunning Fog Scale Level, Flesch-Kincaid Grade Level and Dale-Chall Score). Using a 3-
point Likert scale, two independent medical professionals assessed the compliance of the responses with the
guidelines.
Results: A total of 45 questions were asked of all chatbots. Chatbot C gave the shortest answers, 7.0 (6.0 – 8.0), and Chatbot A the longest 17.5 (13.0 – 24.5). The Flesch Reading Ease Scale ranged from 16.3 (12.2 – 21.9)
(Chatbot D) to 39.8 (29.0 – 50.4) (Chatbot A). Flesch-Kincaid Grade Level ranged from 12.5 (10.6 – 14.6) (Chatbot A) to 15.9 (15.1 – 17.1) (Chatbot D). Gunning Fog Scale Level ranged from 15.77 (Chatbot A) to 19.73 (Chatbot D). Dale-Chall Score ranged from 10.3 (9.3 – 11.3) (Chatbot A) to 11.9 (11.5 – 12.4) (Chatbot D).
Conclusion: This study indicates that chatbots vary in length, quality, and readability. They answer each question
in their own way, based on the data they have pulled from the web. Reliability of the responses generated by
chatbots is high. This suggests that people who want information from a chatbot need to be careful and verify the answers they receive, particularly when they ask about medical and health aspects.

Słowa kluczowe:
Chatbots,Readability,Cardiovascular health,Oncology

Afiliacje autorów:

Olszewski R.	-	IPPT PAN
Watros K.	-	inna afiliacja
Mańczak M.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Owoc J.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Jeziorski K.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Brzeziński J.	-	inna afiliacja

140p.

Olszewski R., Watros K.^♦, Brzeziński J.^♦, Owoc J.^♦, Mańczak M.^♦, Targowski T.^♦, Jeziorski K.^♦, COVID-19 health communication strategies for older adults: Chatbots and traditional media, Advances in Clinical and Experimental Medicine, ISSN: 2451–2680, DOI: 10.17219/acem/195242, pp.1-9, 2024

Streszczenie:
Background. The coronavirus disease 2019 (COVID-19) pandemic has significantly accelerated the development and use of new healthcare technologies. While younger individuals may have been able to quickly embrace virtual advancements, older adults may still have different needs in terms of health communication.

Objectives. To identify areas of interest and preferred sources of information related to the COVID-19 pandemic among older adults and to verify their eHealth competencies.

Materials and methods. The study was conducted between February 2022 and July 2022. It included listeners from the University of the Third Age (U3A) and younger students. Both groups received information about the HealthBuddy+ chatbot, a questionnaire that addressed respondents’ interests about COVID-19, and the PL-eHEALS (eHealth Literacy Scale) questionnaire to measure their eHealth competencies.

Results. There were 573 participants in the study (U3A listeners – 303 participants, median age: 73 years (interquartile range (IQR): 69–77); young adult students – 270, median age: 24 years (IQR: 23–24). The primary source of information about COVID-19 for older adults was television (84.5%), and for younger adults, internet (84.4%). Among the older adults, only 17% ever interacted with a chatbot (younger adults – 78% respectively), and 19% considered it a trustworthy source of information on COVID-19 compared to 79% of younger respondents. Older adults and younger adults in our study were most interested in COVID-19 treatment methods (45.5% and 69.3%, respectively), symptoms of the disease (36.6% and 35.2%, respectively) and chronic diseases coexisting with COVID-19 (35.0% and 51.5%, respectively). However, their eHealth competencies were generally low (median (Me): 34; IQR: 30–39) compared to younger adults (Me: 42; IQR: 40–47).

Conclusions. Health education for older adults should be appropriately tailored to their current needs and differentiated. The level of eHealth competencies of older adults suggests that much work remains to narrow the gap between the eHealth competencies of the younger and older generations.

Słowa kluczowe:
health education,older adults,information seeking,COVID-19

Afiliacje autorów:

Olszewski R.	-	IPPT PAN
Watros K.	-	inna afiliacja
Brzeziński J.	-	inna afiliacja
Owoc J.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Mańczak M.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)
Targowski T.	-	National Institute of Geriatrics, Rheumatology and Rehabilitation (PL)
Jeziorski K.	-	National Institute of Geriatrics Rheumatology and Rehabilitation (PL)