알림 및 행사

  • 김태균 교수, Scientific Data에 논문 게재
  • 관리자
  • 2025-12-12 10:25:42
  • 139
김태균 교수의 논문이 <Scientific Data>에 게재되었습니다.

- Title: A benchmark dataset for evaluating gender sensitivity in Korean political discourse with large language models
 -Journal / year: Scientific Data / forthcoming
- Authors: Sunkyoung Park, Eunbi Cho, Chan Young Jung, Woo Chang Kang, Taegyoon Kim, Eunah Park, Sanghoun Song. 

- Abstract:  Large language models (LLMs) are increasingly used to analyze political discourse, yet their ability to detect subtle, culturally grounded variations in gender sensitivity remains underexplored.  We introduce a benchmark dataset of 1,222 transcripts from the Korean National Assembly, annotated for gender sensitivity across 6,024 utterances. Each utterance is labeled as high or low in gender sensitivity, based on contextual indicators of bias, discrimination, or inclusion, and tagged for the target group (e.g., women, men, sexual minorities, all genders). The dataset covers legislative sessions from 1948 to 2024, including plenary sessions, committees, and hearings. Annotation reliability was ensured through dual coding and adjudication, yielding high intercoder agreement (Cohen’s κ = 0.96; Krippendorff’s α = 0.95). When tasked with labeling utterances by gender sensitivity, GPT-4.1 achieved F1-scores of 87.5% (zero-shot) and 91.2% (18-shot) for high gender sensitivity labels, while GPT-4o reached 90.4% and 91.1%, respectively. While incorporating in-domain examples enhanced model performance, limitations in distinguishing between criticisms and reinforcements of inequality, culturally specific terminology, and extended contexts were observed for both models. Our results demonstrate the dataset's utility as a robust benchmark for analyzing gender sensitivity in Korean political speech and evaluating multilingual LLMs’ sociocultural alignment.

상세내용은 아래 링크를 참조해주시기 바랍니다.
* https://www.nature.com/articles/s41597-025-06344-3