RVBench: Role values benchmark for role-playing LLMs
With the explosive development of Large Language Models (LLMs), the demand for role-playing agents has greatly increased to promote applications such as personalized digital companion and artificial society simulation. In LLM-driven role-playing, the values of agents lay the foundation for their att...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-08-01
|
Series: | Computers in Human Behavior: Artificial Humans |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2949882125000684 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the explosive development of Large Language Models (LLMs), the demand for role-playing agents has greatly increased to promote applications such as personalized digital companion and artificial society simulation. In LLM-driven role-playing, the values of agents lay the foundation for their attitudes and behaviors, thus alignment of values is crucial in enhancing the realism of interactions and enriching the user experience. However, a benchmark for evaluating values in role-playing LLMs is absent. In this study, we built a Role Values Dataset (RVD) containing 25 roles as the groundtruth. Additionally, inspired by psychological tests in humans, we proposed a Role Values Benchmark (RVBench) including values rating and values ranking methods to evaluate the values of role-playing LLMs from subjective questionnaires and observed behavior. The values rating method tests the values orientation through the revised Portrait Values Questionnaire (PVQ-RR), which provides a direct and quantitative comparison of the roles to be played. The values ranking method assesses whether the behaviors of agents are consistent with their values’ hierarchical organization when encountering dilemmatic scenarios. Subsequent testing on a selection of both open-source and closed-source LLMs revealed that GLM-4 exhibited values most closely mirroring the roles in the RVD. However, compared to preset roles, there is still a certain gap in the role-playing ability of LLMs, including the consistency, stability and flexibility in value dimensions. These findings prompt a vital need for further research aimed at refining the role-playing capacities of LLMs from a value alignment perspective. The RVD is available at: https://github.com/northwang/RVD. |
---|---|
ISSN: | 2949-8821 |