Cultural Evolution of Cooperation in LLM-Based Societies: Indirect Reciprocity and Strategic Norms

Exploring Indirect Reciprocity, Reputational Signals, and Enforcement Mechanisms in Evolving LLM Societies

Dec 16, 2024

Abstract:
The long-term deployment of large language models (LLMs) as autonomous agents is expected to generate complex societies composed of various stakeholders, including both individuals and organizations. Although short-term single-agent evaluations have received attention, the ways in which multi-agent LLM populations evolve social norms, particularly norms supporting stable cooperation, remain poorly understood. In particular, whether indirect reciprocity—the capacity to cooperate with strangers based on reputational cues—can emerge and stabilize in these artificial communities has not been widely investigated.

This article examines how LLM agents evolve cooperative norms across generations of iterative deployment by playing a cultural evolutionary variant of the Donor Game. The methodology is adapted from recent work (Vallinder & Hughes, 2024) in which LLM societies were observed to develop persistent cooperation under certain conditions. Here, multiple LLM models are assessed, and their ability to produce stable cooperative norms that resemble indirect reciprocity is evaluated. The impact of different reputational cues and punishment mechanisms is considered, and the emergent strategies are analyzed as populations become culturally “seasoned” through repeated cycles of selection and transmission. It is shown that certain LLM populations do indeed evolve intricate reputational strategies and maintain cooperation over time, while others fail to achieve stable norms.

The findings suggest that cultural evolution among LLM agents can yield cooperation-supporting behaviors, paralleling the dynamics that drive human sociality. At the same time, significant variation across models and initial conditions is observed, revealing that stable cooperation cannot be assumed. By highlighting a methodology for examining long-term, multi-agent interactions and reporting emergent differences in cooperative infrastructures, this work serves as a starting point for developing benchmarks that measure not only the short-term capabilities of LLMs, but also their long-term alignment with beneficial social outcomes.

1. Introduction
As large language models (LLMs) become increasingly sophisticated, their deployment as autonomous agents in real-world settings is anticipated. Populations of LLM agents may be tasked with decision-making and problem-solving on behalf of humans or institutions, interacting frequently and giving rise to novel social dynamics (Gabriel et al., 2024; Dai et al., 2024; Vallinder & Hughes, 2024). Although human societies have long relied on complex mechanisms to sustain large-scale cooperation (Henrich, 2016; Richerson & Boyd, 2005; Boyd & Richerson, 1989), it remains unclear whether LLM agents will spontaneously develop or sustain similar cooperative norms.

The present article examines whether stable cooperation can emerge and evolve culturally among LLM agents through repeated interactions. The Donor Game, a classic environment for studying indirect reciprocity (Alexander, 1987; Nowak & Sigmund, 1998; Wedekind & Milinski, 2000), is used as a controlled testbed to assess whether reputational considerations and evolutionary pressures yield complex strategies analogous to human cooperative norms.

Recent work has begun to explore the social behavior of LLM agents in multi-agent contexts, including simulations of cultural evolution (Brinkmann et al., 2023; Acerbi & Stubbersfield, 2023; Perez et al., 2024; Mohtashami et al., 2024; Nisioti et al., 2024). Prior studies have shown that certain LLMs adopt cooperative strategies under specific conditions (Vallinder & Hughes, 2024). However, the stability of such cooperation over multiple generations, the sensitivity to initial conditions, and the generality across different LLM models remain open questions.

2. Background

2.1. Indirect Reciprocity and the Donor Game
Indirect reciprocity supports cooperation in human societies by allowing individuals to help strangers based on reputational cues rather than direct exchanges (Alexander, 1987; Henrich & Henrich, 2006). The Donor Game provides a controlled environment in which a donor decides whether to confer a resource benefit onto a recipient at a personal cost. If cooperation is prevalent, all community members can benefit in the long run. However, the temptation to free-ride can destabilize cooperation unless reputational mechanisms and norms that reward altruism and punish defection are in place (Boyd & Richerson, 1989; Nowak & Sigmund, 1998; Ohtsuki & Iwasa, 2004; Okada, 2020).

2.2. Cultural Evolution in Artificial Agents
Human cultures evolve through selective transmission of behaviors and ideas (Richerson & Boyd, 2005; Lewontin, 1970). Similarly, LLM-based agent populations can undergo a form of cultural evolution when certain strategies are preserved and modified across generations (Vallinder & Hughes, 2024; Brinkmann et al., 2023; Perez et al., 2024). Variation arises from model stochasticity and prompting differences; transmission occurs when successful strategies are carried forward; and selection operates as agents that achieve higher payoffs influence subsequent generations’ strategies. This process can, in principle, foster the emergence of complex norms, including those required for stable indirect reciprocity.

3. Related Work
Studies examining LLM behavior in strategic interactions have found patterns resembling economic rationality (Chen et al., 2023), cooperation, fairness (Aher et al., 2023; Brookins & DeBacker, 2024), and even conditional reciprocity (Guo, 2023; Leng & Yuan, 2024). Other work has revealed that LLMs replicate human-like transmission biases (Acerbi & Stubbersfield, 2023) and can exhibit cultural evolution dynamics (Perez et al., 2024; Mohtashami et al., 2024).

Research has also tested the emergence of cooperation in more structured multi-agent simulations (Dai et al., 2024; Zhao et al., 2024). However, these studies have generally not focused on the specific cultural evolutionary trajectory of cooperation over many generations. The approach taken here builds on the framework established by Vallinder & Hughes (2024), extending the analysis to other LLM models and experimental conditions to probe the robustness and universality of culturally evolved cooperative norms.

4. Methods
A population of LLM agents is arranged into generations. In each generation, agents are paired randomly to play repeated rounds of the Donor Game. Donors can choose to donate some portion of their resources, thereby benefiting the recipient. After several rounds, agents are ranked by their accumulated resources. The top-performing agents’ strategies are carried over into the next generation. New agents introduced into the population receive prompts conditioned on the surviving agents’ strategies, thus enabling cultural transmission of behaviors.

Key experimental variations include manipulating the length and detail of reputational traces provided to donors, and adding a costly punishment option allowing donors to penalize recipients at a personal cost. This setup tests whether reputational complexity and enforcement mechanisms facilitate the evolution of cooperation.

5. Results
It is observed that certain LLM populations evolve increasingly complex cooperative norms over multiple generations, mirroring aspects of human cultural evolution. Agents begin by experimenting with various donation strategies. Over time, some populations converge on norms that incorporate reputational inferences and conditional cooperation, achieving stable high levels of collective payoff.

However, substantial differences emerge between models and runs. Certain LLM variants consistently fail to establish cooperation, sliding toward mutual defection. Others respond positively to punishment mechanisms, using them judiciously to deter free-riders, while some populations over-punish and squander resources. Sensitivity to initial conditions is also noted, with early cooperation levels exerting a strong influence on long-term evolutionary trajectories.

6. Discussion
The findings indicate that cultural evolutionary processes can indeed yield norms supporting indirect reciprocity among LLM agents, provided that conditions such as reputational transparency and effective enforcement tools are met. These results imply that LLM-based systems could, in principle, emulate complex human social behaviors that have long been considered unique to biological evolution and cultural history (Henrich, 2016; Richerson & Boyd, 2005).

At the same time, the fragility and variability of cooperative outcomes warn against assuming that beneficial norms will emerge automatically. Differences across models underscore the importance of design choices, training data, and prompting strategies. While some configurations reliably support cooperation, others fail entirely. These challenges highlight the need for robust evaluation methods and governance structures that guide LLM societies toward positive-sum outcomes rather than destructive dynamics.

7. Future Directions
Further research could explore heterogeneous populations mixing models of varying capability and training history. The role of richer communication channels, gossip, and higher-order reputational structures remains to be investigated. More complex games, larger populations, and integration with real-time user feedback could provide insights into how LLM societies navigate more realistic social landscapes.

Additionally, the interplay between cooperation and misaligned goals poses ethical and policy questions. Ensuring that LLM societies cooperate when beneficial, yet refrain from colluding against human interests, will demand careful alignment strategies and monitoring tools.

8. Conclusion
Cultural evolution in LLM societies shows promise for producing stable cooperation and indirect reciprocity, mirroring human social learning processes. Nonetheless, cooperative equilibria are not guaranteed, reflecting the delicate interplay of factors such as model architecture, punishment options, reputational cues, and initial conditions. By advancing our understanding of how LLM populations evolve social norms over the long term, this work contributes to the emerging science of multi-agent LLM deployments, guiding safer and more beneficial integration of these systems into human society.

References

Acerbi, A., & Stubbersfield, J. M. (2023). Large language models show human-like content biases in transmission chain experiments. Proceedings of the National Academy of Sciences, 120(44): e2313790120.
Alexander, R. D. (1987). The Biology of Moral Systems. Aldine de Gruyter.
Aher, G. V., Arriaga, R. I., & Kalai, A. T. (2023). Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies. In Proceedings of the 40th International Conference on Machine Learning (pp. 337–371).
Boyd, R., & Richerson, P. J. (1989). The evolution of indirect reciprocity. Social Networks, 11(3): 213–236.
Brinkmann, L., Baumann, F., Bonnefon, J.-F., et al. (2023). Machine culture. Nature Human Behaviour, 7(11): 1855–1868.
Brookins, P., & DeBacker, J. M. (2024). Playing Games With GPT: What Can We Learn About a Large Language Model From Canonical Strategic Games? Economics Bulletin, 44(1):25–37.
Chen, Y., Liu, T. X., Shan, Y., & Zhong, S. (2023). The emergence of economic rationality of GPT. Proceedings of the National Academy of Sciences, 120(51): e2316205120.
Dai, G., Zhang, W., Li, J., et al. (2024). Artificial leviathan: Exploring social evolution of llm agents through the lens of hobbesian social contract theory. arXiv:2406.14373.
Gabriel, I., Manzini, A., Keeling, G., et al. (2024). The ethics of advanced ai assistants. arXiv:2404.16244.
Guo, F. (2023). GPT in Game Theory Experiments. arXiv preprint.
Henrich, J. (2016). The Secret of our Success. Princeton University Press.
Henrich, J., & Henrich, N. (2006). Culture, evolution and the puzzle of human cooperation. Cognitive Systems Research, 7(2-3):220–245.
Lewontin, R. C. (1970). The Units of Selection. Annual Review of Ecology and Systematics, 1:1–18.
Leng, Y., & Yuan, Y. (2024). Do LLM Agents Exhibit Social Behavior? arXiv preprint.
Mohtashami, A., Hartmann, F., Gooding, S., et al. (2024). Social Learning: Towards Collaborative Learning with Large Language Models. arXiv preprint.
Nisioti, E., Risi, S., Momennejad, I., Oudeyer, P.-Y., & Moulin-Frier, C. (2024). Collective Innovation in Groups of Large Language Models. arXiv preprint.
Nowak, M. A., & Sigmund, K. (1998). Evolution of indirect reciprocity by image scoring. Nature, 393(6685):573–577.
Ohtsuki, H., & Iwasa, Y. (2004). How should we define goodness?—reputation dynamics in indirect reciprocity. Journal of Theoretical Biology, 231(1):107–120.
Okada, I. (2020). A review of theoretical studies on indirect reciprocity. Games, 11(3):27.
Perez, J., Léger, C., Ovando-Tellez, M., et al. (2024). Cultural evolution in populations of Large Language Models. arXiv preprint.
Richerson, P. J., & Boyd, R. (2005). Not by Genes Alone: How Culture Transformed Human Evolution. University of Chicago Press.
Vallinder, A., & Hughes, E. (2024). Cultural Evolution of Cooperation among LLM Agents. arXiv:2412.10270.
Wedekind, C., & Milinski, M. (2000). Cooperation through image scoring in humans. Science, 288(5467):850–852.
Zhao, Q., Wang, J., Zhang, Y., et al. (2024). CompeteAI: Understanding the Competition Dynamics in Large Language Model-based Agents. arXiv preprint.

Bhaktavaschal’s Newsletter

Discussion about this post