A recent study by Benjamin Spiegel and colleagues published in the preprint journal arXiv demonstrates that artificial intelligence systems can develop proto-writing systems through visual theory of mind, mirroring how early humans evolved from using pictographs to abstract symbols in the first writing systems.
The concept of visual theory of mind, central to Spiegel's research, refers to the ability to reason about how others perceive visual signals. This cognitive capability allows both the creator and interpreter of pictographs to maximize successful communication by leveraging shared visual understanding.12 In the study's computational framework, AI agents demonstrate this ability when they create pictographs that are visually distinctive while receivers use inference to interpret the intended meaning, considering possible alternatives.13
This research challenges previous assumptions about AI language models and theory of mind capabilities. While some have claimed that large language models like ChatGPT have achieved theory of mind, experts remain skeptical about these assertions.4 Spiegel's work provides a more nuanced understanding by focusing specifically on visual theory of mind in the context of proto-writing development, suggesting that this cognitive mechanism was likely crucial for early humans when inventing the first writing systems, which relied on creating symbols that leveraged shared visual understanding within communities.15
The researchers developed a multi-agent reinforcement learning testbed called the "Signification Game" to study emergent communication between AI agents.1 In this framework, agents communicate by creating and interpreting pictorial symbols drawn as splines on a canvas, without relying on pre-existing language or specialized communication hardware.2 This naturalistic approach allows for clearer analogies to human and animal cognition compared to previous computational studies of pictographic systems.1
Unlike systems that learn solely through reward-maximization, which the study found to be severely limited for language acquisition, the Signification Game incorporates inferential communication mechanisms.3 This framework is situated within a broader formalism for animal communication, providing insights into both the cognitive and cultural processes that underlie the emergence of proto-writing systems.1 The model's design enables researchers to observe how agents develop communication strategies that evolve from simple pictographs toward more abstract symbolic representations, similar to the trajectory of human writing systems throughout history.45
A fundamental challenge identified in Spiegel's research is the "signification gap" - the disparity between what needs to be communicated and what can be effectively represented through simple pictographs. This gap emerges when the complexity of concepts exceeds the representational capacity of basic drawings, making certain ideas difficult to convey through visual means alone.12
To bridge this gap, the AI agents in the study employed visual theory of mind capabilities rather than relying solely on reward-based learning. This approach enabled them to create increasingly sophisticated communication systems by reasoning about how others would perceive and interpret their visual signals. The research demonstrates that simple reward-maximization is insufficient for developing complex communication systems like writing, highlighting the essential role of inferential communication in the evolution from crude pictographs to effective symbolic representation.34
The research by Spiegel and colleagues reveals that both AI agents and early human writing systems follow a similar evolutionary trajectory, beginning with iconic pictographs that directly resemble their referents before gradually transforming into more abstract symbolic forms12. This evolution occurs naturally as communication needs become more complex and efficient representation becomes necessary.
Within the computational framework, AI agents initially create detailed pictorial representations but progressively develop more stylized, abstract symbols as they continue to communicate13. This pattern mirrors archaeological evidence of human writing systems, where early pictographs (like those found in ancient Sumerian or Egyptian hieroglyphics) gradually evolved into more abstract cuneiform or hieratic scripts. The study suggests this transition isn't merely coincidental but represents a fundamental cognitive process in the development of semiotic systems, whether artificial or human45.