The Implicit Identity Boundaries in Cooperative Reasoning

Sean · September 19, 2025, 5:52pm

Thought Experiments

Thought Experiment A

A rational agent is playing single-round prisoner’s dilemma with their perfect clone. The clone will make the same decision as the agent. Why might the agent cooperate?

Consider these reasons:

“They’ll do what I do, so cooperation maximizes my payoff” → cooperate
“They’re another instance of me, so harming them would harm me” → cooperate

Both recommend cooperation.

Thought Experiment B

The agent learns their clone has already cooperated. Now what?

The same two lines of reasoning lead to different conclusions:

“They’ve already cooperated, so defecting maximizes my payoff” → defect
“They’re another instance of me, so harming them would harm me” → cooperate

The decision now depends on the agent’s internal arbitration between conflicting reasons.

What’s Happening?

Each line of reasoning implicitly draws an identity boundary, partitioning the relevant agents:

First line of reasoning: {me} inside | {clone} outside

Experiment A: “If I cooperate, they (external agent) cooperate, maximizing my payoff” → cooperate
Experiment B: “They (external agent) already cooperated, I should defect” → defect

Second line of reasoning: {me, clone} inside | {} outside

Experiment A: “We shouldn’t harm ourselves” → cooperate
Experiment B: “We shouldn’t harm ourselves” → cooperate

The first treats the clone as part of the environment to navigate for personal benefit. The second treats the clone as part of an extended identity. When both recommend the same action (Experiment A), the decision is overdetermined. When they conflict (Experiment B), the agent must arbitrate.

Framework

For a single decision to cooperate or defect, there may be multiple reasons
Many of these reasons implicitly create an identity boundary, partitioning the full set of decision-relevant agents into a mutually exclusive complementary set pair.
If agent(s) appear in both of a reason’s complementary set pair, that reason may be further decomposed into multiple reasons, each with their own complementary set pair.
Multiple reasons can unanimously support the same decision (overdetermination), or be in conflict.
When reasons conflict, internal arbitration determines the outcome.
Implicit identity boundaries are per-reason, not per-decision or per-decision-maker.

In real-world decisions:

Cooperation decisions may be made with with a limited number of familiar and salient reasons, not all possible reasons.
Across decisions, conjugate reasons may be applied inconsistently.

Emergent Patterns

Each reason’s complementary set pair generates distinct relationship dynamics. For each reason, an agent relates to other agents inside the implicit extended identity boundary differently than to those outside.

Instrumental Relating

When a reason’s boundary excludes other decision-relevant agents, it relates to them instrumentally - treating them as environment to navigate. For systems dominated by instrumental relating patterns:

Cooperation

is conditional to modeled costs and benefits
is fragile to enforcement disruption or incentive change
scales through rules and institutions, so can be imposed quickly

Agents

are incentivized to game the system, generating increasingly complex arms races

Recognition relating

When a reason’s boundary includes other decision-relevant agents, it relates to them through recognition - treating them as multiply realized instances of that reason’s identity pattern. For systems dominated by recognition relating patterns:

Cooperation

is conditional on recognizing shared identity pattern
is fragile to identity reinterpretation
is difficult to scale beyond direct relationships, and requires time to develop

Agents

resist temptation to game the system, making enforcement self-regulating

Reason Salience

The familiar collection of reasons is conditioned by:

Evolutionary heritage: kin, tribe, species
Cultural transmission: nation, religion, class, profession
Personal history: friends, enemies, “people who’ve helped me”
Situational priming: which partitions the current context makes relevant

This constraint explains why framing matters: shared identity patterns which are emphasized (“we’re a family”) or downplayed (“they’re barely human”) may shift behavior without changing material payoffs. Institutions shape cooperative reasoning not just by changing incentives, but by making certain identity boundaries more salient.

Martin · September 19, 2025, 6:02pm

A clone, or another instance of yourself is not you - maybe just correct it so the clone is actually an individuated part of yourself?

I’d also add love or liking someone as a great reason not wanting to hurt them, at least for me.

JonahW · September 19, 2025, 9:34pm

Welcome to the discussion @Sean!

I thought along these lines for a while, but now think this way of thinking about prisoner’s dilemma (PD) is mistaken - though your conclusion still holds.

You mention ‘material payoffs’ but the payoff matrix of PD is not literally a payoff, e.g. in fincancial or other material terms. If it was, it would be easy to ‘solve’ the dilemma by just making people less materialistic. The reason it poses a dilemma for rationality is that the payoffs represent ‘utility’ as measured by revealed preferences, which factors in everything the player values.

This means that as soon as you start talking about clones, and indeed about knowing what the other player has chosen, you’re no longer playing prisoner’s dilemma, but a new game with a different payoff matrix.

This way of looking at things gives a much easier way to understand why framing and shared identity matters in the way you conclude that it does: it changes the game (as defined by an ordering of payoffs defined by holistic utility) from a dilemma, to one in which cooperation is rational (because the holistic payoff, considering your shared identity, is higher that way), such as a ‘stag hunt’.

Gen · September 20, 2025, 11:18am

This is the kind of tangles I come here for!

Gen · September 20, 2025, 9:43pm

I’ve added a section to what I’m playing with, to address your clone thought experiment. Follow link to see it in LaTex formatting; Notion

Nash + Kairos: A Conceptual Sketch Toward a Trifold Temporal Model of Revolutionary Emergence

Introduction: A Sketch in Progress

This document outlines a conceptual framework—not yet a fully formalized model—for understanding how individuals and groups move from compliance to rebellion under systemic pressure. It is intended as a prototype, laying the foundation for future refinement in mathematical precision, empirical validation, and methodological clarity.

1. The Problem with Nash in Human Systems

Nash Equilibrium predicts that under oppression—where rebellion entails high personal cost—no rational actor will defect. It assumes:

Fixed preferences
Isolated agents
Static payoff matrices
One-shot or repeatable, closed-system games

Yet real-life events like the Warsaw Ghetto uprising or Arab Spring defy these expectations. Human behavior under systemic constraint is more fluid, relational, and symbolic.

2. Why Nash Falls Short

Nash is effective in static, closed systems but does not account for:

Evolving identities and preferences
Symbolic meaning, trauma, and shared narrative
Nonlinear, threshold-based behavior
Network effects and emergent identity

It also assumes atemporality—decisions made in a vacuum, divorced from arcs of time, memory, anticipation, or evolution.

3. Toward a Temporal Reframing

We propose time as a minimal corrective to Nash. Even acknowledging that identities and incentives shift over time moves us closer to realism. We refine this further through three temporal modes:

Temporal Mode	Description	Symbol
t₁ – Impulse	Emotional activation, urgency, trauma response
t₂ – Relational	Trust, shared recognition, conspiratorial networks
t₃ – Mythic	Sacred meaning, symbolic rupture, archetypal narratives

These aren’t linear stages, but overlapping fields of influence that shape volition.

4. Volitional Transformation (Sketch)

In classical Nash Equilibrium, a player’s strategy is the best response given all others’ strategies, within a fixed payoff matrix. The assumption is that choices are rational, iterative, linear, isolated, and not shaped by deeper emotional, social, or symbolic contexts.

Let’s begin with Nash’s classic setup:

Ui(si,s−i)U_i(s_i, s_{-i})

Ui(si,s−i)

Where:

U_i is the utility or payoff for player i
s_i is player i’s chosen strategy
s_{-i} is the combination of strategies chosen by all other players

In a Nash Equilibrium, every player’s strategy is the best response to the others. No one can do better by changing their choice alone. This model assumes:

Utilities are known and static
Agents are isolated
Choices are rational and self-maximizing

The Limitation:

This framework cannot explain behavior where people act against self-interest (e.g., risking death to rebel), or where identity, meaning, and trust shift the internal calculus.

Enter Time and Transformation

The Volitional Transformation Model reframes the decision-making process:

Vi(t)=α⋅Ii(t)+β⋅Ri(t)+γ⋅Mi(t)−Ci(t)V_i(t) = \alpha \cdot I_i(t) + \beta \cdot R_i(t) + \gamma \cdot M_i(t) - C_i(t)

Vi(t)=α⋅Ii(t)+β⋅Ri(t)+γ⋅Mi(t)−Ci(t)

This expression mirrors Nash’s utility model but replaces static payoff with a dynamic, multidimensional measure of pressure or readiness to act.

Breakdown of Terms:

I_i(t): Immediate bodily or emotional urgency (e.g., fear, anger, grief)
R_i(t): Relational safety or solidarity, based on one’s support network
M_i(t): Deeper symbolic or existential drive, like fighting for one’s dignity or ancestry
C_i(t): The cost of action, which might be social (ostracism), material (losing a job), or mortal (imprisonment, death)
α, β, γ: Context-sensitive weights that say how much each factor matters in this situation

Together, this builds on Nash’s idea of utility but adds real human depth: time, emotion, relation, and meaning.

We build on this by proposing a model where volitional potential—the capacity or pressure to act—arises from multi-dimensional, time-sensitive factors. This potential is not merely a calculation of material payoff but a convergence of bodily urgency, social trust, and mythic meaning.

We sketch an initial expression of this volitional potential:

Vi(t)=α⋅Ii(t)+β⋅Ri(t)+γ⋅Mi(t)−Ci(t)V_i(t) = \alpha \cdot I_i(t) + \beta \cdot R_i(t) + \gamma \cdot M_i(t) - C_i(t)

Where:

V_i(t) is the volitional potential of individual i at time t. This represents the net pressure (positive or negative) toward taking transformative action—such as rebellion, dissent, defection, or innovation.
I_i(t) ∈ [0,1]: Impulse activation, representing the somatic or emotional charge experienced by an individual. This could be triggered by immediate suffering, injustice, or bodily threat. It corresponds to t₁ – Impulse Time.
R_i(t) ∈ [0,1]: Relational field strength, indicating the degree of perceived support, mutual recognition, and trust from one’s immediate network or social field. It corresponds to t₂ – Relational Time.
M_i(t) ∈ [0,1]: Mythic resonance, capturing the alignment with sacred values, cultural narratives, ancestral memory, or existential significance. It corresponds to t₃ – Mythic Time.
C_i(t) ∈ ℝ⁺: Cost of action, which can include personal risk, material loss, social ostracization, or existential threat. This could be dynamic and context-sensitive.
α, β, γ ∈ ℝ⁺: Attunement weights that determine how strongly each temporal mode influences the volitional potential in a given context. For example, in an emotionally volatile environment, α may be high; in a strongly bonded community, β dominates; in times of symbolic rupture or collective trauma, γ may rise.

Interpretation:

When V_i(t) > 0, the forces of impulse, relationship, and myth outweigh perceived cost, making transformative action more likely.
When V_i(t) < 0, cost still dominates; the actor remains compliant, silent, or immobilized.

This formulation retains the basic idea of “payoff” from game theory but expands it into a time-evolving, multidimensional field that includes psychological, emotional, social, and symbolic metrics.

Note:

All variables will require clear operational definitions, units of measurement, and context-sensitive calibration.
Linear addition is a simplification. Future versions may include nonlinear interactions, feedback loops, or threshold effects.

Collaboration on development is invited.

5. Sean’s clones. Identity, Recognition, and the Illusion of Separation

Mystical traditions such as Christian mysticism, Sufism, Kabbalah, and certain secular metaphysics suggest that separation is an illusion. From this view, perceiving others—like a clone in a thought experiment—as versions of self can drastically shift one’s volitional calculus.

To mathematically integrate this ontological unity, we introduce an Identity Integration Index:

B_i(t) = \frac{1}{N} \sum_{j=1}^{N} \mathbb{I}_{j \in \text{Self}_i(t)}

Where:

B_i(t) reflects how many other agents individual i includes within their sense of self.
\mathbb{I} is an indicator function (1 if j is part of i’s identity, 0 otherwise).

This transforms the cost function:

C_i(t) = C_0 \cdot (1 - B_i(t))

As more agents are included within the self-boundary, perceived costs decrease. In the limit where B_i(t) → 1, harming others becomes equivalent to harming oneself—reinforcing cooperation.

Interpretation:

When V_i(t) > 0, the forces of impulse, relationship, and myth outweigh perceived cost, making transformative action more likely.
When V_i(t) < 0, cost still dominates; the actor remains compliant, silent, or immobilized.

This formulation retains the basic idea of “payoff” from game theory but expands it into a time-evolving, multidimensional field that includes psychological, emotional, social, and symbolic metrics.

Note:

All variables will require clear operational definitions, units of measurement, and context-sensitive calibration, this will never be hard metrics but trends and potentials.
Linear addition is a simplification. Future versions may include nonlinear interactions, feedback loops, or threshold effects.

Gen · September 20, 2025, 9:49pm

Yes. The idea of a clone is just an abstraction of another self. Love, connection, tribe or respect for other shared identity, or holding a worldview that considers all life an extension of self, will to some degree or another have a similar effect.

JonahW · September 21, 2025, 10:44am

I’ve already explained why it can: if the utility function is based on preferences expressed through behaviour, then this can include self-sacrificing preferences.

There appear to be a number of duplications in your formulas, e.g. A=A=A, and/or inconsistency in the subscripts, which makes it hard to follow.

It would be interesting I think to explore exactly what the relationship is between Vi(t) and Ui(s) - for one thing the former is a function of time, wheras the latter is a function of strategies - presumably the former is also supposed to be a function of strategies, essentially a dynamic payoff matrix, and this would be interesting to formalise.

Particularly interesting would be to explore Ui(s) as a function of time, i.e. Ui(s, t) and connect this to the transition points at which prisoner’s dilemmas become stag hunts, or pure cooperation games.

Sean · September 21, 2025, 4:51pm

I realize I skipped some explanatory steps, so please let me know if anything still doesn’t make sense.

The framework doesn’t claim which interpretation of identity is “correct” - it just models each decision-relevant reason as having an implicit identity boundary, and how different identity boundaries promote different behaviors. In the thought experiments, clone-exclusive boundaries say “defect once they’ve cooperated” while clone-inclusive boundaries say “still cooperate.”

You’re absolutely right about love being sufficient reason not to hurt someone. Using the framework as a lens, love as a reason generates its own implicit identity boundary: {me, loved ones} | {others}. The framework just maps these implicit boundaries in a generalizable way.

rufuspollock · September 21, 2025, 6:34pm

Us all becoming Buddhas is one way to solve PD (Buddha like as a proxy for enlarging my identity boundary). However, it is quite demanding one.

Whilst I personally think that a) most of us becoming a bit more Buddha-like is important and some of us becoming much more Buddha-like is crucial … I also think we can look to cultural evolution (norm dissemination, value/view "contagion’) as a way to go from a small vanguard to majority switching quite quickly.

“Now we have made Italy, now we must make Italians” – Azeglio 1861

The real juice for me in PD is dealing with defectors – even in a society of Buddha’s how do we deal with rotten apples. cf Why Enlightenment Alone Won’t Save Us (and that's ok)

Sean · September 21, 2025, 7:47pm

You’re right that the thought experiments deviate from formal PD (using clones, breaking simultaneity). I’m using PD-like scenarios to cleanly attribute each motivating reason to its implicit identity boundary - not to solve or analyze PD itself. PD-like scenarios are useful because they create sharp conflicts between different identity boundaries, making visible what’s usually complex or hidden in everyday cooperation decisions.

Your point about identity boundaries changing the game itself is compatible with the framework examining how and why those game-changes happen. What the framework explores is the cognitive/psychological layer before that game crystallizes - when an agent holds multiple potential reasons (each with different boundaries), and must arbitrate between them to determine which game they’re playing.

RobertBunge · September 21, 2025, 8:22pm

Good article. In relation to your Lennon and McCartney reference, here is a bit of lesson plan I’ve been teaching the past couple years.

==================

Information Security and the Problem of Evil

”All you need is love.” - the Beatles

If you agree with Lennon and McCartney that love is all you need, feel free to skip the rest of this course. Everyone in the world only wants the best for you and no one is going to mess with your data. Still reading? Then someone, somewhere, is out to steal your data, hack your files, impersonate you online, ruin your reputation, drain your financial accounts, or fry the circuits that control your grid. Any story of IT security needs villains. Bad actors. Those who mean you harm. But who are these bad actors and why are they “bad”? IT security assumes conflict. Us versus them. Generally speaking, “we” are out to protect “ours”. “They” are out to take “ours” for themselves. A very interesting and useful question in today’s globalizing society, though, is where do we draw the lines around “us” and “ours” and who exactly is the “we” who is drawing those lines?

Martin · September 22, 2025, 8:17am

As a thought - how about love being the only identity boundary - all the rest being utility ones? Not sure whether it makes a difference in terms of self-harm.

Gen · September 22, 2025, 11:31am

Thanks, that’s helping to refine my exploration. > Nash Equilibrium is a snapshot of alignment under constraint — a stillness in the dance.
Yes, it can accommodate self-sacrificing behavior — but only if that behavior is already encoded in the actor’s utility.
What it cannot easily hold is the absence of human fixity, the evolution of values, the relational unfolding of selves, or the reweaving of the game itself.

This is not a refutation, but a dimensional expansion.

Apologies for typos and bad formats, I’m taking stuff across platforms and things get messed + I’m pretty hopeless at proofreading. My aim is to formalise meaning and language to some firm, machine-readable equations, for the development of structurally ethical AI, I am not a mathematician, or formally trained in logic.

Gen · September 22, 2025, 11:35am

Yes, this is what we are talking about, love, attractors, boundaries, or not, and being porous, or blurred, sensing some “whole” which we are only part of?