The hostile telepaths problem — LessWrong (2024)

Epistemic status: model-building based on observation, with a few successful unusual predictions. Anecdotal evidence has so far been consistent with the model. This puts it at risk of seeming more compelling than the evidence justifies just yet. Caveat emptor.

Imagine you're a very young child. Around, say, three years old.

You've just done something that really upsets your mother. Maybe you were playing and knocked her glasses off the table and they broke.

Of course you find her reaction uncomfortable. Maybe scary. You're too young to have detailed metacognitive thoughts, but if you could reflect on why you're scared, you wouldn't be confused: you're scared of how she'll react.

She tells you to say you're sorry.

You utter the magic words, hoping that will placate her.

And she narrows her eyes in suspicion.

"You sure don't look sorry. Say it and mean it."

Now you have a serious problem. You don't have an internal "actually mean it" button. And yet here's Mom peering into your soul and demanding that you both have that button and press it. Trying to appease her didn't work. She needs you to be different — and she's checking.

What can you do now?

This is a template for what I've come to call "the hostile telepaths problem". I think it's a common feature of social problems. The hostile telepaths problem is when you're dealing with a being (a) who can kind of read your internal experiences and (b) whom you don't trust won't make your situation worse due to what they find in you.

There are lots of solutions to the hostile telepaths problem. I don't claim to know all of them. But recognizing some common ones has helped clarify a lot of my thinking — particularly around self-deception and akrasia.

And getting very clear on the nature of the problem makes identifying real solutions way easier. This fact produces some previously-surprising-to-me predictions, especially for trauma processing and for making emotionally difficult decisions.

I'll try to spell out what I mean with some theory and a few examples.

There's one really tricky solution to the hostile telepaths problem. It deserves some special front-loaded attention before I name some other solutions.

Here I'll try to spell out its logic with a modification of Newcomb's Problem.

Suppose that instead of Omega, you're dealing with Omega-V. Omega-V is an extremely good but imperfect predictor of someone's box choice.

Through some clever sleuthing, you deduce why. Omega-V is scanning only folks' visual thinking for a little while prior to their box choice. It's extremely good at tracking the difference between the mental visualizations of one-boxers vs. two-boxers… provided they don't know Omega-V is scanning specifically their visual thinking.

So now you have an edge. Maybe you can trick Omega-V. Maybe you can visually think like a one-boxer and use other modes of reasoning (like inner talk) to plan on actually two-boxing. You just have to be very careful to have any "leaks" of reasoning into visualization look totally normal for a one-boxer.

If you decide it's worth the risk to try for this deception, the visual reasoning part of your mind has to be kind of siloed from the rest. It's like you have a subagent that truly honestly believes you're a one-boxer. It's the non-visual reasoning parts of you that are aware of the real plan.

Another way of saying this is: in normal Newcomb's Problem, you can't tell whether you're in the real world being faced with Omega's offer or if you're instead Omega's simulation of real-you and you're revealing what real-you will do. The situation is similar with Omega-V, but in this case it's specifically your visual reasoner (viewed as kind of a subagent) that has to be built in such a way that it can't tell whether it's a simulation. Many other parts of you can tell perfectly well what's going on![1]But for the deception to work, the way in which you plan on two-boxing cannot be something your inner visual reasoner can even deduce.

Now, let's take this one step further.

Let's suppose we're instead dealing with Omega-C. Also an imperfect predictor. But this time you learn that Omega-C scans specifically all of — and only — your conscious thinking.

At this point, the right strategy is to consciously decide you're going to one-box.

But some people still find themselves two-boxing and winning. Even while consciously intending to one-box. Sometimes they get confused and say the wrong thing or push the wrong button. Some of them trip and accidentally grab both boxes. Some laugh and grab both boxes and find them both full.

Assuming that at least some of those outcomes are strategic… how?

Hint: now you (by which I mean conscious parts of you) are maybe the siloed part. Basically an illusion meant to trick Omega-C.

The thing is, you cannot tell. Because if you could, then the deception — if it were there — wouldn't work.

Which is to say: on the inside, you cannot tell the difference between running this Omega-C deception strategy and honestly truly just being a one-boxer.

And furthermore, it's actually to your advantage not to check. Even if you were to find some way of being able to tell the difference!

But because of that, if you're aware of this deception strategy, it might work best if you honestly (from your perspective) look for some way to tell if you're running it. Actively trying to "come clean". That's a pretty powerful signal to Omega-C that you really are a one-boxer.

So if you can make it so that it seems to you like you really are looking, including accounting for awareness of possible non-conscious "shadow" reasoning, and you find none, but you still two-box in the end…

…well, you win!

…while the conscious parts of you are confused and surprised. And possibly quite dismayed.

Sketch of a real-world version

I call this pattern "Newcomblike self-deception" as a nod to Nate Soares' point that Newcomblike problems are the norm.

The deal is that people are kind of telepathic, in the sense of using things like vocal tone and body language and microexpressions and so on to intuitively deduce what might otherwise be hidden in others. They're scanning for things like intent and strategy and emotional tone. Often this telepathy is in service of synching up ("Hey, you seem upset, is something wrong?"), but it doubles as threat detection.

This telepathy is imperfect. Which means that sometimes Newcomblike self-deception is in fact a viable strategy.

I'd like to name one way I think this type of self-deception can actually happen in a person. It might be the main way it happens, or it might be rare. I honestly don't know. But it's one I've in fact seen in myself[2]and I think I've observed in some others.

By some mysterious method, it's possible to contract your awareness — by which I mean, the space of things you're actually aware of can be smaller than the space of sensory inputs (including mental experiences like thoughts and memories). Lots of people experience this when watching TV (losing awareness of the room), or when deep in flow work (not noticing hunger for hours while programming).

If you construct a sort of fake self in your mind, and then contract your awareness around that fake self, it can seem to you on the inside like you really are the way depicted in the fake self. Like it's not fake, it really is who you are.

If you also build up explanations to your fictitious self about why things outside that fiction either are consistent with it or don't matter, then you both (a) can honestly display to hostile telepaths that you (here meaning fake you) are being fully sincere in not hiding anything and (b) possibly give the telepaths ways of discounting the unavoidable signals that you (here meaning you holistically) are hiding something.

For instance, as a child whose mother says to you "Say you're sorry and mean it", you might be able to strategically misinterpret your fear of Mom's Wrath as "being really sorry". As long as you're not aware that that's what you're doing, it might work very well! She might read your distress as you really meaning it. ("I'm sorry I'm sorry I won't do it again please Mom I'm sorry…!") And you can keep yourself from being aware of this whole strategy by keeping your awareness contracted on the fictitious version of yourself that's "bad" and "very sorry", and keeping your understanding of the real problem outside of your awareness.

Possible examples in real life

Here are some examples I think I've actually seen — in culture, in others, and in myself:

  • I think the thing with kids that I sketched above really does happen. More generally, I think similar applications of Newcomblike self-deception are the root cause of (a certain very common kind of) shame: it's a strategic mislabeling of one's pain as being about one's "flaws".
  • Relatedly, lots of folk mislabel their experience as "I hate math." Most people I've talked to who say this actually hate the coercion and gaslighting used almost universally in math classes. The real problems most folk are focused on in math class are social, like "Appease the teacher" and "Get Mom & Dad off my back." But teachers and parents might insist to a student that "you need to try harder" with the math itself while seeming to sort of telepathically scan them for whether they are in fact trying. I think this can sometimes lead students to strategically mislabel their distress about the situation to themselves.
  • Gurus getting involved in sex scandals. I'm sure that at least some of them have been very sincere about what amounts to real Jungian shadow work. But somehow all that sincerity mysteriously ends up hiding and serving (instead of revealing and dealing with) an underlying drive to just get laid.
  • Likewise people "accidentally" cheating. Sometimes folk really are just surprising-to-them vulnerable in some situation and don't have the right kind of discipline when they turn out to need it. But the fact that that ever happens can act as a cover. It's especially obvious in cases of repeated "accidental" cheating.
  • I've seen four friends, as mothers, stay with and defend abusive partners (boyfriends or husbands) for years. She'd often insist that he's just stressed, or it's a frequent misunderstanding but they love each other, etc. In three cases it became possible for her to consider that he might be abusive after a change in her work gave her enough money to support herself and her child without him if need be. In the fourth case, the mother got a lot of social support such as a place to live and people she trusted to take care of her and her child, and then she had room to consider her partner's actions as abusive.
  • If I'm upset with a friend and I'm worried that they can't handle what I'm upset with them about, sometimes I can't think straight about what my problem with them is while I'm talking to them. My mind gets foggy, my concepts seem mushy even to me, the words I remember from journaling about it before now form what feels like a gibberish argument, etc. Often this fog suddenly clears up if I get a vivid sense from my friend that our friendship will be fine after we talk. It also gets clearer if the issue is so big that I realize I'm fine with them not being in my life after we talk.
  • Badly wanting someone to like you can make them like you less. So how do you get them to like you? Not by being aware that you're asking that question! But maybe if you do things for them without knowing that's why you're doing them…? ("Oh, I forgot Bob likes sushi! I just got some because I felt like it, honest!") And maybe if you add an extra dose of self-loathing ("God, I'm being a creep, aren't I? I always do that!") you can pass Omega-C's others' scrutiny here by eliciting care & concern when you might otherwise get caught.

I'm not trying to be exhaustive here. There are tons more examples.

We can't actually penetrate our own Newcomblike self-deception without having another viable (to us) strategy for dealing with hostile telepath problems.

However, if we do have another strategy in a given instance, then in that instance it can be safe to look. The self-deception can lift.

Having power

One alternative strategy type is, coming to trust that you're able to handle the consequences of being accurately seen.

Such as the moms in the abusive partners example above: each one could acknowledge her self-deception once it was safe for her abusive partner to know too. She got enough power (financial or social) to protect herself and her child, making the telepathic scan no longer a dire threat.

I think a lot of "trauma processing" amounts to this self-empowerment strategy. But it's more like, noticing you already have power. I bet a lot of foundational self-deception habits come from being a child faced with telepaths (adults) who have a lot of power over them. A kid who deals with Mother's "Say sorry and mean it" demand with self-deception might then grow up to become really apologetic and "have low self-esteem". But it's just an old strategy for dealing with Mother that hasn't made contact with the fact that Mother isn't that powerful over them anymore. It's now actually just fine for her to know they're not "really sorry". If this raw physical truth comes into contact with the impulse to "be sorry", the mental firewall might simply collapse, and the mislabeling will stop.

So in many cases, "trauma processing" can basically mean noticing you're not a child anymore. You have power. So you don't have to appease the hostile telepaths just because they're adults. They can just know your internal state, and you (trust that you) can handle the consequences of them knowing.

Building emotional resilience is like this, I think. If you (trust that you) can handle the emotional and somatic sensations of others being upset with you, then you don't have to hide the parts of you that might make them upset. They can just be upset. While you might not like it, you know you'll be fine.

(Not to say anything about what's ultimately good to do here. Caring about others' reactions totally makes sense for other reasons, like the health of the community we're in. Here I'm focusing specifically on what can solve the hostile telepaths problem without self-deception.)

Occlumency

Another solution type is occlumency. Which is to say, if you trust you can keep your real goals and/or strategies hidden from a hostile telepath even if you consciously know what your goals/strategies are, then it's safe to consciously know them.

(This is something like switching from Omega-C to Omega-V.)

A classic example is in WWII when Nazis come knocking and ask if you're harboring any Jews. The analog of one-boxing here is just not harboring Jews. Newcomblike self-deception doesn't seem plausible to me here. You very much don't have the power to handle the consequences of being caught "two-boxing". So if you're helping refugees, you probably have to lie convincingly. And if self-deception were a plausible strategy here, you wouldn't need it to the extent that you trust your ability to hide the truth from the Nazis even if you know the truth.

I think many psychopaths[3]use occlumency quite a lot. I've met some who know full well that they're trying to manipulate others and are presenting a façade to do so. It works for them in part because they don't send implicit distress signals around thinking they're bad for being manipulative: they're not nervous, so they don't need to explain their nervousness away.

There's a moral tangle here. Honesty is important for connection, integrity, and communal health. But you might not trust that it's safe to reveal the truth to a hostile telepath.[4]In this case, the moral injunction not to lie makes occlumency harder (because of fear of being caught, plus doubt about whether you should be using occlumency at all). This situation can leave self-deception as your only viable solution — which, incidentally, means you're still not being honest!

I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it's helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it's safe for you to eventually be fully honest.

Solution space is maybe vast

I've named three solutions to the hostile telepaths problem:

  • Newcomblike self-deception
  • Having power
  • Occlumency

These aren't the only ones. A pretty simple one is simply running away and avoiding them. Another is investigating whether the telepaths are in fact hostile and discovering they're not (if that's true). Yet another is to jam telepathic scans with emotional charge that backs privacy norms. ("It's none of your business whether I 'really am' sorry!")

The important part isn't that we have a full taxonomy. That might be helpful, I don't know. The important part, as far as I'm concerned, is that by being very clear about what problem we're solving, we can tell when something is — and is not — a solution.

By this model, to end (Newcomblike) self-deception, we have to remove the need for it. This means solving each instance of the hostile telepath problem some other way.

This is kind of tricky in practice. When you use self-deception to deal with a hostile telepath, you can't know that that's what you're doing. You[5]can't even know which hostile telepath problem you're solving! So how do you come up with another solution?

I don't have a provably general answer, but I have a pretty general approach that makes sense to me and has clearly worked several times. I'll share that approach here.

Welcome self-deception

First is welcoming that I'll self-deceive.

But this isn't "Well, I'm going to do it anyway, so I might as well be okay with it." That's nonsense: you probably can't just "be okay" with it. And trying probably makes the problem worse![6]

I mean something more wholehearted. If I self-deceive, it's because it's the best solution I have to some hostile telepath problem. If I don't have a better solution, then I want to keep deceiving myself. I don't just tolerate it. I actively want it there. I'll fight to keep it there!

This is somewhat akin to dealing with Omega-C by saying:

Look, I know it's possible I'm running a deception strategy. I could spend a bunch of energy trying to suss it out as a costly signal that it's not there. But at a policy level I'm just not going to do that. Not because I have evidence that I'm not following up on, but because I don't want to add stress to myself in the world where I really am self-deceiving. Since I'm doing this regardless of whether the deception strategy is running, it's not information about whether I'm secretly trying to two-box.

This relieves pressure. If I have some sense that I'm self-deceiving, and my attitude is to back the deception instead of trying to penetrate it, then the hidden part of me running the deception doesn't have to engage in an internal arms race with me. We become same-sided.

Look away when directed to

Once I really back my own self-deception, it becomes easier to notice signs I'm doing it.

This works way better if I trust my occlumency skills here. If I don't feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I'm still safe from hostile telepaths.

Seeing where I self-deceive doesn't mean I see what the deception is. In practice it's more indirect than that. What I mean are things like:

  • Revealed preferences. (Akin to noticing I two-boxed "by accident".)
  • My mind suddenly going foggy.
  • Forgetting what I was thinking about.
  • Mental chatter getting loud.
  • Suddenly being very disinterested in what I'm focused on.
  • Getting abruptly absorbed in something unrelated.
  • My attention scattering.
  • Losing awareness of my body, or parts of my body, or my body drives (like hunger).
  • Body activation signs: holding my breath, tensing my shoulders, quickened speech, etc.
  • Energy crash or getting really sleepy. (Like a freeze response.)
  • A sudden addictive impulse.
  • I feel shame, inadequacy, or otherwise think I'm broken or flawed or bad in some way.
  • Etc.

I don't mean this as an exhaustive list. Nor do I mean it as things to look out for. Nor do I mean that these always imply that self-deception is going on.

What I mean is, there are things a person does to maintain self-deception. If you basically promise the strategic not-conscious-to-you part that you really will respect the strategy, then it doesn't have to keep you so firmly out of the loop. Then you can potentially start picking up on some signposts like these ones.

Part of the deal is, when you notice such a possible signpost, you look away. You notice it and you drop the inquiry. Because until you have a non-self-deceptive strategy for whatever the real problem is, you don't want to break the one strategy you have.

For instance, sometimes I'll think about responding to an email… and I start getting sleepy. If I push, I start wanting to watch YouTube. These are signs that something in me doesn't trust it's safe for me to look there. Maybe it involves a decision that requires me to ask myself an unsafe question. I don't know — and I don't try to figure it out. At least not right away. Instead I back off and direct my attention elsewhere. Maybe I go cook something, or take a walk. I consciously distract myself from the tension point.

In my experience, this alone can often eliminate most of the stress involved in self-deception. It becomes fine. Annoying, glitchy, but no longer fraught with anxiety and self-doubt.

Hypothesize without checking

After a while I kind of get a "negative space" sense of what the self-deception is about. I continue not to look, out of something like respect. But I still have a hint.

Like if there's an email I keep freezing around. I can tell there's something there. I might even have some intuitive guesses about what it is!

but I do not check. I don't introspect on whether my guesses feel right.

Instead, I hypothesize. What hostile telepath problem might someone in my shoes be trying to solve such that this behavior arises?

For instance, let's suppose the person is asking for me to run an event this weekend. I might hypothesize like this, intentionally referring to myself in third person:

Maybe Valentine doesn't actually want to do it, but he's scared that letting them know will make them think he's actually uninterested in them in general, which might have them closing opportunities he wants with them in the future.

Importantly, I am not introspectively checking. I'm not asking if I think the above really is what's going on with me. I'm just noticing that, viewing myself in third person, this model does seem to fit the evidence.

I'm also not trying to construct a plan to verify what's going on! Here Nature wants her secrets kept. I do not try to peek under her skirt.

Instead, I notice what Valentine (i.e., me in third person) in this hypothetical could maybe do instead of Newcomblike self-deception. What would be a viable alternative strategy for him?

Maybe Valentine could meditate on their possible disapproval, and come up with a plan for what happens next in which he's okay. (Building power.)

At this point I could just implement this possible solution. I don't have to check if it's relevant to my situation: there's not much cost in leaving myself a line of retreat this way.

If it turns out there's been Newcomblike self-deception going on, and if this hypothetical solution really did resolve the core problem that the self-deception was solving, then the self-deception should basically just lift.[7]

And if I still have an ugh field around the email, then I haven't addressed the real problem yet. Which is fine. Not ideal, but I'm still going to back any self-deception that might be there while I don't have a better option!

I can repeat this process. Hypothesize without checking, implement solutions that would work in the hypothetical, and find out what happens.

…at least unless and until I start getting frozen about this process. That might mean I'm getting too close to understanding the strategy before it's safe to do so.

Then I back off.

Does this solve self-deception?

I don't know.

I didn't originally set out to make sense of self-deception. I was just trying to understand why people sometimes view themselves as flawed and in need of fixing.

It just turned out that that question was tied to a lot of others. Self-deception being one of them. A lot of them unified by considering the problem of hostile telepaths.

It seems worth noting that a bunch of the method I describe here — particularly the "hypothesize without checking" part — is derived. It amounted to a prediction that I tested and discovered worked as the model anticipated.

Likewise, occlumency being helpful. There might be other explanations for why getting better at privacy makes more thoughts thinkable. But I derived it from this one. And, again, it (anecdotally) seems to have worked as predicted.

These approaches work remarkably well on shame too, by the way. I might write a separate post on shame. Its logic is a bit different, but with a few adjustments I've found that shame dissolves extremely well in contact with these ideas.

With all that said, I don't think I'm in a position to say that I've solved self-deception. I don't know how I could know that. I'm not even convinced I've solved Newcomblike self-deception! My method seems plausibly general, but I don't have even the sketch of a necessity argument yet.

So, more work needed.

It seems to me that self-deception is solving a real problem. If we don't solve that real problem differently in a given instance, then in that instance we can't stop self-deceiving.

It seems to me that the real problem is (at least sometimes) hostile telepaths.

When I view hostile telepaths as the real problem I'm trying to solve, the perspective suggests what alternative solutions might look like, and it lets me check whether a given approach even can work as a solution.

And it seems to me that when I implement those alternative solutions, the result is sometimes that self-deception visibly falls away, non-mysteriously. It becomes obvious to me what was going on, and why.

I don't know if this model captures all cases of what we might want to call "self-deception". Maybe it does. But my impression is that it at least captures some cases that matter, and quite a lot of them.

  1. ^

    Note that having non-visual ways of thinking isn't enough to know you're not a simulation. What tells you you're not an Omega-V simulation is that you can reason in ways that (a) cannot be derived from your visual thinking and (b) change what you in fact do.

  2. ^

    Of course, this is something I became aware of after unraveling the structure in a few cases. It's not something that reveals itself while the structure works.

  3. ^

    By "psychopath" I mean someone with the cluster B personality disorder. I don't mean something derogatory. Nor am I (necessarily) referring to Gervais Principle psychopaths.

  4. ^

    To be clear, "hostile telepath" is a role, not an identity. Someone is a hostile telepath to you when they're scanning your mind and you don't trust they won't create problems for you based on what they find. Someone being a hostile telepath is less like them being a criminal and more like them being your lover or your foe. I say this because it's not a solution to identify "the hostile telepaths" in a community and reform or expel them; that approach is gibberish made of confused reification.

  5. ^

    If I were carefully describing this from the outside, I'd say that your false self can't know. "Self-deception" is really false self deception (as a strategy for deceiving hostile telepaths). The thing is, on the inside it doesn't feel like "your false self". That's the whole point! I'm describing this model in a way that's hopefully legible to the internal experience of actually running the strategy. Otherwise any instructions might make theoretical sense but won't be actionable. Sadly, this way of talking results in some ambiguities — precisely because the whole point of the strategy is to make something difficult to see clearly. Hopefully you can correct for this confusion as needed, sort of shifting to third-person and renaming things when the theory isn't clear.

  6. ^

    Why? Well, you need to "be okay" with it. But you're not. So what do you do with the fact that you're not okay with it? Loosely speaking, you've just turned your own conscious mind into an internal hostile telepath!

  7. ^

    In practice I find that not only does this work quite often, but now it sometimes works once I think of the alternative solution. I don't always need to implement it first. It feels to me like this result comes from having built internal trust that I really can and will respect my need for some strategy.

The hostile telepaths problem — LessWrong (2024)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Ouida Strosin DO

Last Updated:

Views: 5322

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Ouida Strosin DO

Birthday: 1995-04-27

Address: Suite 927 930 Kilback Radial, Candidaville, TN 87795

Phone: +8561498978366

Job: Legacy Manufacturing Specialist

Hobby: Singing, Mountain biking, Water sports, Water sports, Taxidermy, Polo, Pet

Introduction: My name is Ouida Strosin DO, I am a precious, combative, spotless, modern, spotless, beautiful, precious person who loves writing and wants to share my knowledge and understanding with you.