Treaties vs Fusion

If you have subagents A and B, and A wants as many apples as possible, and B wants as many berries as possible, and both want each additional fruit the same amount no matter how many they have, then these are two classes of ways you could combine them, with fundamentally different behavior.

If a person, “Trent”, was a treaty made of A and B, he would probably do something like alternating between pursuing apples and berries. No matter how lopsided the prospects for apples and berries. The amount of time/resources they spent on each would be decided by the relative amounts of bargaining power each subagent had, independently of how much they were each getting.

To B, all the apples in the world are not worth one berry. So if bargaining power is equal and Trent has one dollar to spend, and 50 cents can buy either a berry or 1000 apples, Trent will buy one berry and 1000 apples. Not 2000 apples. Vice versa if berries are cheaper.

A treaty is better than anarchy. After buying 1000 apples, A will not attempt to seize control on the way to the berry store and turn Trent around to go buy another 1000 apples after all. That means Trent wastes less resources on infighting. Although A and B may occasionally scuffle to demonstrate power and demand a greater fraction of resources. Most of the time, A and B are both resigned to wasting a certain amount of resources on the other. Unsurprising. No matter how A and B are combined, the result must seem like at least partial waste from the perspective of at least one of them.

But it still feels like there’s some waste going on here, like “objectively” somehow, right? Waste from the perspective of what utility function? What kind of values does Trent the coalition have? Well, there’s no linear combination of utilities of apples and berries such that Trent will maximize that combined utility. Nor does making their marginal utilities nonconstant help. Because Trent’s behavior doesn’t depend on how many apples and berries Trent already has. What determines allocation of new resources is bargaining outcomes, determined by threats and what happens in case of anarchy, determined by what can be done in the future by the subagents and the agent. What they have in the past / regardless of the whole person’s choices is irrelevant. Trent doesn’t have a utility function over just apples and berries; to gerrymander a utility function out of this behavior, you need to also reference the actions themselves.

But note that if there was a 50 50 chance which fruit would be cheaper, both subagents get higher expected utility if the coalition be replaced by the fusion who maximizes apples + berries. It’s better to have a 50% chance of 2000 utility and a 50% chance of nothing, than 50% of 1000 utility and 50% of 1. If you take veil of ignorance arguments seriously, pay attention to that.

Ever hear someone talking about how they need to spend time playing so they can work harder afterward? They’re behaving like a treaty between a play subagent and a work subagent. Analogous to Trent, they do not have a utility function over just work and play. If you change how much traction the work has in achieving what the work agent wants, or change the fun level of the play, this model-fragment predicts no change in resource allocation. Perhaps you work toward a future where the stars will be harnessed for good things. How many stars are there? How efficiently can you make good things happen with a given amount of negentropy? What is your probability you can tip the balance of history and win those stars? What is your probability you’re in a simulation and the stars are fake and unreachable? What does it matter? You’ll work the same amount in any case. It’s a big number. All else is negligible. No amount of berries is worth a single apple. No amount of apples is worth a single berry.

Fusion is a way of optimizing values together, so they are fungible, so you can make tradeoffs without keeping score, apply your full intelligence to optimize additional parts of your flowchart, and realize gains from trade without the loss of agentiness that democracy entails.

But how?

I think I’m gonna have to explain some more ways how not, first.

False Faces

When we lose control of ourselves, who is controlling us?

(You shouldn’t need to know about Nonviolent Communication to understand this. Only that it’s “hard” to actually do it.)
Rosenberg’s book Nonviolent Communication contains an example where a boy named Bill has been caught taking a car for a joy ride with his friends. The boy’s father attempts to use NVC. Here is a quote from Father.

Bill, I really want to listen to you rather than fall into my old habits of blaming and threatening you whenever something comes up that I’m upset about. But when I hear you say things like, “It feels good to know I’m so stupid,” in the tone of voice you just used, I find it hard to control myself. I could use your help on this. That is, if you would rather me listen to you than blame or threaten. Or if not, then, I suppose my other option is to just handle this the way I’m used to handling things.

Father wants to follow this flow chart.

But he is afraid he will do things he “doesn’t want to”. Blaming and threatening are not random actions. They are optimizations. They steer the world in predictable ways. There is intent behind them. Let’s call that intender Father. Here’s the real flow chart.Father has promised Father he can get what he wants without threats and blame. Father doubts this but is willing to give it a try. When it doesn’t seem like it’ll work at first, Father helps out with a threat to take over. It’s a good cop/bad cop routine. Father, who uses only NVC, is a false face and a tool.

Father thinks that Father is irrational. It’s a legitimate complaint. Father is running some unexamined, unreflective, incautious software. That’s what happens when you don’t use all your ability to think to optimize a part of the flow chart. But Father can’t acknowledge that that’s something he’d do and so can only do it stupidly. Father can’t look for ways to accomplish the unacknowledged goals, or any goals in worlds he cannot acknowledge might exist. He can’t look for backup plans to plans he can’t acknowledge might fail. Father’s self-identified-self (Father) is the thrall of artifacts, so he can only accomplish his goals without it.

Attributing revealed-preference motives to people like this over everything they do does not mean you believe everything someone does is rational. Just that virtually all human behavior has a purpose, is based on at least some small algorithm that discriminates based on some inputs to sometimes output that behavior. An algorithm which may be horribly misfiring, but is executing some move that has been optimized to cause some outcome nonetheless.

So how can you be incorruptible? You can’t. But you already are. By your own standards. Simply by not wanting to be corrupted. And your standards are best standards! Unfortunately you are are not as smart as you, and are easily tricked. In order to not be tricked, you need to use your full deliberative brainpower. You and you need to fuse.

I will save most of what I know of the fusion dance for another post. But the idea, from your perspective, the basic idea is to anthropomorphize hidden parts of the flow chart and recognize your concerns, be they values or possible worlds that must be optimized, and then actually try and accomplish those optimizations using all the power you have. Here’s a trick you might be able to use to jump-start it. If you notice yourself “losing control”, use (in your own thoughts) the words the whole flow chart would speak. Instead of, “I lost control and did X”, “I chose to do X because…”. Turn your “come up with a reason why I did that” stuff on all your actions. Come up with something that’s actually true. “I chose to do X because I’m a terrible person” is doing it wrong. “I chose to do X because that piece of shit deserved to suffer” may well be doing it right. “I chose to do X instead of work because of hyperbolic discounting” is probably wrong. “I chose to do X because I believe the work I’d be doing is a waste of time” might well be doing it right. If saying that causes tension, because you think you believe otherwise, that is good. Raising that tension to visibility can be the beginning of the dialog that fuses you.

Why just in your own thoughts? Well, false faces are often useful. For reasons I don’t understand, there’re certain assurances that can be made from a false face, that someone’s deep self knows are lies but still seem to make them feel reassured. “Yeah, I’ll almost certainly do that thing by Friday.” And I don’t even see people getting mad at each other when they do this.

Set up an artifact that says you tell the truth to others, and you’ll follow it into a sandboxed corner of the flow chart made of self-deception. But remember that self-deception is used effectively to get what people want in a lot of default algorithms humans have. I have probably broken some useful self-deceptive machinery for paying convincing lip service to socially expected myths in my purism. I have yet to recover all the utility I’ve lost. I don’t know which lies are socially desirable, so I have to tell the truth because of a lopsided cost ratio for false negatives and false positives. Beware. Beware or follow your “always believe the truth” artifact into a sandboxed corner of the flow chart.

This sandboxing is the fate of failed engineering projects. And your immune system against artifacts is a good thing. If you want to succeed at engineering, every step on the way to engineering perfection must be made as the system you are before it, and must be an improvement according to the parts really in control.

Engineering and Hacking your Mind

Here are two strategies for building things:

Engineering.

It’s about building things so that you can change one part without thinking about the whole thing. This allows you to build big things. Every part must reflect a global order which says how they should interact, what aspects of total correctness depend on correctness of what parts, so that if every part works, it works. In engineering, it’s common to “waste” effort to meet this specification in a way that you will probably never rely upon. This is so when you design other parts, you only have to keep the order in mind as what they have to interact with, not the order plus guesses about whatever your past self (or other people) thought was reasonable.

Perfection in this approach is when you don’t even have to remember the exact interface behavior of the other modules. As you find yourself needing it, you just ask, “What would the ideal behavior be?” and assume it’s that, then build your new module with ideal behavior on top of that. In practice I do this quite a lot with code I’ve written and with pieces of my mind. In engineering, bugs cancelling out bugs are still bugs. Because anything that deviates from the order is liable to cause more problems when you assume the order holds later on.

Hacking.

If engineering is like deontology, hacking is like consequentialism. What something is “really for” is part of the map, not the territory, and you aren’t attached to a particular use of something. Whatever works. Something being “broken” can make it more useful. Don’t waste time on abstractions and formal processes. They are not flexible enough to accommodate something you haven’t built yet. Think about the concrete things you have and need and can change.

Which should you use?

Engineering can be cumbersome to get off the ground, and can seem like it’s predictably always wasting more motion. The things that engineering creates are able to be more robust and it can accommodate more complexity. It scales to large projects. Hacking stretches your cleverness, working memory, and creativity-in-using-things-unintuitively. Engineering stretches your foresight, wisdom, and creativity-in-fitting-to-a-format.

Straw-pragmatists pick hacking for large and long projects. If you are sufficiently ambitious, you need engineering. Implementing any kind of rationality in the human brain is a large and long project. Implementing rationality so you can gain the power to save the world, a common goal among my circles, is definitely ambitious enough to require engineering for the best chance of success.

The human brain is so many hacks already. Engineering will never work. The only option is to pile on more hacks.

Hardcore Buddhists are totally engineers, not hackers. You’ve seen things that’d be impossible for a hacker from them. Like sitting still during self-immolation. Oh yeah. Engineering yourself is DANGEROUS. I do not recommend building yourself according to an order that is not yourself. If you want to save the world, your art of rationality had better be at least that powerful AND automatically point itself in a better direction.

Hopefully my more concrete blog posts will help you understand an order that you can see delivers.

Self-Blackmail

I once had a file I could write commitments in. If I ever failed to carry one out, I knew I’d forever lose the power of the file. It was a self-fulfilling prophecy. Since any successful use of the file after failing would be proof that a single failure didn’t have the intended effect, so there’d be no extra incentive.

I used it to make myself do more work. It split me into a commander who made the hard decisions beforehand, and commanded who did the suffering but had the comfort of knowing that if I just did the assigned work, the benevolent plans of a higher authority would unfold. As the commanded, responsibility to choose wisely was lifted from my shoulders. I could be a relatively shortsighted animal and things’d work out fine.
It lasted about half a year until I put too much on it with too tight a deadline. Then I was cursed to be making hard decisions all the time. This seems to have improved my decisions, ultimately.

Good leadership is not something you can do only from afar. Hyperbolic discounting isn’t the only reason you can’t see/feel all the relevant concerns at all times. Binding all your ability to act to the concerns of the one subset of your goals manifested by one kind of timeslice of you is wasting potential, even if that’s an above-average kind of timeslice.

If you’re not feeling motivated to do what your thesis advisor told you to do, it may be because you only understand that your advisor (and maybe grad school) is bad for you and not worth it when it is directly and immediately your problem. This is what happened to me. But I classified it as procrastination out of “akrasia”.

I knew someone in grad school whose advisor had been out of contact for about a year. Far overdue for a PhD, they kept taking classes they didn’t need to graduate, so that they’d have structure to make them keep doing things. Not even auditing them. That way there was a reason to continue. They kept working a TA job which paid terribly compared to industry. This was in a field where a PhD is dubiously useful. If they had audited the classes, and had only had curiosity driving them to study, then perhaps the terrible realization that they needed a change of strategy would not have been kept in its cage.

Doing the right thing with your life is much more important than efficiency in the thing you’ve chosen. It’s better to limp in the right direction than run in the wrong one.

There are types of adulting that you can’t learn until you have no other recourse, and once learned, are far more powerful than crutches like commitment mechanisms. Learn to dialogue between versions of yourself, and do it well enough that you want to understand other selves’ concerns, or lose access to knowledge that just might be selected to be the thing you most need to know.

I am lucky that my universal commitment mechanism was badly engineered, that the clever fool versions of me who built it did not have outside help to wield even more cleverly designed power they did not have the wisdom not to.

These days there’s Beeminder. It’s a far better designed commitment mechanism. At the core of typical use is the same threat by self fulfilling prophecy. If you lie to Beeminder about having accomplished the thing you committed to, you either prove Beeminder has no power over you, or prove that lying to Beeminder will not break its power over you, which means it has no consequences, which means Beeminder has no power over you.

But Beeminder lets you buy back into its service.

It’s worse than a crutch, because it doesn’t just weaken you through lack of forced practice. You are practicing squashing down your capacity to act on “What do I want?, What do I have?, and How can I best use the latter to get the former?” in the moment. When you set your future self up to lose money if they don’t do what you say, you are practicing being blackmailed.

You’re practicing outsourcing and attributing the functions Freud would call superego to something external. Look at any smart fundamentalist who sincerely believes that without God they’d have no morality to see the long term effects of that. I have heard a Beeminder user say they’d become “a terrible person” if they lost Beeminder. They were probably exaggerating, but that sounds to me like the exaggeration you’d make because you sort of believed it.

This does not mean that giving up on commitment devices will not damage you. That would be uncharacteristically fair of reality. Often you have to break things to make them better though.