The current state of discussion about using decision theory as a human is one where none dare urge restraint. It is rife with light side narrative breadcrumbs and false faces. This is utterly inadequate for the purposes for which I want to coordinate with people and I think I can do better. The rest of this post is about the current state, not about doing better, so if you already agree, skip it. If you wish to read it, the concepts I linked are serious prerequisites, but you need not have gotten them from me. I’m also gonna use the phrase “subjunctive dependence”, defined on page 6 here a lot.
I am building a rocket here, not trying to engineer social norms.
I’ve heard people working on the most important problem in the world say decision theory compelled them to vote in American elections. I take this as strong evidence that their idea of decision theory is fake.
Before the 2016 election, I did some Fermi estimates which took my estimates of subjunctive dependence into account, and decided it was not worth my time to vote. I shared this calculation, and it was met with disapproval. I believe I had found people executing the algorithm,
The author of Integrity for consequentialists writes:
I’m generally keen to find efficient ways to do good for those around me. For one, I care about the people around me. For two, I feel pretty optimistic that if I create value, some of it will flow back to me. For three, I want to be the kind of person who is good to be around.
So if the optimal level of integrity from a social perspective is 100%, but from my personal perspective would be something close to 100%, I am more than happy to just go with 100%. I think this is probably one of the most cost-effective ways I can sacrifice a (tiny) bit of value in order to help those around me.
This seems to be clearly a false face.
Y’all’s actions are not subjunctively dependent with that many other people’s or their predictions of you. Otherwise, why do you pay your taxes when you could coordinate that a reference class including you could decide not to? At some point of enough defection against that the government becomes unable to punish you.
In order for a piece of software like TDT to run outside of a sandbox, it needs to have been installed by an unconstrained “how can I best satisfy my values” process. And people are being fake, especially in the “is there subjunctive dependence here” part. Only talking about positive examples.
I’m trying to do work that has some fairly broad-sweeping consequences, and I want to know, for myself, that we’re operating in a way that is deserving of the implicit trust of the societies and institutions that have already empowered us to have those consequences.
If you set out to learn TDT, you’ll find a bunch of mottes that can be misinterpreted as the bailey, “always cooperate, there’s always subjunctive dependence”. Everyone knows that’s false, so they aren’t going to implement it outside a sandbox. And no one can guide them to the actual more complicated position of, fully, how much subjunctive dependence there is in real life.
But you can’t blame the wise in their mottes. They have a hypocritical light side mob running social enforcement of morality software to look out for.
Socially enforced morality is utterly inadequate for saving the world. Intrinsic or GTFO. Analogous for decision theory.
Ironically, this whole problem makes “how to actually win through integrity” sort of like the Sith arts from Star Wars. Your master may have implanted weaknesses in your technique. Figure out as much as you can on your own and tell no one.
Which is kind of cool, but fuck that.