I don’t know how mutable core values are. My best guess is, hardly mutable at all or at least hardly mutable predictably.
Any choice you can be presented with, is a choice between some amounts of some things you might value, and some other amounts of things you might value. Amounts as in expected utility.
When you abstract choices this way, it becomes a good approximation to think of all of a person’s choices as being made once timelessly forever. And as out there waiting to be found.
I once broke veganism to eat a cheese sandwich during a series of job interviews, because whoever managed ordering food had fake-complied with my request for vegan food. Because I didn’t want to spend social capital on it, and because I wanted to have energy. It was a very emotional experience. I inwardly recited one of my favorite Worm quotes about consequentialism. Seemingly insignificant; the sandwich was prepared anyway and would have gone to waste, but the way I made the decision revealed information about me to myself, which part of me may not have wanted me to know.
Years later, I attempted an operation to carry and drop crab pots on a boat. I did this to get money to get a project back on track to divert intellectual labor to saving the world from from service to the political situation in the Bay Area because of inflated rents, by providing housing on boats.
This was more troubling still.
In deciding to do it, I was worried that my S1 did not resist this more than it did. I was hoping it would demand a thorough and desperate-for-accuracy calculation to see if it was really right. I didn’t want things to be possible like for me to be dropped into Hitler’s body with Hitler’s memories and not divert that body from its course immediately.
After making the best estimates I could, incorporating probability crabs were sentient, and probability the world was a simulation to be terminated before space colonization and there was no future to fight for, this failed to make me feel resolved. And possibly from hoping the thing would fail. So I imagined a conversation with a character called Chara, who I was using as a placeholder for override by true self. And got something like,
You made your choice long ago. You’re a consequentialist whether you like it or not. I can’t magically do Fermi calculations better and recompute every cached thought that builds up to this conclusion in a tree with a mindset fueled by proper desperation. There just isn’t time for that. You have also made your choice about how to act in such VOI / time tradeoffs long ago.
So having set out originally to save lives, I attempted to end them by the thousands for not actually much money. I do not feel guilt over this.
Say someone thinks of themself as an Effective Altruist, and they rationalize reasons to pick the wrong cause area because they want to be able to tell normal people what they do and get their approval. Maybe if you work really really hard and extend local Schelling reach until they can’t sell that rationalization anymore, and they realize it, you can get them to switch cause areas. But that’s just constraining which options they have to present them with a different choice. But they still choose some amount of social approval over some amount of impact. Maybe they chose not to let the full amount of impact into the calculation. Then they made that decision because they were a certain amount concerned with making the wrong decision on the object level because of that, and a certain amount concerned with other factors.
They will still pick the same option if presented with the same choice again, when choice is abstracted to the level of, “what are the possible outcomes as they’re tracking them, in their limited ability to model?”.
Trying to fight people who choose to rationalize for control of their minds is trying to wrangle unaligned optimizers. You will not be able to outsource steering computation to them, which is what most stuff that actually matters is.
Here’s a gem from SquirrelInHell’s Mind:
preserving a memory, but refraining from acting on it
Apologies are weird.
There’s a pattern where there’s a dual view of certain interactions between people. On the one hand, you can see this as, “make it mutually beneficial and have consent and it’s good, don’t interfere”. And on the other hand one or more parties might be treated as sort of like a natural resource to be divided fairly. Discrimination by race and sex is much more tolerated in the case of romance than in the case of employment. Jobs are much more treated as a natural resource to be divided fairly. Romance is not a thing people want to pay that price of regulating.
It is unfair to make snap judgements and write people off without allowing them a chance. And that doesn’t matter. If you level up your modeling of people, that’s what you can do. If you want to save the world, that’s what you must do.
I will not have my epistemology regarding people socially regulated, and my favor treated as a natural resource to be divided according to the tribe’s rules.
Additional social power to constrain people’s behavior and thoughts is not going to help me get more trustworthy computation.
I see most people’s statements that they are trying to upgrade their values as advertisements that they are looking to enter into a social contract where they are treated as if more aligned in return for being held to higher standards and implementing a false face that may cause them to do some things when no one else is looking too.
If someone has chosen to become a zombie, that says something about their preference-weightings for experiencing emotional pain compared to having ability to change things. I am pessimistic about attempts to break people out of the path to zombiehood. Especially those who already know about x-risk. If knowing the stakes they still choose comfort over a slim chance of saving the world, I don’t have another choice to offer them.
If someone damages a project they’re on aimed at saving the world based on rationalizations aimed at selfish ends, no amount of apologizing, adopting sets of memes that refute those rationalizations, and making “efforts” to self-modify to prevent it can change the fact they have made their choice long ago.
Arguably, a lot of ideas shouldn’t be argued. Anyone who wants to know them, will. Anyone who needs an argument has chosen not to believe them. I think “don’t have kids if you care about other people” falls under this.
If your reaction to this is to believe it and suddenly be extra-determined to make all your choices perfectly because you’re irrevocably timelessly determining all actions you’ll ever take, well, timeless decision theory is just a way of being presented with a different choice, in this framework.
have done do lamentable things for bad reasons (not earnestly misguided reasons), and are despairing of being able to change, then either embrace your true values, the ones that mean you’re choosing not to change them, or disbelieve.
It’s not like I provided any credible arguments that values don’t change, is it?