An axis in mental architecture space I think captures a lot of intuitive meaning behind whether someone is “real” or “fake” is:
Real: S1 uses S2 for thoughts so as to satisfy its values through the straightforward mechanism: intelligence does work to get VOI to route actions into the worlds where they route the world into winning by S1’s standards.
Fake: S2 has “willpower” when S1 decides it does, failures of will are (often shortsighted, since S1 alone is not that smart) gambits to achieve S1’s values (The person’s actual values: IE those that predict what they will actually do.), S2 is dedicated to keeping up appearances of a system of values or beliefs the person doesn’t actually have. This architecture is aimed at gaining social utility from presenting a false face.
These are sort of local optima. Broadly speaking: real always works better for pure player vs environment. It takes a lot of skill and possibly just being more intelligent than everyone you’re around to make real work for player vs player (which all social situations are wracked with.)
There are a bunch of variables I (in each case) tentatively think that you can reinforcement learn or fictive reinforcement learn based on what use case you’re gearing your S2 for. “How seriously should I take ideas?”, “How long should my attention stay on unpleasant topic”, “how transparent should my thoughts be to me”, “how yummy should engaging S2 to do munchkinry to just optimize according to apparent rules for things I apparently want feel”.
All of these have different benefits if pushed to one end, if you are using your S2 to outsource computation or if you are using it as a powerless public relations officer and buffer to put more distance between the part of you that knows your true intents and the part that controls what you say. If your self models of your values are tools to better accomplish them by channeling S2 computation toward the values-as-modeled, or if they are false faces.
Those with more socially acceptable values benefit less from the “fake” architecture.
The more features you add to a computer system, the more likely you are to create a vulnerability. It’d be much easier to make an actually secure pocket calculator than an actually secure personal computer supporting all that Windows does. Similarly, as a human you can make yourself less pwnable by making yourself less of a general intelligence. Have less high level and powerful abstractions, exposing a more stripped down programming environment, being scarcely Turing complete, can help you avoid being pwned by memes. This is the path of the Gervais-Loser.
I don’t think it’s the whole thing, but I think this is one of the top 2 parts of what having what Brent Dill calls “the spark”, the ability to just straight up apply general intelligence and act according to your own mind on the things that matter to you instead of the cached thoughts and procedures from culture. Being near the top of the food chain of “Could hack (as in religion) that other person and make their complicated memetic software, should they choose to trust it, so that it will bend them entirely to your will” so that without knowing in advance what hacks there are out there or how to defend against them, you can keep your dangerous ability to think, trusting that you’ll be able to recognize and avoid hack-attempts as they come.
Wait, do I really think that? Isn’t it obvious normal people just don’t have that much ability to think?
They totally do have the ability to think inside the gardens of crisp and less complicatedly adversarial ontology we call video games. The number of people you’ll see doing good lateral thinking, the fashioning of tools out of noncentral effects of things that makes up munchkinry, is much much larger in video games than in real life.
Successful munchkinry is made out of going out on limbs on ontologies. If you go out on a limb on an ontology in real life…
Maybe your parents told you that life was made up of education and then work, and the time spent in education was negligible compared to the time spent on work, and in education, your later income and freedom increases permanently. And if you take this literally, you get a PhD if you can. Pwned.
Or you take literally an ontology of “effort is fungible because the economy largely works.” and seek force multipliers and trade in most of your time for money and end up with a lot of money and little knowledge of how to spend it efficiently and a lot more people trying to deceive you about that. Have you found out that the thing about saving lives for quarters is false yet? Pwned.
Or you can take literally the ontology, “There is work and non-work, and work gets done when I’m doing it, and work makes things better long-term, and non-work doesn’t, and the rate at which everything I could care about improves is dependent on the fraction of time that’s doing work” and end up fighting your DMN, and using other actual-technical-constraint-not-willpower cognitive resources inefficiently. Then you’ve been pwned by legibility.
Or you could take literally the ontology, “I’m unable to act according to my true values because of akrasia, I need to use munchkinry to make it so I do”, and end up binding yourself with the Giving What We Can pledge, (in the old version, even trapping yourself into a suboptimal cause area.). Pwned.