Friday, August 7, 2015

Eva Vivalt Did Not Show QALYs/$ of Interventions Follow a Gaussian Curve

(Epistemic status: I have zero statistics background, but damned if I won’t give this a shot anyway.)

In a recent blog post, Robin Hanson said Eva Vivalt's data indicate that the distribution of charity impacts is close to Gaussian, and is not a fat-tailed power law like Effective Altruists claim. If that's true, it pretty much undermines Effective Altruism altogether, because it means that there's not a big difference between a decent intervention and the best intervention.

Suppose the following three interventions had identical effect sizes: feeding people carrots, handing out chess strategy manuals, and deworming.

I hope you're currently wondering what the hell I just asked you to suppose, because the previous sentence was nonsense. You can only talk about the “effect size” of carrots if you’re measuring an additional thing besides carrots. The additional thing is probably not “handing out chess strategy manuals”, because then the effect size of carrots would be a measure of how good carrots are at handing out chess strategy manuals.

How about if I’m studying “feeding people carrots to make them taller”, “handing out chess strategy manuals to make them smarter”, and “deworming to eliminate intestinal parasites”?

That’s much better! Now it makes sense to talk about effect sizes for these things. There’s some amount of taller people get when you give them carrots, some amount of smarter people get when you give them chess strategy manuals, and some amount of dewormed people become when you give them wormicide.

Now what does it mean for these things to have identical effect sizes?

There are actually several reasonable answers, but here’s the one I’m seeing in Eva’s slides.

Say that the average height is 5’7”, and most people are between 5’5” and 5’9”. So the usual variation in height is “within 2 inches of average”, or a range of 4 inches. When I give somebody a bundle of carrots, they grow an inch. 5’8” plus bundle of carrots equals 5’9”. We can express the effectiveness of carrots in terms of variation in the general population: If everybody gets their Height stat by rolling a four sided die to adjust away from human average of 5’7”, eating a bundle of carrots gives you a plus one to your Height roll. It’s like a 25% bonus to randomness.

Now say that the average IQ is 100, and most people are between 95 and 105: the usual variation in IQ is “within 5 points of average”, or a range of 10 points. When I give somebody a chess book, they gain two and a half IQ points. We can express the effectiveness of the book in terms of variation in the general population: If everybody gets their Intelligence stat by rolling a ten sided die to adjust away from human average of 100, reading a chess book gives you a plus 2.5 to your Int roll. It’s a 25% bonus compared to randomness.

Then we can (sort of) compare the effect sizes of carrots and chess books: Carrots give a 25% bonus, and chess books also give a 25% bonus.

Is that useful information, though? Why does it matter if carrots give the same size of bonus to growth that chess books give to IQ?

Now if carrots happened to also make people smarter, then comparing effect sizes would be useful. We’d be talking about two different interventions aimed at the same outcome. Furthermore, we could dispense with the standardized statistical effect size stuff, and look directly at the absolute number of IQ points gained from a dollar’s worth of chess books, and the number of IQ points gained from a dollar’s worth of carrots.

If it turned out that all intelligence interventions gave the same Int bonus per dollar, then we might as well flip a coin to decide between carrots and chess books. Same thing if it turned out that we weren’t good enough at measuring things to tell the difference between the effects of carrots and chess books on intelligence. Any time spent “picking the best one” would be wasted.

But what if you don't know how much chess books and carrots cost? And what if you don't know how many carrots are in a bundle? Maybe you know that a "bundle of carrots" - whatever number of carrots the charity is distributing per person - has the same effect as a chess book, but you don't know that the chess book costs four times as much as the bundle of carrots. It would be premature to say that time spent choosing between carrots and chess books is wasted, because if you learned the cost, you'd fund the carrots.

It might even be that the effect of carrots only seems to match the effect of chess books because the carrot people kept putting more carrots into the bundles until a bundle was as good as a chess book, and then they stopped. (Because that's where they reach some socially agreed-upon level of 'statistical significance', perhaps?) Maybe if you put twelve carrots into the bundle instead of six, you get twice as many IQ points as a chess book causes, and for half the price! But you don't know; you just know that as things are now, the carrot charity is making people about as much smarter as the chess charity.

And that, it seems to me, is what Eva's data actually say. When I emailed her for clarification, she said that, "Most of the interventions can't be statistically distinguished from each other for a given outcome." She also said cost wasn't factored in yet (though she suspected it wouldn't change much). So if she's right, then as far as we can tell, all the deworming interventions are currently equally good at killing worms, all the microfinance interventions are equally good at alleviating poverty, and so on for the top 20 international development programs.

If it’s true that the top 20 international development programs are just as good at whatever they do as all the other programs they’re directly competing with, even when you factor in cost, this has significant implications for Effective Altruism. It means we can stop evaluating individual charities once we've identified the "pretty good" ones.

But there’s a stronger claim it’s easy to confuse this with. (Eva’s presentation was called “Everything You Know Is Wrong”, and a couple of her slides said, “Anyone who claims they know what works is lying”. I tend to expect confusion when such strong System 1 language accompanies abstract statistical analysis.) The stronger claim is “the top 20 interventions are equally good at saving lives, regardless of how they go about it”. If that were true, it would chuck the premises of Effective Altruism right out the window.

If you want to compare two interventions with different outcomes - medicated mosquito nets vs. microfinance - you’re going to need some way of converting between malaria and poverty. When we’re talking about altruism, the common factor is Quality Adjusted Life Years.

There’s some amount of better a person’s life is when they don’t have malaria, and some amount of time they remain malaria-free after you give them mosquito nets. There’s also some amount of better a person’s life gets when they’re fifty bucks less poor, and some amount of time they stay fifty bucks less poor after you give them fifty bucks. So you can compare bug nets to microfinance through QUALYs once you've got data on 1) effect sizes, 2) how nice it is to be healthy or less poor, 3) how long people stay healthy or less poor once they get that way via bug nets or microfinance.

You need all three of those things. The fact that the effect sizes are identical doesn't matter if a medicated mosquito net is worth orders of magnitude more QUALYs than fifty bucks. Eva's data only include effect sizes.

It doesn't make sense to compare "how many worms a deworming intervention kills" with "how much AIDS a box of condoms prevents" until you know how much those problems affect quality of life, and for how long. So even if all deworming interventions are equally effective, the choice between “deworming” and “condoms” could still be massively important.

And that’s the central claim I take EA to be making.

Eva's analysis says nothing about distribution of QUALYs over all EA interventions under consideration. Maybe it’s Gaussian after all, but this isn’t new evidence either way.

Thursday, August 6, 2015

Uninhibited Fancy Feet

Helpful context: Tortoise Skills page, Effective Rest Days, Inhibition: Game Plan, Tortoise Report 3: Empathy

I was thinking about ways to trick my brain into being more like it is on a rest day, even though there’s specific work I plan to do today, when I noticed a feeling of "not being allowed to do that". I inferred that I must have just wanted something and shut down the wanting without even being aware of it. So I examined my recent memory, and found a desire to get a pedicure. (That counts as “noticing an impulse”, at this stage, so I tapped my fingers together.)

My natural reaction was to wonder whether it makes sense for me to indulge the impulse, and why I shut it down to begin with. Why am I "not allowed" to get a pedicure? This isn't a new thing; I really like getting pedicures, but I hardly ever do it.

Money is one reason, but it's not sufficient to stop me getting the occasional pedicure just because. Doing things that make me happy is what having money is for. Even System 1 believes that at this point, so it's not where most of the "not allowed" is coming from.

Actually, I'm embarrassed about how beaten up my feet are from running. Although I’m not missing any toenails, the bottoms of my feet are covered in calluses and blisters. I’m averse to making the manicurist deal with them. (I note that this inhibition falls under “taking care of other people”, which I've hypothesized is the main kind of inhibition that wears me out.)

What does a manicurist experience when a runner comes in? If I imagine the situation from my own perspective, looking down from my chair, I simulate her with a disgusted look on her face, as though I've been terribly inconsiderate, coming to a nail salon with feet like that. She does her work as quickly as possible, with reluctance and resentment.

Empathy time (gosh, that one's turning out to be so useful!): If I imagine the situation from her perspective, looking at me and my feet, she thinks, "another runner," and goes about her business, doing as much as she can and not worrying about the rest, just like she does every time a runner comes in. Or anytime anyone comes in, really. It changes literally nothing in her routine.

If I try to color in the simulation with a specific emotion, it’s the one I'd feel in her position; something like "Her poor feet! I'm glad she came to me to have them cared for properly." When I offer that feeling to my brain with an interrogative tag, beside the “disgust/resentment” hypothesis, it clicks as a correctness feeling, while the other is rejected with a “that’s not the real world” feeling.

But back to inhibition: Even if she is disgusted and inconvenienced, I'm paying her to care for my feet. I never signed a contract saying I'd only bring in feet that are already in perfect condition.

This is not an inhibition I endorse. Getting a pedicure is a perfectly good way to help me into a more rest-day-like frame of mind. Decision:

Wednesday, August 5, 2015

Inhibition: Game Plan

Observations:

  • I benefit a lot from “rest days” that are more about acting on my immediate impulses than about resting physically or cognitively.
  • Sometimes a rest day is an emergency, because I’ve completely exhausted myself in a particular way that makes the prospect of serving others feel threatening.
  • On rest days, I often find that I’ve suppressed an impulse that I should obviously have acted on, or that I’m in the process of suppressing such an impulse. (Example: I realize that I’ve been thirsty for the last half hour and have repeatedly denied the urge to get a glass of water because I wasn’t done reading an article I set out to read.)
  • I find social interaction exhausting, especially social interaction with large groups of strangers, even when I’m enjoying myself.
  • I find social interaction much less exhausting when I drink alcohol.
  • I am happier when I drink alcohol in a way that feels more like a barrier to emotion being removed than like any particular extra sensation being pleasant. (I sometimes end up sadder if I start out sad.)
  • I am smarter, in a certain sense, when I drink alcohol. Ideas come more easily, combine with other ideas more easily, and inspire actions (like expressing ideas) that I immediately execute.
  • A successful rest day feels a lot like being slightly tipsy.
  • Both rest days and the effects of alcohol feel like they have something in common with the creative state of mind I enter when thinking of a mnemonic image.

Inferences and Speculation:

  • I ordinarily waste a lot of cognitive resources inhibiting impulses unnecessarily.
  • Social exhaustion is caused, at least in part, by the same inhibitory patterns that can cause rest days to be emergencies.
  • I have gained a little control over inhibition under some circumstances (rest days and mnemonics).

Hypotheses:

  • I will be substantially happier, smarter, and more socially resilient if I learn to be less inhibited.
  • I can learn to be less inhibited.

Planning

What does a failure to apply the skill look like? It looks like subconsciously exerting control to prevent an action whose outcome would be neutral or beneficial. Concretely: I’m riding my bike when I see a certain plant, and want to know what it looks like up close. I don’t slow down to find out, even though I don’t have any time constraints or other reasons not to slow down besides “I am biking”.

(Hm, this suggests I’m attached to my current activity, whatever that might be, by default. And looking at that thought is slightly painful. I’m afraid that if I believe my problem comes from being attached to my current activity, I’ll start to frequently tear myself away from my current activity, which will hurt. Message received: If this comes down to needing to frequently pull myself away from my current activity, I’ll be sure to find a pleasant, non-damaging way to do so. And I don't actually expect this to be a big problem. Remember what rest days are like? They're happy, not painful. Am I still clinging to being attached to my current activity even given that? No, I think I trust myself with this.)

Success will probably come in two stages. In the first stage, I’ll bring my impulses into direct conscious examination, choosing deliberately whether or not to act on them. That would be like noticing I want to know what the plant looks like up close, asking myself whether I should stop to look at it, and then stopping to look at it if I get a “yes”. In the second stage, I’ll have eliminated the bias toward inhibiting impulses, and I’ll act on impulses that seem beneficial or neutral without need for conscious effort. That would look like a smooth motion from wondering what the plant looks like to stopping to look at it.

My first hypothesis for a trigger is “wanting something”. I’ll tap my fingers together every time I notice myself wanting something (even when I’m not taking the day off). I expect this to be sufficient for the first stage of success; once I’m aware of wanting something, I expect deliberately choosing whether to have it will be easier than not deliberately choosing.

I'll also try exploratory study pf the phenomenology of wanting and inhibition under the influence of alcohol, and during a mnemonics exercise.

I won’t be surprised if the second stage of success just comes with practice, but training may turn out to suggest faster or more reliable ways of internalizing the habit.

Do I risk losing important abilities I won't be able to get back if I succeed at this? Yes. I risk automatically acting on impulses it would have been better to inhibit. But this only seems like a risk with the second stage of success, and not with the first. With the first stage I'll be deliberately choosing. To mitigate this, I'll look for sensations that can distinguish helpful inhibition from harmful inhibition that happen before I have an opportunity to notice actual wanting. My goal will eventually be to be able to predict when an impulse I should deny is about to occur, and when an impulse I should indulge is about to occur. The risk isn't obviously worse than the current state of affairs. I'm happy to cross that bridge when I come to it, so I'll go ahead and begin to train "noticing wanting".

Results will be in the next Tortoise Report.

Monday, August 3, 2015

Tortoise Report 5: Defensiveness

What's a "Tortoise Report"? See the Tortoise Skills page.

Habit: Staying Sane While Defensive

Duration: 2 Months (This one took some time to get a handle on.)

Success: 7/10

Trigger: A feeling of being drawn into my solar plexus and closing a shield around myself for protection from attacks during interactions with other people

Action: Empathy

Result: I’m not sure I’ve reduced the frequency with which I get defensive very much, which is my long-term goal with this. But the feeling doesn’t get the chance to do nearly as much damage.

If I’m defensive and Eliezer says “that sounds like a bad idea,” I hear, “your idea is bad and you are bad and you should feel bad”. So I fear that he’s updated toward “I am bad”, and want to persuade him that he’s made an error, and in fact I am good. (My attempt is extremely clumsily given my state of unreflection and confusion, of course, and I end up completely undermining it right from the start). I fear he’ll enforce “you should feel bad” with further statements that will make me feel worse, so I feel I need to convince him that it’s false that I should feel bad. All of that defending, of course, gets tied up with a defense of the idea itself.

It’s a giant mess.

He never actually means anything like “your idea is bad and you are bad and you should feel bad”. When he says “that sounds like a bad idea”, he means something like “I predict that acting on the expressed beliefs and inferences will result in outcomes neither of us wants”. Which is blatantly obvious to me the moment I bother to simulate his mental state at all.

Empathy works surprisingly well against defensiveness for me. When I’m defensive, I tend to interpret everything that’s said to me as indicating a value judgement, which seems to be where most of the insanity comes from. Now, when I realize I’m defensive, I imagine what it might be like to be the other person, and what states of mind are most likely to be motivating their behaviors. I usually find my defensiveness-motivated interpretation was completely ridiculous, and I have the opportunity to check when it’s not so clear. I also have the opportunity to say, “I’m feeling defensive,” which can lead to having the rest of the discussion when I’m feeling more secure, or when I’ve eaten or exercised. But even when, upon reflection, I simulate the other person as actually wanting to hurt me, I end up feeling more compassion for them than any need to protect myself.

I don’t yet have an action that reliably leads to me leaving the defensive state of mind, nor a trigger that might allow me to prevent defensiveness in the first place. But being able to not suddenly go completely bonkers when I feel defensive is a pretty big deal.

Side note, one of my main methods when it comes to cognitive habit training is “seek opportunities to practice”. That does not seem to work for me with defensiveness. I had a Facebook thread where I asked people to post about a few topics I consider more or less “emotionally triggering” for me, or to post about things they expected would make me defensive, and it totally failed. Lots of great posts, no defensiveness. There was exactly one minor success, which caused something more like “competitiveness” than the thing that makes me crazy. (I felt compelled to spend many hours defending a certain interpretation of Indian Buddhist doctrine and my inferences from it, and to intellectually dominate the people who were wrong.) But for the most part, I felt a lot of closeness and trust with everyone in that thread, especially the people who expressed negative emotions about me specifically. I felt like, “This is beautiful, I wish I’d done this a long time ago!”

Then I tried reading internet criticisms of Eliezer through Tumblr and Rationalwiki. It all felt silly and actually made me kind of happy, I’m not totally sure why but maybe because I’m proud to be serving someone who gets such strange and outrageous criticisms. “I don’t get the impression that he’s really an earth-shatteringly good mathematician.” Also, did you know? Less Wrong “hopes to make humanity more rational, protect the universe from the development of evil(or "unfriendly") AI and usher in a golden era where we will all be immortal deathless cyborgs having hot sex with sexbots on a terraformed Mars.” There’s some great stuff out there.

I think defensiveness is one of the things that mostly dissolves under scrutiny. I noticed big improvements in my reactions long before I felt like I had any idea what to do with the things I was feeling. There must be some kind of feedback loop in defensiveness that relies on my attention being elsewhere. And if I’m actively expecting to become defensive, the cycle can’t even complete its first loop.

Edit: This report should really contain detailed descriptions of my usual experience of defensiveness before training, and my usual experience of defensiveness now.

Unfortunately, my memories of defensiveness before two months ago are far less detailed, since I'd never explicitly paid attention to those experiences. But I do have some vague memories of, for instance, reading one of Person's blog posts criticizing the LW community about a year ago, and I recall stuff like this: Reading the title, I feel a flash of fear/foreboding, plus a strong attraction. While reading the article, I feel compelled to continue reading, the way I'm compelled to keep looking at a car accident for as long as possible while I drive by it. I feel sort of poised to pounce, on high alert for phrases and claims I might be able to use as ammunition. I also feel almost overwhelming "wanting to move away from the possibility of harsh criticisms that might be true". Afterward, it's like there's a movie playing on repeat in my head, containing bits and pieces of the article, anger directed at Person, counterarguments, and fantasies of publicly stomping on them intellectually while everyone else cheers. I probably don't actually write anything (or at least I don't recall ever having done so), but the thoughts themselves feel totally out of my control. There's definitely an absence of awareness of that fact as well, but that's just a retrospective observation about the memories, of course, not a particular thing I felt at the time.

And here's an experience of defensiveness from last night: Late at night I started talking about a thing with Eliezer. He told me that there's a pattern he'd like to break, where I keep waiting until he's much too sleepy to think clearly before I try to talk to him about anything interesting. I started trying to explain why I think it sort of makes sense for that to happen, emphasizing things more under his control than mine, like "whenever you're not tired, you're usually either working or reading, and I don't want to interrupt you at those times". As I was talking, I noticed that I'd been feeling the following sensations: "wanting to hide inside myself for safety", "holding onto something", "fear of losing something", "needing to protect something", "not having a very clear view of what was happening in my mind". I stopped explaining why it made sense for me to end up talking to him late at night, and became curious about what I was afraid of, what I was holding onto, and reasoned that there's probably something I cherish that part of me believes I can only get by talking to Eliezer late at night. The thing I wanted to protect was probably the cherished thing, and being slightly aggressive - convincing him that the things he wants lead to me talking to him late at night, suggesting that he'd have to give up things he wants if he forced me to stop talking to him late at night - served to reduce the risk that I'd lose my grasp on the cherished thing. I offered this hypothesis to my brain with an interrogative tag, and it responded with emotional sensations of "correctness" and "mild relief/security at having been understood". I mostly paused the conversation since he didn't want to talk late at night, but a fantasy version of the conversation continued in my head, and the topic changed to "what exactly do I think I can get only from talking to him late at night, and are there actually ways to get that elsewhere?" This was accompanied by a mild feeling of frustration and maybe indignation that I couldn't have the fantasy conversation out loud at that moment. The fantasy conversation felt deliberate, not obsessive, and it was easy to let go of when I decided to do that.

Tortoise Report 4: Verbal Processing

What's a "Tortoise Report"? See the Tortoise Skills page.

Habit: Verbal Processing

Duration: 1 Week

Success: 2/10

Trigger: Distress or loss of concentration when hearing more than one verbal stream at once, or when reading while people are talking nearby or music with lyrics is playing

Action: Reflective attention (didn't get any farther than that)

Result: I’m much more likely to invite conversation partners away from larger groups, and to immediately put in earbuds playing rain when trying to read or write while hearing music with lyrics. (Previously I’d waste time and attention attempting to focus despite distraction.)

This continues to not look like low-hanging fruit. It’s important and I hope to make progress on it eventually, but there are more important things right now.

Sunday, August 2, 2015

A Walking Meditation

There is a road stretching from here to a future I imagine.

In that future, there are experts of domain-general reasoning, of prediction, and of cognitive boot-strapping toward accuracy and effectiveness. Many of them have explicit knowledge of how to masterfully wield human intelligence, in the way a present-day fencing instructor knows how to wield a foil. The children there can become such masters in a single lifetime (though to be fair, a single lifetime is probably a lot more than 80 years).

What do you think the bricks on that road are made of?

These bubbles represent possible bricks you yourself could lay on the road to the future I imagine - things that might carry you toward it. What happens when you arrange them in order from the smallest, least important brick to the largest, most important brick?

What are the implications of the fact that you have an answer to that question, even if you're not very confident in your answer?

What is the largest brick you could lay right now?