Sunday, August 3, 2014

Explaining Effective Altruism to System 1

[This obviously borrows heavily from the ideas of Eliezer Yudkowsky. In particular, much of it recaps and expands on his talk at the Effective Altruism Retreat of 2014, though I suspect my own ideas fed into that talk anyway. There are also SPOILERS up to chapter 55 of Harry Potter and the Methods of Rationality.]

I donated to the Humane Society once. There was this charismatic twenty-something holding a clipboard, and I hadn't yet learned to walk past such people on the street. So I stood there and listened, while they told me about lonely puppies raised in tiny, dirty, wire cages; sick and shivering puppies deprived of proper veterinary care, affection, and adequate food and water; frightened puppies, abused and exploited for their market value.

I like puppies. They're fluffy and have great big eyes. They make cute little noises when I play tug of war with them. And it makes me very sad when I imagine them hurting. Clipboard Person told me I could rescue a puppy by donating just ten dollars a month to the Humane Society. So I did. I couldn't help myself.*

The Humane Society of the United States is a nonprofit organization working to reduce animal suffering in the US. The Machine Intelligence Research Institute, another nonprofit, is working to ensure prosperity for the entire future of the humanity. HSUS stops puppy mills and factory farming from hurting animals. MIRI stops artificial general intelligence from destroying the world.**

In 2012, HSUS supporters outdid MIRI supporters one hundred fold in donations.***

Look at this popup.

In this popup--the first thing I see when I visit the HSUS website--I'm told I can be a hero. I'm shown these pink-pawed kissable baby dogs in the arms of their new loving owner [this might actually be a volunteer or police officer or something, whatever], and I'm implicitly led to imagine that if I don't donate, those puppies will suffer and die horribly. If I don't act, terrible things will happen to creatures I automatically care about, and I am personally responsible. This message is concrete, immediate, and heart-wrenching.

Animal advocacy activists have to do approximately zero work to speak to potential donors in the Language of System 1. Which means System 1 automatically gets the message. And guess who's primarily in charge of motivating such actions as pulling out your checkbook. (Hint: It's not System 2.)

This just isn't fair.

Wouldn't it be great if we could grock our support of such strange and abstract EA organizations as the Machine Intelligence Research Institute on the same automatic emotional level that we grock animal advocacy?

I think we can. It takes work. But I think it's possible, and I think I've got some ideas about how to do it. The basic idea is to translate "I should help MIRI" into a message that is similarly concrete, immediate, and heart-wrenching.

So what is the problem, exactly? Why is MIRI so hard for System 1 to understand?

I think the main problem is that the people MIRI's trying to save are difficult to empathize with. If my best friend were dying in front of me and there were a button beside him labeled "save best friend's life", I'd feel motivated to push it even if I had no idea how it worked. But even if I could give S1 an excellent understanding of how MIRI plans to accomplish its goal of saving everyone, it wouldn't change much unless my emotions were behind the goal itself.

VERY IMPORTANT: Do not employ these sorts of methods to get your emotions behind the goal before System 2 is quite certain it's a good idea. Otherwise, you might end up giving all your money to the Humane Society or something. 

Why are most of the people MIRI wants to save so hard to empathize with? I think my lack of empathy is overdetermined.

  1. There are too many of them (something on the order of 10^58th), and System 1 can't get a handle on that. No matter how good S2 is at math, S1 thinks huge numbers aren't real, so huge numbers of people aren't real either.
  2. Most of them are really far away. Not only are they not right in front of me, but most of them aren't even on my planet, or in my galaxy. S1 is inclined to care only about the people in my immediate vicinity, and when I care about people who are far away, there's generally something else connecting us. S2 thinks this is bollocks, but isn't directly in charge of my emotions.
  3. They're also distant in time. Though S2 knows there's no sense in which people of the far distant future are any less real than the people who exist right now, S1 doesn't actually believe in them.
  4. They're very strange, and therefore hard to imagine concretely. Day-to-day life will change a lot over time, as it always has. People probably won't even be made of proteins for very much longer. The people I'm trying to empathize with are patterns of computations, and S1 completely fails to register that that's really what people are already. S1 doesn't know how such a thing would look, feel, taste, smell, or sound. It has no satisfying stories to tell itself about them.****
  5. I don't imagine myself as living in the future, and S1 is indifferent about things that don't directly involve me. [I feel this so strongly that the first version of 5 said "I don't live in the future," and it took several re-readings before I noticed how ridiculous that was.]
Note that most of these obstacles to S1 understanding apply to world poverty reduction and animal altruism as well. People in the developing world are numerous, distant, and tend to live lives very different from my own. This is true of most animals as well. The population of the far distant future is simply an extreme case.

So those are some S1 weaknesses. But S1 also has strengths to bring to bear on this problem. It's great at feeling empathy and motivation under certain circumstances.

  1. S1 can model individuals. It can imagine with solid emotional impact the experience of one single other person.
  2. It can handle things that are happening nearby.
  3. It can handle things that are happening right now.
  4. It feels lots of strong emotions about its "tribe", the people in its natural circle of concern (my family, friends, school, etc.)
  5. It cares especially about people with familiar experiences it can easily imagine in vivid sensory detail.
  6. It loves stories.
  7. It gets a better grip on ideas when things are exaggerated.
  8. It's self-centered, in the sense of caring much more about things that involve me directly.
To translate "I should help MIRI" (and relevant associated ideas) into the Language of System 1, you'd need to craft a message that plays to S1's strengths while making up for its weaknesses.

I did this myself, so I'll try to walk you through the process I used.


I started with the central idea and the associated emotion, which I decided is "saving people" or "protecting people". I searched my association network near "saving people" for something concrete I could modify and build on.

I quickly came across "Harry James Potter-Evans-Verris in Azkaban", which is further associated with "patronus", "dementors", and "the woman who cried out for his help when Professor Quirrell's quieting charm was gone". Yes, THERE is the emotion I want to work with. Now I'm getting somewhere.

Now to encode the relevant information in a modification of this story.

In my story, I'm the one walking the halls of Azkaban, rather than Harry. There are too many people in the future, so I'll focus on one person in one cell. And it will be someone close to me, a particular person I know well and care for deeply. One of my best friends.

My version of Azkaban will extend for a few miles in all directions--not far enough to truly represent reality, but just far enough to give me the emotional impression of "really far". The future doesn't feel real, so I'll populate my Azkaban with a bunch of those future people, and my representations of them exist right now in this brick-and-mortar building around me. Some of them are strange in maybe implausible but fairly specific ways--they're aquatic, or silicon crystals, or super-intelligent shades of the color blue, whatever. They're people, and the woman beside me is familiar.

The central message is "save them"--save them from what? From suffering, from death, and from nonexistance. Conveniently, canon dementors already represent those things.

And what's the "patronus"? That's easy too. In my mind, "effective altruism" is the muggle term for "expecto patronum".

Finally, with a broad outline in place, I begin the story and run my simulation in full detail.

I imagine Azkaban. Imagine myself there. A gray prison with cold, damp walls. There are countless cells--I'm not sure how many, but there are at least a dozen in this hall, and a dozen halls on this floor, and a dozen floors in this wing alone. And in every single cell is a person.

There could be animals here, too, if I wanted. Puppies, even. Because this isn't a prison where bad people are sent to be punished and kept from hurting others. This is a much more terrible place, where the innocent go, just for having been born too early, for having lived before anyone knew how to save them from death.

I imagine that I'm walking down the hallway, lined in cells on either side. I hear the eerie clicking of my shoes against the stone floor. Feel the fear of distant dementors washing over me. And as I walk by, on my left, a single person cries out.

I look through the bars, and I can see her. My friend, lying in shadow, whose weak voice I now recognize. She is old and wasting away. She is malnourished, and sickly, and she will die soon. The dementors have fed on her all these years, and she is calling to me with her last breaths. Just as everyone in Azkaban would do if they knew that I'm here, if they knew they were not alone.

I live in a time when things like this still happen to people. To everyone. Some of us live good lives for a while. Others exist always in quiet desperation. But eventually, the dementors become too much for us, and we waste away. It's not pretty. Our bodies fail, our minds fail. We drown in our own phlegm after forgetting the names of our children.

I imagine my friend crying in that cell, wishing to be healthy and happy again, to live one more day. That is just one chosen from every other prisoner in the present and the vast future, who will die if I just watch, doing nothing. Only I can help. I am here, so she is mine to protect. Everyone in Azkaban is mine to protect. They have nobody else. And if I could be everywhere in Azkaban, the cries for help would echo off of every wall.

But it doesn't have to be like this. Azkaban is an evil place, and I do not have to stand for it. Death is not invincible. I can think, and choose my path, and act.

What is Effective Altruism, in the limit? It is healing every wound. Not praying that someone else will do it, but reaching out myself, with everything I have, to destroy every dementor. To tear down these walls. To carry every prisoner to safety, restore them to health, and ensure no one has to suffer and die like this ever again.

It is seeing this one suffering woman who needs my help and choosing to protect her from the darkness--and knowing that she is every person in the future extended before me.

Harry cast his protronus to protect the woman, in the original story. But then he stopped. Because it wasn't time. He didn't have the power. Like the altruists of two hundred years ago, he wasn't ready. There was only so much he could do.

But now the time has come. Today is the pivotal moment in all of human history when I have the power to intervene effectively. I can cast my patronus, and never let it stop until every dementor is destroyed, and every person has been protected.

A dementor approaches from one end of the hall, seeking its prey. I feel it, radiating emptiness and despair, and a woman wimpers, "help me, please".  From the other end of the hall, others who share my goals race in to help me me. They gather before the dementor. Leaders of EA organizations, others who have dedicated their lives to existential risk reduction. They draw their wands and prepare for battle.

I look at my friend in her cell, her eyes pleading desperately, as I draw my wand and move into the beginning stance for the patronus charm. "I will save you," I say to her.

Moving my thumb and forefinger just the right distance apart, I imagine her smiling, revived, prospering.

I flick my wand once, and promise she will be free. Twice, and promise to free all the prisoners in this wing. Thrice, and promise to free every prisoner in Azkaban. Four times, and promise no dementor will hurt another living person ever again.

We level our wands straight at the dementor, brandishing them to drive away the darkness. And with victory in our voices, together we shout,


The thought explodes from my wand, blazing with the brilliance of the MOST good. It joins with the patronuses of all the other effective altruists. The light burns down the hallway, freeing every prisoner it passes from despair and death. It burns through the walls, and they crumble. It burns in every direction, and one after another, the dementors are reduced to little piles of ashen cloth. Healing the wounds in the world. The light continues to grow, enveloping the patch of pebbles that once was Azkaban, our whole world, our galaxy, our future light cone.

Saving our people. Everyone. Everywhere. Forever.

"Effective altruism" is the muggle term for "expecto patronum". It needn't be merely an abstract idea we force ourselves to act on while our emotions lag behind. It can be our battle cry against death.

*I'd never heard of effective altruism then, of course. In fact, I didn't consider myself an altruist of any sort. I'm not sure I'd donated to anything at that point besides maybe SETI. The HSUS pitch was just really good.
**"Converting the reachable universe into quality adjusted life years is also cute." --Eliezer Yudkowsky, Effective Altruism Summit 2013
***In their 990s, HSUS reported $112,833,027 in grants and contributions, while MIRI reported $1,066,055.
****The Tale of the Far Future Person: "Once upon a time, there will have been an entity. The entity will have been alive and sentient. It will have had various experiences and values. Never dying, it will have satisfied its preferences ever after. The end."