Monday, August 27, 2012

Testing Is Bad

My father is a high school science teacher in the US.  Today he was feeling a bit overwhelmed by work, so I helped him grade tests from Bio 1, an introductory biology course for kids in their 9th or 10th years.  It’s early in the course, so they haven’t moved on yet from attempts to make sure everyone’s familiar with very basic and central notions that they should have learned in earlier years but likely didn’t.  I only graded fill-in-the-blank and definition type questions—no essays or short answers that would require significant interpretation.  I’ve never met the kids in the class, and since I only graded page two of the tests I didn’t even see their names.  Neither is this a common occurrence: I believe I’ve helped Dad with grading one other time in my whole life.  Just in case some readers wanted those disclaimers.  Anyway.

It was an enlightening experience.  It was very clear that the vast majority of the students were far more focused on exploiting the system during class and homework than on understanding the material at hand.  They were trying to learn what they needed to pass the test, and didn't feel at all that the test is merely evidence of what they've so far understood or failed to understand.  Their whole purpose as students is to pass the test.  

This is how I ended up grading one test on which the student defined "element" as "part of an atom which makes up an element".  I thought for a long time about what would have to happen in a child’s head for him to give an answer like this on a test.  When I was in high school myself, I never thought very hard about the minds of other students, and assumed people did poorly in school because they are stupid and lazy.  But now, I see that something else is going on here, something caused not by the stupidity or laziness of individual students but by a grave systemic flaw in US education.

There are two correct answers in this context to "define element".  The first is something along the lines of, "something without parts" (Dad often teaches via the history of science, so this would come from the ancient Greek notion), and the second is, "a substance made up entirely of one kind of atom".  I took Dad’s intro chemistry class way back when, and I remember his wording.

Here is the real problem.  It is threefold.  First, the students don't understand the goal of their lessons—they don’t know how to know what the teacher wants them to understand.  Second, they don’t know how to assess the content and level of their current understanding—they don't know how to know what they don't understand.  These combine to create the third part of the problem: they cannot identify the gap between what they don’t know and what they’re meant to know, so they can’t focus their academic efforts on closing it.  

As it stands, high school students know what tests tend to look like and how to streamline the process of passing them.  They are rewarded for good performance and punished for poor performance, and no one has ever tried to explain to them the internal mechanisms of learning beyond that.  The reason they run into such huge problems with Dad's classes in particular is that his tests require a great deal more understanding as a prerequisite for good performance than do the tests they’ve encountered previously.  

The kind of test you write if you don’t want to spend much time grading—that is, understanding the minds of individual students—is the same kind of test you pass by knowing how to take tests.  An expert at test taking can pass a test over very difficult material without actually understanding the material provided the test is written in a way that allows them to exercise their expertise.  This is how I got a B+ on a college level psychology final last year without ever going to class or studying.  There were many things on that test whose answers I didn't really know, and sometimes I didn’t fully comprehend the question itself, but I could deduce what would be counted as correct in most cases because I know how to take tests.  Multiple choice, for instance, hardly ever requires understanding in most contexts.  It only requires memorization of associated sets of terms.  It’s a skillset that takes a long time to develop, but nine or ten years is plenty long.

So the poor kid did exactly what he’d been conditioned over the course of a decade to do.  He threw together "part", "made up", "atom", and "element" into a grammatically well-formed sentence, and didn’t even notice that it was totally nonsensical.  It didn't occur to him to actually try to understand what "element" means.  

And why would he?  Imagine that you aren’t simply trying to be efficient so you can spend your time on other things that are more obviously worthwhile, which is itself understandable.  Imagine that experience has shown that you aren’t smart enough to understand complicated things even when you try.  This is a pet theory you pulled together after failing tests repeatedly early on.  It makes a lot of sense to spend what cognitive resources you know you do have on exploiting the rules of the system, getting by without anyone suspecting that you’re failing to learn (including yourself) and without being punished for your failure.  Your teachers believe that a good grade means you’ve learned, and you believe your teachers.  Because you’ve been doing this for as long as you can remember, you don’t even recognize anymore that there’s another way.  

This kid gave an answer that evidenced an almost total lack of understanding of anything that had happened in his biology class up to that point, but it's not because he’s dumb, and it's not because he isn't trying to succeed.  He definitely would have had to have studied to give that particular answer.  He's failing to learn because tests have taught him not to learn.

What if instead of doling out rewards and punishments in the forms of grades for being able to answer correctly on tests, we taught kids how to assess their own understanding?  What if we taught them that the first priority is to figure out what it is the teacher wants you to understand, the second is to figure out in what way and to what extent you currently understand it, and finally that the entire purpose of all of this class time and work and testing is to figure out how to close the gap between those two things?  There's no way anyone would be content with a nonsensical answer.  They’d have written what they did understand about the meaning of "element".

A few students seem to have done something similar: they defined element as something like, "all the things on the periodic table".  This is what I'd expect from kids who didn't know how to know what they were meant to understand, but did know what they understood.  They knew that they knew that the things on the periodic table are called "elements".  They knew that they knew what a definition is.  They failed to give a correct definition because they didn't know what they were meant to understand.

Here is an answer I would expect from someone who knows how to know what he's meant to understand, knows how to know what he currently understands, but hasn't quite completed the process of closing the gap.  "An element is a very tiny thing that builds bigger things and takes part in chemical reactions."  A kid who answered this way would have genuinely been learning about atoms, but wouldn't have finished refining his notion: he'd have yet to precisify his understanding enough to distinguish between elements and molecules.  

Not a single student gave this kind of answer.  In fact, I don't think anyone gave this kind of answer to any of the questions.  This suggests that even the kids who are getting the answers right probably don't actually understand the things the understanding of which the test is meant to assess.

People like me, people who love learning so much that they aspire to be professional academics, learn in spite of tests.  In most cases, we grew up believing ourselves to be so much smarter than everyone around us that we were always confident that if someone else was meant to understand, we sure as hell were going to understand as well.  We had confidence in our ability to learn better and faster than required, expected, or maybe imagined.  When faced with the prospect of a test that presented any sort of challenge, we stepped up our efforts, because we knew it would pay off.  By contrast, many students have little confidence not because of low ability but because of learned helplessness.  We did learn to exploit the system because often we just weren't interested in the material, but we never had to deal with a feeling of doubt about our abilities or intellectual worth. 

I think that not only have most people never been taught to apply what intelligence they possess, but they've been taught specifically to behave less intelligently than they would if left to their own devices.  They've thrown in the towel, they're flying blind, and the best they can do is to try to exploit the system, and to pray.

For a boatload of unequivocal empirical evidence that conventional testing is harmful, checkout this ginormous meta-study by Paul Black and Dylan William.  If you’re convinced and want to know what to do about it, I suggest reading up on formative assessment, a good overview of which by D. Sadler can be found here.

Wednesday, August 22, 2012

Proposal for a Course in Which Students Invent Science

Part One: Multiplayer Mode

Every class begins with problem solving.  The question posed in the most recent assignment is written on the board.  The students, as a class, must solve it.  No one is allowed to give an actual answer to the question until a quorum has agreed on the best way to solve the problem, then executed the plan and interpreted the results.  The teacher can participate in the problem solving by posing leading questions, encouraging certain directions of thought, and suggesting that they try using tools they already have, but for the most part the students run this part of the show.  There are two main goals here: to develop their scientific toolboxes, and to encounter the inherent bugs of human minds (cognitive biases) so they can learn to recognize and patch them, thus solving problems more efficiently in the future.  Questions early in the course will emphasize revealing biases, and the later problems will emphasize empirical methods of inquiry and testing.  Overall, we’re working toward inventing something like Bayes theorem or another broad philosophy of science.

Part Two: The Meta-quest

After the question is answered, the lecture begins.  The teacher recaps everything that just happened, pointing out which things worked and which didn’t.  The methods and biases involved are given short and simple names the students can remember, like “testable hypothesis” and “availability heuristic”.  This serves as an outline for the lecture.  Lecture materials for the next several weeks should be assembled beforehand to allow for maximum flexibility in presentation order.  The lecture explicitly covers only those methods discovered by the students, showing how they’ve been used historically and how they’ve improved overtime.  (There is a little leeway here for closely related methods that are particularly difficult to discover in a class setting.)  When biases are identified, the lecture includes descriptions of studies and/or anecdotes evidencing or pinpointing the bias, a discussion (perhaps with class participation) of why we tend to think in that particular way, how we can notice when it presents an immediate danger to reasoning, and how to cope when it does.

Part Three: Personal Quests

An assignment is given at the end of every class: the students learn what question they’ll be answering the next day, and must come up with a plan for finding the answer.  These competing methods will duke it out in class debate the next day.  They must also propose problems whose solutions could be found by methods learned in class, which can be hypothetical or drawn from their lives or stories they’ve heard.

Part Four: Leveling Up

Tests will be given periodically, but their frequency will depend on how much has been discovered how quickly.  They will include simple questions about the material covered in the lectures, and a problem that can be solved only by using several if not all of the tools acquired since the last test.

Part Five: Winning the Game

The final will be cumulative.  There will be an in-class portion that is similar to the basic question and answer portions of previous tests.  The take-home portion of the test will have two parts.  The students have a choice on the first part.  They can either choose to answer one particularly difficult question, or they can answer three easier questions.  They must write an essay explaining how they went about solving the problem and why.  For the second part of the test, they must propose and defend a definition of Science.

What I'd really like to see in the comments here is a brainstorming session in which we generate a whole bunch of useful project ideas for a class like this.  In particular I'd like to focus on things geared toward 8th graders, but other thoughts are also welcome.

Monday, August 6, 2012

In Defense Of Semantics, Or: Until You Can Say What You Mean, You Cannot Mean What You Say

Semantics matters.  In a debate, it is all that matters.  What matters when you’re talking with someone is what you mean by what you say, and what the other person takes you to mean by it.  That’s what communication is.  If I could just project my thoughts into your head all at once, my choice of words and the order in which I arrange them wouldn’t matter.  Words wouldn’t matter.  But I’m not a dolphin, so I have to use language if I want to communicate.

Language is a social behavior in which symbols such as sounds or gestures are agreed by participants to denote entities in the world.  The symbols are arranged according to structural guidelines of temporal progression to denote larger concepts in an organized way.  Thus, an image of a concept is projected from the mind of the speaker into the world for listeners to observe and replicate in their own minds.  This process is called communication.

The success of the project of language depends on two things: syntax and semantics.  Syntax, the rules by which symbols are organized to denote more complex concepts, is ultimately a servant of semantics.  It allows for the communication of thoughts far deeper and more intricate than the mere vocabluary.  But it has no purpose whatsoever in the absence of semantics.

In natural language, “semantics” is a set of correspondences, some between symbols and the things they denote, and others among entire sentences.  For instance, the relationship between the sentences “math is exciting and challenging” and “math is challenging” is one of semantic entailment, because the meaning of one entails the meaning of the other.  Suppose I formulate the following sentence and speak it aloud: “Oma cabeca djorglesnuff.”  Even if you know all the rules of the syntax I’m employing, my utterance will be completely useless as communication until I explain somehow that by “oma” I mean “cats”, by “cabeca” i mean “eat”, and by “djorglesnuff” I mean “mice”.  Only then can you understand what I’m trying to say, and respond with something equally meaningful that moves the discussion forward.  You can identify my declarative sentence as a specific claim.  “My oma,” you might say to me, “does not cabecca djorglesnuff.  I think you’re wrong to say they do.”

And here we’re at a point where the two of us might start “arguing semantics”, because the next thing I say is, “I didn’t mean that all oma cabecca djorglesnuff.  I only meant that some oma cabecca djorglesnuff.”  “Ah,” you say to me, “then you’re correct, but you should have specified that when you were explaining what you meant by ‘oma cabecca djorglesnuff’ in the first place.”  And you are perfectly right to call me out on that.

Why?  Because the sentence “some cats eat mice” entails the truth of different sentences than does “all cats eat mice”, and if I didn’t provide you with the tools to determine which sentences my utterances entail, then my words haven’t sufficiently meant and I’ve done a poor job of communicating.  

Consider the following conversation.
A: God exists.
B: No he doesn’t.
A: Yes he does, and I can support my claim.  Behold!

A holds up a spoon.

B: What does that have to do with God?
A: It is God.  See?  It exists.  God exists.
B: You think that God is a spoon?
A: Well... yeah.  That’s what I meant by “God”.  You’re not going to argue mere semantics with me are you?
B: You bet your boots I am.

The above two cases are perfectly legitimate grounds for substantial semantic disputes.  In both cases, one party has done a poor job of communicating, and the other rightly asked for more careful formulations of what is to be projected through language.  In the first case, the failure was a matter of ambiguity.  There were multiple propositions the speaker might have intended to convey, the distinction between the possible propositions was significant, and thus the misunderstanding was not the fault of the listener.  What the speaker actually said did mean something, but it didn’t mean as much as it should have.  What it meant was not precise enough for the purposes of the discussion at hand.  He did not mean what he said, because he did not say what he meant.

In the second case, the speaker meant by his words something outside of the standard, agreed-upon set of entities that might be denoted by them.  The reason the word “God” is mostly useless in discussions with people who are used to attending to fine conceptual distinctions is that the standard set of notions to which God might be taken to refer is very large and poorly defined; not only is there ambiguity, there is vagueness.  But in the case of a spoon, using the term “God” causes more confusion than usually comes with even that word.  The speaker did not mean what he said, because he did not say what he meant.  If A were to say “God exists,” and B were to say, “I think so too” but take “God” to mean “a porcupine”, the listener would also be making the same kind of mistake.

So you see, the more accurate and careful we are with our language, the more intricate, interesting, and useful will be our communications, and the more worthwhile will be our debates.  Clarity matters.  Precision matters.  Sensitivity to semantic distinctions is a valuable skill, as is the diligence to attend to them.  When philosophers argue semantics with you, the purpose is neither to annoy you nor to show off.  It’s to actually get somewhere with the conversation.  If we’re asking for precision and clarity, we’re doing so because we excel at identifying problems that derive from a lack of these, problems that would lead to frustrating, tiresome spirals of self-perpetuating confusion.  

There are important things to learn from people who shatter your semantic endeavors and ask you to rebuild them from the shards.  Developing the patience to face down the linguistic challenges of philosophers will lead you to wield language that is sharp and strong as the edge of a samurai blade.  And if you choose instead to dismiss such attempts at careful communication as tiresome nitpicking, do not meddle in the affairs of philosophers, nor seek what they have sought.  

If you’re too lazy to say what you mean, how are you ever to mean what you say?  And why should I believe that you do?