Monday, January 25, 2016

Verbal Processing: Take Two

Who wants to do some SCIENCE!?

Back in August, as part of my Tortoise Skills project, I briefly examined my experience of linguistic processing. (My hearing is fine, and the problem extends to reading while people are talking.) Habit installation as I was approaching it didn't seem to be the right tool for the job, so I moved on.

Yesterday, I paid close attention to my experience of a 4-person conversation in a coffee shop, and found that this remains one of my main obstacles - maybe the main obstacle - to comfortable socialization. So much of my processing power seems to go to translating the sounds people are making into coherent language that there's hardly anything left over for thought, speech, or empathy.

In fact, it occurs to me for the first time that my apparently "low empathy" could result almost entirely from this, given that I have no problem reading fiction.

I'd like to take another stab at this, targeting the most common types of face-to-face conversation and using whatever tools seem appropriate.

To get started, I'm going to need some help.


  • 22 one-minute audio clips of someone talking. (Hundreds would be better if this happens to be trivial, since that would support a training program instead of just a test.) 10 should be of person 1 talking, and 12 should be of person 2. Both people should be male, and they should all be of equal audio quality. The topics should be something I'd have no trouble comprehending if I were reading the words instead of listening (so like not an advanced physics lecture).
  • a transcript of two of the clips.
  • some super basic mixing, just putting one clip on top of another and varying volumes
  • people to grade me by listening to the clips and reading my summaries
  • improvements to the design of this test

I can't do that stuff, 'cause I need to not know what's in the clips.


  • Trial 0: I'll read one of the transcripts, then immediately summarize it in writing from memory.
  • Trial 1: I'll listen to the first clip, then immediately summarize what I just heard in writing.
  • Trial 2: Same with clip 2, but with meaningless noise in the background.
  • Trial 3: Clips 3 and 4 simultaneously, with 4 at lower volume. I'll try to only pay attention to 3.
  • Trial 4: Clips 5 and 6 at the same volume, paying attention to 5.
  • Trial 5: Clips 7 and 8, paying attention to 7 but with 8 at higher volume.
  • Trial 6: Clips 9 and 10 at equal volume, paying attention to both at once, then summarizing each.

My summaries should be graded on detail and accuracy (separately), and I'll try to predict my scores in advance. Graders should compare to the raw clips without interference, and shouldn't know which trial they're grading.

Then I'll make some kind of training program, and take the test again (with different clips, of course) at the end.

Naturally, I'll blog about it all afterward.

So, is anybody out there excited by this idea? Interested in lending a hand with part or all of it? So interested that you want to take charge of adding more trials (like testing visual interference, music, other voices, more voices, etc.)? Want to try the test yourself when it's ready? Have clever ideas for training auditory processing? Know of a test just like this one that already exists and I should just take that instead?

Talk to me about any of these things.


James said...

While I think you are onto something with your training plan, I'll offer another consideration, which I hope helps, too.

You might be feeling "fight-or-flighty" due to anxiety which you aren't consciously aware of, because it's like a constant hum.

In an anxious state like this the limbic system is king and the cerebrum doesn't get as much of a say, so widened awareness, and speaking and listening, suffer.

Unknown said...

I'm excited

Brienne said...

James: The above is just a test to measure the effect of my training. I haven't made a training plan yet. The training plan could very well end up involving anxiety reduction.

Unknown said...

Voice to text machine learning systems are probably trained using a whole bunch of (audio, transcript) pairs. Such a data set would make a great source of training data for this project.

Unknown said...

Tried this with various these:

My ability to process tapped out at Trial 5, and I spent some time twiddling my thumbs and contemplating what half-considered TAPs of how-to-socialize-best this data finalizes in my head.

I may have made a mistake using wooey audio that is predisposed to vapidities; lack of coherent content makes it especially difficult to check my comprehension. If I find audio with more concrete topics, I can get some finer granularity on how the skill improves with training.

Anonymous said...

The DFTalk miniseries ( ) meets some of your criteria. There is a lot of material and there is a full transcript for each one. The main thing it misses is that it's more of an interview/monologue format than a discussion with an even amount of each person talking.