
This is Fun: How Well Can a Computer Transcribe Song Lyrics? [INFOGRAPHIC]
Years ago during my metalhead phase, I played drums in a power trio called…wow, I can’t remember our name. Snyper, maybe. Let’s go with that.
On guitar, we had Darren, a guy who looked freakishly like Eddie Van Halen under low light. On bass and vocals was my buddy Charlie. Having him as lead vocalist was a bit risky given his shyness and post-teenage dorkiness, but since neither Darren or I were brave enough to raise our voices in song, the job as frontman fell to Charlie.
As part of his job as our singer, transcribing the lyrics of the songs we covered fell to him. Using a portable cassette player and sometimes a turntable, he dutifully went over the songs we chose, writing down all the words in block letters in a notebook he took everywhere. He spent hours agonizing over every syllable. The only lyric that ever failed him was from Goddo’s “Sweet Thing.” It was years later when I asked Greg Godovitz about that line halfway through the first (and third) verse that Charlie could never understand. (Godovitz replied “‘Lizard fuel.’ Sperm. Cum. Got it? Duh.”
When we played live, that notebook was always open at his feet as a safety net–although I’m not sure how he could read what he’d written, given that his performing courage required two triple-rye-and-Cokes in quick succession before showtime.
Charlie guarded that notebook as if his life depended on it, never showing it to me or Darren. One day, though, he slipped up and left it where we could flip through it. So we did.
This is when we came across what he written for our cover of the Kinks’ “You Really Got Me” which we played Van Halen-style.
Girl, you really got me now.
You got me so I don’t know what I’m doin’.
Girl, you really got me now
You got me so I don’t know what you weigh
Seriously. He’d been singing that for months–yet no one called him on it. Must’ve been the rye-and-Cokes.
Which brings me to this. Can today’s computers figure out song lyrics better than poor Charlie could? Let’s find out.
Will George Orwell’s Fiction Turn Into Reality?
There have been raging debates over the usefulness of artificial intelligence for decades, think of George Orwell. Scientists, politicians and lay people all wonder whether computers will replace humans at most activities. Some believe the takeover inevitable. After all, besides maintenance and software updates, there are few costs associated with a robot employee; meanwhile, humans require insurance, workers compensation and vacation days.
Yet, it seems that, for the near future, humans remain irreplaceable in most fields. So, to ascertain the continued viability of humans in the language translation industry, researchers devised an experiment.
The Research Design
There were two objectives. First, was the aforementioned test to see just which party, humans or machines, proved the best translator. The second goal was to observe which proved better at creating mondegreens, those funny and incorrect lyrical phrases human listeners tend to come up with when misunderstanding songs, which is often!
The research team decided that the best approach was to locate four songs unfamiliar to both the computer and the humans. They put the IBM Watson Translation Machine up against two human professional transcriptionists.
How Did They Do?
The two human transcriptionists proved themselves very capable for the job. In fact, neither of them made any lyrical errors. They both translated what they heard with 100% accuracy. Furthermore, the humans also did a perfect job of transcribing the words. There were no missing words at all. Overall, this performance was amazing.
As for Watson, well, he had a few problems getting things right. Overall, the IBM Machine made 62 total lyric and missing word errors.
What Can We Make of All of This?
It does not take a rocket scientist to realize that the humans beat Watson handily. The machine found it both difficult to understand what it heard and to make phrasal guesses. In the end, Watson made a lot of errors, none of which replicated the mondegreens created by the human mind. It seems the machine was even unable to guess in the manner of a human.
As for the professional transcriptionists, they too failed to come up with any mondegreens, largely because they were just too good for the test. Watson might argue, if he could, that the humans probably had heard all the songs before. It is true that people can recall things they heard previously, but thought they were hearing for the first time. The machine would not have had this advantage.
What can be made of these results? Well, it seems that anyone who needs some music translating done should depend upon a human for the time being. Nevertheless, in the future, perhaps Watson and other machines will become the big brothers described in George Orwell’s 1984.