Since I am currently living in Paris, I decided to take advantage of the wonderful movie culture and see G.I Joe: Retaliation (If you stumbled onto this post expecting to find information about the reptilian Obama, I apologize). Obviously the first question anyone has when watching this movie is, “Is this movie scientifically accurate?” Since I am not a real scientist—the kind that wears a white lab coat and uses bunsen burners—I cannot speak to most of the questions, but I can touch on the aspects related to computer science.
In the middle of the movie, the surviving Joes have postulated that there must be something wrong with the president. Lady Jaye hits the magic analyze button on the computer—if she can cyberblast encrypted beacons, then there isn’t much outside her abilities—and produces some convincing linguistic analysis showing the current president is an impostor. She shows that his word choices have changed in the last few weeks. The focus was on his choice of filler words, the words you stick between the content of your sentence while thinking. These are the types of words teachers always try to get you to stop using when giving class presentations. Lady Jaye claims your use of these words are subconscious and that you cannot just change the words you use. Let’s assume she is correct; how could this analysis be done?
Obtaining large amounts of recorded speech of a president should not be a difficult task. Public speaking is a large part of his job and hundreds, if not thousands of hours must exist. While this is a large amount of speech, enough to build a president-dependent speech recognizer (Can you believe there is not a single research paper titled, “President-dependent speech recognition”?), it may not be the type of speech we are interested in. When the president gives a speech, it is usually a polished presentation more akin to news reporter reading from a teleprompter than a conversation. In such scripted speech, filler words are rarely used. If you consider that the speeches are written by someone else, then the variations in word choice might actually be from the speech writers and not the president himself. Even if sufficient data can be obtained, significant challenges lie ahead.
Once the president’s speech has been obtained, the actual words spoken have to be identified in the audio. This could be done automatically, but would require a huge effort. Also, even with the best system, at least 10% of the words would be incorrect. Another barrier is that most recognition systems do not recognize filler words. They are instead trained to output <fw> or <filler-word>, just some marker that a filler word happened. An additional system could be built to identify the actual filler word once their locations have been detected, but that would be another large obstacle.
Assuming the proper data can be collected and the transcripts can be produced, the actual analysis can begin. The task amounts to speaker verification, confirming the identity of a speaker based on a sample of data. Usually this task is done with audio data, but since the proposed impostor has an identical voice, the transcript must be used instead. I could find surprisingly little research in this area. My best guess is that the circumstances where this would be needed would be uncommon.
Due to the constraints of this particular situation, a simple approach would probably suffice. The simplest approach would be to collect the number of times the president spoke every word for a given time period. From there you can calculate the frequency of each word. Given prior knowledge about the usage of words, comparisons could be made. For instance, comparing the relative frequencies of soda and pop or the frequencies of filler words. Even then, you would have to be careful you are not just anomaly hunting.
It probably does not come as a surprise that it would be unlikely anyone could quickly perform this analysis in an abandoned YMCA with a scavenged computer; a tremendous amount of time, effort, and computing resources would be required. The data needs to be collected, transcribed, and analyzed. In addition, this all ignores what is probably the most difficult step, formulating the initial hypothesis. The Joes would have to hypothesize that there would be a difference in word usage for the president after a particular date. Often in science, the process of testing an idea can seem rather straightforward after the fact, but that ignores the initial insight that was required.