I, for one, welcome our new robot overlords
February 15, 2011 9:15 AM Subscribe
After the first night, IBM's Watson has only played a single round of Jeopardy, and could be doing better. Stephen Baker (who wrote the book on this) and David Ferrucci (the project's chief scientist) analyze some of his mistakes. After his career as a game show contestant is over, will Watson be up for a role on House?
I didn't have any way to watch this, but I was really interested, and the post-mortem analysis is really welcome - thanks for the post!
posted by Wolfdog at 9:17 AM on February 15, 2011
posted by Wolfdog at 9:17 AM on February 15, 2011
This will be far more interesting once the supercomputer is reduced to the same footprint and volume as its human competitors, instead of a separate warehouse and dedicated refrigeration system.
posted by Thorzdad at 9:20 AM on February 15, 2011
posted by Thorzdad at 9:20 AM on February 15, 2011
Well, at least Watson didn't try to strap Trebek down and impregnate him - so it could have been worse.
posted by Joe Beese at 9:20 AM on February 15, 2011 [8 favorites]
posted by Joe Beese at 9:20 AM on February 15, 2011 [8 favorites]
The whole episode was less like a round of Jeopardy and more like a long commercial for IBM
posted by orville sash at 9:23 AM on February 15, 2011 [7 favorites]
posted by orville sash at 9:23 AM on February 15, 2011 [7 favorites]
Here's the j-archive page for the game if you want to review it in detail.
It seems like Watson does best when there's an unlikely word or phrase (like "Badar-Dur" or "died in a church and was buried along with her name") in the question which can be confidently linked to an answer. He (it?) is at his worst when the "key words" in the question are not as related to one another, like in "Name the Decade" (every one of which he missed).
The real test is going to be Final Jeopardy -- I'm not sure whether Watson would be at an advantage or a disadvantage there. Advantage, because the Final Jeopardy answer is usually a super-esoteric fact (which he's good at) rather than a confusingly worded clue (which he's not). Disadvantage, because the 30 seconds of thinking time doesn't really help him. I'm looking forward to tonight.
posted by theodolite at 9:25 AM on February 15, 2011 [1 favorite]
It seems like Watson does best when there's an unlikely word or phrase (like "Badar-Dur" or "died in a church and was buried along with her name") in the question which can be confidently linked to an answer. He (it?) is at his worst when the "key words" in the question are not as related to one another, like in "Name the Decade" (every one of which he missed).
The real test is going to be Final Jeopardy -- I'm not sure whether Watson would be at an advantage or a disadvantage there. Advantage, because the Final Jeopardy answer is usually a super-esoteric fact (which he's good at) rather than a confusingly worded clue (which he's not). Disadvantage, because the 30 seconds of thinking time doesn't really help him. I'm looking forward to tonight.
posted by theodolite at 9:25 AM on February 15, 2011 [1 favorite]
If you didn't get a chance to watch last night, it's up on youtube -- part 1, part 2.
posted by crunchland at 9:27 AM on February 15, 2011 [1 favorite]
posted by crunchland at 9:27 AM on February 15, 2011 [1 favorite]
The whole episode was less like a round of Jeopardy and more like a long commercial for IBM
Duh? Thats kind of the goal, to show that IBM has a bunch of smart people and can be a great consulting company.
They're apparently so good, I now send my mortgage payment to IBM. (IBM Lender Business Process Services or something like that)
posted by SirOmega at 9:27 AM on February 15, 2011
Duh? Thats kind of the goal, to show that IBM has a bunch of smart people and can be a great consulting company.
They're apparently so good, I now send my mortgage payment to IBM. (IBM Lender Business Process Services or something like that)
posted by SirOmega at 9:27 AM on February 15, 2011
I can't wait to show this episode to my currently non existent grandchildren, who reply "Why is this a big deal, its not even winning and our computer is faster."
posted by Felex at 9:27 AM on February 15, 2011 [6 favorites]
posted by Felex at 9:27 AM on February 15, 2011 [6 favorites]
I thought it was interesting that it missed the Voldemort question, because the question was actually poorly written. Mad-Eye Moody was listed as one of Voldemort's victims, but in the books, it's unclear if Voldemort himself did the deed, or one of his minions. Mad-Eye is killed "off screen," so to speak.
When the question came up, I thought, "The computer has no chance here." I gave myself a little high-five.
I wonder how much human-provided A.I. is coded into the background. For example, how does Watson choose his categories and question values when it has the opportunity to choose? How does Watson choose to bet on Daily Doubles? Is it aware of the game clock and the other contestant scores? On a Daily Double, you're betting how much you know about the category ... so is it deriving a "confidence level" in the category's wording before it even sees the question?
posted by Cool Papa Bell at 9:28 AM on February 15, 2011 [1 favorite]
When the question came up, I thought, "The computer has no chance here." I gave myself a little high-five.
I wonder how much human-provided A.I. is coded into the background. For example, how does Watson choose his categories and question values when it has the opportunity to choose? How does Watson choose to bet on Daily Doubles? Is it aware of the game clock and the other contestant scores? On a Daily Double, you're betting how much you know about the category ... so is it deriving a "confidence level" in the category's wording before it even sees the question?
posted by Cool Papa Bell at 9:28 AM on February 15, 2011 [1 favorite]
If you want numbers rather than post-game analysis, a Google Docs spreadsheet has Watson's stats, which should be updated as the shows progress.
posted by Bora Horza Gobuchul at 9:29 AM on February 15, 2011 [1 favorite]
posted by Bora Horza Gobuchul at 9:29 AM on February 15, 2011 [1 favorite]
Advantage, because the Final Jeopardy answer is usually a super-esoteric fact (which he's good at) rather than a confusingly worded clue (which he's not).
Actually a lot of Final Jeopardys tend to be relatively well-known facts but clued in a esoteric manner. So Watson might actually have some trouble, depending on how hard they make them. (FJs can have wildly varying difficulties.)
posted by kmz at 9:32 AM on February 15, 2011
Actually a lot of Final Jeopardys tend to be relatively well-known facts but clued in a esoteric manner. So Watson might actually have some trouble, depending on how hard they make them. (FJs can have wildly varying difficulties.)
posted by kmz at 9:32 AM on February 15, 2011
a long commercial for IBM f'n sweet AI
It also seemed like Alex was reading the clues a hair more slowly than he usually does... maybe to give Watson just a hair more time to come up with an answer?
Overall, I was impressed with Watson's performance, and you could see that Ken and Brad were frustrated with its buzzer speed before the first commercial break (to the point that Ken later rung in before he had formulated a response, beginning with, "Gee, I don't know...").
Also note that Watson went right for the Daily Double (that square contains the Daily Double more often than any other on the board), and chose low-value clues while it had the lead to keep its opponents from catching up. Human competitors so rarely go "across the board" and slow the game down when they're ahead.
posted by uncleozzy at 9:32 AM on February 15, 2011 [4 favorites]
It also seemed like Alex was reading the clues a hair more slowly than he usually does... maybe to give Watson just a hair more time to come up with an answer?
Overall, I was impressed with Watson's performance, and you could see that Ken and Brad were frustrated with its buzzer speed before the first commercial break (to the point that Ken later rung in before he had formulated a response, beginning with, "Gee, I don't know...").
Also note that Watson went right for the Daily Double (that square contains the Daily Double more often than any other on the board), and chose low-value clues while it had the lead to keep its opponents from catching up. Human competitors so rarely go "across the board" and slow the game down when they're ahead.
posted by uncleozzy at 9:32 AM on February 15, 2011 [4 favorites]
Apparently one of the questions Watson got wrong was a repeat of the wrong answer of one of the human players.
When I heard that I wondered if the designers had just forgotten to account for the condition where an answer can be categorically ruled out due to a prior attempt, or if Watson has to interpret the spoken question of the other players and failed to do so in that case. How, if at all, does Watson become aware of prior answers from the human players?
posted by Babblesort at 9:32 AM on February 15, 2011
When I heard that I wondered if the designers had just forgotten to account for the condition where an answer can be categorically ruled out due to a prior attempt, or if Watson has to interpret the spoken question of the other players and failed to do so in that case. How, if at all, does Watson become aware of prior answers from the human players?
posted by Babblesort at 9:32 AM on February 15, 2011
Fascinating stuff. (We're rooting for the humans.)
I'm rooting for the machine. If Watson wins, it's not a victory for computers, it's a victory for humans. Watson was built by people and the QA insights resulting from his construction will serve people. I'm excited for the possibility of a Google that has that kind of language-parsing ability.
posted by theodolite at 9:33 AM on February 15, 2011 [6 favorites]
I'm rooting for the machine. If Watson wins, it's not a victory for computers, it's a victory for humans. Watson was built by people and the QA insights resulting from his construction will serve people. I'm excited for the possibility of a Google that has that kind of language-parsing ability.
posted by theodolite at 9:33 AM on February 15, 2011 [6 favorites]
How, if at all, does Watson become aware of prior answers from the human players?
He doesn't know what they answered -- it's one of his known problems.
It would probably be trivially easy to add that level of speech recognition to Watson. Except if Watson did speech recognition, he'd be expected to listen to Alex read the clues instead of getting them electronically as a text file, and he needs the time while Alex is reading to actually figure out the answer -- if he had to wait to the end of the reading to start calculating, he would never do it in time to buzz in.
posted by jacquilynne at 9:37 AM on February 15, 2011 [6 favorites]
He doesn't know what they answered -- it's one of his known problems.
It would probably be trivially easy to add that level of speech recognition to Watson. Except if Watson did speech recognition, he'd be expected to listen to Alex read the clues instead of getting them electronically as a text file, and he needs the time while Alex is reading to actually figure out the answer -- if he had to wait to the end of the reading to start calculating, he would never do it in time to buzz in.
posted by jacquilynne at 9:37 AM on February 15, 2011 [6 favorites]
theodolite is a Cylon!
posted by kmz at 9:37 AM on February 15, 2011 [1 favorite]
posted by kmz at 9:37 AM on February 15, 2011 [1 favorite]
Last night's episode was pretty fascinating how it showed the limitations of the current technology. I was amused when Watson lost a question because it repeated the same answer that the other contestant made because it's deaf and blind and can't hear the other players. Also interesting that Watson's specs aren't really all that huge for a super computer, IBM's Roadrunner is much bigger.
posted by octothorpe at 9:39 AM on February 15, 2011
posted by octothorpe at 9:39 AM on February 15, 2011
if Watson has to interpret the spoken question of the other players and failed to do so in that case
The little intro Trebek gave described Watson as "deaf and blind", so I don't think it can hear what the other contestants are saying. There must be a Jeopardy employee off-stage telling the computer whether it got it right or wrong, and Trebek talking "to" Watson is for the benefit of the audience.
It was interesting to see what it got right and wrong. It seemed to have the most difficulty with the decades and that word association category. I was a bit surprised about the decades, since that really should just be pulling stats out of a database, no?
I was least surprised about the names category, since matching a name to a segment of a song lyric seems to my untrained view to be a perfect task for a computer running a comparison against a knowledge bank.
posted by backseatpilot at 9:39 AM on February 15, 2011
The little intro Trebek gave described Watson as "deaf and blind", so I don't think it can hear what the other contestants are saying. There must be a Jeopardy employee off-stage telling the computer whether it got it right or wrong, and Trebek talking "to" Watson is for the benefit of the audience.
It was interesting to see what it got right and wrong. It seemed to have the most difficulty with the decades and that word association category. I was a bit surprised about the decades, since that really should just be pulling stats out of a database, no?
I was least surprised about the names category, since matching a name to a segment of a song lyric seems to my untrained view to be a perfect task for a computer running a comparison against a knowledge bank.
posted by backseatpilot at 9:39 AM on February 15, 2011
I was at a viewing party yesterday that included a Q&A session with one of Watson's developers, so maybe I can clear up a couple of details.
On a Daily Double, you're betting how much you know about the category ... so is it deriving a "confidence level" in the category's wording before it even sees the question?
Watson pays attention to the way questions are grouped into categories, but for the most part it ignores the category names themselves. Apparently they tried plugging in that information at one point, but the titles are so idiosyncratic and full of wordplay that it did more harm than good. The exception would be a few particularly common categories that need special handling, like "Before & After."
Apparently one of the questions Watson got wrong was a repeat of the wrong answer of one of the human players.
It doesn't have speech recognition, so it didn't have any way of knowing that Ken Jennings had already guessed the same answer it was thinking of. The question text is fed in electronically when it shows up on the screen, and the correct answer is given to it afterwards as feedback. But nobody's sitting there typing in the other contestants' incorrect attempts in real time.
posted by teraflop at 9:41 AM on February 15, 2011 [2 favorites]
On a Daily Double, you're betting how much you know about the category ... so is it deriving a "confidence level" in the category's wording before it even sees the question?
Watson pays attention to the way questions are grouped into categories, but for the most part it ignores the category names themselves. Apparently they tried plugging in that information at one point, but the titles are so idiosyncratic and full of wordplay that it did more harm than good. The exception would be a few particularly common categories that need special handling, like "Before & After."
Apparently one of the questions Watson got wrong was a repeat of the wrong answer of one of the human players.
It doesn't have speech recognition, so it didn't have any way of knowing that Ken Jennings had already guessed the same answer it was thinking of. The question text is fed in electronically when it shows up on the screen, and the correct answer is given to it afterwards as feedback. But nobody's sitting there typing in the other contestants' incorrect attempts in real time.
posted by teraflop at 9:41 AM on February 15, 2011 [2 favorites]
It would probably be trivially easy to add that level of speech recognition to Watson. Except if Watson did speech recognition, he'd be expected to listen to Alex read the clues instead of getting them electronically as a text file, and he needs the time while Alex is reading to actually figure out the answer -- if he had to wait to the end of the reading to start calculating, he would never do it in time to buzz in.
Speech recognition plus OCR on the text of the clue would come closest to what human contestants do (since they usually finish reading the clue before Alex is done reading it aloud), but OCR on an all-caps, one-font source like the Jeopardy questions is probably functionally identical to just feeding in a text file.
theodolite is a Cylon!
Busted.
posted by theodolite at 9:41 AM on February 15, 2011
Speech recognition plus OCR on the text of the clue would come closest to what human contestants do (since they usually finish reading the clue before Alex is done reading it aloud), but OCR on an all-caps, one-font source like the Jeopardy questions is probably functionally identical to just feeding in a text file.
theodolite is a Cylon!
Busted.
posted by theodolite at 9:41 AM on February 15, 2011
It would have been interesting to actually make Watson see and hear and have to use OCR, visual processing, etc to read the questions, buzz in, etc. But then I guess it would detract from the pure cognitive processing component of the "experiment".
posted by kmz at 9:41 AM on February 15, 2011
posted by kmz at 9:41 AM on February 15, 2011
"could be doing better"? Really? I guess some people really do get trapped into focusing on how well the bear dances.
posted by DU at 9:41 AM on February 15, 2011 [2 favorites]
posted by DU at 9:41 AM on February 15, 2011 [2 favorites]
Would the Toaster...errr, Watson...only buzz in it had greater than 50% confidence in an answer? As I watched, I wondered if that element (judgement about whether to buzz or not) would be missing from Watson, compared to human players.
posted by dry white toast at 9:43 AM on February 15, 2011
posted by dry white toast at 9:43 AM on February 15, 2011
It would probably be trivially easy to add that level of speech recognition to Watson
Yes, it would be easy, but I don't think that would solve the problem. The issue is discounting an entire line of reasoning, not parsing the spoken answer. You want Watson to discount everything about the 1920s and every link to the 1920s in one fell swoop. You kind of want Watson to forget the 1920s ever existed and then recalculate.
posted by Cool Papa Bell at 9:43 AM on February 15, 2011 [1 favorite]
Yes, it would be easy, but I don't think that would solve the problem. The issue is discounting an entire line of reasoning, not parsing the spoken answer. You want Watson to discount everything about the 1920s and every link to the 1920s in one fell swoop. You kind of want Watson to forget the 1920s ever existed and then recalculate.
posted by Cool Papa Bell at 9:43 AM on February 15, 2011 [1 favorite]
On second thought, I change my mind, because including a camera with an OCR system would allow for the possibility of a giant, unblinking red eye.
posted by theodolite at 9:43 AM on February 15, 2011 [5 favorites]
posted by theodolite at 9:43 AM on February 15, 2011 [5 favorites]
Would the Toaster...errr, Watson...only buzz in it had greater than 50% confidence in an answer?
I think it actually goes one step beyond that. 50% is the theoretically "correct" threshold, but Watson can change it on the fly. In particular, it's more likely to take risks when it's behind and doesn't have as much to lose.
posted by teraflop at 9:48 AM on February 15, 2011
I think it actually goes one step beyond that. 50% is the theoretically "correct" threshold, but Watson can change it on the fly. In particular, it's more likely to take risks when it's behind and doesn't have as much to lose.
posted by teraflop at 9:48 AM on February 15, 2011
But then I guess it would detract from the pure cognitive processing component of the "experiment".
Rodney Brooks, with whom I tend to agree, would probably say there's no such thing as a "pure cognitive processing component". You are interacting with the real world on a continuous sliding scale, or something.
posted by DU at 9:48 AM on February 15, 2011
Rodney Brooks, with whom I tend to agree, would probably say there's no such thing as a "pure cognitive processing component". You are interacting with the real world on a continuous sliding scale, or something.
posted by DU at 9:48 AM on February 15, 2011
I wonder how much human-provided A.I. is coded into the background. For example, how does Watson choose his categories and question values when it has the opportunity to choose? How does Watson choose to bet on Daily Doubles? Is it aware of the game clock and the other contestant scores?
I don't know about all of this, but I remember seeing that it has a fairly sophisticated "metagame AI", so it is certainly aware of the other contestants' scores and uses this to plan its strategy - for example, if it is far behind, it will take more risks (requiring less confidence in its answer in order to buzz in).
posted by dfan at 9:50 AM on February 15, 2011
I don't know about all of this, but I remember seeing that it has a fairly sophisticated "metagame AI", so it is certainly aware of the other contestants' scores and uses this to plan its strategy - for example, if it is far behind, it will take more risks (requiring less confidence in its answer in order to buzz in).
posted by dfan at 9:50 AM on February 15, 2011
The issue is discounting an entire line of reasoning, not parsing the spoken answer.
Well, yes and no. If you want to give Watson the ability to think of a different answer and then ring in, that's complicated on a level akin to the original problem of answering the question.
If you just want to give Watson the ability to recognize that his existing answer has been provided by another contestant, and marked as wrong, and thus, that Watson should not ring in and provide that answer again, that's trivial, and it would save him from losing money by answering incorrectly.
posted by jacquilynne at 9:51 AM on February 15, 2011 [1 favorite]
Well, yes and no. If you want to give Watson the ability to think of a different answer and then ring in, that's complicated on a level akin to the original problem of answering the question.
If you just want to give Watson the ability to recognize that his existing answer has been provided by another contestant, and marked as wrong, and thus, that Watson should not ring in and provide that answer again, that's trivial, and it would save him from losing money by answering incorrectly.
posted by jacquilynne at 9:51 AM on February 15, 2011 [1 favorite]
Odds that this is a preview of IBM's TBA web search technology, anyone?
posted by ZenMasterThis at 9:51 AM on February 15, 2011
posted by ZenMasterThis at 9:51 AM on February 15, 2011
Ooops, I meant web search site.
posted by ZenMasterThis at 9:56 AM on February 15, 2011
posted by ZenMasterThis at 9:56 AM on February 15, 2011
God I wish it would have answered "Vera Wang" on that question, in that lovely flat humorless voice.
posted by Wolfdog at 9:59 AM on February 15, 2011 [3 favorites]
posted by Wolfdog at 9:59 AM on February 15, 2011 [3 favorites]
Dry White Toast:
The toaster had an adjustable threshold of hit the buzzer confidence. I didn't hear how they established it but I'd figure his score in comparison to his opponents and the dollar value of the question played into that threshold.
posted by cmfletcher at 10:05 AM on February 15, 2011 [1 favorite]
The toaster had an adjustable threshold of hit the buzzer confidence. I didn't hear how they established it but I'd figure his score in comparison to his opponents and the dollar value of the question played into that threshold.
posted by cmfletcher at 10:05 AM on February 15, 2011 [1 favorite]
I've brought this up more than once, so maybe I'm the only one who's missing it: Why doesn't Watson have a face? Is the swirly screen supposed to make him look inscrutable and scary? Seems like a total psychological misfire.
posted by DU at 10:09 AM on February 15, 2011
posted by DU at 10:09 AM on February 15, 2011
Things will get interesting once IBM designs a machine that invents a game for humans to be good at.
posted by eeeeeez at 10:11 AM on February 15, 2011 [1 favorite]
posted by eeeeeez at 10:11 AM on February 15, 2011 [1 favorite]
I understand why the monolith-like visual reference for the computer is so plain, as IBM doesn't want to distract from their real creation here with some fancy uncanny valley android, and the possibility of something going wrong with the animatronic system being associated with the real Watson software.
I really wish though, that it was that Phillip K Dick animatronic that was stolen a few years back.
Also, it's been a long time since I've seen Jeopardy, and seeing Trebek now, especially with the segments with the actual server racks, combined with a few of those outtake reels where he's irritated and swearing, Trebek is really reminding me of the Winnebago Man.
posted by chambers at 10:12 AM on February 15, 2011
I really wish though, that it was that Phillip K Dick animatronic that was stolen a few years back.
Also, it's been a long time since I've seen Jeopardy, and seeing Trebek now, especially with the segments with the actual server racks, combined with a few of those outtake reels where he's irritated and swearing, Trebek is really reminding me of the Winnebago Man.
posted by chambers at 10:12 AM on February 15, 2011
I think a computer with a human face avatar would be even creepier. Not to mention all the internal politics at IBM deciding what that face should look like. Imagine the legal wrangling trying to create a face that would not offend anybody. It's too white, it's not Asian enough, etc.
posted by COD at 10:13 AM on February 15, 2011 [1 favorite]
posted by COD at 10:13 AM on February 15, 2011 [1 favorite]
"Vera Wang" on that question, in that lovely flat humorless voice.
Out of all the 'rude words' I have made a computer say with those voices over the years, the most consistently funny over time is 'wang'.
posted by chambers at 10:15 AM on February 15, 2011 [1 favorite]
Out of all the 'rude words' I have made a computer say with those voices over the years, the most consistently funny over time is 'wang'.
posted by chambers at 10:15 AM on February 15, 2011 [1 favorite]
When I was a teen, one of my friends had a large Wang. (actually, it wasn't very big, it was the size of one of those mini fridges..stupid miniaturization)
posted by wierdo at 10:29 AM on February 15, 2011
posted by wierdo at 10:29 AM on February 15, 2011
Old school bloggy types might enjoy reading Robotwisdom's live tweeting of the games. It's a charming bit of weirdness to have Jorn watching pre-recorded TV and live tweeting it ahead of everyone else because he happens to get the news out first because wherever he lives shows Jeopardy earlier than the usual broadcast time. Also, it's Jorn.
posted by Nelson at 10:36 AM on February 15, 2011 [3 favorites]
posted by Nelson at 10:36 AM on February 15, 2011 [3 favorites]
If they used Max Headroom to represent Watson it would be pretty cool.
posted by Daddy-O at 10:37 AM on February 15, 2011 [4 favorites]
posted by Daddy-O at 10:37 AM on February 15, 2011 [4 favorites]
wherever he lives shows Jeopardy earlier than the usual broadcast time.
Is there a "usual broadcast time" for Jeopardy? As far as I know it varies quite a bit between markets.
posted by kmz at 10:41 AM on February 15, 2011
Is there a "usual broadcast time" for Jeopardy? As far as I know it varies quite a bit between markets.
posted by kmz at 10:41 AM on February 15, 2011
Just a warning that by 'ahead of everyone else', Nelson means he's alreayd live-tweeted the game from tonight -- so if you don't want to know how it ends, don't click!
posted by jacquilynne at 10:43 AM on February 15, 2011
posted by jacquilynne at 10:43 AM on February 15, 2011
I can't wait to show this episode to my currently non existent grandchildren, who reply "Why is this a big deal, its not even winning and our computer is faster."
I'm sure that Watson also looks forward to showing this episode to its currently non-existent great-great-great-great-great-great-great-great-great-great-great-great-great-great-great-great-great grandchildren, who will ask "What are those pink squishy things standing next to you?"
posted by Strange Interlude at 10:59 AM on February 15, 2011 [4 favorites]
I'm sure that Watson also looks forward to showing this episode to its currently non-existent great-great-great-great-great-great-great-great-great-great-great-great-great-great-great-great-great grandchildren, who will ask "What are those pink squishy things standing next to you?"
posted by Strange Interlude at 10:59 AM on February 15, 2011 [4 favorites]
Nova had an episode on this last week. Some good discussion on the problems they had to solve to get Watson competitive. Nova: Smartest Machine on Earth
posted by calamari kid at 11:09 AM on February 15, 2011
posted by calamari kid at 11:09 AM on February 15, 2011
Jennings did a Q&A thing today. Fun read.
posted by Perplexity at 11:11 AM on February 15, 2011 [1 favorite]
posted by Perplexity at 11:11 AM on February 15, 2011 [1 favorite]
Via Hacker News: How I Beat IBM's Watson at Jeopardy (3 Times!) by Greg Lindsay.
The title refers to the "sparring rounds" that were held in secret to train Watson over the past year.
posted by teraflop at 11:17 AM on February 15, 2011 [1 favorite]
The title refers to the "sparring rounds" that were held in secret to train Watson over the past year.
posted by teraflop at 11:17 AM on February 15, 2011 [1 favorite]
I wish they'd spent a little more time on the voice synthesis. He sounds nothing at all like Hal.
posted by crunchland at 11:20 AM on February 15, 2011 [1 favorite]
posted by crunchland at 11:20 AM on February 15, 2011 [1 favorite]
NYT : Play against I.B.M.’s question-answering supercomputer.
posted by crunchland at 11:27 AM on February 15, 2011
posted by crunchland at 11:27 AM on February 15, 2011
Jennings did a Q&A thing today.
...where he mentions GiveWell right at the bottom in relation to how he chose his charity for the potential winnings. Someone get this man a Metafilter account!
posted by backseatpilot at 11:30 AM on February 15, 2011 [1 favorite]
...where he mentions GiveWell right at the bottom in relation to how he chose his charity for the potential winnings. Someone get this man a Metafilter account!
posted by backseatpilot at 11:30 AM on February 15, 2011 [1 favorite]
Q.
Did Watson's voice throw you off your game at all?
– February 15, 2011 10:30 AM Permalink
A.
KEN JENNINGS :
Yes, because I wanted him to sound like Darrell Hammon doing Sean Connery on SNL's Celebrity Jeopardy instead. "That'sh not what your mother shaid lasht night, Trebek!"
+1 favorite
posted by roomthreeseventeen at 11:33 AM on February 15, 2011 [5 favorites]
Did Watson's voice throw you off your game at all?
– February 15, 2011 10:30 AM Permalink
A.
KEN JENNINGS :
Yes, because I wanted him to sound like Darrell Hammon doing Sean Connery on SNL's Celebrity Jeopardy instead. "That'sh not what your mother shaid lasht night, Trebek!"
+1 favorite
posted by roomthreeseventeen at 11:33 AM on February 15, 2011 [5 favorites]
PBS Newshour : A: This Computer Could Defeat You at 'Jeopardy!' Q: What is Watson? (aired 2/14/11)
PBS Nova : Smartest Machine on Earth (aired 2/9/11)
posted by crunchland at 11:35 AM on February 15, 2011
PBS Nova : Smartest Machine on Earth (aired 2/9/11)
posted by crunchland at 11:35 AM on February 15, 2011
I've followed Ken's blog for years and it's generally a delight. The only time I was sad about it was when the Prop 8 crap was going on and he was defending the Mormon Church's actions in California (even though he supports gay rights himself). Otherwise, it's a treasure trove of trivia, funny tidbits, nerdiness, etc.
posted by kmz at 11:43 AM on February 15, 2011
posted by kmz at 11:43 AM on February 15, 2011
I wish they'd spent a little more time on the voice synthesis. He sounds nothing at all like Hal.
At this point, the HAL thing is way overdone. I'd like to hear a Jeopardy-computer with the voice and speech patterns of John Hodgman, dripping with ironic tweedy condescension. He is a PC, after all.
posted by Strange Interlude at 12:12 PM on February 15, 2011 [1 favorite]
At this point, the HAL thing is way overdone. I'd like to hear a Jeopardy-computer with the voice and speech patterns of John Hodgman, dripping with ironic tweedy condescension. He is a PC, after all.
posted by Strange Interlude at 12:12 PM on February 15, 2011 [1 favorite]
will Watson be up for a role on House?
I'd love to hear this in audio with House's lines dubbed with Watson's voice.
posted by johnstein at 12:24 PM on February 15, 2011
There's an interesting comment on the Ferrucci link - that in the Nova thing it was suggested that they had solved the problem of repeating wrong answers, and that possibly the decades one was repeated because it distinguished between 'the twenties' and '1920's.
But other than that, I am totally Team Watson on this one.
posted by Sparx at 12:44 PM on February 15, 2011
But other than that, I am totally Team Watson on this one.
posted by Sparx at 12:44 PM on February 15, 2011
Team Watson here too. I would love a good voice rec/natural language processing system for home. So, when I got up, I could say something like "Computer, give me the news for last night and today, also, please send an email to my father stating I will have the site done today if all goes well."
posted by Samizdata at 12:47 PM on February 15, 2011
posted by Samizdata at 12:47 PM on February 15, 2011
Also, I beat Watson on the NYT game pretty soundly, but got ripped off on some questions by bad answer handling. "What is minstrel?" versus the winning "What are minstrels?"
So, now the machines start working together to destroy us...
I'm moving to Mexico. I hear there's a hot chick down there looking for help...
posted by Samizdata at 12:50 PM on February 15, 2011
So, now the machines start working together to destroy us...
I'm moving to Mexico. I hear there's a hot chick down there looking for help...
posted by Samizdata at 12:50 PM on February 15, 2011
Does anyone know what kind of data Watson has access to? I know that it is not connected to the internet, but I don't know what information is lying around on those server racks. A cache of Wikipedia articles? (Half-joking.)
posted by dhens at 1:19 PM on February 15, 2011
posted by dhens at 1:19 PM on February 15, 2011
According to that Newshour piece, Watson has the entire "World Book Encyclopedia," Wikipedia, the Internet Movie Database, much of The New York Times archive and the Bible.
posted by crunchland at 1:27 PM on February 15, 2011 [2 favorites]
posted by crunchland at 1:27 PM on February 15, 2011 [2 favorites]
Oh man, I hope there's a clue about a topic whose Wikipedia entry happened to be vandalized when IBM scraped it, and we get to hear Watson ask, "What is penis?"
posted by uncleozzy at 1:31 PM on February 15, 2011 [6 favorites]
posted by uncleozzy at 1:31 PM on February 15, 2011 [6 favorites]
I was impressed with Watson's pronunciation of "Jean Valjean." Especially since mispronunciation of a response often invalidates the response.
posted by yeti at 2:00 PM on February 15, 2011 [1 favorite]
posted by yeti at 2:00 PM on February 15, 2011 [1 favorite]
Has it been explained how they decided how long it takes Watson to buzz in once he gets the answer? I know he has to physically press the button, but it just seems like he's faster at it than the poor humans, perhaps unfairly so. (Judging from the frustration of the other contestants who were obviously trying to buzz in and couldn't beat him at it.) It seemed like they only managed to get a shot when Watson didn't know it at all, with maybe one or two exceptions. So it doesn't seem quite fair in terms of who KNOWS more- it seems like for every answer that they ALL knew, Watson still got the points simply because he buzzes in quicker.
That said, I'm still enjoying this a lot. I'm particularly enjoying the attitude Trebek seems to be giving Watson, considering he can't uh, hear or think. (Re: the 1920's question, "NO, Watson, Ken already said that!")
posted by GastrocNemesis at 2:12 PM on February 15, 2011 [3 favorites]
That said, I'm still enjoying this a lot. I'm particularly enjoying the attitude Trebek seems to be giving Watson, considering he can't uh, hear or think. (Re: the 1920's question, "NO, Watson, Ken already said that!")
posted by GastrocNemesis at 2:12 PM on February 15, 2011 [3 favorites]
You know, the point about the button pressing is pretty important. When I used to play computerized Jeopardy at local BBS parties, one of my best strategies was to buzz in as soon as possible and use the thinking time to answer the question, rather than waiting until I was sure I knew the answer to buzz in.
posted by Samizdata at 2:47 PM on February 15, 2011
posted by Samizdata at 2:47 PM on February 15, 2011
Speaking of this, anyone familiar with Nell? I have been using some of my daily Twitter time to help clear up incorrect information.
posted by Samizdata at 2:51 PM on February 15, 2011
posted by Samizdata at 2:51 PM on February 15, 2011
Duh? Thats kind of the goal, to show that IBM has a bunch of smart people and can be a great consulting company.
I thought it was to show whether a computer could understand the semantics of the English language and beat masters of trivial knowledge. Instead, we got about 5 minutes of jeopardy and then promotional videos bookended by commercial breaks.
posted by orville sash at 3:00 PM on February 15, 2011
I thought it was to show whether a computer could understand the semantics of the English language and beat masters of trivial knowledge. Instead, we got about 5 minutes of jeopardy and then promotional videos bookended by commercial breaks.
posted by orville sash at 3:00 PM on February 15, 2011
I've been looking forward to this and was really disappointed when I found out it wasn't going to be three games, but a single game split across three days with IBM infomercial as filler. Wednesday night stands to be all filler but for about two minutes of Final Jeopardy. Thank goodness for my PVR.
It was interesting to see Jennings and Rutter shut out at the beginning of the game -- surely a novel experience for them.
I've been saying for years that Rutter's waiting for Trebek todieretire. And now he's moved to LA looking for acting or hosting gigs. Hmm...
posted by Zed at 3:31 PM on February 15, 2011
It was interesting to see Jennings and Rutter shut out at the beginning of the game -- surely a novel experience for them.
I've been saying for years that Rutter's waiting for Trebek to
posted by Zed at 3:31 PM on February 15, 2011
Zed- it's two games split over three days. The first one finished today, so I am assuming/hoping that they got all the promotional stuff out of the way, leaving tomorrow as just a regular game.
posted by GastrocNemesis at 3:48 PM on February 15, 2011
posted by GastrocNemesis at 3:48 PM on February 15, 2011
WHAT IS TORONTO???? HAHAHA U FUCKING IDI-- $947? Oh, you aso.
posted by herrdoktor at 4:59 PM on February 15, 2011 [3 favorites]
posted by herrdoktor at 4:59 PM on February 15, 2011 [3 favorites]
I usually don't watch Jeopardy, but I was wondering - if a contestant gives too much of an answer, it is considered wrong? Last night in the Beatles category, the correct answer was simply "Maxwell" (since the clue referred to "his silver hammer"), but Watson gave "Maxwell's Silver Hammer" as his answer, and it was considered correct. (There was another instance of it in tonight's game, but I don't recall what it was.)
posted by Lucinda at 4:59 PM on February 15, 2011
posted by Lucinda at 4:59 PM on February 15, 2011
WHAT IS TORONTO???
Okay, since this has been breached... I know that teraflop said that Watson ignores the categories, for the most part, but you'd think that, for Final Jeopardy, and especially with such a cut-and-dried category, it would restrict its possible responses to those in the category's domain. Seems like an odd response. Although I suppose the correct response could have been category-related trivia rather than a member of the category.
I did think, though, that it would be a very difficult clue for an AI to solve when it was revealed.
posted by uncleozzy at 5:38 PM on February 15, 2011
Okay, since this has been breached... I know that teraflop said that Watson ignores the categories, for the most part, but you'd think that, for Final Jeopardy, and especially with such a cut-and-dried category, it would restrict its possible responses to those in the category's domain. Seems like an odd response. Although I suppose the correct response could have been category-related trivia rather than a member of the category.
I did think, though, that it would be a very difficult clue for an AI to solve when it was revealed.
posted by uncleozzy at 5:38 PM on February 15, 2011
speech patterns of John Hodgman, dripping with ironic tweedy condescension
That would be good, but I think computer voices should be either unironically condescending in tone (exhibit A B C D), or show toady-esque subservience (exhibit A B). It's also fun when those two personality types argue.
I guess that this show defines my computer personality preferences, rather than say HAL, Star Trek, or even GLADOS for that matter.
posted by chambers at 6:27 PM on February 15, 2011
That would be good, but I think computer voices should be either unironically condescending in tone (exhibit A B C D), or show toady-esque subservience (exhibit A B). It's also fun when those two personality types argue.
I guess that this show defines my computer personality preferences, rather than say HAL, Star Trek, or even GLADOS for that matter.
posted by chambers at 6:27 PM on February 15, 2011
Giving too much information can sometimes cause an answer to be wrong, especially if part of that information is wrong -- for example, giving the wrong first name for an author will still be marked wrong, even if they would have accepted only the last name. But I've seen human players do the same thing Watson did with Maxwell's Silver Hammer (repeat part of the clue even though it was unnecessary) and not be penalized for it.
posted by jacquilynne at 7:31 PM on February 15, 2011 [1 favorite]
posted by jacquilynne at 7:31 PM on February 15, 2011 [1 favorite]
Wow he did pretty damn well tonight. The other contestants simply can't buzz in fast enough. Jennings seemed to be getting pretty frustrated.
posted by meta87 at 10:43 PM on February 15, 2011
posted by meta87 at 10:43 PM on February 15, 2011
I know I'm flagrantly anthropomorphizing here, but I found Watson's baffled guessing in Final Jeopardy (and associated wager) to be weirdly adorable.
posted by NMcCoy at 6:14 AM on February 16, 2011 [1 favorite]
posted by NMcCoy at 6:14 AM on February 16, 2011 [1 favorite]
This just proves what all true fans have known for years: jeopardy is all about who buzzes in the fastest.
posted by Potomac Avenue at 6:34 AM on February 16, 2011 [3 favorites]
posted by Potomac Avenue at 6:34 AM on February 16, 2011 [3 favorites]
I've only watched the first half of the second day, but Watson's scary. Did they tweak his code overnight? Does he "learn" from his mistakes?
Also, this is fun to watch.
posted by mccarty.tim at 7:17 AM on February 16, 2011
Also, this is fun to watch.
posted by mccarty.tim at 7:17 AM on February 16, 2011
I know I'm flagrantly anthropomorphizing here, but I found Watson's baffled guessing in Final Jeopardy (and associated wager) to be weirdly adorable.
I think it's amazing that he can be baffled at all -- that he can have an answer but at the same time can be fully aware that he is not very sure of the answer. That's neat.
posted by notmydesk at 7:52 AM on February 16, 2011 [2 favorites]
I think it's amazing that he can be baffled at all -- that he can have an answer but at the same time can be fully aware that he is not very sure of the answer. That's neat.
posted by notmydesk at 7:52 AM on February 16, 2011 [2 favorites]
Ok, Computer: Don’t believe the hype about the new super-machine on ‘Jeopardy!’. "if the supercomputer triumphs, it will probably be for another reason entirely: because it can activate the buzzer most quickly."
posted by Nelson at 8:06 AM on February 16, 2011
posted by Nelson at 8:06 AM on February 16, 2011
"if the supercomputer triumphs, it will probably be for another reason entirely: because it can activate the buzzer most quickly."
That's a bit disingenuous; it still has to come up with the correct response. For human competitors, when it's a given that everybody understands (and most can solve) the clue, the buzzer really is the bulk of the game. But the fact that Watson can even get to the point where it has a response to trigger the buzzer is plenty hype-worthy.
posted by uncleozzy at 8:26 AM on February 16, 2011 [2 favorites]
That's a bit disingenuous; it still has to come up with the correct response. For human competitors, when it's a given that everybody understands (and most can solve) the clue, the buzzer really is the bulk of the game. But the fact that Watson can even get to the point where it has a response to trigger the buzzer is plenty hype-worthy.
posted by uncleozzy at 8:26 AM on February 16, 2011 [2 favorites]
"if the supercomputer triumphs"
This was a triumph......
posted by schmod at 9:21 AM on February 16, 2011 [1 favorite]
This was a triumph......
posted by schmod at 9:21 AM on February 16, 2011 [1 favorite]
"if the supercomputer triumphs, it will probably be for another reason entirely: because it can activate the buzzer most quickly."
Hah, this statement is absurd. Another reason entirely from what?!
Oh:
But here’s the rub: “Jeopardy!”is actually a terrible way of proving that Watson is more intelligent than its opponents.
That is not what this is about.
posted by milestogo at 10:29 AM on February 16, 2011
Hah, this statement is absurd. Another reason entirely from what?!
Oh:
But here’s the rub: “Jeopardy!”is actually a terrible way of proving that Watson is more intelligent than its opponents.
That is not what this is about.
posted by milestogo at 10:29 AM on February 16, 2011
Hah, this statement is absurd. Another reason entirely from what?!
To be clear, my point is in agreement with uncleozzy above: being able to activate a buzzer doesn't automatically win Jeopardy, you still have to do the hard part of getting the questions right, which is the entire interesting part of the challenge.
posted by milestogo at 10:33 AM on February 16, 2011
To be clear, my point is in agreement with uncleozzy above: being able to activate a buzzer doesn't automatically win Jeopardy, you still have to do the hard part of getting the questions right, which is the entire interesting part of the challenge.
posted by milestogo at 10:33 AM on February 16, 2011
it's two games split over three days
Very happy to have misunderstood that point. I'm looking forward to a full game tonight. My money's on Watson being most likely to win, and Jennings being most likely to throw down his signalling device in disgust (but that's still not very likely.)
posted by Zed at 11:26 AM on February 16, 2011
Very happy to have misunderstood that point. I'm looking forward to a full game tonight. My money's on Watson being most likely to win, and Jennings being most likely to throw down his signalling device in disgust (but that's still not very likely.)
posted by Zed at 11:26 AM on February 16, 2011
Oh, and the Chris Welty that was one of the IBM people talking last night taught my Intro to AI class when he was a Ph.D. student at RPI. (His stated goal at the time for what to do after he got his doctorate was to work on Porsches and not touch a computer again. Guess that part didn't work out for him.)
posted by Zed at 11:28 AM on February 16, 2011
posted by Zed at 11:28 AM on February 16, 2011
So, what would happen if Watson were forced to delay its reaction time to equal 250 milliseconds after solving the problem, which is roughly the average human reaction time, before it hits the button?
I strongly want to see a rematch with that tweak.
Or actually, let's have this thing go on "Who Wants to Be a Millionaire?" It'd be interesting to see how it does taking an open question and choosing which set response matches best. Reaction time doesn't matter, as it's single player. It's more interesting to see how the machine answers questions, to me, than how quickly it answers questions. Granted, the clock mechanic would fly out the window, unless it was truly stumped. But it looks like Watson just makes low-confidence guesses, rather than getting stuck.
posted by mccarty.tim at 1:06 PM on February 16, 2011
I strongly want to see a rematch with that tweak.
Or actually, let's have this thing go on "Who Wants to Be a Millionaire?" It'd be interesting to see how it does taking an open question and choosing which set response matches best. Reaction time doesn't matter, as it's single player. It's more interesting to see how the machine answers questions, to me, than how quickly it answers questions. Granted, the clock mechanic would fly out the window, unless it was truly stumped. But it looks like Watson just makes low-confidence guesses, rather than getting stuck.
posted by mccarty.tim at 1:06 PM on February 16, 2011
The way the signalling works is that Alex reads the clue. (I presumed that Watson is fed the text of the question as soon as the clue is revealed, just before Alex begins speaking -- like the human players, it's reading ahead -- the humans don't really rely on listening to Alex for the content of the clues either.) When Alex finishes speaking, a person hits a switch to indicate that the players may signal. This lights a light on the players' lecterns. If a player signals too soon, that player's signalling device is locked out for a short period -- I don't remember what, but it's long enough that the other players would pretty much have to not signal at all or signal and get it wrong for the locked-out player to have any chance.
But the nearly-universal advice is that if you signal after you've seen the light, you're going to lose to a player who was just listening for Alex to stop speaking and has the rhythm right for when to signal. At high-level play, most of the players know most of the responses, and it becomes a signalling race and whoever captures this rhythm best is probably going to win.
The background stuff specified that Watson is physically triggering a standard Jeopardy signalling device, but it didn't talk about how it knows how to signal. I wonder if it has a photoelectric cell "looking" at the standard light or if it's being fed the signal more directly. Either way, I suspect that with Watson, we're seeing the Jeopardy player who can win the signalling race by watching the light.
posted by Zed at 1:37 PM on February 16, 2011 [3 favorites]
But the nearly-universal advice is that if you signal after you've seen the light, you're going to lose to a player who was just listening for Alex to stop speaking and has the rhythm right for when to signal. At high-level play, most of the players know most of the responses, and it becomes a signalling race and whoever captures this rhythm best is probably going to win.
The background stuff specified that Watson is physically triggering a standard Jeopardy signalling device, but it didn't talk about how it knows how to signal. I wonder if it has a photoelectric cell "looking" at the standard light or if it's being fed the signal more directly. Either way, I suspect that with Watson, we're seeing the Jeopardy player who can win the signalling race by watching the light.
posted by Zed at 1:37 PM on February 16, 2011 [3 favorites]
He can read the writing on the wall, that's for sure. (mild spoiler)
posted by teraflop at 6:46 PM on February 16, 2011
posted by teraflop at 6:46 PM on February 16, 2011
So, how long until we get web apps using Watson's software?
I mean, sure, I'd just screw with it 90% of the time, but the remaining 10% of the time could be pretty cool. Like with Wolfram Alpha (try using it to compare apples and oranges. It actually works quite well!)
And if it learns, it might precociously start using memes!
If IBM drags their feet or are overly restrictive, Google should get on this. I mean, they have the servers, the developers, and the raw cash. And they have the database of linked ideas.
posted by mccarty.tim at 6:50 PM on February 16, 2011
I mean, sure, I'd just screw with it 90% of the time, but the remaining 10% of the time could be pretty cool. Like with Wolfram Alpha (try using it to compare apples and oranges. It actually works quite well!)
And if it learns, it might precociously start using memes!
If IBM drags their feet or are overly restrictive, Google should get on this. I mean, they have the servers, the developers, and the raw cash. And they have the database of linked ideas.
posted by mccarty.tim at 6:50 PM on February 16, 2011
Ken's latest blog entry on the whole Watson thing is pretty amusing.
posted by jacquilynne at 1:54 PM on February 25, 2011
posted by jacquilynne at 1:54 PM on February 25, 2011
NJ congressman tops 'Jeopardy' computer Watson -- Turns out all it took to top Watson, the "Jeopardy"-winning computer, was a rocket scientist. U.S. Rep. Rush Holt of New Jersey is just such a scientist.
posted by ericb at 4:15 PM on March 1, 2011 [1 favorite]
posted by ericb at 4:15 PM on March 1, 2011 [1 favorite]
« Older Poe through the Glass Prism | Through My Eye, Not Hipstamatic's. Newer »
This thread has been archived and is closed to new comments
posted by roomthreeseventeen at 9:17 AM on February 15, 2011