BellKor Wins Netflix $1 Million By 20 Minutes

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

BellKor Wins Netflix $1 Million By 20 Minutes 104

Posted by kdawson on Tuesday September 22, 2009 @01:58AM from the seven-guys-and-a-million-bucks dept.

eldavojohn writes "As we discussed at the time, there was a strange development at the end of Netflix's competition in which The Ensemble passed BellKor's Pragmatic Chaos by 0.01% a mere twenty minutes after BellKor had submitted results past the ten percent mark required to win the million dollars. Unfortunately for The Ensemble, BellKor was declared the victor this morning because of that twenty-minute margin. For those of you following the story, The New York Times reports on how teams merged to form Bellkor's Pragmatic Chaos and take the lead, which sparked an arms race of teams conjoining to merge their algorithms to produce better results. Now the Netflix Prize 2 competition has been announced." The Times blog quotes Greg McAlpin, a software consultant and a leader of the Ensemble: "Having these big collaborations may be great for innovation, but it's very, very difficult. Out of thousands, you have only two that succeeded. The big lesson for me was that most of those collaborations don't work."

This discussion has been archived. No new comments can be posted.

BellKor Wins Netflix $1 Million By 20 Minutes

Load All Comments

Search 104 Comments Log In/Create an Account

Comments Filter:

Anonymous Coward (Score:1, Insightful)

by Anonymous Coward writes:

the topic confuses me
- Re: (Score:3, Insightful)
  
  by ksatyr ( 1118789 ) writes:
  
  The whole thing confuses me. Why are these extremely intelligent people doing research work for NetFlix that would otherwise cost them many times the price of the prize if they paid them in-house? Are there at least share options down the road? I hope the ultimate solution(s) end up in the public domain.
  - - Re: (Score:3, Interesting)
      
      by kelnos ( 564113 ) writes:
      
      It is? I only see a bit in a question about licensing (somewhat tangential) that suggests that Netflix hopes that participants will be able to build a business out of the algorithm they design, but that sounds pretty weak, and doesn't have all that much to do with what the participants got, aside from the prize money.
      
      The contest has been going on for three and a half years, and the winning team of seven will be splitting a cool million, which gives each person just under $145k, minus taxes. Now, I don't
      - Re: (Score:1, Offtopic)
        
        by kelnos ( 564113 ) writes:
        
        Yeah, I don't get it either. More like Offtopic is the new "I disagree with you".
      - Re: (Score:1)
        
        by AvitarX ( 172628 ) writes:
        
        That's how bounties work.
        It's for the publicity on all sides in the end (and the challenge to the competitors).
        The winning team did alright, but the second place, nearly as good one got absolutely nothing for pretty much an equivalent result, they are the ones truly losing out, even if the prize was $100 million.
  - Re: (Score:2)
    
    by retchdog ( 1319261 ) writes:
    
    The intelligent people (well, at least the ones who also had enough time to come close to winning...) have good research jobs already.
    I've also heard that, even before they won the prize, they were selling some of their tangential/spin-off ideas to Netflix... The prize seems to have been more of a trigger.
Bad Summary (Score:5, Informative)

by Anonymous Coward writes: on Tuesday September 22, 2009 @02:11AM (#29500935)

The Ensemble beat BellKor by 0.01%... by their own reporting. According to Netflix, it was a tie. In the case of a tie, the first posted results wins.

Share
twitter facebook
- Re:Bad Summary (Score:5, Informative)
  
  by tangent3 ( 449222 ) writes: on Tuesday September 22, 2009 @04:39AM (#29501513)
  
  The Ensemble beat BellKor by 0.01% on the quiz set. Basically there are 2.8 million records in the qualifying set that the teams must predict the grades of. Half of the records (which half is known only to Netflix) form the quiz set, the other half form the test set. Teams submit their prediction a limit of once a day to get a result from the quiz set, but the final decision of who won is made on the result of the test set.
  So even though Ensemble beat BellKor on the quiz set, the test set results came back dead even.
  
  Parent Share
  twitter facebook
It was a tie... (Score:3, Interesting)

by rm999 ( 775449 ) writes: on Tuesday September 22, 2009 @02:18AM (#29500969)

It was a tie...
In football, I can see how a 20 second difference makes the difference between winning the superbowl. In a contest like this that took thousands of man hours of some brilliant people, calling Ensemble "second place" due to a 20 second difference is just wrong. I don't know if there was a better solution, but something just seems wrong about it all.

Share
twitter facebook
- Re:It was a tie... (Score:4, Informative)
  
  by dingen ( 958134 ) writes: on Tuesday September 22, 2009 @02:21AM (#29500983)
  
  Altough I think the actual 20 minute difference instead of your imaginary 20 second difference is a little less harsh, you're still right.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by rm999 ( 775449 ) writes:
    
    Hah whoops, I guess that weakens my football analogy, but I stick to my point.
    And it's not really about the money - a million dollars is nothing when you split it between the companies sponsoring the teams, but the right to say you won the contest means a lot. The 20 minutes realistically had nothing to do with winning or losing.
    - Re: (Score:2)
      
      by Sparr0 ( 451780 ) writes:
      
      Why would you think the sponsor gets the prize money?
      - Re: (Score:2)
        
        by TheCycoONE ( 913189 ) writes:
        
        He's probably had too much experience with vulture capitalists.
  - Re:It was a tie... (Score:4, Insightful)
    
    by aywwts4 ( 610966 ) writes: on Tuesday September 22, 2009 @03:25AM (#29501223)
    
    Most football games didn't start in 2006, so proportionally 20 seconds is far too long. You didn't exaggerate near enough, someone else can do the math though. (I'm real sleepy, but the imaginary football game came down to roughly 45 milliseconds?)
    I'm really surprised Netflix didn't offer 2 million dollars to the two winning teams, or at-least some sort of consolation prize, as it was effectively a tie in a culmination of years of work.
    These people did so much work even at a million dollars they would have likely earned below minimum wage. Netflix has come a long way since 2006, and this kind of research would have cost many millions, they really can't lose here. Unless the contest took so long the code isn't useful and they have already surpassed 10% in house.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Insightful)
      
      by dingen ( 958134 ) writes:
      
      Most football games last for a few minutes more than the standard 90 minutes, depending on the number of incidents during the match. The game would never be terminated in the middle of an interesting action and no proper referee cares about a few seconds.
      - Re: (Score:2)
        
        by vigmeister ( 1112659 ) writes:
        
        I have to agree (especially after this past weekend's Manchester derby), but I believe the OP is making a reference to American football where the time (of play, not of the game) is much more tightly controlled.
        Which makes the analogy interesting as different sports have different concepts of deadlines and duration of play. As long as there was no violation of their stated rules, I do not see a problem as I think both teams deserved to win and there was only room for one on the podium. I am sure Ensemble wi
        
        Re: (Score:2)
        
        by dingen ( 958134 ) writes:
        
        Congrats to both!
        Hey, thanks!
- Re: (Score:1)
  
  by aiht ( 1017790 ) writes:
  
  One thing that just seems wrong about your post is the fact that it was 20 minutes, not 20 seconds.
  Not that it makes much difference compared to thousands of man hours, but y'know, try to get it right.
- Re: (Score:2)
  
  by wizardforce ( 1005805 ) writes:
  
  It wasn't 20 seconds, it was 20 minutes. Technically Bellkor won the prize by virtue of reaching the target first and technically their competitor did beat them by a hair's width but not before the goal was reached and that was apparently the goal of the competition.
- Re: (Score:2)
  
  by E IS mC(Square) ( 721736 ) writes:
  
  From what I read, Netflix is implementing two or three 'parameters' or 'methods' (out of possibly thousands the teams may have used) for now. (Can't find the link atm)
nonsense (Score:5, Insightful)

by wizardforce ( 1005805 ) writes: on Tuesday September 22, 2009 @02:34AM (#29501029) Journal

The big lesson for me was that most of those collaborations don't work."
Setting an arbitrary goal that only .2% of competitors could meet does not mean that most collaborations don't work. If 90% of the teams met the target, you probably wouldn't be so quick to claim that the vast majority of collaborations do work but rather that the goal wasn't high enough.

Share
twitter facebook
- I think it's a gloss on prizes as innovation-spurs (Score:3, Interesting)
  
  by langelgjm ( 860756 ) writes:
  
  I think he's pointing to one of the inefficiencies of prize systems as a way to spur innovation. Thousands of people tried, spending tens or hundreds of thousands of work-hours and other resources, and only a fraction got "winning results" (yes, according to the arbitrary way that winning was defined). But the point is that the prize probably resulted in a very inefficient use of resources. We could hypothesize that the same result might have been achieved with only 25% of the resources spent on the prize -
  - Re:I think it's a gloss on prizes as innovation-sp (Score:4, Interesting)
    
    by martin-boundary ( 547041 ) writes: on Tuesday September 22, 2009 @08:20AM (#29502415)
    
    for example, by making the cost of entry non-zero, you could have eliminated teams with no chance of winning from participating.
    
    This doesn't work. If you make the entry cost nonzero, you'll be much less efficient at doing *science*. Remember, the journey is much more important than the result. The benefits to society in disseminating knowledge of data mining technologies and good datasets largely dwarfs the knowledge of the winning entry (think Metcalfe's law).
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Insightful)
      
      by langelgjm ( 860756 ) writes:
      
      The benefits to society in disseminating knowledge of data mining technologies and good datasets largely dwarfs the knowledge of the winning entry (think Metcalfe's law).
      You're only considering the benefits to society that result from this particular competition. The argument about prize systems being inefficient has to do with the fact that while they generate huge interest in a particular topic (and yes, generate more returns than simply the winning entry), they also result in an inefficient allocation of resources to that one particular topic.
      I.e., some of the entrants would likely have benefited society more by flipping burgers or sweeping sidewalks than by wasting thei
      - Re: (Score:2)
        
        by martin-boundary ( 547041 ) writes:
        
        You keep claiming inefficiency, yet you don't quote a relevant result. Remember, people who participate in an AI competition are self selecting, ie they are following their preferences. If these (their) preferences had instead been tending towards burgerflipping, they wouldn't be spending the time on number crunching. So, there simply is no shortage of burgerflippers in society as a direct result of the existence of the prize, only an increase in AI skills among a subpopulation.
        In fact, comparing the two
        
        Re: (Score:2)
        
        by langelgjm ( 860756 ) writes:
        
        The burger flipping example was facetious, of course. The point being that it doesn't matter if people are following their preferences - people do not automatically prefer to do that which they are most efficient at doing.
        So, there simply is no shortage of burgerflippers in society as a direct result of the existence of the prize, only an increase in AI skills among a subpopulation.
        Assume the entrants all had moderate computer programming skill. There was likely a lot of duplicative effort in the competition (this happens in other types of research as well). Overall benefits may have been greater if 50% of the entrants worked on 100 different open-source software pro
        
        Re: (Score:2)
        
        by martin-boundary ( 547041 ) writes:
        
        people do not automatically prefer to do that which they are most efficient at doing.
        
        That's a strange way of viewing efficiency. Who gets to decide what is worthwhile to do for others? You're implying here a framework of value judgements independent of individual preferences, whereas typical definitions of efficiency [wikipedia.org] only require individual preferences.
        Now from the second part of your comment, I have to infer that these value judgements have somthing to do with a certain dislike of duplication. That's
        
        Re: (Score:2)
        
        by langelgjm ( 860756 ) writes:
        
        That's a strange way of viewing efficiency. Who gets to decide what is worthwhile to do for others? You're implying here a framework of value judgements independent of individual preferences, whereas typical definitions of efficiency [wikipedia.org] only require individual preferences.
        Your assumptions are 1) all entrants for the NetFlix prize prefer to spend their time on the NetFlix prize rather than something else (reasonable, and I agree to some extent); and 2) because these entrants prefer to spend their time doing this, it is efficient for them to do so (because you claim that efficiency is defined by preferences).
        Your second assumption relies on a definition of efficiency as individual utility maximization, which in turn assumes that individual utility is defined by preferences. Th
      - Re: (Score:2)
        
        by Moridin42 ( 219670 ) writes:
        
        I don't think you are talking about efficiency accurately. By your reasoning, every competitive activity, anywhere, should be done away with since the participants could have been of more value to society by doing any productive job. And yet, how much would you have to pay those people in order to do the janitorial work or burger flipping instead of whichever competitive activity they would choose to do voluntarily? That is the measure of the inefficiency of your argument.
        Any time participants in any activi
  - Re: (Score:2)
    
    by radtea ( 464814 ) writes:
    
    Basically prize systems benefit from people's inability to accurately assess their real chances of winning - or put another way, prize systems free ride off of people's self-delusion.
    Pretty much. I had a look at the data early on, verified that by a tiny bit of cleverness I could hit the existing performance mark with far less iron than I'm sure NetFlix throws at the problem, recognized that getting improvements over that were going to take huge efforts in time and computing resources given the structure o
    - Re: (Score:3, Funny)
      
      by ostrich2 ( 128240 ) writes:
      
      Your experience was very different from mine.
      I found an obvious solution and wrote it down in the margin of a book. I even discovered a proof of this, but the margin was too narrow to contain it.
      - Re: (Score:1)
        
        by daveime ( 1253762 ) writes:
        
        Go back to trolling sci.math, James :-(
        
        Re: (Score:1)
        
        by daveime ( 1253762 ) writes:
        
        Anyone who's been exposed to the ramblings of James Harris, even for a short time, will quickly discover a nasty taste in their mouths.
        Hence the sad emoticon.
  - Re:I think it's a gloss on prizes as innovation-sp (Score:2)
    
    by fulldecent ( 598482 ) writes:
    
    >> I think he's pointing to one of the inefficiencies of prize systems as a way to spur innovation. Thousands of people tried, spending tens or hundreds of thousands of work-hours and other resources, and only a fraction got "winning results" (yes, according to the arbitrary way that winning was defined). But the point is that the prize probably resulted in a very inefficient use of resources. We could hypothesize that the same result might have been achieved with only 25% of the resources spent on th
  - Re: (Score:2)
    
    by bill_mcgonigle ( 4333 ) * writes:
    
    perhaps there are incidental rewards to those resources having been used
    Right - everybody who seriously competed greatly enhanced their own personal knowledge of the field. I'd bet that most of that new working knowledge is not left to waste. There is a ripe market for prediction systems, and even the worst of the entrants can probably fulfil somebody's small need.
Funny, I learned a different lesson... (Score:5, Insightful)

by Squiggle ( 8721 ) writes: on Tuesday September 22, 2009 @02:37AM (#29501033)

The big lesson for me was that big collaborations were the most successful.
In creating solutions for hard problems most of everything fails and is horribly difficult. No big surprise there. Kinda odd that was the quoted lesson...

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by misnohmer ( 1636461 ) writes:
  
  I was just about to post the very same comment. By the contest rules, the contest ends the once someone comes up with a winning solution. The fact that there were 2 solutions meeting the requirement so close together and both resulting from collaborations would rather suggest the collaborations worked really well. The other collaborations simply stopped once there was a winner. Concluding from this that collaborations don't work would be like concluding that the training athletes go through prior to the Oly
  - Re: (Score:2)
    
    by radtea ( 464814 ) writes:
    
    The other collaborations simply stopped once there was a winner.
    If you look at the leaderboard you'll see that the performance of teams drops off dramatically, so that by number 19 you're already down to 9% improvement. To use your Olympic analogy, it's like 20 people running the 100 m and two of them coming in over a second behind the leader. It's remarkably difficult to find full reports of sprint times--including the losers--but given there's about a second between men's and women's times in the 100 m
    - Re: (Score:1)
      
      by misnohmer ( 1636461 ) writes:
      
      Would you agree that the results warrant a conclusion that collaborations are more effective than individual teams? If so, your overall conclusion shouldn't be about collaborations, but simply that people, no matter how many of them (collaborating or not), are not that effective.
- Re: (Score:2)
  
  by YourExperiment ( 1081089 ) writes:
  
  Precisely. The Wired post [wired.com] about this hits the nail on the head: -
  Arguably, the Netflix Prize's most convincing lesson is that a disparity of approaches drawn from a diverse crowd is more effective than a smaller number of more powerful techniques.
  If even Wired can pick up on this, it's kind of embarassing that Slashdot decided to quote the one news source that got completely the wrong end of the stick.
- Re: (Score:1)
  
  by grep_rocks ( 1182831 ) writes:
  
  This is a great way for a big corporation to get hundreds of researchers to work on a problem of econmic importance to it and ONLY pay the researchers who had the best result, the rest get nothing - if netflicks had to actually pay for all the researchers who went down blind alleys they would have spent millions more, or gasp... actually had to have had their own R&D department - what a scam, and yet everyone celebrates them like it is some kind of game show... all I see are suckers....
  - Re: (Score:2)
    
    by mattack2 ( 1165421 ) writes:
    
    Why are they suckers? You could make the same argument that most everyone working on open source software is a sucker, because they could be being paid to program. (Even at some place like RedHat, be paid to work on open source software even.)
    As others have said, apparently many people did this because they enjoy working on this type of a problem. The $1M prize is just that, a prize.
    Do most people who enter the World Series of Poker (and that costs $10,000 to enter) actually think they're going to win?
    - Re: (Score:1)
      
      by grep_rocks ( 1182831 ) writes:
      
      They are suckers because Netflicks makes a profit from the work of all the researchers who worked on the problem yet they only pay one group - the other groups tried different things that didn't pan out but Netflicks did not have to pay for that R&D effort. The other examples you site like open source make utilities that everyone can benefit from - these open infrastructure projects are not exploitive in the same way - Netflicks owns the algorithm developed by the group. Someone who pays 10K to be in a
Well at least.... (Score:3, Insightful)

by russ1337 ( 938915 ) writes: on Tuesday September 22, 2009 @02:49AM (#29501085)

it's still good for the CV.....

Share
twitter facebook
The Rules are the Rules... (Score:5, Interesting)

by Anonymous Coward writes: on Tuesday September 22, 2009 @02:49AM (#29501087)

I agree that Ensemble "losing" because they posted 20 minutes later is a harsh result. However, those were the rules that Netflix set forth and Ensemble, intentionally or not, was making a risky gamble by waiting until right before the deadline to submit their project. And, perhaps the "tie goes to the earlier poster" rule makes some sense because it encourages making your submission earlier that you would otherwise and not "sniping" unless you're absolutely sure your project is better than the rest. At least as far as I can understand, the rule set forth the proper tradeoff -- Ensemble got to see the score to beat (BellKor's) before it posted; however, in exchange for that, its score needed to have been better in order to win. Had Ensemble wanted the first-mover's advantage and the win in event of a tie, it could have posted earlier than BellKor. The fact that BellKor posted only 20 minutes before the end of the competition suggests that Ensemble could have easily posted earlier without compromising its entry. That is, how much significant tinkering could have possibly been done in the last half hour of this multi-year competition?

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by martin-boundary ( 547041 ) writes:
  
  I think it would qualify as harsh if the runner up had a simple algorithm, but in this case all the teams which qualified for the 10% threshold did so with complicated blends of many algorithms. There's really no way to identify whose work is more valuable and deserved most to win, from a scientific perspective.
  - Re: (Score:2)
    
    by kelnos ( 564113 ) writes:
    
    True, but given a set of conditions including "value" or "complexity" of the submission, it'd be damn-near impossible to judge the final result. Your hypothetical example is easy: if the dirt-simple algo got a 10.2% improvement, whereas the algos with magnitudes greater complexity netted a 10.4% improvement, it'd be easy to say the simple algo is more "valuable." But what if the complexity differences are much smaller? How do you judge them when the improvement is also close? I think Netflix was smart b
- Re: (Score:2)
  
  by E IS mC(Square) ( 721736 ) writes:
  
  It's funny and sad at the same time!
  
  And the irony is that, Ensemble were able to calculate thousands of scenarios and permutations/combinations to break the 10% barrier (a significant achievement), but they failed to take into consideration a very basic scenario that their final submission might be tied/inferior. Yes, they tested it against the quiz set, but there was no guarantee they would have got same result against the test set.
- Re: (Score:2)
  
  by Pollardito ( 781263 ) writes:
  
  I would have considered it harsh if BellKor had been in the lead for most all of the multi-year contest and then suddenly lost in the final 20 minutes. But even if that happened I'm not sure that I would be clammoring for them to give out two prizes. It'd be nice of them to do that, since they got a lot of value out of all the teams. But that was true regardless of the 20 minute margin, and everyone knew there was no second million dollar prize from day 1. On the bright side, I'm sure even the second pl
The Objective (Score:2, Insightful)

by maglor_83 ( 856254 ) writes:

OK, so somebody won a prize, offered by NetFlix, to do... what exactly?
- Re:The Objective (Score:4, Informative)
  
  by __aasqbs9791 ( 1402899 ) writes: on Tuesday September 22, 2009 @03:05AM (#29501145)
  
  IIRC, it was to improve the prediction algorithm for ratings. Basically, if you rated this movies at this level, then Netflix tries to predict you will rate these movies at this many stars each, or something to that effect. I've found the old method they used seems to generally work pretty well for me, though there are times I've been surprised. Though I'm not convinced my ratings are really all that accurate anyway. I'm pretty sure if I'm in a certain mode before I see some movies I'd rate them quite a bit differently than other times, though without some way to wipe my memories of seeing it the first time, I'm not sure how I'd actually test that.
  
  Parent Share
  twitter facebook
  - Re:The Objective (Score:5, Informative)
    
    by martin-boundary ( 547041 ) writes: on Tuesday September 22, 2009 @04:17AM (#29501451)
    
    Though I'm not convinced my ratings are really all that accurate anyway. I'm pretty sure if I'm in a certain mode before I see some movies I'd rate them quite a bit differently than other times, though without some way to wipe my memories of seeing it the first time, I'm not sure how I'd actually test that.
    
    If you phrase it like that, you're somewhat missing the point. The target was to minimize an average prediction error over a large number of people, not the prediction error for a single person (eg you).
    Here's an analogy which might help: Suppose you play the lottery and you try to predict 6 numbers exactly, then you'll have a vanishingly small chance of getting them right. But suppose you submit millions of sets of predictions, all different, then your chance is much larger of getting the actual 6 right.
    Now the Netflix contest required predicting a few million ratings, and even if any one rating might be very far off the target, the task only required making sure that a large proportion of the predictions were pretty close to each of their targets and the remaining ones were not too far off.
    The winners were able to make several million predictions such that most of them were, on average (in the RMSE sense used a lot in engineering), a distance of 0.85 from the real rating.
    Even if in some instances their predictions were off by 4 (ie predict 1 when it is 5). For example, with 4 million predictions, if 1% of their predictions are off by 4, that's 40,000 instances of being off by 4, but this has to be compensated by several percent of being off by 0 if you want to get 0.85 on average.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by LordKronos ( 470910 ) writes:
      
      Well it might not affect the average prediction as it relates to everybody else. However, from a user's perspective, the whole point of the system is to try an figure out what my taste is for movies based on how I rated those movies, match it up to other people's ratings, and try to predict what other movies I'd like. You can't statistically average out my ratings, as my ratings are the only significant factor on one side of the equation. There are no other users you can use to balance out what my tastes ar
      - Re: (Score:2)
        
        by LordKronos ( 470910 ) writes:
        
        1) Please pay attention to the conversation, because I didn't post an analogy at all. I merely posted a few extreme examples of things that would affect how you rate a movie.
        2) Please pay attention to the conversation, because nobody ever said the system was useless. The person who started this tangent (nextekcarl) simply said (I'm paraphrasing) "the old system worked better for me, but I'm not sure how reliable my ratings are anyway, so it could just be that", which led to martin-boundary suggesting "it do
      - Re: (Score:2)
        
        by martin-boundary ( 547041 ) writes:
        
        It's really not as pointless as you claim, because you are not alone in the world. A company like netflix needs to serve all its customers, not just one, otherwise it wouldn't be profitable.
        Suppose your personal ratings change with your mood, and everyone else's ratings change with their mood too. The overall set of ratings remains 1-5 as before, and provided people are reasonably independent in their mood swings, then statistically the effect will be neutral. For any one individual whose mood swings up,
        
        Re: (Score:2)
        
        by LordKronos ( 470910 ) writes:
        
        Yeah, I know that. But as my post was saying, you can't compensate for the side of the equation which only has a sample size of 1 person.
        Since I've already been accused by some AC of making an analogy, I might as well go and get one in (and I'll try to make it a bad one full of holes).
        Lets say there is a system which looks at your tax return and suggests activities that you might enjoy based on your level of income. That system works pretty well for most people. You are in a field that pays $200k salary. Th
        
        Re: (Score:2)
        
        by AxelBoldt ( 1490 ) writes:
        
        People's daily moods do affect their movie ratings, and the winning algorithms account for that with parameters that vary by person and day. You can read about it in the winners' algorithm description [netflixprize.com].
  - ratings systems (Score:2)
    
    by bersl2 ( 689221 ) writes:
    
    Ratings systems are inaccurate because people tend to cluster their ratings towards the extremes, for a number of reasons. (I would go into what I believe to be those reasons and the conditions under which they are triggered, but it's really late.)
    My proposed solution is to require ratings to conform to some probability distributions and fit some criteria:
    1. A user's votes should be approximately normal, with some degree of deviation permitted.
    2. [Approximately] 90% of everything is crap/crud (the quantized
    - Re:ratings systems (Score:4, Insightful)
      
      by retchdog ( 1319261 ) writes: on Tuesday September 22, 2009 @10:26AM (#29503693) Journal
      
      I'm sure that every schmuck with a Netflix account would be willing to adhere to your stupid rules, and saddened by your unwillingness to pontificate on how you'd change human behavior.
      Seriously, this is what Netflix would be if it were invented by Stalin.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by bersl2 ( 689221 ) writes:
        
        I'm being compared to Stalin. This is a first. How interesting.
        My idea was to queue votes that had not yet been fit. If a user continued to have an excess of some certain rating level, the idea would be to suggest that the user manually normalize his votes and give him appropriate tools to do this (perhaps quasi-random suggestions for the casual user, an entire list for those who want it). People's minds change over time too, so this could encourage updates to old votes.
        I realize that this sort of interface
    - Re:ratings systems (Score:4, Insightful)
      
      by Geoff-with-a-G ( 762688 ) writes: on Tuesday September 22, 2009 @01:19PM (#29506181)
      
      Your proposed solution would only make sense if people were forced to watch a completely random selection of movies. Once you factor in the fact that people are allowed to select which movies they want to watch, it makes sense that their ratings would cluster towards the high end of the spectrum. That is, in fact, the whole point of this ratings prediction system: to tell you, in advance, which movies you will like. If it worked perfectly, you'd never have to rate a movie below average, because you could avoid ever renting a movie which you wouldn't like.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by deanoaz ( 843940 ) writes:
        
        Excellent point!
  - Re: (Score:1)
    
    by Kashgarinn ( 1036758 ) writes:
    
    yeah... it's the same as saying "hey you liked starwars episode 3, 4 and 5, you're going to loooooove episode 1, 2, and the craptastic 3!"
    - I guess it's a fine incentive for people who want a $1,000,000 to jump through their hoops, but did they actually help improve "things you might like"?
    - I also think they're missing a vital statistic, things you hate, stuff you loathe, they could probably have improved the rating system 100% by adding that measurement into it.
  - Re: (Score:3, Funny)
    
    by daybot ( 911557 ) * writes:
    
    I'm in a certain mode before I see some movies I'd rate them quite a bit differently
    Absolutely. Every single film I first saw on a plane ranks very low for me.
  - Re: (Score:2)
    
    by E IS mC(Square) ( 721736 ) writes:
    
    This article has some interesting details on the ratings, and people's behavior of what and how they rate movies (from the target database of the netflix contest) - http://www.businessweek.com/technology/content/sep2009/tc20090921_645345.htm [businessweek.com]
- Re:The Objective (Score:5, Informative)
  
  by crunchyeyeball ( 1308993 ) writes: on Tuesday September 22, 2009 @04:59AM (#29501597)
  
  Basically, you were asked to predict how a number of users would rate a number of movies, based on their previous ratings of other movies.
  You were supplied with 100 million previous ratings (UserID, MovieID, Rating, DateOfRating), with the rating being a number beween 1 and 5 (5=best), and asked to make predictions for a seperate ("hidden") set comprising roughly 10% of the original data. You could then post a set of predictions to their website which would be automatically scored, and you'd receive a RMSE (Root Mean Squared Error) by email.
  To avoid the possibility of tuning your predictions based on the RMSE, you could only post one submission per day, and the final competition-winning results would be scored against a seperate hidden set, independent of the daily scoring set.
  It really was a fantastic competition, and anyone with a little coding knowledge (or SQL knowledge) could have a decent go at it. Personally, I scored an RMSE of 0.8969, or a 5.73% improvement over Netflix's benchmark Cinematch algorithm, having learnt a huge amount based on the published papers and forum postings of others in the contest, and my own incoherent theories.
  In a way, everyone wins. Netflix gets a truly world-class prediction system based on the work of tens of thousands of researchers around the world hammering away for years at a time. Machine learning research moves a big step forward. BellKor et al get a big juicy cheque, and enthusiastic amateurs like myself get access to a huge amount of real-world research and data.
  
  Parent Share
  twitter facebook
Elisha Gray (Score:2)

by TFer_Atvar ( 857303 ) writes:

Anyone? http://en.wikipedia.org/wiki/Elisha_Gray_and_Alexander_Bell_telephone_controversy [wikipedia.org]
- Re: (Score:1)
  
  by CrashandDie ( 1114135 ) writes:
  
  Nice way to be off-topic.
  
  The Gray-Bell controversy, in essence, is about Bell possibly stealing Gray's invention and method. However, the issue in TFA is about someone being denied a prize based on the fact they submitted 20 minutes *after* someone else, and were only marginally better.
  
  The Gray-Bell discussion doesn't come close to this, because at the time, the Patent Laws stated that it's not the Patent Registration that gives someone the rights, but rather the time of invention, and the ability to
Big deterrent for the future (Score:1)

by WeirdingWay ( 1555849 ) writes:

For those who would do this w/o interest in money because they have such a passion for this sort of thing, this result won't phase them. But for others, the sheer mortality rate of these attempted collaborations, tied in with the company's apparent disinterest to provide something noteworthy to the other team due to a minor technicality is going to discourage people. Imagine how the losing team could turn on each other..."If only you didn't have to take that 25 minute crap we'd be cashing in!" "If only we h
- Re: (Score:1)
  
  by WeirdingWay ( 1555849 ) writes:
  
  I guess I should expect that my message content will be filtered by someone's expectation of presentation as I didn't split the comment into paragraphs. My bad.
  - Re: (Score:2)
    
    by deanoaz ( 843940 ) writes:
    
    You are correct sir!
Most collaborations don't work? (Score:2)

by roystgnr ( 4015 ) writes:

Out of thousands, you have only two that succeeded.
Yes, because the Netflix rules were set up in such a way as to encourage winners to submit their results as soon as possible upon success. They're not going to wait around to give anyone else the chance to reach the same goal first. You might as well say "Only two people crossed the tape during that photo finish! The other thousand runners are failures!"
The big lesson for me was that most of those collaborations don't work.
By this standard, zero non-coll
When will they implement it? (Score:2)

by bigbigbison ( 104532 ) writes:

I didn't see anything in the article about when Netflix may implement the new algorithms? I've rated a ton of stuff on Netflix and seem to have totally confused the current system because I rarely get any recommendations and when I do they are totally off. For example I rated a Japanese horror film highly and Netflix then suggested 3 european romantic films (one comedy and two dramas).
The whole premise is wrong (Score:1)

by drsmack1 ( 698392 ) writes:

Maybe I am alone here; but the only real trend in my movie likes is that I only watch GOOD movies. I have seen nothing in any of the articles on this that account for that. If I enjoyed 12 Monkeys; don't be suggesting Battlefield Earth to me just because they are both SF movies. To me, a better suggestion for a fan of 12 Monkeys would be Momento.
- Re: (Score:2)
  
  by EboMike ( 236714 ) writes:
  
  Well, that's the whole point of this competition - find out what each user believes to be "GOOD", which is highly subjective. People who enjoy, say, Epic Movie, White Chicks, and hate movies like Insomnia are likely to dislike Memento as well.
  There is no objective "good" or "bad" for a movie - you can average the ratings given by all users, but according to IMDb, Insomnia (US version) has 7.2, and Die Hard 4 has 7.6 - which one would you prefer? (To me, Insomnia is clearly the "better" movie, but my opinion
Try the Hutter Prize model (Score:2)

by Baldrson ( 78598 ) * writes:

The Hutter Prize [hutter1.net]'s incremental prize awards for progress, itself modeled on the M-Prize [mprize.org], is a superior way of awarding prize money. There is continual reward for teams that contribute substantially and no one team takes everything based on a technicality.
- Re: (Score:1)
  
  by daveime ( 1253762 ) writes:
  
  The Hutter Prize is a nonsense ... there's only been one russian competing in it since 2006, and he's won the princely sum of about 6,700 Euros. Bit of a far cry from a million bucks.
  Also, the prize structure is flawed, in that it penalizes the very people who are achieving the best work. The first person to achieve an improvement in data compression gets a chunk of the prize money. Then someone else who comes in later and manages to compress it a bit further (arguably a harder task to make any gains), only
  - Re: (Score:2)
    
    by Baldrson ( 78598 ) * writes:
    
    daveime writes: And as it is patently obvious that the compression algorithms are NOT general purpose, but specifically tuned / optimized to the data set in question (a 100MB chunk of wiki data), it is probably going to be useless for any other data set.
    What would you suggest is a good English language corpus as a test of your assertion?
    - Re: (Score:1)
      
      by daveime ( 1253762 ) writes:
      
      English corpus fine ... but the point is that the wiki data text set *isn't* a standard English Text, it's a form of specialised wikipedia markup language (think XML) with abbrevations, wiki code numbers and dates.
      A lot of the so called "optimizations" have been achieved by identifying the structure of that *specific* markup to make extra gains in compression.
      It's like making a compressor that works specifically well on Windows PE exe file structure, and expecting it to do the same thing on a jpeg or plain
      - Re: (Score:2)
        
        by Baldrson ( 78598 ) * writes:
        
        daveime writes: It's like making a compressor that works specifically well on Windows PE exe file structure, and expecting it to do the same thing on a jpeg or plain text file in the English Language.
        Again: What English language corpus would you suggest to test your assertion that the Hutter Prize has, due to "specialization" to handle markup syntax, produced a compressor that does not outperform the others on English text?
      - Re: (Score:2)
        
        by Baldrson ( 78598 ) * writes:
        
        daveime: Nothwithstanding that, I guess the think that put me off was the whole "compression == AI" angle that Hutter tried to put on things
        Something you need to understand about the Hutter Prize is that it is not about writing a compressor -- it is about achieving the simplest representation of human knowledge. If you want to do it the way Doug Lenat has been trying with Cyc -- hiring a bunch of philosphy PhDs to manually construct an ontology that parsimoniously represents human knowledge -- then by al
        
        Re: (Score:1)
        
        by daveime ( 1253762 ) writes:
        
        Look I'm getting really tired of arguing with you, as you seem to have some special axe to grind on this.
        I'd suggest you look at the title of the page to start with ...
        "50'000 Prize for Compressing Human Knowledge"
        Then the opening paragraph ...
        "Being able to compress well is closely related to intelligence as explained below. While intelligence is a slippery concept, file sizes are hard numbers. Wikipedia is an extensive snapshot of Human Knowledge. If you can compress the first 100MB of Wikipedia better th
        
        Re: (Score:2)
        
        by Baldrson ( 78598 ) * writes:
        
        The point of the Turing test is to model human intelligence. That is not the point of the Hutter Prize. The point of the Hutter Prize is to model optimal intelligence. Human intelligence is not optimal. The target of this intelligence is chosen as human knowledge as represented in Wikipedia. Optimal, or universal, intelligence is a field of pure mathematics: The goal is to mathematically define a unique model superior to any other model in any environment. From a presentation by Marcus Hutter [hutter1.net]:
        
        The
        
        Re: (Score:1)
        
        by daveime ( 1253762 ) writes:
        
        Okay, one more shot before I sleep ...
        The (optimal) AI model is unique in the sense that it has no parameters which could be adjusted to the actual environment in which it is used
        Yes, this makes perfect sense, however, the algorithms thus far presented have been subject to many *human defined* parameters to optimize the compression. It's hardly unsupervised AI "learning".
        The kind of thing described above is more akin to Kohonen Networks, which when trained on specific inputs (with no pre-specified outputs),
No need for a fancy algorithm (Score:2)

by bugs2squash ( 1132591 ) writes:

My netflix queue contains movies chosen by me, my wife and my children and sometimes chosen for a visiting friend. If they would only allow me to maintain separate queues or tag the content as to who chose it, I would have thought that it would make predicting what we each like much easier. It's the same with itunes, the "genius" must think I'm schizophrenic.
Cat, tongue, and all that. (Score:1)

by Impy the Impiuos Imp ( 442658 ) writes:

> Greg McAlpin, a software consultant and a leader of the Ensemble: "Having these
> big collaborations may be great for innovation, but it's very, very difficult. Out of
> thousands, you have only two that succeeded. The big lesson for me was that most of
> those collaborations don't work."
Tough luck on the loss. Oh, and you're an idiot.
Saying "only two" worked is like saying "only one person actually found the car keys and all the other guys looking are a big Fail".
See, they stop looking after it

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Anonymous Coward (Score:1, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Interesting)

Re: (Score:1, Offtopic)

Re: (Score:1)

Re: (Score:2)

Bad Summary (Score:5, Informative)

Re:Bad Summary (Score:5, Informative)

It was a tie... (Score:3, Interesting)

Re:It was a tie... (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:It was a tie... (Score:4, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

nonsense (Score:5, Insightful)

I think it's a gloss on prizes as innovation-spurs (Score:3, Interesting)

Re:I think it's a gloss on prizes as innovation-sp (Score:4, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:1)

Re: (Score:1)

Re:I think it's a gloss on prizes as innovation-sp (Score:2)

Re: (Score:2)

Funny, I learned a different lesson... (Score:5, Insightful)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Well at least.... (Score:3, Insightful)

The Rules are the Rules... (Score:5, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The Objective (Score:2, Insightful)

Re:The Objective (Score:4, Informative)

Re:The Objective (Score:5, Informative)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

ratings systems (Score:2)

Re:ratings systems (Score:4, Insightful)

Re: (Score:2)

Re:ratings systems (Score:4, Insightful)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3, Funny)

Re: (Score:2)

Re:The Objective (Score:5, Informative)

Elisha Gray (Score:2)

Re: (Score:1)

Big deterrent for the future (Score:1)

Re: (Score:1)

Re: (Score:2)

Most collaborations don't work? (Score:2)

When will they implement it? (Score:2)

The whole premise is wrong (Score:1)

Re: (Score:2)

Try the Hutter Prize model (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)