Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
The Internet Crime Technology

A Crowdsourcing Project To Make Predictions More Precise 69

databuff writes "Predictions are critical to modern life. Police predict where and when crimes are most likely to take place, banks predict which loan applicants are most likely to default, and hotels forecast seasonal demand to set room rates. A new project called Kaggle facilitates better predictions by providing a platform for forecasting competitions. The platform allows organizations to post their data and have it scrutinized by the world's best statisticians. It will offer a robust rating system, so it's easy to identify those with a proven track record. Organizations can choose either to follow the experts, or to follow the consensus of the crowd — which, according to New Yorker columnist James Surowiecki, is likely to be more accurate than the vast majority of individual predictions. The power of a pool of predictions was demonstrated by the Netflix Prize, a $1m data-prediction competition, which was won by a team of teams that combined 700 models. Kaggle's first competition is underway, and it is accessing the 'wisdom of crowds' to predict the winner of this May's Eurovision Song Contest." Understandably, participation requires registration.
This discussion has been archived. No new comments can be posted.

A Crowdsourcing Project To Make Predictions More Precise

Comments Filter:
  • First Post! (Score:1, Funny)

    by Anonymous Coward
    I predict a first post!!
  • by fusiongyro ( 55524 ) <faxfreemosquito@NOSPAM.yahoo.com> on Wednesday April 14, 2010 @04:02PM (#31849030) Homepage

    "Past prediction is not an indicator of future performance."

    While we're at it, why don't we let everyone pool together their lottery number predictions?

    • by blair1q ( 305137 )

      Logical. If we get enough, we're more likely to get it right.

    • "Past prediction is not an indicator of future performance."

      The general idea is that 'the crowd' has, collectively, access to more (inside) information than any one expert.
      This lets 'the crowd' do a better job of predicting future performance despite your inability to verify individual trustworthiness.

      As for the lottery, if the real world was that random, things like stock markets couldn't function.

      • In order for that to work, the crowd has to actually have access to better information. The problem is that we are extremely prone to seeing patterns where they don't exist. No amount of software collation can turn pattern-y non-information into information.

      • Let the "You just RickRolled me you insensitive b*tch" contest begin!
    • ...because none of us is as dumb as all of us.

    • Already been done: How to win the Lottery [wikipedia.org]
    • I totally agree, past performance does not guarantee future performance. However, the more forecasts you get statisticians to make, the less likely it is that their prediction-history reflects chance rather than skill.
  • My crime predictions (Score:5, Interesting)

    by elrous0 ( 869638 ) * on Wednesday April 14, 2010 @04:05PM (#31849064)

    Behold my amazing precognitive abilities, as I look into the future of crime and predict:

    Most crime will take place in that part of town with the highest concentration of check-cashing and liquor stores, between 5 pm and 3 am. Most of the alleged defendents will not be college educated and will have prior criminal records. Very few actual crime arrests will involve white collar fraud or the elaborate, diabolically-planned crimes that make up the bulk of criminal activity shown in popular TV shows, comic books, and movies. The vast majority of accused criminals will be, in fact, guilty of the crime they are accused of. Very few criminals will be represented by a crusading public defender with the resources to conduct a thorough analysis of their case and order elaborate DNA tests to prove their innocence in a last-minute dramatic countroom reveal.

    • by Vintermann ( 400722 ) on Wednesday April 14, 2010 @04:15PM (#31849224) Homepage

      Very few actual crime arrests will involve [...] diabolically-planned crimes

      Yes. Eurovision Song Contest entries occasionally qualify as diabolically-planned crimes in my opinion, but alas, they tend to get away with it.

    • by dgatwood ( 11270 )

      I'll go one step further and say that statistically speaking, the majority of the criminals will be poor (or in the case of drug dealers, moderately wealthy, but from poor families), and most will be members of a repressed minority in the country in question.

    • They're just legally prevented from interceding before the when. In fact, in my old home town, the police knew a certain criminal had been murdered because of the reduction in the crime rate at certain times and areas.

      Only a small proportion of real crimes and criminals are not predictable.

  • Where the bad guys get to the guy who has the best predicted performance, kidnap his loved one and make him commit a bank robbery or something.
    • Re: (Score:2, Funny)

      by SnarfQuest ( 469614 )

      How about this. Get a group of "police" that predict who will commit a crime, and arrest them beforehand. We could give it a silly name, like "minority report".

  • Police predict where and when crimes are most likely to take place,

    There's going to be, ... er, a crime. A big one. Yeah, that's it. Clear over on the other side of town. Send all the cops. Right away.

  • by pushing-robot ( 1037830 ) on Wednesday April 14, 2010 @04:15PM (#31849228)

    Crowdsourcing Project To Make Predictions More Precise

    I think they used to call them "polls".

  • Because if most people think something is true (or in this case, think something is going to happen a certain way), then it simply must be so.
  • If you take enough samples, with approximately the same error rate, you will get an accurate result if you average them together.
    Therefore I conclude that any answer can be calculated by running: answer = (answer+rand())/2; enough times
    • Averaging only reduces random error giving you a more precise result. It doesn't help with systematic errors and therefore not necessarily a more accurate result.

      Accuracy being closeness to the bullseye and precision being the grouping of the shots.

      • Accuracy being closeness to the bullseye and precision being the grouping of the shots.

        A good way of expressing the difference - to most people they're synonyms.

        I put it like this: two people measure the length of a stick. One says it's ten inches, the other says it's ten point one inches. Which is more accurate? Most people will say the latter. But if the real length (we'll argue about what that means later) is exactly ten (or nine point nine) the first is more accurate. The second is merely more pre

    • Really? What of you go to Arkansas and ask how old the Earth is?

  • Can they accurately predict the global tempature 1000 years in the future, but have to estimate past values, the the Global Warming people?

    • I can confidently predict that the average temperature for June 2010 in the northern hemisphere will be higher than for April 2010. I am not nearly as confident that 15. June will be warmer than today.

      Predicting averages is easier than predicting point values. The wider the area averaged over, the easier it becomes.

      "The the global warming people" are in the business of predicting averages.

      The intrade contracts on global temperature averages are yours for the taking, if you think you know more than the exper

  • For example, what if we crowdsourced a prediction for which stocks will do well tomorrow?

    There's a sort of unavoidable feedback loop when the entity responsible for prediction is also responsible for execution, even if that entity is "everyone".
  • Apply Selectively (Score:4, Interesting)

    by delirium of disorder ( 701392 ) on Wednesday April 14, 2010 @04:32PM (#31849472) Homepage Journal

    The wisdom of crowds works when everyone is looking in the same area for the answer to a question with a somewhat fuzzy answer. The group average can often be better than any single expect that attempts to calculate it. However this is a poor approach when the crowd isn't even looking in the right place. Simple majority decision making would be disastrous for many of the big decisions organizations make. The pubic is massively ignorant on scientific issues and continues to be plagued by religious, corporate, and state imposed falsehoods. Freeing people from these shackles and providing full education for all could allow us to crowd source more important decisions and lead to a more efficient and just society.

    • *expert....I never claimed to be one regarding proofreading.

    • by blair1q ( 305137 )

      Crowds elected George W. Bush. Twice.

      Plural voting is not a reliable system of determining facts. It's better than asking one half-informed person, but not by much.

      • Crowds elected George W. Bush. Twice.

        Did they though?

        Oooooh. Yeah, think about that one, my friend.

      • Yeah, Bush was elected by a pretty big crowd. But bad as that was, letting a smaller crowd elect someone could easily have led to much worse results.

    • by MikeFM ( 12491 )
      I think combining one or more expert systems with the crowd can produce more interesting results. Single systems rarely accurately model a problem any better than a single person does but by polling many expert systems, as well as people, you can get a more accurate result.

      I can't really agree fully that more education makes people better decision makers. I think educated people tend to suffer from underestimating the value of their opinion and from thinking other people will be reasonable. The majority of
  • I think it was a short story in Analog or Asimov science fiction magazines. Someone got tired of the weather forecast being right only about half of the time and created a nation-wide betting pool for people to bet on the weather for the next few days for their area. The theory was that most people would bet on what they thought would actually occur instead of trying for long odds. In the story the forecasters eventually started subscribing to the pool because its predictions were accurate more often than t

    • by blair1q ( 305137 )

      The problem with the weather isn't that it's unpredictable, it's that the parameters that feed the prediction can change significantly in the time it takes you to propagate the prediction to the end users, and many features of the weather are local enough that a central weather forecast is incorrect for a significant portion of the user base.

      So no, the general public will not be a better predictor of the weather than the NWS could. And the system using wagers will be even worse, since many of those wagers w

      • by MikeFM ( 12491 )
        You could improve the system by letting people bet on their locality rather than a large area. Even with crude data there is a lot of correlation with previous patterns people have watched over time and learned to recognize without being able to clearly define. Would be interesting to tweak the system to recognize people that are more accurate and weight their predictions higher.
  • Do polls work so well because the people voting in the earlier polls influence the later polls?

    If the predictions were shared in real-time with the people they were to predict upon, would they still have the same accuracy?

    It seems to me that predicting is only useful when its use is unknown to those it's used on.

    • Do polls work so well because the people voting in the earlier polls influence the later polls?

      If the predictions were shared in real-time with the people they were to predict upon, would they still have the same accuracy?

      It seems to me that predicting is only useful when its use is unknown to those it's used on.

      I think the answers to that is:

      Not only, but it does have an effect.

      No, probably not.

      You're right that things trying to predict their own results face fundamentally insurmountable obstacles. It has discouraged computer scientists, and I think it will eventually discourage economists and brain researchers. But polls can still be useful (as can computers, economists, and brains)

  • This model seems to follow 'genetic programming' principles, but is flawed in many ways: (a) It assumes that most people know everything relevant to the problem under consideration - they often don't.

    (b) What the model is looking for is an expert among the crowd. On average, you can find an expert among 1024 people, to predict 10 coin tosses - this is with random data having no relation to specialized wisdom.

    (c) Eurovision (mentioned above) is in the rare category of scenarios that can make use of 'crowdsou

    • Great comments, thanks! To address the your most incisive comments (as I see them) c) Competitions on Kaggle aren't polls. Competitions are framed in a way that requires serious data analysis. For example the Eurovision Forecasting Comp requires contestants to forecast the voting matrix (who votes for who) rather than a simple who will win. b,d,e) getting people to do lots of predictions should seperate the talented from the lucky. Having forecasters predict in the same place over and over is a good way
  • I don't know I find these kind of opening phrases amusing/annoying, but saying "Predictions are critical to modern life" seems to imply that somehow they are more important than ever. Aren't most major religions based on "predictions" of some kind and didn't they begin a wee bit before "modern life"?

    Crowd-sourcing predictions will undermine all sorts of religions, and we all know what happens when you threaten the monopoly on truth help by religion...

    • I do believe we rely on predictions more today than at anytime in history because we can make them more reliably (we have so much historical data to base them on).
  • Thanks everyone for your comments! Sounds like many of you are skeptical that 'wisdom of crowds' can work in this setting. It'll be an interesting experiment, but I'm encouraged by the Netflix Prize case study. Out of interest, does anybody have any interesting ideas for prediction competitions? I'd love to hear from you either in the comments area or at statsbuff@gmail.com.
  • This is REALLY old news. 66 years ago, it was known as the DELPHI method, and it's been studied to death in the interim.

    • Thanks for the post. I hadn't heard of the DELPHI method - so now I'm a little bit wiser. According to the Wikipedia article, the DELPHI method tries to get a panel of experts to agree on a single forecast. Kaggle (assuming the wisdom of crowds is the method of choice), cherishes diversity. It takes everybody's forecasts and 'combines' them in the hope that individual forecast errors will cancel out.
    • by Teunis ( 678244 )
      The novel "Shockwave Rider" (John Brunner, 1975) proposes a computer-based model very similar to this one doing "crowd sourced predictions, with prizes". He even gave proper attribution, calling it "Delphi".

      so yeah, nothing new here - not even method.
  • As someone who knows a little about the Netflix Prize [slashdot.org], metalearning is not crowdsourcing. It's not a prediction market. It's none of these things. It's essentially taking a weighted average to match some prior data and then using that for new predictions. It's machine learning. It's not magic. If you wanted to draw an analogy in the real world, it would be like asking, who do you believe more when predicting changes to the climate? Some know-nothing wingnut who never went to college, but listens to c

To do two things at once is to do neither. -- Publilius Syrus