A Crowdsourcing Project To Make Predictions More Precise 69
databuff writes "Predictions are critical to modern life. Police predict where and when crimes are most likely to take place, banks predict which loan applicants are most likely to default, and hotels forecast seasonal demand to set room rates. A new project called Kaggle facilitates better predictions by providing a platform for forecasting competitions. The platform allows organizations to post their data and have it scrutinized by the world's best statisticians. It will offer a robust rating system, so it's easy to identify those with a proven track record. Organizations can choose either to follow the experts, or to follow the consensus of the crowd — which, according to New Yorker columnist James Surowiecki, is likely to be more accurate than the vast majority of individual predictions. The power of a pool of predictions was demonstrated by the Netflix Prize, a $1m data-prediction competition, which was won by a team of teams that combined 700 models. Kaggle's first competition is underway, and it is accessing the 'wisdom of crowds' to predict the winner of this May's Eurovision Song Contest." Understandably, participation requires registration.
First Post! (Score:1, Funny)
did we forget something? (Score:4, Funny)
"Past prediction is not an indicator of future performance."
While we're at it, why don't we let everyone pool together their lottery number predictions?
Re: (Score:2)
Logical. If we get enough, we're more likely to get it right.
Re: (Score:1)
Don't bet on number 24.018734 - it never comes up.
Re: (Score:2)
"Past prediction is not an indicator of future performance."
The general idea is that 'the crowd' has, collectively, access to more (inside) information than any one expert.
This lets 'the crowd' do a better job of predicting future performance despite your inability to verify individual trustworthiness.
As for the lottery, if the real world was that random, things like stock markets couldn't function.
Re: (Score:2)
In order for that to work, the crowd has to actually have access to better information. The problem is that we are extremely prone to seeing patterns where they don't exist. No amount of software collation can turn pattern-y non-information into information.
Re: (Score:2)
Crowd Sourced Predictions... (Score:2)
...because none of us is as dumb as all of us.
Re: (Score:1)
Re: (Score:1)
My crime predictions (Score:5, Interesting)
Behold my amazing precognitive abilities, as I look into the future of crime and predict:
Most crime will take place in that part of town with the highest concentration of check-cashing and liquor stores, between 5 pm and 3 am. Most of the alleged defendents will not be college educated and will have prior criminal records. Very few actual crime arrests will involve white collar fraud or the elaborate, diabolically-planned crimes that make up the bulk of criminal activity shown in popular TV shows, comic books, and movies. The vast majority of accused criminals will be, in fact, guilty of the crime they are accused of. Very few criminals will be represented by a crusading public defender with the resources to conduct a thorough analysis of their case and order elaborate DNA tests to prove their innocence in a last-minute dramatic countroom reveal.
Re:My crime predictions (Score:4, Insightful)
Yes. Eurovision Song Contest entries occasionally qualify as diabolically-planned crimes in my opinion, but alas, they tend to get away with it.
Re: (Score:2)
I'll go one step further and say that statistically speaking, the majority of the criminals will be poor (or in the case of drug dealers, moderately wealthy, but from poor families), and most will be members of a repressed minority in the country in question.
The police already know who and where (Score:3, Informative)
They're just legally prevented from interceding before the when. In fact, in my old home town, the police knew a certain criminal had been murdered because of the reduction in the crime rate at certain times and areas.
Only a small proportion of real crimes and criminals are not predictable.
I can see a movie out of this project (Score:2)
Re: (Score:2, Funny)
How about this. Get a group of "police" that predict who will commit a crime, and arrest them beforehand. We could give it a silly name, like "minority report".
Yeah, that's the ticket (Score:2)
Police predict where and when crimes are most likely to take place,
There's going to be, ... er, a crime. A big one. Yeah, that's it. Clear over on the other side of town. Send all the cops. Right away.
Crowdsourcing predictions (Score:5, Funny)
Crowdsourcing Project To Make Predictions More Precise
I think they used to call them "polls".
Re:Crowdsourcing predictions (Score:4, Funny)
Re: (Score:3, Funny)
It puts the zeitgeist in the machine.
Re: (Score:2)
Looks to me like someone just misread.
Re: (Score:1)
Reminds me of Wikiality (Score:2)
Re: (Score:2)
it's just a rehash of PREDICTION MARKETS which is OLD news.
I absolutely agree. And prediction markets do it better than this approach. For Kaggle, I'd issue a prediction by an initial deadline, time would pass, and my prediction would then be judged and prizes awarded. What happens if I change my mind after the start? Tough luck. And the reward is significantly devalued by being put off.
Further, there are various ways to game this system that are much easier and lower cost than corresponding gaming of prediction markets. For example, I could create several accou
Re: (Score:1)
Re: (Score:2)
Kaggle, unlike prediction markets, is designed to deal with complex tasks where data modeling is required. For example, a prediction market can be used to get the crowd's view on who will win the Eurovision Song Contest. But Kaggle is asking contestants to forecast the voting matrix.
You can get similar coverage with a combinatorial market where the securities are themselves complex objects. Robin Hanson [gmu.edu] has a nice example (on paper) for how you could implement a prediction betting market. Among other things, Hanson's approach allows for conditional statements (if Greece gets the 2032 Summer Olympics, then Ethiopia gets the 2036 Summer Olympics). These things tend to have liquidity problems (the more esoteric the prediction, the less likely you are to find anyone to trade with) and Hans
Mod Statistics (Score:1)
Therefore I conclude that any answer can be calculated by running: answer = (answer+rand())/2; enough times
Mod Statistics (Score:2)
Averaging only reduces random error giving you a more precise result. It doesn't help with systematic errors and therefore not necessarily a more accurate result.
Accuracy being closeness to the bullseye and precision being the grouping of the shots.
Re: (Score:1)
A good way of expressing the difference - to most people they're synonyms.
I put it like this: two people measure the length of a stick. One says it's ten inches, the other says it's ten point one inches. Which is more accurate? Most people will say the latter. But if the real length (we'll argue about what that means later) is exactly ten (or nine point nine) the first is more accurate. The second is merely more pre
Re: (Score:1)
Really? What of you go to Arkansas and ask how old the Earth is?
Re: (Score:2)
Hint: I'm also 29 years old.
Weather predictions (Score:1)
Can they accurately predict the global tempature 1000 years in the future, but have to estimate past values, the the Global Warming people?
Re: (Score:2)
I can confidently predict that the average temperature for June 2010 in the northern hemisphere will be higher than for April 2010. I am not nearly as confident that 15. June will be warmer than today.
Predicting averages is easier than predicting point values. The wider the area averaged over, the easier it becomes.
"The the global warming people" are in the business of predicting averages.
The intrade contracts on global temperature averages are yours for the taking, if you think you know more than the exper
Sounds like a giant paradox (Score:1)
There's a sort of unavoidable feedback loop when the entity responsible for prediction is also responsible for execution, even if that entity is "everyone".
Apply Selectively (Score:4, Interesting)
The wisdom of crowds works when everyone is looking in the same area for the answer to a question with a somewhat fuzzy answer. The group average can often be better than any single expect that attempts to calculate it. However this is a poor approach when the crowd isn't even looking in the right place. Simple majority decision making would be disastrous for many of the big decisions organizations make. The pubic is massively ignorant on scientific issues and continues to be plagued by religious, corporate, and state imposed falsehoods. Freeing people from these shackles and providing full education for all could allow us to crowd source more important decisions and lead to a more efficient and just society.
Re: (Score:2)
*expert....I never claimed to be one regarding proofreading.
Re: (Score:2)
Re: (Score:2)
Crowds elected George W. Bush. Twice.
Plural voting is not a reliable system of determining facts. It's better than asking one half-informed person, but not by much.
Re: (Score:1)
Crowds elected George W. Bush. Twice.
Did they though?
Oooooh. Yeah, think about that one, my friend.
Re: (Score:2)
Yeah, Bush was elected by a pretty big crowd. But bad as that was, letting a smaller crowd elect someone could easily have led to much worse results.
Re: (Score:2)
I can't really agree fully that more education makes people better decision makers. I think educated people tend to suffer from underestimating the value of their opinion and from thinking other people will be reasonable. The majority of
I read this in Sci-Fi many years ago (Score:2)
I think it was a short story in Analog or Asimov science fiction magazines. Someone got tired of the weather forecast being right only about half of the time and created a nation-wide betting pool for people to bet on the weather for the next few days for their area. The theory was that most people would bet on what they thought would actually occur instead of trying for long odds. In the story the forecasters eventually started subscribing to the pool because its predictions were accurate more often than t
Re: (Score:2)
The problem with the weather isn't that it's unpredictable, it's that the parameters that feed the prediction can change significantly in the time it takes you to propagate the prediction to the end users, and many features of the weather are local enough that a central weather forecast is incorrect for a significant portion of the user base.
So no, the general public will not be a better predictor of the weather than the NWS could. And the system using wagers will be even worse, since many of those wagers w
Re: (Score:2)
Isaac Asimov and Hari Seldon's Psychohistory? (Score:1)
Do polls work so well because the people voting in the earlier polls influence the later polls?
If the predictions were shared in real-time with the people they were to predict upon, would they still have the same accuracy?
It seems to me that predicting is only useful when its use is unknown to those it's used on.
Re: (Score:2)
Do polls work so well because the people voting in the earlier polls influence the later polls?
If the predictions were shared in real-time with the people they were to predict upon, would they still have the same accuracy?
It seems to me that predicting is only useful when its use is unknown to those it's used on.
I think the answers to that is:
Not only, but it does have an effect.
No, probably not.
You're right that things trying to predict their own results face fundamentally insurmountable obstacles. It has discouraged computer scientists, and I think it will eventually discourage economists and brain researchers. But polls can still be useful (as can computers, economists, and brains)
Quality not quantity (Score:1)
This model seems to follow 'genetic programming' principles, but is flawed in many ways: (a) It assumes that most people know everything relevant to the problem under consideration - they often don't.
(b) What the model is looking for is an expert among the crowd. On average, you can find an expert among 1024 people, to predict 10 coin tosses - this is with random data having no relation to specialized wisdom.
(c) Eurovision (mentioned above) is in the rare category of scenarios that can make use of 'crowdsou
Re: (Score:1)
Modern life? (Score:1)
I don't know I find these kind of opening phrases amusing/annoying, but saying "Predictions are critical to modern life" seems to imply that somehow they are more important than ever. Aren't most major religions based on "predictions" of some kind and didn't they begin a wee bit before "modern life"?
Crowd-sourcing predictions will undermine all sorts of religions, and we all know what happens when you threaten the monopoly on truth help by religion...
Re: (Score:1)
Does anybody have prediction-competition ideas? (Score:1)
The DELPHI method, circa 1944 (Score:2)
This is REALLY old news. 66 years ago, it was known as the DELPHI method, and it's been studied to death in the interim.
Re: (Score:1)
Re: (Score:1)
so yeah, nothing new here - not even method.
As Someone Who Knows Something (Score:1)
As someone who knows a little about the Netflix Prize [slashdot.org], metalearning is not crowdsourcing. It's not a prediction market. It's none of these things. It's essentially taking a weighted average to match some prior data and then using that for new predictions. It's machine learning. It's not magic. If you wanted to draw an analogy in the real world, it would be like asking, who do you believe more when predicting changes to the climate? Some know-nothing wingnut who never went to college, but listens to c