The 2009 GitHub Contest is winding to a close – less than 48 hours until the deadline to get your submissions in. If you haven’t pushed the source code for your entry yet (which is a lot of you), please remember to do so soon. You know what’s on the line:

I would also like to write up an overview of the entries. If you would like to be featured, please add a description of what you’ve done and any further reading on your approaches to your README and send me an email (scott at github) – even if you didn’t get very high in the rankings. Unfortunately the dataset I put out there wasn’t perfect – most people found a good percentage of my removed results by adding the parents of forked repositories, which by itself gave people a big boost. However, I’m not interested in writing that type of GitHub specific stuff up – I want to know the rest of the algorithm – the parts that would be useful to any dataset or website trying to do something similar. Please let me know about what you’ve done, what you tried, what worked well – I would like to share it with everyone and point to your code.
A great example of a project that is both fantastically open and describes most of what they are doing this way is Jeremy Barnes entry – it is really an amazing writeup on one of the best performing entries out there.
Good luck in the home stretch, everybody!



So, what time will it exactly end. Noon PST would be 20:00 UTC, but it looks to me like you might be thinking of PDT, which would be 19:00 UTC. Can you please clarify this?
Also, do I really have to release my source code before the deadline? Although I am not really in the top positions, I would be afraid that somebody could just steal my stuff if I were one of the better contestants.
That would be Jeremy Barnes' entry, since it belongs to him. >.>
All of us over 40% are almost certainly using Github-specific heuristics. Like Jeremy, I think it is something to be celebrated that people tried to understand the data rather than blindly applying algorithms.
Even simple heuristics can be incredibly effective. A lot of the later progress in Netflix prize was due to psychological heuristics being explored. The first to be talked about much was the anchoring effect - a tendency people have to not stray too far from the rating they gave the previous movie they watched.
What's needed is a sane way to blend inputs from heuristic and more generally applicable algorithms; there's still a lot of progress to be made in that area. Maybe that's a good topic for the next competition? :)
I'M LOOKING FOR PAYOTE. DOES ANYONE KNOW WHERE I CAN GET IT FROM, AND PLEASE DONT SAY AMSTERDAM. I KNOW I CAN GET IT THERE. BUT I'M LOOKING FOR SOMETHING CLOSER TO HOME. THANKS. LIVEWYR