CARVIEW |
Google Engineering Explains Microformat Support in Searches
by James Turner | comments: 3
You may also download this file. Running time: 18:24
Subscribe to this podcast series via iTunes. Or, visit the O'Reilly Media area at iTunes to find other podcasts from O'Reilly.
Today, Google is releasing support for parsing and display of microformat data in their search results. While the initial launch will be limited to a specific set of partners (including LinkedIn, Yelp and CNet reviews), the intent is that very quickly, anyone who marks their pages up with the appropriate microformat data will be able to make their information understandable by Google. This technology would allow you to explicitly search, for example, for only printers that had an average customer review of 3 stars or higher. Initial support will include things such as:
- Review Ratings
- Product Prices
- Personal Details
We talked this morning with Othar Hansson and RV Guha, two of the Google engineers responsible for the new functionality, and you can listen to them discuss it in this exclusive O'Reilly interview.
JAMES TURNER: Why don't you guys start by introducing yourselves?
OTHAR HANSSON: Sure. I'm Othar Hansson: and I'm a tech lead on this project. And I'm in Google's Search UI Group.
RV GUHA: My name is Guha. I'm an engineer at Google and I do stuff across the board.
JT: So can you describe briefly, to start off, exactly what it is you're releasing today?
RVG: Okay. We are asking webmasters who have pieces of data like reviews or people profiles, and in an experimental form, things like information about organizations and products, to put the structure data representing the content on the webpage in a machine-understandable form on the webpage. Typically, what happens is that if you take a website and having created opinions, I can talk about the context of opinions. You would typically have a database in the back-end which has lots of information about products. People write reviews about them. And you get information such as the number of reviews, the average rating of the reviews, the price of the product, who sells it, et cetera, et cetera, et cetera. It's stored in a structured database in your back-end. You then use some scripts to format it into HTML as per the site's design. Now going from the structured data to the HTML is quite straight-forward. But going from the HTML back to the structured data in a fashion which works across sites is very, very, very hard. Now our search engine doesn't -- it's very difficult for a search engine to understand -- to sort of get back the structured data for all of the sites. Now if it were to understand that, if it were to understand that this is a review site where the product being reviewed is such and such and it has 30 reviews with an average rating of 3.2 and so on and so forth, we could do a better job of the search. In particular, we could do a better job of presenting the two or three lines of text that appeared as part of the search result so that the user has a better idea of what to expect on that page. And from our experiments, it seemed that giving the user a better idea of what to expect on the page increases the click-through rate on the search results. So if the webmasters do` this, it's really good for them. They get more traffic. It's good for users because they have a better idea of what to expect on the page. And, overall, it's good for the web.
JT: So in some ways, that's in the same way that right now for certain sites, you'll give the internal structure of the site as part of the search result or for shopping results, you'll give price ranges and things like this. This is just, again, enriching and providing more structured -- more than just a snippet, giving more of a structured display of the information on that page?
RVG: Yes. If we have a structured data, we can do lots of things. We're starting off by improving the snippets. It's an absolute no-brainer. It seems to be helping everybody. And, as you know us, we keep playing it on with different ideas and different things. As structured data becomes more prevalent, there's a ton of ideas, both inside Google and outside Google, on how you might improve search.
tags: google, interviews, microformats, search, seo
| comments: 3
submit:
Google Announces Support for Microformats and RDFa
by Timothy M. O'Brien | comments: 8
On Tuesday, Google introduced a feature called Rich Snippets which provides users with a convenient summary of a search result at a glance. They have been experimenting with microformats and RDFa, and are officially introducing the feature and allowing more sites to participate. While the Google announcement makes it clear that this technology is being phased in over time making no guarantee that your site's RDFa or microformats will be parsed, Google has given us a glimpse of the future of indexing. Read this article to find out about the underlying technology and how you can prepare you own content to work with this emerging technology.
What is RDFa?
While Google's announcement today focuses on microformats they will soon release support for RDFa. From the W3C RDFa in XHTML Specification:
The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported into a user's desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
Let's take a quick look at a review from Amazon, and see how it would be marked up with RDFa to provide more information for Rich Snippets. First, here's a review from the Amazon site:

Next, let's take a look at a (very simplified) example of markup that might be used to generate this review:
<div> <div> 79 of 98 people found the following review helpful: </div> <div> <span>5.0 out of 5 stars</span> <span><b>American Biographer: Jon Meacham</b>/span> </div> <div><a href="https://www.amazon.com/gp/pdp/profile/A2G8PQ9HNUY6NA/"> <span>Marian the Librarian</span></a> (NY, NY) - </div> <div> <b>This review is from: <a href="https://www.amazon.com/American-Lion-Andrew-Jackson-White/dp/1400063256/"> American Lion: Andrew Jackson in the White House (Hardcover)</a></b> </div> <div class="review"> American Lion is a wonderfully crafted biography about an incredibly interesting and oft-overlooked American who helped shaped this country... </div> </div>
Next, let's add the RDFa markup to this review that would allow Google to integrate this review into Google's Rich Snippets. To markup this XHTML with RDFa, you use the https://data-vocabulary.org namespace and a set of attributes. To see a list of attributes that work with Google's indexing technology, see this RDF for data-vocabulary.org:
<div xmlns:v="https://rdf.data-vocabulary.org " typeof="v:review"> <div> 79 of 98 people found the following review helpful: </div> <div> <span><span property="v:rating">5.0 out of 5 stars</span> <span><b>American Biographer: Jon Meacham</b>/span> </div> <div><a href="https://www.amazon.com/gp/pdp/profile/A2G8PQ9HNUY6NA/"> <span property="v:reviewer" about="https://www.amazon.com/gp/pdp/profile/A2G8PQ9HNUY6NA/">Marian the Librarian</span></a> (NY, NY) - <span property="v:dtreviewed">1st April 2009</span> </div> <div> <b>This review is from: <a property="v:itemreviewed" about="https://www.amazon.com/American-Lion-Andrew-Jackson-White/dp/1400063256/" href="https://www.amazon.com/American-Lion-Andrew-Jackson-White/dp/1400063256/"> American Lion: Andrew Jackson in the White House (Hardcover)</a></b> </div> <div class="review" property="v:description"> American Lion is a wonderfully crafted biography about an incredibly interesting and oft-overlooked American who helped shaped this country... </div> </div>
This initial release covers people and reviews, but Google will be slowly rolling out support for other RDFa vocabularies and microformats as they become available. For more information, see "Marking up content with RDFa"
on the Google Webmaster/Site Owners Help site.Analysis
While the Semantic Web has been around for years, it has yet to live up to the audacious promises that heralded its introduction to the world. What is the Semantic Web? Here's the definition from Wikipedia in case you need a refresher:
Humans are capable of using the Web to carry out tasks such as finding the Finnish word for "monkey", reserving a library book, and searching for a low price for a DVD. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by people, not machines. The semantic web is a vision of information that is understandable by computers, so that they can perform more of the tedious work involved in finding, sharing, and combining information on the web.
In short, the Semantic Web is about more "meaningful" content. We've perfected the art of scanning text and creating massive distributed indexes that produce highly relevant search results, but when you type in "Swine Flu" you are really still dealing with an inefficient indexing approach that doesn't know about the meaning of the text being parsed and indexed. Moving toward the Semantic Web will allow our searching technologies to become more intelligent and will set the stage for the next revolution in which computing systems can become more aware of the "meaningfulness of data".
We've already seen a shift toward "semantic search": Google has already been augmenting search results with Google Maps, limited catalog searches, and more recent entries into the search market such as Amazon's A9 and the yet to be released Wolfram Alpha differentiate themselves by the structured data and content that can be extracted from a search result. We have yet to a see a compelling reason for web masters to place RDFa or microformats into a site to enable this semantic data to be mined until today, until Google provided a social incentive for site designers. This shift toward semantic markup promises to disrupt existing SEO approaches which are built atop the platform Google provides.
With Google in the game, it now becomes an imperative, sites that want to be listed in search results with Rich Snippets will eed to think about RDFa and microformats. Tools that have been designed to present person and review data will now output RDFa and microformat markup compatible with Google by default. Blogging systems like Moveable Type or Wordpress, ecommerce tools like Magento, content management tools like Alfresco and Drupal will, very quickly, adopt the formats supported by Google, and in five years time, we won't be able to imagine a web that wasn't being supported by semantic markup. We think reminisce about the days when search results were produced by ad-hoc text processing technologies unsupported by meaningful data. The search result you are used to today will seem quaint in comparison to the rich data-centric experience of the emerging Semantic Web.
"The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. " - Tim Berners-Lee
UPDATE (3:52PM): We've had some response about failing to mention Yahoo's SearchMonkey which also supports RDFa and Microformats. Google is certainly not the first search engine to support RDFa and Microformats, but it certainly has the most influence on the search market. With 72% of the search market, Google has the influence to make people pay attention to RDFa and Microformats.
tags:
| comments: 8
submit:
Four short links: 12 May 2009
Storage Superfluity, Data-Driven Design, Twit-Mapping, and DIY Biohacking
by Nat Torkington | comments: 1
- Lacie 10TB Storage -- for what used to be the price of a good computer, you can now buy 10TB of storage. Storage on sale goes for less than $100 a terabyte. This obviously promotes collecting, hoarding, packratting, and the search technology necessary to find what you've stashed away. Analogies to be drawn between McMansions full of Chinese-made crap and terabyte drive full of downloaded crap. Do we need to keep it? Are there psychological consequences to clutter? (via gizmodo)
- In Defense of Data-Driven Design -- a thoughtful response to the "Google hates design!" hashmob formed around designer Douglas Bowman's departure from Google. When you’ve got the enormous traffic necessary to work out if miniscule changes have some minor, statistically significant effect, then sure, if you can do it quickly, why wouldn’t you? But that’s optimization that should happen at the very end of the design cycle. The cart goes after the horse. Put it the other way ‘round and you have a broken setup. It doesn’t mean horses suck. It doesn’t mean carts suck. Carts are not the enemy of horses. Optimization is not the enemy of design. Get them in the right order and you have something really useful. Get them the wrong way around and you have something broken.
- Just Landed: Processing + Twitter + Metacarta + Hidden Data -- Jer searched Twitter for "just landed in", used Metacarta to extract the locations mentioned, and then used Processing to build visualizations.
- Do It Yourself Genetic Sleuthing -- MIT is starting a hotbed of DIY biologists. The 23-year-old MIT graduate uses tools that fit neatly next to her shoe rack. There is a vintage thermal cycler she uses to alternately heat and cool snippets of DNA, a high-voltage power supply scored on eBay, and chemicals stored in the freezer in a box that had once held vegan "bacon" strips. Aull is on a quirky journey of self-discovery for the genetics age, seeking the footprint of a disease that can be fatal but is easily treated if identified. But her quest also raises a broader question: If hobbyists working on computers in their garages can create companies such as Apple, could genetics follow suit? It's unclear what those DIY-started "genetics" companies would look like--the potential is there, but it's yet to met the right problem. (via Andy Oram)
Just Landed - 36 Hours from blprnt on Vimeo.
What is the Right Amount of Swine Flu Coverage?
by Brady Forrest | comments: 5
Dr. Hans Rosling (Gapminder) has posted a short, but effective video comparing the coverage of Swine Flu to a more constant killer like Tuberculosis. He decries the fact that Swine flu has generated many orders of magnitude more coverage per death than Tuberculosis.
Dr. Rosling has a point. The media could be said to be disproportionately covering Swine Flu. However, how can the media not be expected to cover Swine Flu? It is new. It is spreading quickly. It is something that will potentially impact the daily lives of their readers (and themselves). Tuberculosis, while on the rise (see the chart to the right), is a known, is relatively contained and there is a vaccine.
Which should the media focus on? Which would you expect them to? While the media coverage maybe overblown (and I questioned putting this post up at all) I think it is understandable to want to track this potential new threat closely.
[Tuberculosis Growth Chart via Wikipedia]
Updated:I realized that this post was incomplete without checking some trend data to see how people's interest compare. Here's the Wikirank comparison chart:
And the Google Trends comparison:
For "fun" I included H1N1 to see if the name change was working. Based on search volume it does not seem to have been effective use of re-marketing dollars.
It's clear that the news is driving a lot of interest in Swine Flu and that there is very little residual interest in Tuberculosis. Whether this is the tail wagging the dog remains to be seen.
tags:
| comments: 5
submit:
Vine, Disaster Tech From Microsoft
by Brady Forrest | comments: 4
Last week Microsoft will started inviting users into Vine, a public-service tool that will be especially useful during disasters. In case of an emergency or everyday life, Vine will be a multi-platform, ad-free method of staying in touch with networks. Once Vine is launched it has the potential to become a very powerful communication platform. Last week I had a phone call with Tammy Savage, the GM of Microsoft’s Public Safety Initiative.
Vine's primary goal is to connect you with a small group of people, reach them wherever they are, and allow you to determine what conditions are like where they are. Vine will do this by letting you connect to it as you desire. Initially that means Facebook, LinkedIn, email, SMS, and the Vine Windows client.
Vine has three main functions and many supporting features:
Send An Alert - You can send a message to a pre-constructed group via its own email address and SMS keyword. All replies go to the group and the messages can later be found on the group's report.
Post A Report - You can also post to the report. This is structured info that can be shared It's also a way to share information. You can "Check In Safe and Well", "Report Upcoming Plans", "Report a Situation", or share "General Information". Each option is associated with a timestamp, a location, and provides different data fields (for example "Check In Safe and Well" has a toggle for "Okay"/"Not Okay"; "Report Upcoming Plans" includes a date range). Tammy said that this isn't blogging , however it seems like it will be very similar.
Research News and Safety Info - On Vine you can search for news and alerts in a geographic area. You you will be able to include GeoRSS feeds from around the web. It provides situational awareness in cases of emergencies.
Vine will need to support many different platforms. In my discussion with Tammy she said that there will be web access, Twitter integration, and access for non-Windows users and mobile users. Tammy would not make a commitment to any platform however the most logical ones are Mac, Windows Mobile and the iPhone. An API is under consideration. Right now is a time of experimentation for the group. After they see their users' behavior the team will start making decisions about how to expand access.
Vine is a mashup made into a product. It uses a combination of eleven Microsoft services. The ones that I am aware of include: Live Search (for alerts), Messenger (for chat), Live ID (for identity), Hotmail/ Live Mail, VE Maps and SQL Server on the backend. In the future we can expect Tellme's voice recognition to be added. The Vine Windows Client will use the new Windows 7 Location and Sensing API in the future.
Vine will not have ads. The team is rightly concerned that ads could be distracting in a crisis. Instead they will add on premium services, but their will always be a free version. I would bet that premium services will be web services (not clients). Enterprises and governments will also be interested in hosting their own version.
Vine is going to start testing in the Seattle area. I asked Tammy if this meant that there would be staged emergencies (ala Strong Angel) to test; there won't be. Instead they want people to use it in their daily lives. Over that time they'll see how people integrate Vine into their lives.
In times of crisis people fall back on what they know. Twitter has quite famously been used during emergencies, but it does not have all of the functionality necessary to be the only method of communication used. Vine will use Twitter's powerful ability to to broadcast bits of information to many people from anywhere and supplement it with social networks, news reports, research ability and location-awareness. Tools like Twitter and Facebook need champions to make them suitable for disaster relief scenarios. Hopefully Vine (and InSTEDD's GeoChat) can create a platform that can and will save lives.
For more on Disaster Technology champions watch Jesse Robbins and Mikel Maron in their talk on DisasterTech at Where 2.0.
tags:
| comments: 4
submit:
Four short links: 11 May 2009
Healthcare, Diagrams, Social Networking, and Email
by Nat Torkington | comments: 4
- OSCAR Canada -- open source healthcare (EMR) software, akin to VistA. See linuxmednews.com for more.
- Instaviz -- iPhone app for mindmapping/any other blob-and-line diagram. I'm hypnotised by the correction of a fuzzy hand-drawn circle into a clean crisp algorithmic circle.
- Buddypress -- open source software that turns a Wordpress installation into a social networking platform. Ok, so social networking software is now essentially free. What's the next big thing that will as hard and new as social networking was in 2003?
- Getting Insight Into One's Own Email -- Thunderbird now shows interesting facts when there's no message to look at: recently read messages, messages most likely to be interesting, and a histogram of activity.
tags: email, healthcare, social networking, visualization
| comments: 4
submit:
Goodreads vs Twitter: The Benefits of Asymmetric Follow
by Tim O'Reilly | comments: 44
I am never more painfully reminded of the limits of symmetric “friend”-based social networks than I am when I post a book review on Goodreads. I love books, and I love spreading the word about ones I enjoy (as well as ones I expected to enjoy, but didn’t quite). Most of the time, my reviews go out quietly to a small group of friends, whose book recommendations I also follow. It’s a lovely social network.
But every once in a while, I post a link to one of my reviews on Twitter, and am immediately deluged with friend requests. Some of them are from people I know, but whose taste in books I may not share (or even care about), and many are from complete strangers. If I say “yes” to any of them, I have to see every book they review as well. As you can imagine, it doesn’t scale.
I don’t mind if anyone in the world reads my reviews, and they are in fact all public on the site, but for someone to “follow” my reviews (get notified when I write them), they have to be accepted as my friend, in which case I see all their reviews as well. Asymmetric follow should at least be an option on any social network. It’s the way the world really works. We never find ourselves in clearly delineated friend-circles, where everyone has or wants complete visibility with everyone else, or none at all.
If you’re even a minor-league celebrity like me, there are way more people who are interested in what you are doing or thinking that you can possibly keep up with. I can’t even keep up regularly with the 500+ people I do follow on Twitter; keeping up with the 400,000 who follow me would be impossible.
Asymmetric follow is why I use Twitter regularly and Facebook much less often. With Twitter’s model, I can find people I’m interested in, whether or not they know me, and learn about them and their lives and thoughts. Others can include me in their lists. You become “friends” with complete strangers over time, by communicating with them (responding with @messages for example), perhaps by mutual following. In fact, Twitter’s wonderful system of @ messages means that anyone can address me - and so I find myself having conversations with complete strangers as well. I actually follow my @ messages more faithfully than I do my planned Follow list.
On Facebook, I’m expected to approve every request, and alas, I turn down far more than I accept. Amazingly, few people who I don’t know even bother to explain who they are and why they want to be my friend. I sometimes do accept strangers who make a good case for why I’d be interested in them, but I always ignore those I don’t know who don’t bother to even say hello. Ditto for LinkedIn and Plaxo and all the other greedy networks that are clamoring for my time and attention while requiring me to take explicit steps to approve or deny each request.
(Meanwhile Dopplr has seemingly implemented a form of reverse friending, in which I am forced to see the trips of anyone who has requested the ability to see mine, a kind of Bizarro-world asymmetric follow that has rendered Dopplr completely useless to me.)
Asymmetric follow is also a good way to boost viral growth, as it encourages people to try the service without having to be an active user. We learned long ago from Usenet and mailing lists that there are always more lurkers than posters.
So, consider this a LazyWeb request to all social networks out there: even if you have your own ideas about how to organize social networks, have an option for users to turn on “Twitter-mode.” I think you’d be surprised how well it works.
tags: asymmetric follow, dopplr, facebook, goodreads, social networking, twitter
| comments: 44
submit:
Hacking Primes in Mathematica
by Mike Loukides | comments: 8If this is too esoteric, skip it. I couldn't figure out anywhere else to put it.
This morning, Tim Bray tweeted about a post on prime numbers and Benford's law. To cut the esoterica short, one of the big problems in prime numbers is that people don't know how they're distributed. This post suggests that Benford's Law describes the distribution of the first digit of prime numbers. One of the comments asked an important question: is this really just an artifact of base 10? Math really doesn't "know anything" about bases, so if this idea doesn't generalize to bases other than 10, it doesn't mean much.
tags: math, mathematica, mathematica cookbook, mathematics, prime numbers
| comments: 8
submit:
Who Will Cut The Gordian Knot of Healthcare Billing?
by Tim O'Reilly | comments: 10
In a story about open source medical records systems, I couldn't help but be struck by the irony in the following statement:
Referred to by health care quality guru Philip Longman as an "unrecognized national resource," VistA's open source code is constantly being improved and updated by its users. However, John Halamka MD, Chief Information Officer at the Beth Israel Deaconess Medical Center in Boston, is quick to note that VistA is not designed for complex billing scenarios that challenge large hospital systems because the VA is a single payer system unlike the health system for the general public.
It's true that VistA is designed for hospital management and patient care rather than billing, but isn't it a sign of something wrong when the billing tail wags the dog of care?
For so many problems in our society, solutions are dismissed as impossible because they would require changes that people don't want to make. That's why change so often comes from outside. Perhaps the simplicity of VistA is a feature, not a bug. In its early days, the internet was cited as inadequate -- too lightweight for serious networking -- by proponents of complex, over-built systems. Where's Alexander when we need him? Gordian Knots are everywhere.
tags:
| comments: 10
submit:
Hackers wanted! Scholarships available to coders who'll come to journalism and help save democracy
by Brian Boyer | comments: 30
Guest blogger Brian Boyer is a hacker journalist who writes about the intersection of technology and journalism. He's worked at public-interest journalism site ProPublica and is now at the Chicago Tribune, building their new News Applications team.
It's not news that journalism is in crisis. CNN turned newspapers into first-day fishwrap and Craigslist killed the business model. Solutions are scarce, and our democracy is at risk. I don't have a chart to guide our way through the darkness to Citizenry 2.0, but there are some who can navigate the singularity.
Journalism needs great hackers. Not just nerds, but programmers who care -- about the values of journalism and the power of a free press to hold government accountable. Luckily, hackers are a freedom-minded bunch. The free software movement is rooted in many of the same principals that guide journalism. But news organizations aren't very sexy places to work -- especially now, as layoffs, bankruptcy and closures plague the industry. So how can we bring nerds to the news? One old-skool school is trying.
Free beer school!
Tell your programmer friends: The Medill School of Journalism at Northwestern University is giving away full scholarships, plus expenses, to software developers.They can get a masters degree in journalism, gratis, from one of the most prestigious J-schools around.
I recently graduated from the year-long program, during which I studied with with one other hacker and ~45 brilliant 'normal' journalism students. I interviewed lawmakers, farmers and shopkeepers and wrote stories about agriculture, waterways, and the diabetes epidemic in Illinois. It was difficult to shake my introverted, google-first, face-to-face-as-a-last-resort programmer nature. But it was also thrilling.
Journalism is an info-geek's dream. You're constantly learning new topics, speaking with experts, and distilling real-world issues to their essence -- all in the mission of informing the folks who don't have time to soak up all that data. It's like being paid to write a new Wikipedia article every day.
We also wrote some software. My programmer colleague and I banged out enviroVOTE in a frenetic weekend of coding and coffee in the days preceding the election. The night of, we were tied to our keyboards, tallying results and tweeting updates while the rest of the world was watching TV. Such is the life of a journalist.
For our final project at Medill, the two coders and four non-coder new-media students built NewsMixer, an experiment in integrating social networks with news coverage. It was one of the first applications to roll out on Facebook Connect, and remains one of the only apps that explores its full potential. All the code is GPL'ed and has already spawned other open-source projects.
This is the time to remake journalism
Programmers have been making an impact in the news world for some time, but until recently most innovation in this space has been in creating new ways to present the old style. With a few shining exceptions like the datavisuals by the New York Times, most online news could have been written on a typewriter and mailed to Google for indexing.
Then, something amazing happened: Software won a Pulitzer Prize. Created by hacker journalist Matt Waite and other fantastically clever folks at the St. Petersburgh Times, PolitiFact is form of news that could only exist online. Aron Pilhofer, leader of the innovations team at the NYT, put it perfectly:
But is it journalism, some people asked? There's no lead per se, no narrative and no pyramids anywhere to be found, much less the inverted sort.
Journalism is about helping people make sense of important issues, and how those issues affect them personally. It's about uncovering that which someone wants to keep hidden. It's about holding people we place in high public office accountable. And by those definitions... PolitiFact more than meets the test. It takes a traditional form of newspaper reporting -- fact-checking what politicians say -- and scales it up in a way only possible on the web.
The NYT's Represent and its open-source cousin, Repsheet, are innovations much in the same vein, and their existence is a sign of the times. The tools now available to hackers are so great that we can think far beyond content management systems. The moment has come when a couple of great hackers can knock out a fully-fledged new form of media in a matter of weeks. Tell the Twitterati: there are lights in the distance.
Hackers wanted
The news is waiting to be saved. We have the technology, all we need is more nerds. So ditch your boring corporate gigs and come to journalism! Democracy is one hell of a fun problem to hack.
tags: education, journalism, open source, programming, web 2.0
| comments: 30
submit:
Four short links: 8 May 2009
by Nat Torkington | comments: 2
- Citizen Journalism and Civic Reporting -- Gawker rebuts the nonsense that reporters will be the only people at council meetings: as a newspaper reporter who spent a few years covering a town much like Baltimore — Oakland, California — I often found that bloggers were the only other writers in the room at certain city council committee meetings and at certain community events. They tended to be the sort of persistently-involved residents newspapermen often refer to as "gadflies" — deeply, obsessively concerned about issues large and infinitesimal in the communities where they lived. I know my local newspaper only paraphrases council press releases, they rarely actually attend the meetings. (via waxy)
- Keeping Score (Rowan Simpson) -- It makes me wonder what other things we dismiss as being too simple to be useful. Inspired by Atul Gawande's books, which I highly recommend.
- The Extraordinaries -- micro-volunteer opportunities on the mobile phone. (Think of it as Mobile Turk) Another way to harness our great cognitive surplus.
- Visualization in Sports -- roundup of the use of computer graphics and visualization in sports. Sports is competitive, lucrative, and quite fast-paced. I love to see sport and business learning from each other. (via tomc on delicious)
tags: book related, crowdsourcing, journalism, mobile, visualization
| comments: 2
submit:
Velocity 2009 - Big Ideas (early registration deadline)
by Jesse Robbins | comments: 5
My favorite interview question to ask candidates is: "What happens when you type www.(amazon|google|yahoo).com in your browser and press return?"
While the actual process of serving and rendering a page takes seconds to complete, describing it in real detail can take an hour. A good answer spans every part of the Internet from the client browser & operating system, DNS, through the network, to load balancers, servers, services, storage, down to the operating system & hardware, and all the way back again to the browser. It requires an understanding of TCP/IP, HTTP, & SSL deep enough to describe how connections are managed, how load-balancers work, and how certificates are exchanged and validated... and that's just the first request!
Web Performance & Operations is an emerging discipline which requires incredible breadth, focusing less on specific technologies and more on how the entire system works together. While people often specialize on particular components, great engineers always think of that component in relation to the whole. The best engineers are able to fly to the 50,000 foot view and see the entire system in motion and then zoom in to microscopic levels and examine the tiny movements of an individual part.
John Allspaw recently described this interconnectedness on his blog:
With websites, the introduction of change (for example, a bad database query) can affect (in a bad way) the entire system, not just the component(s) that saw the change. Adding handfuls of milliseconds to a query that’s made often, and you’re now holding page requests up longer. The same thing applies to optimizations as well. Break that [bad] query into two small fast ones, and watch how usage can change all over the system pretty quickly. Databases respond a bit faster, pages get built quicker, which means users click on more links, etc. This second-order effect of optimization is probably pretty familiar to those of us running sites of decent scale.
Working with these systems requires an understanding not only of the way technology interacts, but the way that people do as well. The structure, operation, and development of a website mirrors the organization that creates it, which is why so many people in WebOps focus on understanding and improving management culture & process.
Organizing a conference like Velocity is a wonderful challenge because it requires the same sort of thinking. We focus on the big concepts that everyone needs to know and then go deep into the technologies that change our understanding of the system. We find ways to share the unique experience that can only be gained by operating at scale. We make it safe to share as much of the "Secret Sauce" as we can.
Please join us at Velocity this year, we have an amazing lineup of speakers & participants. Early registration ends on Monday, May 11th at 11:59 PM Pacific. (Radar readers can use "vel09cmb" for an additional 15% discount.)
tags: cloud, data, infrastructure, operations, scale, velocity, velocity09, velocityconf, web, web2.0
| comments: 5
submit:
Recent Posts
- Up Close with an Enigma | by Ben Lorica on May 8, 2009
- Overheard: @edjez on innovation in mobile | by Tim O'Reilly on May 7, 2009
- Eat Fast, Get Fat? | by Brady Forrest on May 7, 2009
- Velocity Preview - Keeping Twitter Tweeting | by James Turner on May 7, 2009
- Tim O'Reilly - Why Twitter Matters for News | by James Turner on May 7, 2009
- Four short links: 7 May 2009 | by Nat Torkington on May 7, 2009
- Ignite Show: Lisa Katayama on Japanese Gadgets and Toys | by Brady Forrest on May 6, 2009
- Overheard: @andrewsavikas on DRM | by Tim O'Reilly on May 6, 2009
- That Was Fast: Mapme.at Uses Latitude API | by Brady Forrest on May 6, 2009
- Four short links: 6 May 2009 | by Nat Torkington on May 6, 2009
- Google's Sneaky Launch of Latitude's Location-Sharing API | by Brady Forrest on May 6, 2009
- NiN's Rob Sheridan on iPhone Application Rejection | by Timothy M. O'Brien on May 5, 2009
STAY CONNECTED
TIM'S TWITTER UPDATES
CURRENT CONFERENCES

Where 2.0 2009 delves into the emerging technologies surrounding the geospatial industry, particularly the way our lives are organized, from finding a restaurant to finding the source of a new millennium plague. Read more
O'Reilly Home | Privacy Policy ©2005-2009, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.