CARVIEW |
Artur Bergman

Artur Bergman, hacker and technologist at-large, is the director of engineering at Wikia, supporting its mission to compile and index the world's knowledge. He is also an enthusiastic apologist for federated identity and a board member of the OpenID Foundation. His current interests include semantic search, large scale infrastructure, open source development, federated instant messaging, neurotransmitters, and the future of cyborgs.
Fri
Apr 3
2009
Savory: Native Kindle epub and PDF Converter
by Artur Bergman | comments: 1
In an editorial for Forbes, Tim called for the the opening of the Kindle, else it will slowly turn obsolete. Since I love my Kindle, I am happy that my friend, Jesse Vincent, a long time open source contributor and OSCON speaker, is trying to open the Kindle. (You might remember him as the guy who discovered Amazon's USB-network easter-egg in the Kindle 2 last month.)
He is developing Savory, the first native Kindle application. Savory is an open source epub and PDF converter that actually runs natively on the Kindle. While it doesn't add anything that you couldn't do from a desktop, it streamlines the process, allowing you copy epubs and PDFs to your Kindle over USB or download them from the web, and immediately read them offline. (O'Reilly provides bookworm, which converts DRM free epubs to HTML and lets you read them through the Kindle's web browser, as well as DRM-free .mobi formatted versions of much of O'Reilly's catalog at O'Reilly Ebook Bundles.) Here's Jesse on why he created Savory:
I'm in love with my Kindle. I've been reading ebooks on screens of various sorts for many years, but the Kindle2 is the first device that I actually enjoy reading as much as I enjoy reading paper books. I've tried other ebook readers, but for a variety of reasons, they just don't work for me. My goal is to make it easier for readers to read more free content on the Kindle.
Savory is based on the open source project Calibre -- a python application that lets you convert between multiple ebook formats. The implementation is a background daemon that uses inotify to immediately convert the file to the mobi format. To get a performance boost, it uses unladen-swallow -- Google's optimized version of Python. I find it exciting that this paves the way for 3rd party applications on the Kindle.
While I wish that Amazon would follow Apple's path and make the Kindle DRM free, it is worthwhile to note that Savory itself does not deal with DRM at all.
Jesse says:
No. Savory does not include support for ebooks protected by DRM. DRM is an incredibly "hot" topic in the ebook world right now. There are varying opinions on its efficacy. My opinions on the matter aren't relevant, except to say that I am not touching the topic with a 10 foot pole. It will not convert DRM-protected ebooks into a format the Kindle will read. It will not add or remove DRM from any ebook.
Personally, I've been using Calibre to create and convert a daily operations report for all of Wikia -- and I look forward to be able to download the report from the web and just read it.
More information can be found on Jesse's blog, with the code available at savory.googlecode.com.
tags:
| comments: 1
submit:
Thu
Feb 12
2009
Cloud Computing defined by Berkeley RAD Labs
by Artur Bergman | comments: 4
I am pleased to finally have found a paper that manages to bring together the different aspects of cloud computing in a coherent fashion, and suggests the requirements for it to develop further.
Written by the Berkeley RAD Lab (UC Berkeley Reliable Adaptive Distributed Systems Laboratory) the paper succinctly brings together Software as a Service with Utility Computing to come up with a workable definition of Cloud Computing and is a recommended read.
The services themselves have long been referred to as Software as a Service (SaaS). The datacenter hardware and software is what we will call a Cloud. When a Cloud is made available in a pay-as-you-go manner to the general public, we call it a Public Cloud; the service being sold is Utility Computing. We use the term Private Cloud to refer to internal datacenters of a business or other organization, not made available to the general public. Thus, Cloud Computing is the sum of SaaS and Utility Computing, but does not include Private Clouds.
Exploring the difference between the raw service of Amazon EC2 to the high level web centered Google App Engine, the highlights are:
- Insight into the pay-as-you go aspect with no commits
- Analysis of cost with regards to peak and elasticity in face of unknown demand
- Cost of data transfers versus processing time
- Seamless migration of user to cloud processing
- Limits and problems with I/O on shared hardware
- Availability of Service
- Data Lock-In
- Data Confidentiality and Auditability
- Data Transfer Bottlenecks
- Performance Unpredictability
- Scalable Storage
- Bugs in Large-Scale Distributed Systems
- Scaling Quickly
- Reputation Fate Sharing
- Software Licensing
I particularly find interesting the analysis of transportation cost versus computing cost; when is it more efficient to to use EC2 than your own individual processing? I predict speed of light and available of raw transfer capacity is going to become a even larger obstacle. (Both inside computers, between them on local LANs and on WANs.)
The paper reinforces my belief in the cloud, but that we need open source cloud environments and a larger ecosystem of providers.
Read more on the Above the Clouds blog.
tags: cloud computing, operations, web2.0
| comments: 4
submit:
Sun
Aug 10
2008
Adhearsion - next killer app for Ruby?
by Artur Bergman | comments: 7
Foo camp attendee Ben Black alerted me to Adhearsion, a framework for developing applications in the VoIP space. Think of it as a Ruby on Rails for telephony. Developed by Jay Philllps who got frustrated by the slow uptake of Asterisk.
Adhearsion is written in Ruby and lets those even without any VoIP experience write applications intuitively and productively or simply download and use a pre-written solution. With the framework extension architecture, VoIP functionality can now be actually traded around - an issue the VoIP industry has always suffered from.
A fresh, standard Adhearsion system out of the box does what many companies spend thousands on. It includes a wide - and growing - set of features that should not have to be rewritten for every business that wants to implement them. And yes, this is open-source.
Considering Microsoft spent around $800M on Tellme, I look forward to see what kind of applications this leads to, and what value they generate. We often forget the enormous market for telephone based services.
tags: open source, web 2.0
| comments: 7
submit:
Tue
Jul 22
2008
Perl on App Engine?
by Artur Bergman | comments: 8
I am a Perl hacker. I have written parts of the core, created CPAN modules and written tons of perl code. In fact I am addicted to it ; or rather, CPAN. I have been wanting to play around with Google App Engine, but I haven't had time to get up to speed in Python. Today at OSCON I met up with Brad Fitzpatrick, who told me he had permission from Google to talk about and work on a Perl on App Engine project.
He makes it clear that,
I'm happy to announce that the Google App Engine team has given me permission to talk about a 20% project inside Google to to add Perl support to App Engine. To be clear: I'm not a member of the App Engine team and the App Engine team is not promising to add Perl support. They're just saying that I (along with other Perl hackers here at Google) are now allowed to work on this 20% project of ours out in the open where other Perl hackers can help us out, should you be so inclined.
The plan is to harden Perl (one layer of defense in App Engine's hardened environment); implement Protocol Buffers and stubs of the backend services, so people can write App Engine applications on their local servers.
There is more information at Brad's LiveJournal, as well as the the Perl-AppEngine project. Capturing the creative spirit here at OSCON, Brad and I hacked together a new module that emulates a protected environment, Sys::Protect (generally good idea for any web application).
tags: open source, open space, oscon
| comments: 8
submit:
Thu
Feb 7
2008
OpenID Foundation - Google, IBM, Microsoft, VeriSign and Yahoo
by Artur Bergman | comments: 14
I am very happy to be able to say that Google, IBM, Microsoft, VeriSign and Yahoo are joining the OpenID Foundation (on whose board I sit.) It marks the end of a lot of hard work by all parties involved, as well as -- at least for me personally -- the hope that we will be able to get a decentralized federated single sign-on technology across the internet.
My experience from co-authoring djabberd, as well as working on systems with large amount of end users, has taught me the value of decentralized federation. Just as I have multiple different jabber ids or email address for different contexts, I also want to have different identities that I can use in different contexts across multiple sites.
From the beginning I was captivated by the promises of this system, and at Six Apart I worked to make sure it was available for widespread adoption. I would like to especially thank David Recordon for convincing me, and others to continue, and his tireless evangelization, which got him a 2007 Google-O'Reilly Open Source Award. It is fitting that he is now back at Six Apart.
I am very grateful to the entire OpenID Community, the rest of the Foundation board and supporting companies who have taken it this far in a little over two and a half years.
Brad Fitzpatrick created OpenID to solve the problem of people commenting between different installations of LiveJournal. Using a URL-based identity for blog commenting made perfect sense, as the identity you are commenting with is your blog. However, the URL-based identity does confuse people, and so at the Social Graph Foo Camp, Brad et al came up with a proposal to map email addresses to OpenID URLs. Perhaps the idea of just using your email address to login will be easier to understand.
Another area where we see innovation enabled is that OpenID does not specify how you authenticate to your OpenID provider. We have seen examples of this innovation including putting OpenID in cellphones, connecting it with the Estonian National ID card, older standards like Kerberos, new desktop authentication technologies, one-time-password tokens, and even new markets being formed around phishing resistant web authentication.
This kind of layered extensibility is why I find the design of OpenID so important, as I've written before. It is an enabling technology. The basic implementation allows exploration and I am looking forward to see what people can use it for.
Again, thanks all of you who made it happen.
tags: web 2.0
| comments: 14
submit:
Fri
Jan 25
2008
Books that make you dumb
by Artur Bergman | comments: 36
Wikiscanner hacker Virgil Griffth told me a while ago about his latest data mining project, to visualise the relationship between books and SAT scores. Today he released his findings at Booksthatmakeyoudumb.
He does this by cross referencing the 10 most popular books at every college, as given by Facebook, and the average SAT score. He then presents it all in this nifty little visualisation.
I find it somewhat amusing and surprising that erotica takes top and bottom positions, with Lolita at the top and the author Zane coming in last (perhaps it says something that the lowest scoring book is actually miscategorized.) The book named "I don't read" also comes pretty far down.
In all, the results aren't that surprising, but as Virgil said to me; "but isn't it wonderful to have concrete data to back it up?"
tags:
| comments: 36
submit:
Wed
Dec 5
2007
OpenID 2.0 Final
by Artur Bergman | comments: 5
The next version of OpenID, the open authentication system, is finally a released specification. As well as the technical work on the 2.0 specicifaction, the community has worked to ensure that OpenID is freely implementable, resulting in the execution of a non assert agreement by the contributing parties.
As a board member of the OpenID Foundation, I am grateful and happy of the careful work by AOL, Cordance, JanRain, Microsoft, NetMesh, Six Apart, Sxip, Sun Microsystems, Symantec, Verisign and Yahoo!. Last week Google and Microsoft also showed their support of OpenID by respectively launching OpenID support in Blogger and by Microsoft Research. With support from these big vendors, many of the shipping open source reference implementations, I have big hopes for the adoption of OpenID as well as the technologies that will be built on top of it.
Late summer 2005, Brad Fitzpatrick at Six Apart came up with OpenID to facilitate authenticating your ownership of a URL to another website. The driving force behind this was to enable commenting across multiple blogging sites with the need for accounts on each of these services.
OpenID should be viewed as a core fundamental enabling technology. It allows the authentication and exchange of account data between un-related websites. Indeed, it does not attempt to solve higher level problems, such as authorization. Instead you can invision it as the underlying technology for interactions between social networks, like the interaction that Tim talks about in
tags: web 2.0
| comments: 5
submit:
Thu
Oct 18
2007
Platial acquires Frappr
by Artur Bergman | comments: 0
In Brady's CFP for Where 2.0 he mentions the emerging importance of the Geoindex. In the -- to my knowledge -- first consolidation in the social mapping space, Platial is acquiring Frappr. Platial's community generated place data together with Frappr's personal location data combines into a larger geoindex allowing for a more personalised mapping environment.
Talking with Platial's CEO Di-Ann Eisnor, she mentions that they plan to integrate the functionality within the next months, and allow you to filter the place data based on your peers in Frappr groups. The combined sites have more than 100 million data points, 15 million unique visitors per month with 4 million maps created.
Advertisement with any user generated content is difficult, and I think that the increased reach and ability to target the ads to users locations and interests create a better proposition for advertisers. With an estimated 25% reach of all map widgets, Platial should become a larger player in the location related advertisement. Perhaps partly based on the ad network and technology provided by Mappam?
tags: geo
| comments: 0
submit:
Fri
Aug 24
2007
German and Japanese Wikipedia scanner
by Artur Bergman | comments: 7
Time for more excitement on the Wikipedia front. Last week Virgil released the English Wikiscanner, as I wrote about previously. This week, it is time for German and Japanese edits to be exposed, with the newly-released German and Japanese Wikiscanner. Sadly, I don't read either German nor Japanese, but I am sure some of you readers do. If you find any nice ones, please post them in the comments. Thanks to a tipster, I know of this Scientology ping pong
tags: just plain cool, lazyweb, web 2.0
| comments: 7
submit:
Fri
Aug 17
2007
Opening up the Social Network Graph
by Artur Bergman | comments: 19
LiveJournal founder Brad Fitzpatrick and Open Source Awards winner David Recordon just posted a manifesto titled "Thoughts on the Social Graph". Brad and David presented their work at Foo Camp and have been sharing it with interested parties over the last couple of months.
Their project attempts to solve the problem of multiple overlapping social networks. This overlap makes it harder to establish new sites, as people tire of rebuilding networks on each social networking site. As a non-profit and opensource project, it aims to be vendor-neutral and usable by all vendors.
Brad sums it up:
Users and developers alike are going crazy. There's too many social networks out there to keep track of. Developers want to make more, and users want to join more, but it's all too much work to re-enter your friends and data. We need to lower the amount of pain for both users and developers and let a thousand new social applications bloom.
All I can say is: finally!
tags: foo camp, open source, web 2.0
| comments: 19
submit:
Recent Posts
- Movable Type 4 Released on August 14, 2007
- Wikipedia is only as anonymous as your IP on August 14, 2007
- Virgin America Inaugural on August 8, 2007
- Law is code on August 6, 2007
- Your browser is a tcp/ip relay on August 1, 2007
- Virgin America on July 31, 2007
- OSCON: Intel releases Open Source Threading Building Blocks on July 25, 2007
- Failure happens on July 25, 2007
- 365 Main datacenter power outage - Six Apart Technorati Craigslist on July 24, 2007
- OSCON: Open Source Developer Toolkit on July 24, 2007
STAY CONNECTED
BUSINESS INTELLIGENCE
RELEASE 2.0
Current Issue

Big Data: Technologies and Techniques for Large-Scale Data
Issue 2.0.11
Back Issues
More Release 2.0 Back IssuesCURRENT CONFERENCES

Where 2.0 2009 delves into the emerging technologies surrounding the geospatial industry, particularly the way our lives are organized, from finding a restaurant to finding the source of a new millennium plague. Read more

Found is the authoritative place to discover best practices for this industry and gain a thorough understanding of why search-friendly architecture is absolutely mission-critical to businesses of all sizes. Read more
O'Reilly Home | Privacy Policy ©2005-2009, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.