CARVIEW |
Jim Stogdill

Jim Stogdill is a group CTO for a large technology consultancy where he advocates the development of open source software in government and defense. He believes, perhaps naively, that open source can help break the proprietary lock in business model that is the norm in that space. In previous lives he built B2B reverse auction systems, brought heuristic-based optimization and online trading to the corporate treasury, and traveled the world as a Navy Officer. Unfortunately from his vantage point it all looked like the inside of a submarine. He spends his free time hacking silver halides with decidedly low-tech gear. He's on twitter as @jstogdill
Fri
Jun 19
2009
The Web^2: it's Exponential, but is it Contracting or Expanding?
by Jim Stogdill | @jstogdill | comments: 4
The theme for the Web 2.0 Summit this year is Web Squared. It is rooted in the idea that as the web morphs into less of a hub and spoke distribution model and more of a network of connected people and things, innovation and opportunity on it are growing exponentially. There has been a little bit of discussion on the Radar back channel about exactly what this means, or should mean, and Nat started things off with a thoughtful response that probably should be blogged as well. In particular he introduced feedback loops into the discussion, and with Nat's prodding, I decided to share my response to his email here. I've edited it a bit to make it a *bit* more cohesive, and while it isn't as structured as I would like, these are my thoughts about the exponential future of the web and a little bit about how that future might also impinge on the future of government...
I agree with Nat that feedback loops are a great mental filter to view the world. I read a little bit of Wiener and now I see feedback loops everywhere. Furthermore, what I like about them as a mental model, is that they help me understand the web at the ecosystem level rather than at the level of a specific technology. Wiener defined a cybernetic system the way engineers define a thermodynamic system. In thermodynamics, a system is closed if no energy crosses its boundary. A cybernetic system is closed when no messages or information cross. Since messages are the lifeblood of feedback these boundaries are important. As an example, open government stuff is so exciting to me because once computing systems connect between the web and government, the boundaries of previously isolated cybernetic systems (e.g. the people and its government) begin to be permeable. And once they are permeable to computing messages they will also be permeable to cultural signals that can create cultural feedback loops. That will cause state to change on both sides of the boundary. Two small isolated cybernetic social systems become one larger integrated one with new feedback loops in place.
Regarding the exponential theme, I'm not sure that innovation is progressing as an exponential over time - although, in fairness, I'm still working on my unabashed optimism credentials. But... In the 1920's automobile companies were springing up like crazy in America. It was the era before production methods became the dominant competitive weapon and anyone with a good idea for a better combustion chamber design or a valve train or a styling cue could still try their hand at building a car company. With access to tools, labor, and know how Detroit in the 20's was a very generative environment for automobile innovation. But by 1980 even DeLorean with a trunk full of coke couldn't afford the startup costs - a combination of more sophisticated design requirements and the changes in production scale economics made it impossible.
Are Data Centers the Economic Equivalent of Manufacturing Plants?
The interesting parallel with the web (or computing and software more generally) is that the rise of the data center as a key piece of competitive know how and, perhaps more importantly, capital cost. The question in my mind is whether utility computing enhances generativity, or by making it contingent on powerful interests, effectively stifles generativity in the long term despite the generative potential of the technology (I'm shamelessly borrowing the idea of contingent generativity from Jonathan Zittrain). And a related question, does the introduction of capital cost as a major factor in the eco system eventually make the web feel more like Detroit in 1980? Will it fundamentally change the web by tying it firmly to those who can access sufficient capital? (Google spent over $800m on data center capital improvements last year. That's a number that even makes the Defense Department wistfully declare "we just can't afford to do what Google does.").
Or, ...the electric utilities made innovation with electrical devices more possible, but it doesn't necessarily follow that utility computing will always do the same. After all, electrical utilities ship their power to us where we use it in situ for whatever purpose we want, but utility computing requires us to send our "loads" to them where it is much easier to implement perfect mechanisms of surveillance and enforcement. Homeowners associations used association charters to turn neighborhoods into little fascist fiefs and data centers have the potential to do the same with EULA's.
Scale and Concentration (or, is the Universe Expanding or Contracting?)
As scale on the web increases there are competing concentrating and generative factors at work (any of which might be exponential). The concentrating factors (need for capital, sophisticated expertise, ...) tend, like gravity, to collapse the system down on itself in a variety of ways. I don't mean that it becomes less relevant or makes less money, I mean that it ends up feeling more like AT&T; in the 60's with centralized control and vested interests and strict contingencies on generativity. Just like Apple's oversight of the app store. On the other hand, the factors that tend toward expansion are feedback loops that span organizational boundaries, ready access to seed funding, standards for cloud computing that encourage true commodity availability of non-contingent generative environments etc.
Figuring out which force will dominate is like trying to figure out whether the universe will expand forever or eventually contract. The balance between the factors is quite subtle, depends on minute variations in initial conditions, and is very difficult to predict. But, we can still ask ourselves, "how can we influence the broader cybernetic ecosystem of the web to encourage policy, practices, cultural values, etc. that will promote generative expansion rather than scale-driven contraction?"
Exponential Effects and Social Structures
Shifting gears for just a moment, complexity science is the other idea I tend to come back to as a frame for viewing the web that, while not directly related to the exponential theme, is at least peripheral. The web is fascinating in the way it has become the cybernetic substrate on which both technical and social patterns are emerging. Stripes form on a zebra because "black" and "white" chemical messengers from adjoining cells interact with each other differently over distance. Out of that simple mechanism complex patterns emerge. The web is transport for human messages that don't decay with geo-spatial distance. This geo-and-time-independent messaging is enabling human "striping" that is no longer geo-ethnic dependent.
Within a geography the existing striping can become more severe as the web enables self-selected and self-reinforcing pockets of auto-propaganda that combine with social graph clusters; clusters that only infrequently span value systems. The situation is reminiscent of 1930's era Spanish political parties and their newspapers, but operating at photo-multiplier tube speed. We consume the stuff that reinforces our world view and segregate ourselves into more and more thoroughly strident neighborhoods of belief. We remain physically in our geo-defined country, but in our chosen echo chamber we each live a very different intellectual and emotional experience in a whirlpool of exponentially hardening world view. Perhaps someday we'll live in "nation states" that are stripes of psychographic and value alignment instead of stripes in geography.
Of course, it's true that as long as we are physical beings we will continue to stripe locally in our physical world. The cybernetic overlay in human relationships provided by the the web doesn't replace that reality, but by augmenting it and letting us stripe along lines of affinity and value system without regard to geography, it contributes to fissures in our geo loyalties. These fissures are important because States exist to govern the physical world (trade, law, taxes, defense...) but they depend on shared values and culture to function effectively. Just look at Iran today to see the effect of incongruent value systems on co-located peoples.
tags: web 2.0
| comments: 4
submit:
Tue
Apr 28
2009
Forge.mil Update and DISA Hacks Public Domain
by Jim Stogdill | @jstogdill | comments: 0
On Monday DISA's forge.mil got another mention on Slashdot. Not really new news, but I think it has been getting press again because of the related news that DISA is also open sourcing its Corporate Management Information System (CMIS). CMIS is a suite of HR and related projects and DISA signed an agreement with OSSI to open source them.
I had been meaning to touch base with Rob Vietmeyer at DISA anyway and the Slashdot mention (plus a subtle kick from Sara Winge) got me off the dime. We are working on a project that we want to share across DoD and since Rob is the force behind forge.mil, I had been meaning to ask him about its uptake. I thought I'd share his answers here.
Since forge.mil was launched it has grown to about 1400 registered users and approximately 10% of them are active on any given day. There are approximately 70 active projects right now in a variety of categories. There are system utilities, geospatial systems, a control system for UAV's, an embedded engine control module, command and control system components, some SOA piece-parts for Net Enabled Command and Control (NECC) and Consolidated Afloat Networks and Enterprise Services (CANES), and sundry others. Project-related traffic (commits, downloads, etc.) is growing and there is a backlog of new projects being on-boarded (including at least one really high profile system that is looking to get broad participation).
What interested me about that list was that the code ranges across domains and from small niche items to components of large scale programs.
At this point most of the code seems to be licensed as "DoD Community Source" and a few projects are under Apache and BSD style licenses. DoD Community Source basically means that the code is "unlimited rights" or "government purpose license rights" under the Defense Federal Acquisition Rules (DFARS). While not "open source" in the OSI sense of the term, hosting code licensed this way on forge.mil should make collaboration across DoD the default rather than the exception. Basically these aren't copyright-based licenses but are designed to operate as though they are in practice - the goal is to do open source-like development within the DoD garden walls.
The source that is licensed under Apache / BSD style licenses is in fact licensed copyrighted material, but at this moment it is still difficult for non-DoD community members to participate because of forge.mil access limitations. DISA is looking into ways to mirror these open source materials to sourceforge instances outside of the DoD garden walls and to extend community participation across those boundaries as well. I think mirroring the code will be a lot easier than figuring out how to do boundary-spanning community.
Projects wanting to be hosted on forge.mil go through a "project adjudication" process that screens out the people just looking for a repository but who don't understand open (or, understand it but don't want it). Projects that don't want to provide open access to other DoD participants have been turned away.
I think there is something interesting hidden in plain view in that CMIS news as well.
One of the oddities of code written by government employees is that the government doesn't create copyright. In an ironic twist, this means that the government can't directly release code under open source licenses, since those licenses rely on copyright law to enforce their terms.
CMIS was written by government employees so DISA and OSSI had to figure out a hack to license it under a copyright-based license. Under the terms of their agreement DISA is releasing the code to OSSI under public domain, then OSSI is re-releasing a "derivative work" under OSL/ASL licenses.
I understand what DISA / OSSI is doing here but I wonder how much they've changed the code to make the "derivative" distinction. It's probably moot though because, assuming community forms around the stuff, it shouldn't take too long before a chain of real derivations is in place that would make the OSL/ASL license terms defensible.
tags: government, opensource
| comments: 0
submit:
Mon
Feb 16
2009
Google's PowerMeter. It's Cool, but don't Bogart My Meter Data
by Jim Stogdill | @jstogdill | comments: 17
Last week I read this piece in the New York Times about Google's PowerMeter, their entry into the smart meter game. The story was picked up in quite a few places but neither the NYT piece or related articles from other outlets expanded much on Google's underlying press release. Google's FAQ isn't very satisfying either; it has no depth so I didn't really know what to make of it. When I finished reading it I was left with an inchoate unsettled feeling and then I forgot about it. But on Friday evening I had a random conversation about it with a colleague who works in the meter data management (MDM) space. By the time we were through talking about what Google might be doing I had arrived at a position of love / hate. I'll explain the love first.
In terms of the attention this brings to energy consumption at the household level, I really love what Google is doing with this initiative. As they put it:
"But smart meters need to be coupled with a strategy to provide customers with easy access to near real-time data on their energy usage. We're working on a prototype product that would give people this information in an iGoogle gadget."
I agree completely. It's not exactly the same thing, but I've been amazed by how much my behavior behind the wheel changed once I started leaving average mpg permanently visible on my car's dashboard display. In short order I went from speed racer wannabe to one of those guys that gets harassed by co-workers for driving too slow. "Hey, can you hypermile on the way back from lunch? I'm starving."
While I am not sure that a gadget on the web will have the same right-there-in-front-of-my-eyes impact that my car's LCD display has, I'm convinced that Google has hit on something important. After all, today most of us have no idea how many kilowatts we use, what we use them for, or how much we're paying per kilowatt. We use power in our homes the way I used to drive my car.
Unfortunately, Google's FAQ doesn't really answer any questions about how the service works. But from statements like "Google is counting on others to build devices to feed data into PowerMeter technology" we can deduce that Google is proposing to correlate the total power reported by your smart meter with the data collected from individual loads inside the home. This is really cool, because not only does it make the information more generally accessible to you (in an easily accessible gadget), it proposes to tell you what it is in your house that is using that power, and when.
Google can do this because many national and state governments have begun to mandate smart meter programs. Most of us will probably have one on the side of our house pretty soon (especially if the stimulus bill speeds things up). Smart meters improve on their predecessors by automating meter reading, reporting consumption in intervals (typically 15 minutes), and they can send "last gasp" failure notifications in the event of power outages.
But, just like their dumb ancestors, they will be owned by the utility. This means that the data generated will ultimately be under control of the utility and hosted in their systems. The meter will talk to a utility data collector and from there its data will enter the utility's MDM system. The MDM will do a bunch of stuff with the data. However, from the point of view of you, the consumer, it will primarily send it to the billing system which will now be able to account for time of day pricing. Also, it will send those last gasp signals to the outage management system so that outage reporting will be automatic. This will make analysis and response faster and more accurate. Google appears to be leveraging their position and market power to make deals with the utilities to access that data on our behalf.
The biggest reason for smart meter initiatives is demand management. The utilities have to carry expensive excess capacity so that they can meet peak loads. If they can use interval metering coupled with better pricing and feedback systems, they may be able to change our usage patterns and smooth that load which will reduce the necessary peak capacity overhang. Also, as alternative energy sources with less predictable availability like wind power come on line the utilities will need more "load shaping" options. Ultimately they might be able to reach directly out to your smart appliances and turn them off remotely if they need to.
The laws that are mandating smart metering are focused on this demand side management. Practically speaking, most utilities will close the consumer feedback loop by offering a simple portal on the utility's web site that will let you monitor your usage in the context of your bill. However, this isn't the part of the system the utilities are excited about. The hardware and the meters are the sexy part. The contracts to build the consumer portals are probably going to go to low cost bidders who will build them exactly to low band pass requirements. In some cases they may provide provisions for customers to download historical data into a spreadsheet if they want to. A few enterprising customers will probably take advantage of this feature, but this is the hard way to do the kinds of correlations Google has in mind.
What should be apparant by now, is that the government is mandating a good idea, but they are mandating it from a utilty-centric rather than customer-centric point of view. There is naturally some overlap between utility and customer interests, but they are not identical. The utility is concerned about managing capital costs. They look at the interval data and the customer portal as a way to influence your time-of-use behaviors. They really don't care how much power you use, they just don't want your demand to be lumpy. On the other hand, we just want our bills to be low.
So, Google's initiative offers to take your data from the utility, combine it with data coming from devices in your home, and visualize it much more you-centrically. There offering will do a better job than the utility's portal illuminating structural efficiency problems in the home as well as usage pattern problems once utilities start implementing variable pricing. In short, while the utility is attempting to influence your "when I use it" decision making, Google is offering to help you make better "what I plug in" decisions along with the stuff the utility cares about.
So, what's not to like?
Google needs two distinct sources of data to make this initiative work. They need access to your data via the utility that owns your smart meter. Plus they need data from equipment manufacturers that are going to make your appliances smart or provide your home automation gadgets. It doesn't bother me at all that they get this data, as long as the utility makes it available for anyone else that might be able to innovate with it too, including me. You never know, I might want to use it for a home made gadget that sets up an electric shock on my thermostat any time my last eight averaged readings are above some arbitrary threshold, you know, just to make me think twice before turning it up.
The little bit of info that Google provides on this initiative is at their .org domain, but there is virtually no information about how to participate in data standards making, API specification, device development, or that kind of thing. If you want to participate, you pick whether you are a utility, device manufacturer, or government, fill out a form and wait for Google to get back to you. Imagine, the government fills out a form to participate in Google's initiative. Google has out governmented the government.
As I described already, governments are insisting on demand side management, but there don't appear to be any requirements to provide generic API's for meter readings or meter events. It's enterprise thinking rather than web platform thinking and we run the risk of your data being treated like utility "content." "In other news today HBO struck an exclusive deal with XYZ electric for all of their meter content, meanwhile Cinemax surprised industry watchers by locking up ABC Electric. As was reported last night, all of the remaining utility's signed with Google last week."
I'm guessing that Google is probably following the same pattern that they are using in the transit space and making (exclusive?) deals with the utilities to consume your data. You'll have to log into the utilty portal to approve their access (or check a box on your bill). But Google, or other big players that can afford to buy in, will probably be the only choice(s) you have. There is no evidence on Google.org that they are trying to create an eco-system or generalized approach that would let you, the owner of the data, share it with other value added service providers. If the utilities implement this under government mandate it will suck. If they install smart meters with stimulus package money and still don't provide eco-system API's it will worse than suck.
Any thoughts on how this plays out on the smart appliance / home automation side? Are there healthy open standards developing or is there danger of large scale exclusivity on that side of the equation too?
Google will be more innovative with this data than the electric utilities, I have no doubt about that. But I can easily imagine other companies doing interesing innovating things with my meter data as well. Especially as Google achieves utility scale themselves. If my electric utility is going to create a mechanism to share my data with companies like Google, I want them to make a generalized set of API's that will let me share it with anyone.
A quick note to policy makers in states who haven't yet finalized their programs. When you think about what to mandate, consider a more consumer-centric model (if it's easier, think of it as a voter-centric model). You should be shooting for a highly innovative and generative space where contributions and innovations can come from large and small firms alike, and where no one should be structurally locked out from participation. Don't lock us into a techno-oligarchy where two or three giant firms own our data and the possibility of innovation. If you insist on widely implemented consumer controlled API's and a less enterprise-centric model, you will not only encourage broader innovation at the consumer end, but you can use it to enhance competition on the generation side too.
Well, Google isn't really saying what they are doing, so maybe I got it wrong. Maybe they are about to get all "spectrum should be free" and roll out all kinds of draft API's specifications for comment. If you think I got it wrong, don't hesitate to let me know in the comments.
Update (2/17): Asa pointed out in the comments that Google does provide more about their intent in their comments to the California Public Utilities Commission. I missed that link before and it gives some useful hints.
Most interesting is the repeated reference to Home Area Networks (HAN). In the original post I assumed Google was taking current smart meters as a given and obtaining data from the utility MDM after it went through their data collectors. That looks like it was incorrect. Instead Google probably wants your meter to to talk to your HAN via wireless(?) and then on to them from there.
If Google can use their market position to make that data accessible off the HAN rather then from the utility MDM I think that's a good thing. Mostly because it makes possible the direct consumption and analysis of the data on my side of my home network's NAT / firewall. I didn't really touch on privacy considerations in the original post, but given that PowerMeter appears trivial from a computational point of view, I'd much rather run it locally rather than share my every light switch click with Google. If I want to know how I'm doing relative to peers I can share that data then, in appropriately summarized form.
The other point in the CPUC comments is this statement: "PowerMeter... we plan to release the technical specifications (application programming interfaces or API) so anyone can build applications from it."
This is great, but I would love to see the API's sooner rather than later. They aren't really PowerMeter API's after all, if I'm reading the situation correctly, these are proposed API's and data specifications for smart meters and smart devices. The API's that Google (and others) will be consuming, not the ones they are offering. If a whole ecosystem is going to be enabled through those API's, then the ecosystem should have a hand in developing them.
In summary, if Google manages to create a level playing field for the development of an ecosystem based on this data, I'll applaud them. Some people will use their service and, like they do with other Google services, trade privacy for targeted ads. Others will choose other approaches to using the data that provide those functions without exporting as much (or any) data.
tags: energy, google, utilities
| comments: 17
submit:
Mon
Feb 9
2009
The Kindle and the End of the End of History
by Jim Stogdill | @jstogdill | comments: 24
This morning I was absentmindedly checking out the New York Times' bits blog coverage of the Kindle 2 launch and saw this:
“Our vision is every book, ever printed, in any language, all available in less than 60 seconds.”
It wasn't the main story for sure. It was buried in the piece like an afterthought, but it was the big news to me. It certainly falls into the category of big hairy audacious goal, and I think it's a lot more interesting than the device Bezos was there to launch (which still can't flatten a colorful maple leaf). I mean, he didn't say "every book in our inventory" or "every book in the catalogues of the major publishers that we work with." Or even, "every book that has already been digitized." He said "every book ever printed."
When I'm working I tend to write random notes to myself on 3x5 cards. Sometimes they get transcribed into Evernote, but all too often they just end up in piles. I read that quote and immediately started digging into the closest pile looking for a card I had just scribbled about an hour earlier.
I had been doing some research this morning and was reading a book published in 1915. It's long out of print, and may have only had one printing, but I know from contemporary news clippings found tucked in its pages that the author had been well known and somewhat controversial back in his day. Yet, Google had barely a hint that he ever existed. I fared even worse looking for other people referenced in the text. Frustrated, I grabbed a 3x5 card and scribbled:
"Google and the end of history... History is no longer a continuum. The pre-digital past doesn't exist, at least not unless I walk away from this computer, get all old school, and find an actual library."
My house is filled with books, it's ridiculous really. They are piled up everywhere. I buy a lot of old used books because I like to see how people lived and how they thought in other eras, and I guess I figure someday I'll find time to read them all. For me, it's often less about the facts they contain and more about peeking into alternative world views. Which is how I originally came upon the book I mentioned a moment ago.
The problem is that old books reference people and other stuff that a contemporary reader would have known immediately, but that are a mystery to me today - a mystery that needs solving if I want to understand what the author is trying to say, and to get that sense of how they saw the world. If you want to see what I mean, try reading Winston Churchill's Second World War series.
Churchill speaks conversationally about people, events, and publications that a London resident in 1950 would have been familiar with. However, without a ready reference to all that minutiae you'll have no idea what he's talking about. Unfortunately, a lot of the stuff he references is really obscure today and today's search engines are hit and miss with it - they only know what a modern wikipedia editor or some other recent writer thinks is relevant today. Google is brilliant for things that have been invented or written about in the digital age, or that made enough of a splash in their day to still get digital now, but the rest of it just doesn't exist. It's B.G. (before Google) or P.D. (pre digital) or something like that.
To cut to the chase, if you read old books you get a sense for how thin the searchable veneer of the web is on our world. The web's view of our world is temporally compressed, biased toward the recent, and even when it does look back through time to events memorable enough to have been digitally remembered, it sees them through our digital-age lens. They are being digitally remembered with our world view overlaid on top.
I posted some of these thoughts to the Radar backchannel list and Nat responded with his usual insight. He pointed out that cultural artifacts have always been divided into popular culture (on the tips of our tongues), cached culture (readily available in an encyclopedia or at the local library) and archived culture (gotta put on your researcher hat and dig, but you can find it in a research library somewhere). The implication is that it's no worse now because of the web.
I like that trichotomy, and of course Nat's right. It's not like the web is burying the archive any deeper. It's right there in the research library where it has always been. Besides, history never really operates as a continuum anyway. It's always been lumpy for a bunch of reasons. But as habit and convenience make us more and more reliant on the web, the off-the-web archive doesn't just seem hard to find, it becomes effectively invisible. In the A.G. era, the deep archive is looking more and more like those charts used by early explorers, with whole blank regions labeled "there be dragons".
So, back to Bezo's big goal... I'd love it to come true, because a comprehensive archive that is accessible in 60 seconds is an archive that is still part of history.
tags: big hairy audacious goals, emerging tech, publishing
| comments: 24
submit:
Tue
Jan 27
2009
The Army, the Web, and the Case for Intentional Emergence
by Jim Stogdill | @jstogdill | comments: 19
Lt. Gen. Sorenson, Army CIO, at Web 2.0 Summit
I didn't make it to the Web 2.0 Summit in San Francisco in November last year so I didn't get to see Army CIO Gen Sorenson present this Higher Order Bit talk in person. However, I thought it was cool that the Army made the agenda and luckily someone posted the video. I finally got a chance to go through it. If you didn't see the talk, or don't have the 20'ish minutes to watch it now, here's a rough summary:
- Because of security and related concerns, it takes a very long time for the Army to take advantage of new generations of technology. We tend to deploy it widely about the time it's becoming obsolete.
- However, we are now beginning to take some advantage of Web 2.0 technologies in, for example, Stryker Brigade collaboration, battle command information sharing, and command and control.
I don't think that slow technology adoption is caused by fundamental first principles, so I don't think it has to remain true. But that's a long discussion for another time. In this post I'd like to focus on Army Battle Command, Web 2.0 and Gen Sorenson's connecting the two. Specifically I'd like to talk about lost opportunity and how the same technologies can constitute a generative platform in one setting and window dressing on a temple to determinism in another.
The lost opportunity I'm thinking of isn't whether Army Battle Command is Web 2.0 enough or not. It's that enterprises tend to see web technologies as an add on to whatever they already have. Plus, they tend to focus on specific technologies rather than the combination of technology, process and policy that make a collection of technologies viable as a generative platform. "Let's add some Web 2.0 to this system; we'll use REST instead of SOAP." But the fundamental question that the web answers isn't whether REST is better than SOAP, but whether emergence is more likely to create innovation than enterprise planning, and the answer to that question is yes.
General Sorenson says in the video that "CPOF brings in Web 2.0 capability, chat, video, etc..." and then comments on "graphics, chat, use of tools..." and stuff like that to reinforce the idea that Command Post of the Future (CPOF) and the Battle Command suite it is part of has Web 2.0 attributes. Like many enterprise technologists, General Sorenson appears to be focusing on rich user experience and collaboration as the attributes that give CPOF a Web 2.0 imprimatur. While that's not unexpected, I think it leaves most of the benefits on the table and untapped.
Putting aside for the moment that CPOF isn't primarily delivered through a browser, a first step toward webness, the reality is that CPOF and other systems like it neither leverage accessible platforms nor contribute to them. It is a standalone (though distributed) computing system with gee whiz collaboration and VoIP. And while it offers some enterprise-style data services, it has none of the features of a generative platform. If I'm in the field I can't readily extend it or build on it to solve different problems, modify its proprietary underpinnings to suit my local needs, or quickly incorporate its information into other applications. If an important aspect of Web 2.0 is enabling the long tail, then this isn't Web 2.0.
I should say, this isn't a post about web 2.0 semantics. However, it's important to understand that the web's power derives from its evolution as a platform. Otherwise it's hard to see what is being missed by the military's IT enterprise (and many other large enterprises).
From the beginning the web has been generative. It wasn't CompuServe. With some basic skills you could add to it, change it, extend it, etc. Jonathan Zittrain, in his excellent book The Future of the Internet - and How to Stop It, reflects on why the Internet has experienced such explosive innovation. He argues that it's the powerful combination of user-programmable personal computers, ubiquitous networking with the IP protocol, and open platforms. Today, the emergence of open source infrastructure, ubiquitous and cheap hosting for LAMP-based sites, open API's, and the intentional harnessing of crowd wisdom has ushered in the web 2.0 era. It's an era of high-velocity low-cost idea trying that leverages the web itself as the platform for building world changing ideas and businesses.
The Internet hosts innovation like it does because it is an unconstrained complex system where complex patterns can grow out of easy to assemble simple things. Simple things are not only permitted, but they are encouraged, facilitated, and often can be funded with a credit card.
I've subscribed to the notion of Gall's Law for longer than I knew it was a law:
"A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system."
tags: defense, emergence, enterprise 2.0, enterpriseIT, web2.0, web2summit
| comments: 19
submit:
Mon
Dec 15
2008
The State of Transit Routing
by Jim Stogdill | @jstogdill | comments: 12
My brother called me a week ago and during the course of our conversation mentioned that he made the trek to the Miami Auto Show. He was complaining that he really wanted to take Tri-Rail (the commuter rail that runs along Florida's South East coast) but it was just too hard to figure out the rest of the trip once he got off the train. "One web site for train schedules, another for buses, and another for a city map to tie it all together. It was just too much trouble to figure out, so I drove. I just want to go online and get directions just like I do for driving, but that tells me which train, which bus, etc."
Coincidentally, later in the day I downloaded the iPhone 2.2 upgrade with the new walking and public transit directions. So far, at least where I live, it's useless. The little bus icon just sits there grayed out, taunting me. I guess because SEPTA (our local transit authority for bus and regional rail) isn't giving data to Google?
My brother hadn't heard of Google Transit, but It turns out to have some coverage in Miami. Their coverage at this point seems to be transit authority centric and doesn't seem to have great support for mixed mode or stuff that crosses transit system boundaries. I am curious though, is it being used? Let me know in the comments if you are using it to good effect.
Anyway, my brother's call on the same day as the iPhone update piqued my interest in the current state of the art for mixed mode transit routing. After some mostly fruitless web searches for I reached out to Andrew Turner. I knew he'd know what was going on in. This is what he had to say:
Routing is definitely one of the emergent areas of technology in the next generation applications. So far, we've done a great job getting digital maps on the web, mobile locative devices, and comfortable users.One problem for awhile has been the lack of data. You can have a great algorithm or concept, but without data it's useless. Gathering this data has been prohibitively expensive - companies like NAVTEQ drive many of the roads they map for verification and additional data. Therefore if you wanted to buy road data from one of the vendors you had to have a large sum of money in the bank and know how you were going to monetize it. This stifled experimentation and creating niche applications.
Now that the data is becoming widely, and often freely, available innovation is happening at an increased pace.
For one example, consider the typical road navigation. The global OpenStreetMap project has always had topology (road connectivity), but the community now adding attribute data to ways such as number of lanes, stop lights, turn restrictions, speeds, and directionality. Anyone can download this data to use with a variety of tools such as pgRouting. As a result people are rethinking standard routing mechanisms that assume travel from A to B via the fastest, or shortest, route. What if a user wants to take the "greenest" route as determined by lowest total gas mileage, or the most scenic route based on community feedback.
An area that has strongly utilized this idea has been disaster response. Agencies and organizations deploy to areas with little on the ground data, or data that is now obsolete due to the disaster they're responding. Destroyed bridges, flooded roads, new and temporary infrastructure are just some of the aspects that are lost with typical navigation systems. However, given the capability of responders to correct the data and instantly get new routes is vital. And these routes may need to be based on attributes different from typical engines - it's not about the fastest, but which roads will handle a 5-ton water truck?
This scheme was deployed in the recent hurricane response in Haiti in conjunction with the UNJLC, CartOng, OpenStreetMap and OpenRouteService.
Beyond just simple, automotive routing, we can now incorporate multi-modal transit. With 50% of the world's population now living in urban areas, the assumption that everyone is in a car is not valid. Instead people will be utilizing a mixture of cars, buses, subways, walking, and bicycling. This data is also being added to OpenStreetMap as well as other projects such as Bikely or EveryTrail. GraphServer is one routing engine that will incorporate these various modes and provide routes.
And we're interfacing with all these engines using a variety of devices: laptop, PND (Personal Navigation Device), GPS units, mobile phones, and waymarking signs. PointAbout recently won an award in the Apps For Democracy for their DC Location Aware Realtime Alerts mobile application that displays the route to the nearest arriving metro.
What's also interesting is the potential of these routing tools beyond actual specific individual routes. Taken in amalgamation the routing distances form a new topography of the space. Given a point in the city, how far can I travel in 20 minutes? in 40 minutes? for less than $1.75? This type of map is known as an isochrone. Tom Carden and MySociety developed London Travel Time Maps that allow users to highlight the spots in London given a range of house prices and travel times.
Despite these apparent benefits, there is a large hurdle. Like road data, there has been a lack of openly available transit data to power applications and services. Providers like NAVTEQ and open projects like OpenStreetMap are possible because the public roads are observable and measurable by any one. By contrast, the many, varied local transit agencies own and protect their routing data and are reluctant to share. Google Transit has made great strides in working with transit authorities to expose their information in the Google Transit Feed Specification - at least to Google. This does not mean the data has to be publicly shared, and in many cases this is exactly what has occured.
However, not even the allure of widely admired Google Transit can induce transit authorities to share their prized data. The Director of Customer Service of the Washington Metro Area Transit Authority (WMATA) plainly states that working with Google is "not in our best interest from a business perspective."
Hopefully, this situation will change, first through forceful FOIA requests, but later through cooperation. One step in this direction have been TransitCamps. And Portland's TriMet is a shining example with a Developer Resources page detailing data feeds and API's.
These experimentations are just the beginning of what is being pushed in the space. Routing is one of those features that users may not realize they need until they have it and then they'll find it indispensable. The ability for a person to customize their valuation of distance to assist in making complex decisions and searching is very powerful.
For more projects and tools, check out the OpenStreetMap routing page, Ideas in transit and the OGC's OpenLS standards.
tags: emerging tech, geo
| comments: 12
submit:
Tue
Nov 25
2008
My Web Doesn't Like Your Enterprise, at Least While it's More Fun
by Jim Stogdill | @jstogdill | comments: 20
The other day Jesse posted a call for participation for the next Velocity Web Operations Conference. My background is in the enterprise space, so, despite Velocity's web focus, I wondered if there might not be interest in a bit of enterprise participation. After all, enterprise data centers deal with the same "Fast, Scaleable, Efficient, and Available" imperatives. I figured there might be some room for the two communities to learn from each other. So, I posted to the internal Radar author's list to see what everyone else thought.
Mostly silence. Until Artur replied with this quote from one of his friends employed at a large enterprise: "What took us a weekend to do, has taken 18 months here." That concise statement seems to sum up the view of the enterprise, and I'm not surprised. For nearly six years I've been swimming in the spirit-sapping molasses that is the Department of Defense IT Enterprise so I'm quite familiar with the sentiment. I often express it myself.
We've had some of this conversation before at Radar. In his post on Enterprise Rules, Nat used contrasting frames of reference to describe the web as your loving dear old API-provisioning Dad, while the enterprise is the belt-wielding standing-in-the-front-door-when-you-come-home-after-curfew step father.
While I agree that the enterprise is about control and the web is about emergence (I've made the same argument here at Radar), I don't think this negative characterization of the enterprise is all that useful. It seems to imply that the enterprise's orientation toward control springs fully formed from the minds of an army of petty controlling middle managers. I don't think that's the case.
I suspect it's more likely the result of large scale system dynamics, where the culture of control follows from other constraints. If multiverse advocates are right and there are infinite parallel universes, I bet most of them have IT enterprises just like ours; at least in those shards that have similar corporate IT boundary conditions. Once you have GAAP, Sarbox, domain-specific regulation like HIPAA, quarterly expectations from "The Street," decades of MIS legacy, and the talent acquisition realities that mature companies in mature industries face, the strange attractors in the system will pull most of those shards to roughly the same place. In other words, the IT enterprise is about control because large businesses in mature industries are about control. On the other hand, the web is about emergence because in this time, place, and with this technology discontinuity, emergence is the low energy state.
Also, as Artur acknowledged in a follow up email to the list, no matter what business you're in, it's always more fun to be delivering the product than to be tucked away in a cost center. On the web, bits are the product. In the enterprise bits are squirreled away in a supporting cost center that always needs to be ten percent smaller next year.
tags: operations, web2.0
| comments: 20
submit:
Tue
Nov 18
2008
DIY Appliances on the Web?
by Jim Stogdill | @jstogdill | comments: 9
Or, My Enterprise is Appliancized, Why Isn't Your Web?
I wrote a couple of posts a while back that covered task-optimized hardware. This one was about a system that combined Field Programmable Gate Arrays (FPGA's) with a commodity CPU platform to provide the sheer number crunching performance needed to break GSM encryption. This one looked at using task-appropriate efficient processors to reduce power consumption in a weather predicting super computer. In these two posts I sort of accidentally highlighted two of the three key selling points of task-specific appliances, sheer performance and energy efficiency (the third is security). The posts also heightened my awareness of the possibilities for specialized hardware and some of my more recent explorations that focused on the appliance market in particular got me wondering if there might be a growing trend toward specialized appliances.
Of course, specialized devices have been working their way into the enterprise ever since the first router left its commodity Unix host for the task-specific richness of specialized hardware. Load balancers followed soon after and then devices from companies like Layer 7 and Data Power (now IBM) took the next logical step and pushed the appliance up the stack to XML processing. These appliances aren't just conveniently packaging intellectual property inside commodity 1U blister packs, they are specialized devices that process XML on purpose-built Application Specific Integrated Circuits (ASICS), accelerate encryption / decryption in hardware, and encapsulate most of an ESB inside a single tamper proof box whose entire OS is in firmware. They are fast, use a lot less power than an equivalent set of commodity boxes, and are secure.
Specialization is also showing up in the realm of the commodity database management systems. At last year's Money:Tech Michael Stonebraker described a column-oriented database designed to speed access to pricing history for back testing and other financial applications. In this case the database is still implemented on commodity hardware. However, I think it's interesting in the context of this conversation on specialized computing because it speaks to the inadequacy of commodity solutions for highly specific requirements.
A device from Netezza is also targeted at the shortcomings of the commodity DBMS. In this case the focus is on data warehousing, but it takes the concept further with an aggressive hardware design that is delivered as an appliance. It has PostgreSQL at its core but it takes the rather radical step of coupling FPGA's directly to the storage devices. The result, for at least a certain class of query, is a multiple order of magnitude boost in performance. I think this device is noteworthy because it puts the appliance way up the stack and is perhaps a harbinger for further penetration of the appliance into application-layer territory.
While appliances are expanding their footprint in the enterprise, it seems like the exact opposite might be happening on the web? Maybe the idea of a closed appliance is anathema to the open source zeitgeist of the web, but in any case, the LAMP stack is still king. Even traditional appliance-like tasks such as load balancing seem to be trending toward open source software on commodity hardware (e.g. Perlbal).
I can't help but wonder though, at the sheer scale that some web properties operate (and at the scale of the energy cost required to power them), can the performance and cost efficiency of specialized hardware appliances be ignored? Might there be a way to get the benefits of the appliance that is in keeping with the open source ethos of the web?
If you've ever uploaded a video to Youtube and waited for it to be processed you have an idea of how processor hungry video processing is on commodity hardware. I don't know what Google's hardware and energy costs are for that task but they must be significant. Same goes for Flickr's image processing server farm and I would guess for Google's voice processing now that its new speech services have launched. If the combination hardware and electricity costs are high enough, maybe this is a good place to introduce specialized appliances to the web?
But how to do that in a way that is consistent with the prevailing open source ethos and that still lets a firm continue to innovate? I think an answer might be sort of DIY writ large; a confluence of open source and open hardware that works like an undocumented joint venture based on the right to fork. Think Yahoo and the Hadoop community or JP Morgan and friends with AMQP but with hardware and you get the idea. Such a community could collaborate on the design of the ASICS and the appliance(s) that hosted them and even coordinate production runs in order to manage unit costs. Perhaps more importantly, specifying the components openly would serve cost sharing across these companies while still supporting flexibility in how they were deployed and ultimately, generativity and innovation for future uses.
There are probably a bunch of reasons why this is just silly speculation, but Google's efforts with power supply efficiency might be seen as at least a bit of precedent for web firms dabbling in hardware and hardware specifications. In fact, Google's entire stack, from it's unique approach to commodity hardware, to software infrastructure like GFS, might be thought of as a specialized appliance that suits the specific needs of search. It's just a really really big one that "ships" in a hundred thousand square foot data center form factor.
tags: emerging tech, energy, open hardware, open source
| comments: 9
submit:
Thu
Nov 13
2008
Apps for Democracy
by Jim Stogdill | @jstogdill | comments: 3
Vivek Kundra, the District of Columbia's CTO, isn't just talking about transparent government and participative democracy, he's working hard to make DC's massive data stores transparent and inviting your participation.
I first heard about Vivek's push for transparency when he spoke at an Intelligence Community Conference in September (I just happened to be speaking on a panel thanks to a twitter-induced serendipitous introduction to one of the conference organizers - @immunity). He was there in sort of "one government entity to another" role to demonstrate that data could be shared and that it is valuable to do it.
I was impressed with the risks he was taking to push hard for the "democratization of data" and for what he was calling The Digital Public Square. What came through really clearly was that he didn't just view this as a technology exercise, but as a way for citizens to participate in real ways and to increase government accountability. It was an engaging and refreshing talk.
It's not exactly news at this point that he came up with $20,000 to offer prizes for the best applications to be built on top of the district's data. After all, the submissions have been long closed and the winners have already been announced. However, I thought it might be worth pointing out that you still have until tomorrow to vote for the two people's choice award winners.
I thought it was kind of fun to just poke around in the list of submissions and see what people came up with. As you can imagine many of them are geo-spatial display's of district stores, but there are some other ideas in there as well. Take a look, see what you think, and get your people's choice vote in.
And just because the contest is over doesn't mean it's too late to build something. Take a look at the catalog of data and see what comes to mind. This is just the beginning (Mayor Nutter of Philly, I'm looking at you...).
tags: emerging tech
| comments: 3
submit:
Mon
Nov 10
2008
My Apple Holiday Wish
by Jim Stogdill | @jstogdill | comments: 20
I've been searching for a personal backup solution that doesn't suck for, well, pretty much since I got my first computer in the 80's, and I'm still looking.
A few years ago I was cleaning out old crap and ran across boxes and boxes of 800kb floppies labeled "1988 backup disk x." The trash / recycling picker uppers got those along with a pile of zip disks, various CD's, DVD's, a USB drive or two, and a couple of bare SATA drives that I was too cheap to buy housings for. Oh, and there was even a pile of tapes in some long forgotten format in there.
After a few years of manually copying stuff to multiple USB drives, last year I was completely seduced by the "it's like RAID but you don't need identical drives" beauty of the Drobo. Three failures later (including one with smoke), a nasty virtual tinnitus that comes and goes as its disks transition through a perfect cabinet-resonating frequency, incompatibility problems with Time Machine and Airport Extreme, and access speeds that are too slow to serve Final Cut, and screw it. Now it mostly just sits there powered down making a Drobo-shaped dust-free spot on my desk. It's too buzzy to listen to but too expensive to Freecycle.
Next up, Time Capsule. Still (even more) useless for Final Cut and that sort of thing, but it's doing an ok job with backups - at least of the straight Time Machine variety. There are still a few issues though...
First off, I don't really trust that single spinning platter. It will die some day. Plus, it's in my house about ten feet from where my laptop is usually parked so my eggs are all in a single fire / theft / flood basket.
Apple's Mobile Me and the Backup program that comes with it theoretically provide a solution to this issue, but unfortunately it sucks. It's slow, much slower than a local time capsule backup because it is relying on an Internet connection. Also, it effectively requires my machine to be running all the time so that it can conduct it's backups in the middle of the night when I won't be competing for bandwidth or CPU cycles.
Even worse, it fails all the time. I don't know why, but it's finicky. A brief connectivity hiccup (or whatever) and I wake up the next day to find that my multi-hour backup died. Finally, It's too small to be useful for more than a few key critical files. I have a few hundred gigabytes of data I'd like to secure and my mobile me account is limited to twenty.
So Apple, I don't usually resort to begging, but here's your chance to fix backup for me once and for all. Just update the firmware in my Time Capsule so that my fast Wi-Fi-based local backups can be incrementally streamed to either an expanded Mobile Me account or to a separate S3 account (or whatever) whenever it's sitting at home with my network connection to itself.
I can't leave my laptop connected for the days it would take to stream all those hundreds of Gig, but Time Capsule is just sitting there with my Internet connection doing nothing while I'm at work anyway, so give it something to do. This way I'll have the best of both worlds, fast reasonably secure backups to my local Wi-Fi connected Time Capsule when I'm home and don't-need-to-think-about-it remote storage that can take its time when I'm not. At the risk of way over reaching, it could even work in both directions so that if I'm on the road for an extended period, Time Machine could backup critical changes directly to Mobile Me which could then in turn incrementally stream that back to my Time Capsule.
Ok, that's it. A simple idea I think. Can I have it by Christmas?
By the way, if the thought of all those gigabytes in your Mobile Me data centers makes you blanche (and the idea of using S3 is anathema to Apple's do it all culture), how about a Time Capsule-based distributed hash overlay network? If every Time Capsule shipped with the option of turning on a separate partition representing about 1/3 of the disk, you could put a Planet Lab-like distributed file system in there. My files would be split into chunks, encrypted, and distributed around to other people's Time Capsules while some of their stuff was on mine. Sort of an inverted Bit Torrent for backups, no data center required.
That would be cool but I know you won't do it. And, from the category of "things you are even less likely to do," if you opened up the Time Capsule firmware to third parties someone else probably would.
tags: emerging tech
| comments: 20
submit:
Recent Posts
- Open Source in Defense on October 9, 2008
- Ignite Philly II on September 25, 2008
- I Am Trying To Believe (that Rock Stars aren't Dead) on September 6, 2008
- Energy Savings, Strange Attractors, ... on July 31, 2008
- The Last HOPE on July 21, 2008
- User Mediated Trans-Enterprise-Web Mashups? on July 16, 2008
- An ESB for the Web? on July 11, 2008
- From the July NY Tech Meetup on July 2, 2008
- Philly's First Ignite was a Smash on June 16, 2008
- Satan is on My Friends List... on June 11, 2008
STAY CONNECTED
BUSINESS INTELLIGENCE
RELEASE 2.0
Current Issue

Big Data: Technologies and Techniques for Large-Scale Data
Issue 2.0.11
Back Issues
More Release 2.0 Back IssuesCURRENT CONFERENCES

Join us at OSCON 2009, the crossroads of all things open source, a deeply technical conference that brings community ideals face-to-face with business practicality. Come together with 3,000 of the best, brightest, and most interesting people to explore what's new and to help define, maintain, and extend the identity of what it means to be open source. Read more

Gov 2.0 Summit, a new government technology conference co-produced by O'Reilly Media and TechWeb, capitalizes on the momentum for change and broad engagement, creating a non-partisan forum for addressing the monumental challenges our nation faces. Gov 2.0 Summit is a place for technologists to rise to the call of public service, offering up their internet expertise to its best and highest purpose. Read more
O'Reilly Home | Privacy Policy ©2005-2009, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.