Australian national body Standards Australia had an industry forum today on Open XML. The agenda and invitation for this is up at Tom Worthington’s website.
The Invitation
Here is the most interesting part of the invitation:
This forum is being conducted by Standards Australia as a courtesy to stakeholders. It is an extraordinary meeting that we are not required to hold, but do so to provide an open process. We appreciate your attendance and expect that you appreciate our effort in making this opportunity available to you.
Standards Australia values its vote as a participating member of all international committees, and does not exercise it injudiciously. We provide considered Australian viewpoints that are beneficial to Australian stakeholders, including industry, government, academia and the general community, through the facilitation of trade and the inclusion of clear Australian requirements in international standards.
The JTC1 process has established that the ECMA-376 document is not contradictory to existing standards and ECMA has responded to a number of technical considerations raised in the initial consultation period. This forum is not to debate the merits of the JTC1 decision making process or the validity of the ECMA response.
While technical comments are welcomed, it would be entirely counter productive to use this forum to reiterate technical comments that have already been raised and are likely to be debated in every JTC1 member body in some form.
We are looking for creative, positive contributions that emphasise our commitment to representing truly Australian views to the international community.
More on that later.
The Speakers
The meeting had 30 to 35 attendees (I didn’t count, oops), based on the membership of existing technical committees and people who had sent in comments to the process so far. It was not a voting meeting at all, just a meeting to help consensus and to give more information to higher committee members. (However, participants can submit comments by Aug 21 for consideration by the Australian CITeC Standards Sector Board.)
The meeting was a three hour affair, with the first half invited speakers and the second half question and answer and commentary.
The first half started with an introduction by Standard Australia’s Alistair Tegart, who provided good strong chairmanship that left most people frustrated that they had not had a chance to say more, but which gave everyone a chance to make their most important comments in the allotted time. The interesting thing was that discussion of technical minutae was strongly discouraged (wrong meeting for that), which is a nice break for me. Discussion was civil, everyone friendly in the coffee break, and frank in the meeting.
I had been invited to speak on the subject of General overview of the standards process because of my involvement as Australian delegate to (what is now) SC34 in the 1990s and my continuing involvement with standards. A nice comment afterwards (by a law professor!) was that mine was the only talk with new content. I tried to present an SC34-based perspective on standards: what SC34 standards are, how the preference for enabling standards rather than applications has been overtaken by the fast-track process, the basic standards posture for Australia in SC34 in the mid-90s (need for simplicity to suit our small development teams, I didn’t mention support for regional neighbours though it was important) and how each different country has different requirements. (For example, some countries have a requirement that they do not want to be blocked out from international contracts because of the lack of standards.)
Then a quick mention of some of the issues that I prototype in this blog: that ISO standards for documents are voluntary, that standards form a library of choices, that the mere existence of alternative standards does not prevent any group from choosing one over the other, that standards such as PDF and Torx are not open in the sense of allowing arbitrary change but nevertheless valuable, and so on. I emphasized again that the ISO process is a win/win system in which attempts by one group to stymie another’s needs does not fit.
That took about 15 minutes, then there were speakers on the case against the adoption of Open XML (the scheduled IBM speaker was hospitalized so we were treated to an emergency podcast from Rob Weir which was basically the same content as his Technical Case against OOXML.) and for the adoption: a quick tag team with an MS representative, then the local CompTIA representative, then a CEO. The CEO, Richard White from CargoWise EDI, was particularly forceful on how it would help his business.
The Discussion
Then after coffee we had over an hour of moderated discussion. By and large it went as expected: people from local industry welcomed it as solving a real problem, people from business rivals of MS didn’t like it, people who identified themselves with Open Source didn’t like it, people from academia or standards bodies seemed to think that having it as a standard would somehow force them use it (I didn’t get this.)
I had to gag myself a few times. The local Google Maps operation was represented, but I was quite surprised to hear Lars Rasmussen say how difficult it would be to implement Open XML…surprised because he had told me last year how Google maps used VML to deliver to IE and how format was simply not a problem. (Here is the first line sent for a Google map, for confirmation: note the namespace declaration and stylesheet reference:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="https://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml">
<head><meta http-equiv="content-type" content="text/html; charset=UTF-8"/><noscript>
<meta http-equiv="refresh" content="0; URL=https://maps.google.com.au/?output=html"/><<noscript>
<title>Google Maps</title>
<link rel="stylesheet" type="text/css" href="https://www.google.com/intl/en_au/mapfiles/86/maps2.css" />
<style type="text/css">body{margin-top: 3px;margin-bottom: 0;margin-left: 8px;}
#vp {position: absolute;top: -10px;left: -10px;width: 1px;height: 1px;visibility: hidden;}
#homestate {display: none;}
v:* {behavior: url(#default#VML);} ...
It seemed strange to be saying that a technology was too big to implement when you were in fact using that technology successfully. Maybe the Google speaker didn’t realize that VML is a part, though obsolescent, of DIS 29500. I think what happens is that “implement” gets stretched to mean “implement all the parts of a specification”: so “It is too big to implement” means “It is too big to implement it all” in Googlespeak. But Australians aren’t going to implement a full new office suite. It is too big, even if you just used ODF; and Open Source people will more naturally join the existing Open Source and Free projects rather than set up new ones, it seems to me. For Australian requirements, “full implementation from scratch” is an imaginary and spurious requirements.
What has maybe slipped Lee’s mind is that most integrators will use DIS 29500 in the same way that Google Maps would be using it: just cherry picking the parts that are needed (in their case, the subset of VML.) And, in particular, when you are using it as an end-format, you only need to “implement” (i.e. generate) the elements that correspond to your input. Not the whole thing.
When I was talking to the local Google people last year, they told me that Google doesn’t actually have any fulltime people allocated to standards work in general. I gathered that was a little pedestrian for them, because they made their money by innovating not by following the pack: sounds like a recipe for QA disaster to me. I don’t know whether their foray into web-based applications will make them a little more savvy with standards.
Another Google guy (who turned out to be a ring-in: Georg Greve, initiator and president of the Free Software Foundation Europe, who gfim says was flown in from Switzerland by Google especially for the occasion) stood up and recommended we should track what the Indian standards body’s concerns about binary mappings. Again I had to gag myself (actually, Alistair did it for me) because I believe I actually was present at the meeting in Delhi where that issue was raised: by the Indian representatives of Sun and IBM. I am afraid I couldn’t help thinking this was the classic Colbert “Echo Chamber” effect, similar to Wikipedia’s astroturfing: one member of a collective puts something up in one forum, then other members of the same collective bring up the first as independent evidence. (In this case with the added twist that Sun and IBM were not mentioned: a lay listener could easily have had the impression that this was some kind of position adopted by the BIS, whereas, as far as I know, it has not so been. I hope the Georg will be a little more careful with attributions in the future, because people can so easily get the wrong impression.)
An interesting comment from local IT29? committee head, Jamie (surname illegible sorry), that for educational users, they needed to guarantee interoperability and could not force students to purchase particular programs. I didn’t uite see the logic of how this meant that DIS 29500 should not be adopted at ISO. Marcus Carr from Allette Systems (who I consult for and teach standards seminars by) sits on a local IT committee too and responded that students would be better served by PDF if guaranteed interoperability were the issue.
A few comments later, I got a chance to mention that none of the XML formats today provide guaranteed interoperability in the sense of visual fidelity, for the reasons that readers of this blog will be familiar with: every application supports different feature sets, has different fonts, hyphenation dictionaries, kerning tables, line break algorithms, and so on. Plus the formats are extensible, so can have all sorts of strange media types. Standard documents may perhaps be a necessary condition, but they certainly are not sufficient. What is needed are profiles, which restrict the features, requires certain application behaviours and require certain fonts. And for uncramped page designs that reduce the chance of page overflow on different systems.
Several speakers raised the issue of IP rights and worries about the MS Covenant Not to Sue and the Open Specification Promise. MS gave the usual response: we have run it past lots of external lawyers who say it is fine, and since OSP is so similar to Sun’s equivalent, why don’t you have the same concerns about ODF. I think Standards Australia has been playing a little coy here, because they are trying to be scrupulous not to be seen to take sides, I guess. I had asked them in email to have a clear position on standards and IP from the legal perspective, and they ended up saying, in effect, that for Standards Australia, it is JTC1s responsibility and competency to evaluate the IP issues of drafts submitted for Fast-Tracking, and not a technical issue for voting.
I think a better and more complete answer would be better. People who are interested in this are should first read the excellent webpage by OASIS lawyer and anti-Open XML conduit Andy Updegrove, especially on the Allied Tubemakers and Dell cases. Standards don’t exist in a vacuum, and MS standard’s participation and the very strong and constant statements that MS would be considered in any court.
Another aspect of the IP discussions is that typesetting and desktop applications are not now a new thing. With a 20 year limit, patents before 1987 have expired, which is well after the invention all of the basic ideas in office suite software. (Last week in Thailand, MS’ Oliver Bell was asked a question on this issue, and IIRC he said that actually MS only uses its patent portfolio defensively and has never sued on IP. Does anyone have a list on this?)
Jamie ? pointed out that participation in a standards body does not nullify the IP; however, the issue is the scant chance that submarine patents are enforceable. So add the two covenants, external legal opinion, vetting by Ecma and ISO/IEC JTC1, the age of likely patents, the recent stronger court awareness of junk patents, the difficulty of enforcing submarine patents, the multiple statements made by MS executives and staff to the highest level, and the basic fact that a document standard is more concerned with schemas and general description and no of methods or algorithms, and I don’t know how much more would be possible to satisfy someone.
One comment mentioned the idea that MS covenant not to sue etc does not cover external technologies. Of course, not, but that is no different for ODF and HTML.
Other speakers brought up a few of the usual suspects. autoSpaceLikeWord95 made its scheduled appearance, along with statements that made it clear that the speaker had never read the spec and was parroting. There is an easy way to tell a parrot in this area: they will say something like “The standard is full of compatibility elements like autoSpaceLikeWord95 which are undocumented and prevent implementation”. In fact, there are 64 compatibility elements, and IIRC correctly all but two are adequately document with explanations of their general functionality. AutoSpaceLikeWord95 is optional and is clearly marked as deprecated: it seems to be a warning flag that some document was originally created by Word95 with this bug and it had never been corrected. The bug is related to the treatment of Fullwidth character used in East Asian typesetting (zenkaku): certainly for Australian users it is utterly extraneous to our national requirements.
The issue of the definition of various functions in SpreadsheetML came up too, from a localization perspective. (Again, irrelevant to Australia.) If the moderator hadn’t been so tough, I would have liked to have asked whether the speaker wanted to remove or rename the existing function (and break everyone who used this function’s spreadsheets) or merely to add better localized functions (which belongs in a maintenance phase.)
On the issue of maintenance, I did get another opportunity to spout. I said that it is too early to tell what effective systems for maintenance will occur. When OASIS and ECMA submit their standards, they also submit information about how maintenance will occur. There would be collaboration with JTC1 SC34, for example. I said I thought this was only practicable for fast corrigenda (which don’t add functionality just fix the text and clear mistakes) and that the approach that OASIS seem to be taking, which would involve resubmitting ODF 1.1 and ODF 1.2 etc for fast-tracking each time, was probably the more realistic thing to expect. However, I noted that DIS 29500 has a quite strong extension regime, indeed a whole part (Part 5) and starts from a much more complete position that ODF: so one would expect complete updates to be rare events, perhaps aligned to the three-year product cycle.
The main Google guy at some stage made a good point about overlapping standards, along the lines that having multiple standards for programming languages was OK because the differences could be justified, but he had not heard arguments why Open XML was so different from ODF that it could be justified.
A few others also had comments that could be fairly reduced to “We don’t need it, therefore we do not support it becoming an ISO standard, therefore we appose it becoming an ISO standard” which is a non sequitur.
Baseline formats and downstream formats
A very interesting point was made by the National Archives representative. They don’t have the resources to cope with Open XML and ODF, he said, so they would adopt ODF for their future format and didn’t support Open XML becoming a standard. Again, I don’t see how the standardization of Open XML forces their policy in any way. Standards Australia is not even a government agency, and has no legal clout on the National Archives: moving to ODF where possible seems a reasonable choice (well, ODF 1.3).
Marcus Carr objected to this. He spoke from the perspective of document processing from the early 90s, and the difficulties in practice of dealing with Word documents (with the various hijinks: converting .DOC to the Rainbow DTD, converting .DOC to RTF then processing that, etc) and brought up the key processing issue that I think almost all the commentators on Open XML miss. He brought up the issue of the need for a full-fidelity baseline schema to allow the most flexibility in downstream processing.
Now this is a pipeline approach that has proven itself to work over the last 15 years we have been using it. Elephantine readers may remember a blog of mine a year ago:
A typical strategy when converting from XML into some structured text format is to have three transformations:
* first, convert the XML into ideal XML: resolve links as needed, remove extraneous elements and attributes, convert cases, generate headings and other things that need to be generated
* second, convert that ideal XML into an XML-ized version of the output format
* third, convert the output XML into the text format, delimiting and indenting as needed
If the input data is non-XML, then we have an additional stage where we first convert the data into an “baseline” XML format that maintains all the information from the data source (it could be a database, another format, a binary, no matter.) You never know what information you need, and you don’t want to trust someone else’s abstraction but work with as unmediated form of the data as possible.
So Marcus’ comments was that we (the system integration and document processing community need Open XML as an ISO standard, because it alone provides an adequate baseline format for subsequent transformations. So the Australian National Archives could well decide to archive data using ODF, but they may well decide to implement their conversion to ODF by going through Open XML. So one standard would be useful for one purpose (saving future archives), the other standard for a different purpose (opening existing files in the archive.) If the Australian National Archive is moving to ODF 1.0 fast, I hope they don’t throw away the original binaries…
Breadcrumbs
So all in all, I think the day was a worthwhile exercise, and a good opportunity to help us all escape groupthink.
I suspect, from the tone of the invitation and comments made at the meeting and elsewhere, that when Standards Australia looks at the comments that people send in (deadline August 21) they will be completely disinterested in comments that question JTC1 decisions and comments on issues that have no local relevance. I gather they may not be much impressed by arguments that can be refuted by precedent: for example, that there should be no overlapping standards. However, we shall see, and I don’t know anything about the CITeC Standards Sector Board.
The most enlightened part of their approach, and I think this is pretty novel, is that Standards Australia seem very aware that the role of an individual standards body in vetting a standard when there is a multi-national campaign to discredit it (on the one hand) and promote it (on the other) changes the requirements for a review. In the case of a normal standard review, you raise as many (sensible) flaws as you can, because you don’t know whether the issue will be addressed by anyone else. In the case of a global campaign, it is clear that almost every National Body has been mail-bombed with the same speil and that therefore those are issues that we can actually ignore, unless they have a clear national significance because we know that other national bodies will be examining them. I think that is what may be behind the last line of the invitation
We are looking for creative, positive contributions that emphasise our commitment to representing truly Australian views to the international community: they want to husband their resources to what is important for local industry and local requirements. They don’t want to succumb to a Denial of Service attack where by concentrating on sorting out edge cases and typos they miss out the big picture of national interest.
I’m preparing my comments to the CITeC Standards Sector Board at the moment, and I will put them online here too, if anyone is interested.