So anyway, Sylvain arrived here in the U.S. of A. on the 13th,
… and the two of us have been heads down ever since preparing to release this at that. The day before his arrival I paid a visit to my local Apple Store to pick up his development machine. As such, he too is now discovering just how wonderful development life on a Mac can be.
Of course as wonderful as life can be there is one thing that has really irked me: The lack of a good tabbed shell/terminal client. What surprised me is that no matter how hard I looked it didn’t seem like anybody either noticed and/or cared enough to do anything about it. But as Sylvain just discovered, apparently I was wrong,
iTerm is a full featured terminal emulation program written for OS X using Cocoa. We are aiming at providing users with best command line experience under OS X. The letter i represents a native Apple look and feel of the program interface, and an emphasis on complete international support. iTerm was merged from two projects, CTerminal and TerminalX, both of which were based on JTerminal project. The current version is still in beta stage. It is however very much functional and usable.
Since publishing my recent article on Next Generation Grid Enable SOA and taking this topic out into the world, I have been getting asked to clarify and frame the discussion around why state management in what is supposed to be “stateless” SOA is such an important issue. Steve Jones of CapGemini bluntly stated No they ruddy well shouldn’t be when he wrote his opinion on stateful vs stateless services in a SOA.
My observation has been that the need for state management is a continuum that ranges from completely stateless to fully stateful services as the complexity of the business logic and the longevity of the service instance increases.
Since joining Oracle I have been working across the various product teams in the Fusion Middleware Group, to create a vision for what I’m currently calling “Next Generation Grid Enabled SOA”. I recently published an article on the subject in SOA Magazine.
In my blog Converting Content Models to Schematron I outlined some code ideas. Recently we (Topologi) have been working on an actually implementation for a client: a series of XSLT 2 scripts that we want to release as open source in a few months time.
Why would you want to convert XSD to Schematron?
The prime reason is to get better diagnostics: grammar-based diagnostics basically don’t work, the last two decades of SGML/XML DTD/XSD experiences makes plain. People find them difficult to interpret and they give the response in terms of the grammar not the information domain. And error messages are reported in terms of where the error was detected, not where the error was. For example, given a content model (a, (b, c)?, c, d ) and a document <a/><c/><c/><d/> you will get an error “Expected a d” at the location of the second c element; however the problem really is that the b is missing.
Schematron converted from a grammar still does not have much info to go on. Of course, the Schematron scripts should be easier to customize for tailored assertions and diaganostics. But also the phase mechanism is very useful: we can implement multiple different ways of checking the grammar and let the user decide on which one provides the best information.
A secondary reason is that Schematron only needs an XSLT implementation. There is still quite a suspicion that XML Schema implemantations are partial or broken. Japan Industrial Standards’ comment on Open XML were that they could not in fact even get the Schemas to run under Xerces and another major implementation. XSLT is much more common. However, we have decided to use XSLT2, and SAXON in particular, because it offers us some short cuts.
One shortcut that is quite fun is this possibility (I am not sure whether we will implement this method this round, it is outside our initial brief): by converting the children element names of an element into a string, such as “H1 p div div div table ht p” for example, and the converting a grammar such as ( (H1 | H2 | H3 | P | div | table )* into a regular expression equivalent, we can actually use the built-in regex recogniser of the XPath2 functions to validate the document. Just using a vanilla CSLT2. And this even copes with the minOccurs/maxOccurs cardinality contstraints, too.
This is rather exciting as these things go because it means that we can have a fallback validator that completely covers all the constraints of a grammar system, without leaving Schematron or the world of assertions. The downside? If implemented in a simple way, you only get the same kinds of diagnostics as a conventionally implemented XSD system will give you. But the advantage of having a complete Plan B means that we can concentrate on useful messages for the Plan A.
I’ll blog on how we implemented it over the next few weeks. Basically, we have a two-stage architecture: the first stage (3 XSLTs) takes all the XSD schema files and does a big series of macro processes on them, to make a single document that contains all the top-level schemas for each namespace, with all references resolved by substitution (except for simple types which we keep). This single big file gets rid off almost all the complications of XSD, which in terms makes it much simpler to then generate the Schematron assertions.
We have so far made the preprocessor, implemented simple type checking (including derivation by restriction) and the basic exception content models (empty, ALL, mixed content), with content models under way at the moment. I think the pre-processor stage might be useful for other projects involving XML Schemas.
Actually, the difficulty has been in an unexpected direction. XML Schemas is so unpleasant to work with, that one programmer asked to be take off the project because it was simply too much to cope with, and another has left the company (to take up an overseas appointment) but not before also getting frustrated, boggled and bogged down by XSD! Things like complex type with simple content derived by extension from a simple type with simple content etc become a maze or ratnest. (Hopefully we have that under control and we’ll be able to attend to our backlog of other work ASAP: we have been pretty poor.)
It is interesting that in all the last almost eight years of Schematron, I don’t recall anyone complaining it was too difficult. Instead, I regularly get surprised to hear of quite important projects where it has been quietly used without fuss or drama, and just chugs away doing its thing, with everyone involved feeling (and being) in control. This week for example I heard about UK taxation office’s use of Schematron for checking incoming documents being lodged. I think some of the reason for the success might be that because Schematron is small, it can be kept under control and understood, and that because there is zero support from the large software players, it is never used as part of an attempt to up-sell big hardware or message busses or protocols or enterprise systems etc.: it gets used for POX (Plain Old XML) sites.
The bumpy ride to ISO standardisation of Microsoft Office Open XML is receiving a lot of attention here on XML.com, and drawing out a lot of strong opinions on both sides of the issue. Frankly, the intense coverage given to every minor detail in the specification bores me to tears, even though I see the need for it, and I think that there is a larger story behind these events that is not receiving enough attention.
A while back, I named Gazzag.com my enemy for spamming people to join their social network. They have a new enemy partner: Quechup.com
Quechup allows users to import contacts from their email accounts (such as Gmail). The unsuspecting user, provides their login, and Quechup does retrieve their contacts. It then sends an email to every one of those contacts from the user telling them that the user has requested them to join Quechup.
This is annoying and just plain wrong.
Some people will say that users should be more careful about sharing this information with a website, or that they should read the fine print more carefully. However, I reiterate my claim that no reasonable person will want a site to email everyone they know. We all have professional or personal contacts that we would not choose to invite to a social network. I don’t believe that even a giant, dedicated, flashing warning page that alerts users to the fact that all of their contacts are about to be automatically invited in their name is sufficient. This practice, as I said before, is just evil.
I assume the people who run these websites believe that they will automatically get lots of new members by spamming users’ contacts. Instead, they create a lot of angry users who go out of their way to email their friends and actively discourage them from signing up for a network.
Users develop trust in a website for many reasons. Some are simple - appearance, comfort with the community of people there, nice features, etc. Some are more technical - a good privacy policy, parental controls, and the like. Many users come into websites with a base level of trust. This sort of mass invitation violates that trust, and is a sure way to spread negative publicity about your site (such as this post). Users should have explicit, obvious, and protective control over who is invited in their name. By default, no one should be invited. The user should have to knowingly undertake a process to select every person they want invited.
I hope other networks learn from the mistakes of Gazzag and Quechup. The spamming techniques used by these sites is a worst practice of the social networking world.
One of the most startling aspects of the last year, to me, really shows the disruptive potential of standards: bitter enemies are hard at work making systems that also benefit their enemies in pursuit of a higher goal. A world turned upside down!
Examples include:
MS opening up their formats and taking them to Ecma and ISO
ODF and Sun making converters for Word
Microsoft paying for an open source converter between OOXML and ODF
IBM paying someone to review the draft specfication
Open source activists reviewing the draft
An ISO ODF editor making a big contribution to reviewing the draft
Microsoft blogs publicizing Gnumeric, Apple software and any applications that get any kind of OOXML support
All this competition and bile channeled productively! No wonder people are freaked out. :-)
But the paradoxes don’t just mean that enemy act like friends, it seems. Friends also can get accused of being enemies. There is a very interesting post ODF vs OOX : Asking the wrong questions (hat tip to Doug) on the blog Spreadsheet Proctologist which I like very much because it brings out that ease-of-implementation is just as much (and perhaps mostly?) a question of what your starting base is (i.e. your native data structures and functions) as it is a question of what information and forms the external format provides.
But the readers’ comments include statements like Your self-annihilating devotion to Microsoft is too evident., and Just by touching MS OOXML, you are playing their pawn in the only purpose for this exercise. To kill ODF adaption and therefore the threat of Open Office and others as a replacement for Microsoft Office products. Is to laugh! Now GNU developers are pawns and devotees of Microsoft! That GNU software, ooooooh, just another Bill Gates plot!
As a side note, but related to the theme of finding strategies so to make the acts of people’s enemies as productive as the acts of their friends, I think that Stephane Rodriguez’ comments (to that blog and, just as circumspectly, elsewhere) on the calculation chain should be paid more attention to. (Sometime I will look up whether it made it to any national body comments for the BRM, I hope so.) Calc chain needs to be reviewed with the question asked “Is the base case a little too complicated still?” It is a mild and productive question: I suspect programmers would be happy if some more leeway were provided. Now whether the issue is an Office one or a DIS29500 one, I don’t know; but the issue should not be dismissed just because it was deposited by an ostensibly rabid whirlwind! Quite the reverse.
In a recent interview with Rohit Khare, Director of CommerceNet Labs, Jon Udell may have been responsible for introducing a new meme into the noosphere that will be as important in its time as AJAX was in 2004. Rohit Khare gave an influential presentation describing ALIT, which utilized SOAP messages for transferring events between systems, but in the intervening years, his thinking has shifted to a new system based not upon SOAP but upon RESTful RSS and Atom feeds, for which he has coined the term Syndication Oriented Architecture, or SynOA.
I’ve just glanced over the 3549 or so comments put in by various national bodies for the recent ballot on DIS 29500. I’ve made a table listing the countries that commented, together with their votes and whether I think most of their issues could be resolved during the upcoming Ballot Resolution Meeting next year.
The bottom line: there are a few touchstone issues that may be tricky but it is difficult to see from the comments that DIS 29500 would not be successfully fixed and approved to be an ISO standard. The particular touchstone issues I see are that spreadsheet dates need to be able to go before 1900, that DEVMODE issues need to be worked through more, that the retirement of VML needs to be handled now, and that there needs to be a better story for MathML.
Apart from these, there is a sea of details that are eminently fixable: typos, clarifications, fixing schemas against closed lists, the use of more standard notations for fields, encryption, conformance language, refactoring the spec: editorial and syntactic rather than data model or wholesale semantic changes. On the other extreme, there are various non-starters which I expect have little hope since they run counter to the rationale for the spec: adopting SVG or adding various frustrating little things in the name of compatibility with ODF (Some NBs even call for ODF’s blink element, even though blink has been removed from HTML since it can cause epileptic fits!)
You can find a full list of national votes from the SC34 website. I was pleased to see that all the issues I raised ended up in Standards Australia’s comments (it abstained on the vote, but its comments still go in the mix.)
What is in the table
The thing that interested me in this table was whether I thought each National Body’s comments could be resolved enough to change their No vote to Yes vote. I am assuming there is no point to a standard that Ecma and Microsoft could not buy into. One of most interesting documents in the collection of comments from different bodies is Ecma’s own contribution: basically they accept almost all of Japan’s technical issues (which have a lot of overlap) which augers well for many of the other changes.
So I provide a rating as to whether I expect that a National Body’s vote will be definitely no, probably no, or probably will change to yes as a result of a successful BRM. Caveat: The NB comments do provide a much clearer indication of each National Body’s thinking than just the raw Yes/No/Abstain vote (which are utterly useless in predicting a finally outcome); however, I would be a little more confident in my ratings of the NBs if SC34 or ISO had released information about which NBs had ticked the normal box that says they might change their mind if the issues were resolved. I guess you would rate me as an optimist in general about the process, but still I am not saying that all these NBs will necessarily vote yes ultimately; but there is quite a bit of commonality to the comments.
I also have columns marked “Indie” which has an X if it seems the NB undertook independent review of the specification. And one marked “Parrot” where the NB is reproducing (perhaps with some localization or sorting or selection) the material, turning the standards review process into a form-letter campaign. I have mixed feelings about parrot items: on the one hand an NB is free to consider whatever issues it likes, and some NBs have procedures that may favour the garrulous, but on the other hand it represents a hijacking of valuable review time to obsess on the same issues, rather than give fresh eyes.
The reviews that seem to me the best are those where an NB focuses on its areas of expertise or national interest: Japan is very interested in schemas, Israel is very interested in right-to-left text, Ireland is very interested in correct references, Australia is very interested in clarity, Canada is very interested in assistive technology, Tunisia is interested in the application to mobile devices, Ghana (with a large Arabic influence) is interested in IRIs, and so on. The comments that seem least useful are the parrot comments, and the ones with vague recommendations. (I expect that this is the first comments that many of the NB committees or staff have sent in, so it is a good training exercise nonetheless.)
And there are some nice touches in there, where perhaps some cultural values slip through: Switzerland’s comments are a list of problems they actually have rejected and the details why, and Jordan and Turkey both have dignified documents that explain their positive reasons. Some of the parroted comments are unnecessarily ranty, but only a few were mad: the US comments in one place want to remove OPC because it is not present in the “pre-existing binary format” but then they want to get rid of compatability elenents because they are a “museum”:..they don’t need to worry about consistency because they are voting yes anyway: some of the comments are like that, they are there to only allow the cake to be had and eaten. I expect that several NBs are not really attached to some of their comments.
The second last column is “Off-topic” which is where the NB’s comments includes material that the BRM cannot discuss. These are typically issues concerning IPR. MS needs to spend a bit more effort on this: Switzerland’s comment is really interesting on this point.
The final column marked “radical” is where a National Body’s comments include something that I think will be a challenge for MS or Ecma or ISO to support. I don’t include things like changing minor notations or providing better text explanations for things: I think the Ecma comments show a willingness to have those. However, where some change involves a wholesale alteration of the technology or its implementation, I would be surprised if it were acceptable. This is because for every nation that is voting “No” because they really prefer ODF, etc, there are two who are voting in favour because OOXML is what it is.
Country
Vote
Really No?
Probably No?
Probably Yes?
Indie
Parrot
Off-topic material
Radical
Australia
Abstain
-
-
-
X
X
Austria
Yes
(X)
X
Brazil
No
X
X
Bulgaria
Yes
(X)
X
Canada
No
X
X
X
Use DrawingML rather than VML
Chile
Abstain
-
-
-
X
Field formatting.
Use MathML, Use SMIL, Use SVG, Use ODF
China
No
X
Review time
(Document 13)
?
X?
Remove VML
Colombia
Yes
(X)
OPC to separate standard
Czech Republic
No
X
X
X
Denmark
No
X
X
X
Finland
Abstain
-
-
-
Dates before 1900. Remove VML. Use MathML
France
No
X
X
X
X
Date prior 1900, remove math pending mathml3
Germany
Yes
(X)
X
X
Dates prior to 1900
Ghana
Yes
(X)
X
X
Dates prior to 1900. (replace VML with DrawingML, adopt MathML)
Great Britain
No
X
X
X
Add ODF-isms, (replace VML with DrawingML, adopt MathML)
Greece
Yes
(X)
X
Dates prior to 1900, (replace VML with DrawingML, adopt MathML)
India
No
X
X
X
Use MathML, pre 1900 dates
Iran
No
X
X
X
Dates before 1900. Add ODF-isms
Ireland
No
X
X
Dates before 1900
Israel
Abstain
-
-
-
X
Italy
Abstain
-
-
-
Reference implementation, test suite
Japan
No
X
X
Publish OPC as separate standard
Kenya
Yes
(X)
X
X
Dates before 1900. Remove DrawingML
Korea
No
X
X
Needs interoperability with ODF. Remove VML and DrawingML