Tuesday, February 28, 2006

Why is it called a Markup Language?

I have a few side interests which have recently found an odd synchronicity. The first interest is in using XML (or something like it) to describe UI. (I've long been a proponent of the idea, although I think most of the implementations utterly fail to fulfill the potential.) The second interest is in building a 'better' XML, or at least an alternative that might be better suited for how I see people actually using XML. The last abstract angle to this awkward web is my interest in building a better XML API. (Designing a better XML is actually just the flip side to designing a better API.) The odd synchronicity is that both tend to terrify me with how people often completely fail to understand what the 'Markup Language' in eXtensible Markup Language (XML) really means.

I'm a bit unusual, in that I came at XML from an SGML background, and yet I was a firm believer in XML's future as a data transport language. The SGML background means that I really do have a firm grasp on what it means to be a 'markup language'. SGML came about as a way to annotate or 'tag' some text, to describe higher level semantics that were not intuitively obvious from the text itself, at least not obvious to a computer. The <p> and <title> tags in HTML come to mind as perfect examples of this. Tags also provide a useful place to attach information that may be useful for best presenting the information to a user. The 'target' attribute on the <href> element is a good example of that. What is important to note though, is that this tags were just adding layers of meaning to an existing language text. This is why it was called a Markup Language. Taken to an extreme (and it wasn't that much of an extreme for some users), you started by writing the document in plain language, and then you went back and added these tags, thus you were 'marking up', just like your English teacher in high-school would add all those comments to your papers. What is rather interesting about SGML and XML is that while they are 'languages' for marking up plaintext, they are themselves a framework on which other 'languages' are built (think XHTML, RSS, Atom, etc).

Why do I bring this up? Load up an Atom document in a text editor. How much of the document is markup and how much is text? I mean this as no criticism. This is where I envisioned XML leading 10 years ago, when I first heard about W3C's efforts to create XML. We are at a point where the real content of the document is in these 'tags', the things which were originally designed as annotations.

Lets just tracks for a bit, to XML APIs. Working at Microsoft, I spent a number of years working with customers trying to use XML. The way XML was sold inside Microsoft meant that this shift from the primary data being the text to the primary data being the tags happened very early there. The team that was building the initial XML support libraries mostly came from a data-oriented background (probably because Adam Bosworth was the man who had put the team together... I just happened to have the dumb luck to have wandered across their path at the right time). Even before shipping Microsoft's first real XML library (MSXML), we were struggling with problems stemming from the fact that XML was designed for marking up text, more than it was designed for serializing data.

One of the most frustrating problems for XML API designers boils down to what is called 'mixed content'. Mixed content is what XML was designed for, where there is some text and various parts of that text have tags which layer on some further meaning; think <href>/<p>/<b>/<i>/etc in HTML. The <p> tag is the container, and the content of the <p> is mixed content, meaning it is a mix of text and tags. I also lump in the problem of Processing-Instructions and Comments into this general problem for API designers. The reason these are such a problem is that they don't map to traditional programming models veyr well. Consider this simple problem: given the XML: "<p>This is some <b>bold</b> text.</p>". What is the content of the <p> tag? Most APIs expose the content as a sequence of a 'text' node, a <b> tag, and another text node, where that <b> tag itself has a text node as content. If I am looking at this from a marked-up text perspective, the problem with that description is that all the text inside <p> is equally part of that paragraph, so why is some of it more issolated from the <p> tag that other parts? Now compare that to this XHTML snippet: "<ul><li>Item 1</li><li>Item 2</li><ul>". There we clearly want to keep the 'Item 1' text distict from 'Item 2', they are distinct. So what is the 'value' of <ul>? Obviously, one can't use the same logic to report the value of <ul> as we did for <p>.

One could definitely argue that part of XML's power is that it so easily expresses both concepts is such a simple syntax. The problem is that the burden has been moved onto the API and the API user. When talking about applications, such as HTML, where the majority of the effort is spent in authoring XML content to be processed by a limited number of tools, this trade-off makes absolute sense. But what about all these uses of XML where there is no 'author'? Where the 'user' is effectively the application developer? Now the tables have turned. This complexity has become an impedement in the usability of XML. This is part of why there is such a proliferation of XML binding tools. Most XML APIs mimic the abstract design of XML, which means they expose this complexity. Part of the reason that there are so many XML binding libraries is that XML is so flexible that there are many ways to say the same thing.

One way of summarizing the current situation is to say that the majority of XML users are not actually using XML as a 'markup language' at all. They are using it as a data serialization language.

So the problem I've struggled with for the last few years is how to design an XML API that reflects the fact that people use XML as a serialization language, not a real markup language. That would be easy, except for that fact that while most data is just data, some of it is markup... like XHTML. Worse, RSS and Atom are a perfect example of something that mixes both uses in one file. It seems to me that the trick is the fact that most applications actually know what parts of the data are data-serialization and what are markup. What if there were a way to basically let the user switch back and forth to the appropriate API for the task at hand? The problem is that I've never quite figured out how to do this. The trick is that you sometimes need to peek ahead and then switch your view based on what you saw.

Ultimately, I feel that one of the weights that may eventually lead to something replacing XML, is this lack of distinction between markup and data serialization.

Friday, February 24, 2006

SAX vs StAX a.k.a. Push vs Pull

I've been writing some code to load up XML configuration files and populate some Java objects with the contents. None of the XML binding libraries work because both the schema for the config file and the Java objects are already predefined and while the mapping between the two is trivial, it isn't a brain-dead one-to-one mapping. I like to think I know a thing or two about programming to use XML, and so I set about implementing the code to load the config files. If I want to run on a clean Java 1.5 install, there were 3 options: SAX, DOM, StAX. If I want to run on a clean 1.4 install, StAX drops off that list. I realize there are other APIs out there, mostly DOM alternatives, but they aren't part of the standard install, so I wanted to avoid them.

The format was pretty trivial, and I knocked together a DOM implementation in a hour or two. Which quickly lead me into my coworker's office to ponder whether any of the designers of the DOM API tried to use the beast they were building. I guess, I'm still spoiled from years of working on Microsoft platforms. I wrote script code for this kind of thing all the time, and would use node.SelectSingleNode() and node.SelectNodes() as my primary way of navigating the DOM tree. In Java, I had to write dramatically more code. Worse, if I wanted code with half-decent error checking, the code rapidly bloated up with null checks. So I ended up writing a set up helper methods that allowed me to write SelectNode()-like code. Why isn't this just part of the APIs? I understand the controversy of query language, etc... but come on people! Pick something reasonable and move on. More importantly, let developers move on to the real task at hand, rather than struggling to code basic tree-walks!

These configs were small, so the memory footprint of the DOM isn't that big a deal, but I'd like to make it faster, which means using a parsing API rather than an in-memory-tree API. Since I want to support Java 1.4, I'm forced to use SAX. You will notice some reluctance in my voice. I've implemented a few XML loaders in my day. Not as many as many of you, but enough. Writing business logic on top of SAX is right up there with visiting the doctor, among my least favorite things. Why? Because, the SAX API was designed to make the parser's developer's like easier, not yours. Sure, SAX parsers are usually faster than StAX or the like, but only 1 in 1000 apps really benefits from that 5%, because hooking into a SAX parser is so damn much more work than using a StAX parser. When I'm writing something like my config loader, I want my code to be reasonably fast and as simple as possible. The config loader is not a high priority on my todo list, but it must be done. When using SAX, I have to turn the problem inside out, because the parser is in control, not my code. I'd give an example, but even a trivial example will take more time than I wanted to devote to this write-up.

In now way to I mean to chastise the authors of SAX the way I wish I could the DOM API authors. When SAX was first defined, this was the normal way to hook up to parsers. Admittedly, James Clark's parsers have long had pull-model-like APIs, but most parsers had APIs like SAX. The problem is that the work has moved on. StAX-like APIs are just better. Part of why so many people need XML binding in Java is because the parsing APIs are so difficult to use. Worse, I think SAX actually encourages people to write fragile XML loaders that can't handle valid use of XML. One example I've seen many times is that people assume an element will have a single text-node child. What if I want to add a comment? Or maybe use an Entity to avoid retyping some common text? Sure, it is possible, to implement this correctly in SAX, but since it is so damn hard to do the simpler parts, many people never get around to worrying about this... Actually, most developers code up this kind of thing don't know enough about XML to even know that this is a danger.

StAX isn't a total solution. It is definitely easier to use, but it suffers some of the same problems. Where are the helper methods that read an element's content and return a String? Why would I need that you ask? The example I used above of using an Entity or introducing a Comment in the middle of some text causes similar problems for StAX programmers, as it does for SAX programmers. In StAX all you need is a single helper method to abstract it away and you are golden, while in SAX you need a much more complicated solution.

Ultimately, I'm suggesting to Sun (or who-ever decides these things... I'm still figuring out how the Java 'standardization' process works) start looking at how to make XML easier to use. You don't need XML language integration (although that might not hurt). You really just need better APIs.

Thursday, February 16, 2006

Windows weakness

I mentioned before that I was planning on getting a new laptop and debating between getting one of the new MacBooks or a MSWindows machine. For various reasons, I broke down and got a Dell Latitude 610. Asside from not being a Mac, it is a great machine. Fast (2.13GHz Pentium M and 7200rpm drive!), especially where my old PowerBook was slow: Java development. Also, it is smallish and light-ish, yet still has at 1400-1080 screen. And the nail in the coffin? I got it off DellOutlet for a steal. Almost $1k less than a similarly equiped MacBook!


So now that I have succome, and using it a good bit, I'm really starting to find annoying the lack of some of OS-X's niceties. If I knew people on the Vista shell team, I'm be pestering them, but from what I know of the schedule, all the features are long ago baked and noth'n that ain't a critical bugfix is open for consideration now.

So what do I miss? Beside the slick elegance of OS-X...

  • F11 – Hide everything to let you get at the Desktop. Yea… Windows has the desktop thingy in the QuickLaunch tollbar, but it never seems to work as I would like. F11 doesn’t ‘hide’ windows like normal, so there is no way to end up with all you windows left hidden, which is what happens to me with the QuickLaunch icon. I tend to keep at least a half-dozen windows open and another half-dozen minimized 99% of the time. For example, I’ll often have 3-4 CMD windows open, so that I can cycle through compile/test/run without with just a flick of the alt-tab. Now if I hit the show-desktop icon, they all hide, but if I do anything other than just stare at the desktop, they all _stay_ minimized and I have to manually un-minimize each window! On my Mac, I would hit F11, and if whatever I did on the desktop launched a new app, all the windows returned to their places, with the app on top. Perfect! So how do I get that on a PC?
  • Application/Window dichotomy. OS-X and Windows have very different abstractions of document, windows, and apps. Windows UI was obviously designed when MDI was king. (Can I inject that I have _never_ liked MDI? Tabs make MDI acceptable, even desirable for some apps, but the older style was abysmal.) I say this, because the window-manager/shell basically assumes that each top-level window is its own application. Think about it… alt-tab, the TaskBar.. everything about their design makes much more sense if you think of each window as an individual application. The problem is that many modern applications are not written that way. Word, Excel, Outlook, InternetExplorer, etc… I really got to like how with OS-X, I could just use Cmd-W to close the current window. If the app used tabs, it would close that tab, otherwise it would close the current window. To close the app, you need to actively quit the app. This is apparently a large source of confusion for new users, but is a huge boon for people like me. On windows, I find myself always having to stop and think whether to type Alt-F4 or Ctrl-F4. If I want to leave an app around, but not have a window open? No such option on Windows. I’m not sure what Gnome/etc do, but I suspect that it is more similar to Windows, since under X11, the window manager doesn’t have the concept of an app, it only knows about windows. I never would have thought of this, but after using it a while, I strongly prefer the OS-X way.
  • Copy/Paste keyboard shortcuts in CMD. I tend to use CMD more than most people. It is _really_ frustrating that I can’t use the normal ctrl-c/ctrl-x/ctrl-v shortcuts. I understand why (What if the app wants access to those keys? Or I want to use ctrl-c to break a command?), but it is still damn annoying.
  • I tend to use iTerm on my Mac, and really like being able to group my shells into a single window with tabs for each shell. A few years back, I tried some hacks to try and host a CMD window inside another app, and never got it to work.

That is just 1 week of regular use. Windows has so many great applications, and is a great development platform… but I find that it lacks the subtlety of OS-X. With some luck, in a year and half or so, when I’m looking for a new laptop again, Apple will have its act together better.

Wednesday, February 15, 2006

Oracle on the move

My last few years at Microsoft, the XML team has been reporting under SQL Server. That meant that at division all-hands meetings and such, I got to hear all sorts of hoopla about SQL Server vs Oracle, etc.. It also lead to some interesting hallway discussions about the threat of LAMP, MySQL, etc..

So first Oracle goes and buys InnoDB, the provider of the transaction-safe storage engine for MySQL. That sure puts MySQL at a disadvantage. InnoDB was key to their 'I'm a grownup' story. Sure, they still have access to the source-code, but with Oracle pulling the strings, I doubt they can expect much investment in the public release.

Oracle has acquired Sleepycat! I can't think how may placed I've used BerkeleyDB back before I joined Microsoft, and was working on web apps. Now, Sleepycat has been working on a number of things beyond the basic library of yore, and it may be that these higher layers are why Oracle bought them. Or it may be that BerkeleyDB is at the kernel of so many things, including apparently other parts of MySQL!! (Sorry, I can't seem to find the link that indicated that.)

The final part that interests me the most about this, is that by weakening MySQL, Oracle is also giving Microsoft a leg up. SQL Server grew mostly by building a strong small-business base, where Oracle was almost non-existent. MySQL was a serious threat to that business. With SQL-Express and MySQL on less stabile footing, Microsoft is in a pretty good position. Now all we need are some good persistence layers for .Net... Java has them fall out of it's ears. Are there some that I'm missing? I don't follow this space much, as most of my programming is lower-level than that.

Monday, February 13, 2006

Duct Tape My Heart

Pandora just made my evening. I'm fighting an frustrating bit of code that just isn't working as I would like. I'm disheartened and frustrated. So Pandora comes along and plays an absolutely ridiculous song that is quintessential elector-pop: Duct Tape My Heart, by Freezepop. This is why I love radio and these new services like Pandora: unexpected gems that redefine my mood. I guess I should consider myself lucky then that my attempts at a VNV Nation / Assemblage 23 station doesn't seem to quite play the mix that I would expect.

Monday, February 06, 2006

Why I love the SuperBowl

First, I'll make an admission. I don't really care much about sports. I have actually watched maybe 2 SuperBowl games in my lifetime, but I've never been that intersted in watching others play sports, unless it is a sport I also play. Then I'm trying to learn how to improve my game, by wathcing their technique.

So why then do I love the SuperBowl? Is it the ads?

Nope. While entertaining, SuperBowl ads are overhyped, especially for someone like me who rarely watches television. Ads in general fascinate me, as a bizarre commentary on my culture. The SuperBowl is just more of the same, only bigger.

Why then do I love the SuperBowl?

I love the SuperBowl because everyone else is at home watching it! The roads were empty! The grocery store (normally a chaotic morass of people) was almost empty. I parked right by the entrance! I went clothes shopping downtown and that was similarly stark. Since my goal, in both the clothes and grocery shopping was to go in, find specific items and get out, it was lovely.

It almost makes up for all the hellish traffic on game days.

Friday, February 03, 2006

Music to my ears

Ars Technica has an article about the cost of music and digital music downloads. Having been a confirmed music addict since late high-school (I own ~1000 CDs and probably 150 records), I am acutely aware of the cost of music. I probably purchase on average 2+ albums a month. I have yet to download anything online because of DRM concerns and quality concerns. The music industries current behaviour appalls me, although having read some business books, I understand why they are doing what they are doing. The problem is that the current path seems to be one that actually hurts the industry in the long run. Not that any of the current management cares, they will have cashed out with their millions first.

I guess what hurts, is to see an industry with such amazing potential for customer involvement, aim to milk that same customer dry. How else can you explain CD prices? CDs are dramatically cheaper to produce than 15 years ago, yet the price of a CD has more than increased with inflation (I think... lets see, 12$ ten years ago, and 18$ now, yup, that beats inflation). Being in the tech industry, I know plenty of music addicts who make tidy salaries and can afford these prices. But I also know, from years working at my college radio station and hanging out at indie music shops, that some of the greatest potential lies in the starving young music lovers. Ever seen High Fidelity? Those guys are the guys who get people to buy music. They play the Beta Band while you are looking for something else (or just browsing, looking for some hidden jems in the used section) and now you are on the hook for 2 CDs.

So why is the industry scorning these jewels of potential? Probably because those kind of guys will mostly do that regardless, almost. There is a point, when pushed too far, by rising prices and craptacular major releases, that they will turn away and do something else. The other problem is that they tend to promote random independent bands and local bands, rather than the latest ex-mouseketeer starlet that the industry is promoting.

In the meanwhile, there are strong veins of "The innovator's Dilemma" in the current situation. The actions of the larger corporations is actually opening new doors for the smaller shops. Sure the independents can't get on the Walmart shelves, but while the corporations are trying to make online purchasing as awkward and expensive as possible, the independents are gladly taking cash from an entire new class of customer. MySpace.com and various other online venues are providing new ways for smaller bands to get their name out. Services like Last.fm and Pandora provide an amazing opportunity to introduce people to new music that they may not hear on the radio other otherwise be exposed to.

In the meantime I watch in disgust at the RIAA and MPAA dinosaurs, just waiting for the inevitable fall.