Tuesday, December 28, 2004

XML is like Perl

While visiting my parents for the Holidays, I caught up with some old college friends. One of the guys who was a close friend and had a huge impact on me is one of the principals at a small startup. They build a network appliance and he and I were chatting over geek details. He was telling me how they use XML. Basically, as part of the services that their box provides, it allows you to do some simple aggregation and data integration. One of the more common mechanisms they leverage for this is simple XML web services. SOAP or REST, although it sounds like they tend to use SOAP. They could do some really cool stuff. It was interseting to chat about why they like XML. I can easily understand why they would use XML. If you use SOAP, you are forced to use XML. Why people like XML is far more interesting to me.

Part of my job is to help design/evolve the APIs we expose for using XML. I love talking with programmers who use XML just as part of getting their primary job done. It helps remind me what is important to our customers. As an implementer it is too easy to get caught up in the details, the really ugly parts of the spec, or the most interesting aspects to implement. It is very easy to forget that most users never touch any of that. They use the 20% of the spec that lets them get their job done. (note: This is why I dislike the XML-Namespaces spec. It is a sneaky, nasty surprise lurking around the corner to trip up the casual XML user.)

My friend uses XPath, SOAP, and a few other XML-isms because it really does make his life easier. XML isn't the most efficient way to get the job done, but every platform and it's mutant offspring have some support for XML. For pushing 'data' around a network, XML is a supremely useful glue. Just like people use VB, Javascript, Perl, Python, Bash, etc to quickly connect up modules/apps, people use XML to connect services. Any app writen in any scripting language could be writen in C (or even assembly) such that it would be faster, propbably dramatically faster. On the flip side, it would take you 10 times as much time, and be harder to update/tweak/etc. XML plays a similar role. You can use it to package almost any kind of data. Both ends get to leverage a large set of tools to implement their functionality. Once it is up and running, start profiling. Where are you spending time? To much data on the wire? Compress the XML, use a binary encoding, etc. For most platforms you can just plug in a custom XML reader/writer and away you go. More likely, the bottleneck is your processing, so replace your XSLT with custom code.

I am a frim believer in building something simple that works, and then evolving it to work well. That doesn't mean you don't do back-of-the-envelope calculations before you start, and doesn't mean you don't architect a clean solution. It means start simple and leverage the tools at hand. Once it works, you can see how it actually behaves in action and start to adjust it based on actually usage. That does mean understanding that V0.1 is really little more than a prototype and thus there is a lot more work left, despite what your marketing guy may say. Just because it looks like a Ferrari doesn't mean it can corner like one.

A lot of people complain about XML being too verbose and wildly inefficient, and they are right. If we went back to the drawing board and redesigned XML today it would probably be different. If we redisigned it for their specific scenario it woudl be different still. But the fact is that it was designed 7+ years ago, which means that there are mature libraries to support it. More interestingly, nothing else has shown up which is so much better as to have replaced it. When you want a generic data format, especially when data interchange is involved, XML is your man. Worry about the verbosity and inefficiency after your prototype is done and you have sold your idea to your boss/vp/vc.

0 Comments:

Post a Comment

<< Home