Monday, October 30, 2006

The OS is almost a commodity

I just got back from the pub, where we managed to drive one poor friend to pull out her Economist just to break up the computer-geek talk, but one of the points of discussion resonates well with Scott Grannemann's article "Surprises inside Microsoft Vista's EULA". Microsoft really is trying painfully hard to prove that it is the new 90's IBM. (You know... before IBM figured out Linux, Java, etc...)

My take on all this is that it proves that the Lawyers and Marketing folk run the show now. I can't see any other reason. The company can't make a profit by making better products, so it will squeeze every last ounce out of the customers it has, there-by making them even more likely to just ship. Even when it does have better products (MS Office vs Open Office, C# vs Java) it seems they are determined to dig their own graves.

The operating system is becoming a commodity. Look at Google, or at how anyone between the ages of 10 and 30 uses a computer. Microsoft needs to stop leaning on Windows and get back to competing! Look at iTunes. If Microsoft could ever build a decent mp3/audio app, they would then kill it by making it Windows specific. Part of the success of both MS Word and Excel depended on their Mac ports (at least from my experience). I know that the world runs Windows on their desktops, but they don't run Windows on their phone, or their iPod, or their Tivo. As I have said before, social computing is based on things like the iPod and YouTube. If Microsoft keeps limiting it's plays with Windows tie-downs (kinda-like tie-ins...) it will continue to fail. I pray that J Allard and co get this with the Zune.

Every day, Linux gets closer to being a viable daily OS for me (again... it has been years, but I lived by a Linux laptop 10 years ago, hard to believe). I've be writing this on an Apple laptop, except for my lack of trust of any V1 hardware platform. Windows no longer implies the tie-in that it once did.

Thursday, October 26, 2006

My long road there: Binary XML

First an admission: I've been 'doing XML' for a long time, and was not an easy convert to 'Binary-XML'.  XML has many uses, and a curious history that is often at odds with it's actual usage.  XML serves many masters.   I've been using XML for markup, configuration, and quick'n'dirty data-stores for longer than I like to admit.  For most of those purposes, Binary XML really just isn't necessary.  In fact, I still don't think Binary XML is particularly advantageous.  The advantages of being able to open the data in you text editor of choice far outweighs any of the benefits of a Binary XML encoding.

Yet... I have come to understand that Binary XML is a necessity. While I was still at Microsoft I was tangentially involved with SQL Server's XML support.  One of the things I remember clearly was the motivation for a Binary-XML serialization: float to string conversion.  Conversion from a IEEE float/double to an accurate text representation is horrendously expensive.  While there are queries where you need to do this in the server, there is a huge advantage to offloading that to the client.

Since coming to work at AgileDelta I've been encountering another use-case that didn't really come up when I was at Microsoft: Constrained Bandwidth.  At the extreme, when all you have got is a 300 baud modem, XML's verbosity becomes a real problem.  This issue arises even in more common situations, very chatty web services for example.  A few quick web searches show that a number of people are struggling with the issue.

It is worth stopping to review why people want to use XML.  It isn't just because it is the cool thing to do.  XML has an array of tools and frameworks that make it easier to use XML than building your own custom language or custom format.  One of the big motivations to use XML is that you can depend on all these pre-built, pre-tested components to keep your focus on your primary task.  Just as using Java/C# makes development faster/cheaper because the developers don't need to worry about memory allocation woes, XML provides faster/easier custom data formats.

What happens when you want all that faster/easier-ness but some part of your pipeline is bandwidth constrained?  Or maybe you are struggling with a bottleneck such as the one I mentioned about SQL Server, where a significant amount of your precious CPU is being lost to XML serialization?  This is where Binary-XML can save your bacon.  I've watched some of our current customers implement solutions that leverage our Efficient XML to integrate limited bandwidth systems into an existing system with minimum of work, where the mere idea was unthinkable before.

Binary-XML isn't for everyone.  Ant should never use Binary-XML.  Nor would it really help XHTML.  But when bandwidth is a limiting factor (say... a smart-phone on a metered data-link?) it may just be the thing to save you a few bytes without the huge implementation/test cost of a custom binary protocol.

I've heard the arguement: to just wait a year or two, the bandwidth will come.  Rubish.  Bandwidth does not obey Moore's law.  Take cell-phones as an example.  We in the US are just now getting 3G data services!  It gets worse.  Cell bandwidth is like Cable-modem's.  Everyone on the block shares a fixed available connection, but with cell-phones, it is shared per tower.  As more people start using data services on their phones, the effective bandwidth will drop because the available bandwidth to the towers is limited.  Limited bandwidth is already a problem for some.  Nothing I've read indicates that this is likely to change anytime soon.

The W3C's Efficient XML Interchange group (of which my employer is a member) is evaluating a number of proposals (including one from my employer).  I have personally seen how our format can produce messages that compete in size with custom-designed formats with a fraction the effort, while preserving XML's flexibility and extensibility.  It will take a while, but an efficient Binary-XML will remove the need for many custom formats.  Having a standard with implementations integrated into the major platforms will open many new doors.

Tuesday, October 17, 2006

Gillmore and not using Microsoft

The latest Gillmore Gang includes Steve Gillmore stating that people choose not to use Microsoft Windows because of lock-in. That may be true amongst the technical elite, but I've watched a whole series of people choose Apple not because they don't like Windows, but because the Mac looks better and 'feels' better. Thanks to the Internet, computers and their operating systems are now a commodity. Macs running OS X look more friendly. Many people just use a computer for a web-browser and music/photo management, and anyone can do that with either a Mac or Windows. (Note: I don't include Linux in that category... yet)

10+ years ago Microsoft worked so hard to build Internet Explorer to compete with Netscape because Microsoft management was worried that the internet would kill the Windows platform. It has finally started to happen, at least for the casual-user home PC. Microsoft should be very scared. Not because people aren't falling for the lock-in, as Steve implied, but because there is no lock-in for a growing class of users.

Tuesday, October 03, 2006

We Shipped!

AgileDelta, the little company I happen to work for, just shipped our first commercial release of Efficient XML!. If you are running into problems with XML's verbosity, and compressing your data is either too slow or not performant enough, our Efficient XML product may be just what you need.

Monday, October 02, 2006

Java.Util.Zip bugs

I've spent much of this morning trying to track down what appears to be a bug in Java.Util.Zip.Deflater/Inflater. The repro is relatively easy:

byte[] data = new byte[] {
(byte)0xCF, (byte)0xCF, (byte)0xCF, (byte)0xCF, (byte)0xCF
};
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Deflater deflater = new Deflater(Deflater.BEST_COMPRESSION, true);
DeflaterOutputStream dos = new DeflaterOutputStream(baos, deflater);
dos.write(data);
dos.close();

ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
Inflater inflater = new Inflater(true);
InflaterInputStream iis = new InflaterInputStream(bais, inflater);
int count = 0;
while (iis.read() != -1)
count++;
System.out.println("count: "+count);

The values for the data array are specific. Other values also demonstrate the error, but most values work without issue. You can also fix it by not selecting the 'nowrap' option, but that dramatically increases the output size for small inputs.

I'm not sure how to report the issue to Sun, so I'm blogging it here.