Reducing the Length of XML Data Returned from a Web Service

I’m working on a project for a large consulting customer.  The application we’re developing has a SOA that uses WCF services to return data.  A requirement is that the data be communicated using XML for maximum interoperability between potential heterogenous (Microsoft and non-Microsoft) environments.

The problem I’ve run into is that some of the data sets can be quite large.  We’re using the built-in functionality of the ASP.NET DatSet object to serialize the data and table schema information to XML.  This is very fast and convenient to code, but the generated XML is very verbose and adds a lot of overhead.  This resulted in errors in clients of the web service with the message “The maximum message size quota for incoming messages (n) has been exceeded“.  I increased the MaxReceivedMessageSize property on the binding in the web.config file to 50000000 (yes, that’s 50Mb), but I still ran into the same error.  I really didn’t want to keep increasing the max message size and be sending 50Mb, or even more, of XML across the wire. 

The solution I implemented to reduce the size of the XML but still return a string value from the service is a 2-step solution.  First, we compress the XML using one the built-in .NET compression classes.  I chose to use the System.IO.Compression.DeflateStream class.  (The documentation for the GZipStream class states that the algorithm uses the deflation algorithm, so I just decided to use the DeflateStream directly.)  Once the data has been compressed, it is in a binary format, so we move to step to of the process: base64 encoding the binary data.  This results in a text string that represents the compressed binary data.

On the server, the code looks like this:

// Create a compressed stream in memory to hold the XML schema and data.
System.IO.MemoryStream memStream = new System.IO.MemoryStream();
System.IO.Compression.DeflateStream compressedStream =
    new System.IO.Compression.DeflateStream(memStream,
            System.IO.Compression.CompressionMode.Compress, true);
// Let the System.Data.DataSet serialize the XML directly to the compressed stream.
ds.WriteXml(compressedStream, XmlWriteMode.WriteSchema);
compressedStream.Close();
memStream.Seek(0, System.IO.SeekOrigin.Begin);
// Return a base64 encoded string representation of the compressed stream.
return Convert.ToBase64String(memStream.GetBuffer(), Base64FormattingOptions.None);

Pretty simple.

Now, on the client that calls the WCF service to request the data, we have to unencode the base64 encoded data back into the deflated binary data and decompress that back into our original XML string.

// Convert the base64 encoded data back into the binary compressed format.
System.IO.MemoryStream memStream =
    new System.IO.MemoryStream(Convert.FromBase64String(base64EncodedData));
System.IO.Compression.DeflateStream compressedStream =
    new System.IO.Compression.DeflateStream(memStream,
            System.IO.Compression.CompressionMode.Decompress, true);
// Convert the compressed XML data back into an ASP.NET DataSet.
DataSet ds = new DataSet();
ds.ReadXml(compressedStream);

This resulted in pretty big savings. One of our XML string representations of the data went from 52,011,436 bytes to 1,646,592 bytes. Another data set went from approximately 21Mb down to 732k. In both cases, the cempressed and base64 encoded data was about 1/30th the size of the uncompressed data. That’s significantly less data to stream across the wire. The amount of processing overhead is minimal.

The approach I took also minimizes the number of copies of the data stored in memory.  Before compressing the data, one of the larger data sets we had would cause an out-of-memory exception within the WCF service when I tried to serialize it from the DataSet to XML.  Since I made the change to the code to use compression, this hasn’t happened.

So, with just a few extra lines of code leveraging built-in ASP.NET class for compression and base64 encoding, I was able to reduce the size of the XML that the WCF data service sends over the wire. I hope you find this tip helpful in improving the latency time of streaming long XML strings out of your WCF or web data service.

Related posts: