Saturday, 02 December, 2000

Advantages of Text Protocols

I had occasion recently to peek under the hood at some Internet communications protocols.  I knew that SMTP, for example, was very simple—it had to be in order to work for so long (almost 30 years) on so many different types of hardware.  Like many of the other protocols, SMTP is entirely text based, and the commands are simple enough that you could use a terminal-mode program to send mail directly through an SMTP server without using a mail client.  Not that you'd necessarily want to, but you could.  One benefit of using a simple text-based protocol is that it's easy to test by hand.  Complicated binary protocols that support compression and encryption may be more efficient, but they require similarly complicated testing programs, and it's often difficult to determine where an error exists—in the server you're testing or in the test program itself.

I've run into similar problems with binary file formats.  Text has its disadvantages:  it's usually larger than the equivalent binary file, it's a pain to write parsing code, and being human-readable encourages some people to "try" things.  But being human readable makes it easy for me to verify a program's I/O, text parsing code isn't all that hard to write, and the additional space required by text is easily offset by the ease of debugging and verification.  If the text format becomes too much of a resource problem, it's very easy to convert to a binary format after the rest of the program is working.  I use text for file formats and communication protocols whenever it makes sense, which is a lot more often than you might think.