m4: An underappreciated tool

Sep 8th
Posted by Michael Trausch  as computing, cool stuff, tips & tricks

In 40+ years of computing, there is bound to be software and tools that people are simply not aware of, or have forgotten about, or have moved from the forefront to behind the scenes—there are actually lots of these.  One of the reasons that I watch Freshmeat for software releases is to constantly look for things that I might be able to use or are just plain nifty.  After all, there is a lot of software out there.

One piece of software that is almost ubiquitous (at least, outside of the not-so-wonderful world of Microsoft Windows) is a small program known as m4.  You may or may not have heard of it, and you have probably never used it outside of the GNU build system if you’ve used that at all.  m4 is fantastic, if hard to learn, but it is available and it is a pretty much standard component of most UNIX-like systems.  It is a macro processor that dates back to V7 UNIX and it’s been standardized as part of the POSIX body of standards, and GNU has its own version with extensions.

Essentially, what you do with m4 is create macros that process text, and then you can use those macros to simplify things.  In other words, m4 makes it possible to follow the DRY principle when it comes to writing text documents, configuration files, and other text-based files.  For example, I’ve written some macros to make it easier to work with djbdns configuration files, which are available on my software page.  Instead of writing a line that looks like this:

+alltray.trausch.us:173.15.213.185:600

I can write a line that looks like this:

DD_A_FULL(alltray, 173.15.213.185)

Or, for multiple hosts that share the same address, I can (actually, I do; this is from my configuration file) do this:

DD_A_IP(173.15.213.185)
DD_A(allspice)
DD_A(alltray)
DD_A(mike)
DD_A(morganne)
DD_A(phone)
DD_A(projects)
DD_A(sip)
DD_A(vcs)
DD_A(www)
DD_A(gallery)
DD_A(wiki)
DD_AAAA_IP(2001:470:1f11:3f::1)
DD_AAAA(allspice)
DD_AAAA(alltray)
DD_AAAA(mike)
DD_AAAA(morganne)
DD_AAAA(phone)
DD_AAAA(projects)
DD_AAAA(sip)
DD_AAAA(vcs)
DD_AAAA(www)
DD_AAAA(gallery)
DD_AAAA(wiki)
DD_AAAA(spicerack)

The nifty part is that the AAAA lines actually expand to:

:allspice.trausch.us:28:\040\001\004\160\037\021\000\077\000\000\000\000\000\000\000\001:600

Which, as you can imagine, is quite a pain to manage manually.  However, using m4 and some helper programs that are written in C for formatting IPv6 addresses for AAAA records and formatting SRV records for things like XMPP services, managing my configuration has become so easy that I don’t really have to think very hard to do it.  I don’t have to seek out any online generators for SPF records, AAAA records, or SRV records.  I don’t have to worry about repeating a TTL in every line, either, because I set up a domain like so:

DD_DOMAIN(trausch.us)
DD_TTL(600)
DD_SOA(spicerack,mbt.zest.trausch.us,1800,600,21600,600)
DD_NS(173.15.213.185,spicerack)
DD_NS(,primary.staffasap.com)
DD_PTR(173.15.213.185,spicerack)

And for a section that should have a longer TTL, I do so by saying DD_TTL(longer_ttl) and then writing the new lines.  I didn’t try to turn the configuration file format into something that was BIND-like (after all, part of the reason I left BIND was I hated managing its configuration files) but I did make the configuration easier for me to manage.  Now, when I need a new address in the file, or change anything about my domain, it’s as simple as adding or deleting macro calls, and the rest is handled for me.  (I also modified my Makefile that generates the djbdns "data.cdb" file so that m4 is called automatically when I update the data.m4 file.)

M4 can be a major pain to use until you learn it.  And while I do sometimes run into the occasional pitfall with it, I’ve used it for adding preprocessing capability to C# and Vala code to make my life easier, and I’ve used it for various other things in the past outside of the GNU build system.  I absolutely love it.  It is not for everyone—indeed, not everyone has a use for a macro processor—but if ever the need arises, it is well worth your time to learn m4 and use it any time you need a macro processor.  It sure beats having to learn macro processors that are specific to a particular environment, and thanks to Cygwin, you can use GNU m4 on Windows, too, if you happen to be so unlucky as to have to live life with 20% of your CPU cycles (and who knows how much RAM) paid as a tax to fend off viruses and malware.  ;-)

And no, m4 isn’t just for programming, nor is it just for programmers.  Anyone who works with a great deal of text is likely to have a good use case for it.  People have been known to avoid using things like PHP and alleviate server-side stress by using a set of macros and generating all of their pages statically (without having to statically manage them all).  This can be good if you have to work with a server that is extremely light on resources and shouldn’t be running things like server-side scripting languages and fetching data from databases.  Instead, you can do all of the database lookups and page formulations ahead of time, meaning that the so-called "slashdot effect" doesn’t ever cause you any trouble on nearly any type of server since it is only serving static pages.  Or writing documents that require things like legal boilerplate and the like; though the days of text processing and typesetting for the masses have generally gone away, sadly.  There are a potentially unlimited number of applications that m4 could be used for—and while m4 is somewhat difficult to learn and can be frustrating at times, it’s standard and it has decades of history, like UNIX in general.

Tags

Leave a Reply

Powered By Wordpress || Designed By Ridgey