Arnold (Aharon) Robbins has just received the USENIX Flame for his contributions to the USENIX community over the last several decades. Although Arnold and I have never met until now, I found exchanging emails for this interview easy to do, as if we had known each other for years. I guess some of that is the Unix effect.
Arnold has written or revised over a dozen books, including the documentation for gawk, GNU awk. He is also the maintainer of gawk.
Rik Farrow: What was your first Unix system?
Arnold Robbins: I was first exposed to Unix in 1980 on a PDP-11/70 running a V6 variant (IS/1 from Interactive Systems). I was exposed to C at the same time.
I read the K&R C book (first edition), and found that my head was swimming with all the details; it was one of the densest books I'd ever read. I then re-read the book, and everything clicked. I fell in love with C.
At the same time, I was also exposed to Kernighan & Plauger's Software Tools. That book changed my life! It turned on all the lights for what the various Unix tools were and how to really use Unix to the fullest. In particular, regular expressions and grep and ed, but also sort as a key way to organize data for further processing.
My first personal Unix system was an AT&T 3B1, the slightly bigger brother of the "Unix PC". Both had an MC 68010 with virtual memory, a System V-based kernel, and a kludgy shared library implementation which was mainly for use by the C library. The final userland for it had some bits from System V Release 2.
The 3B1 had room for a larger disk; eventually I had a 67 Meg disk and I seem to recall 2 megabytes of memory. It was amazing how much you could do with what today is considered miniscule quantities of memory and storage.
The 3B1 had a funky windowing system with a three-button mouse, a great keyboard, and a built-in 1200 baud modem. I was part of a group of unixpc/3B1 owners who set up our own UUCP-based network. Initially this was with the built-in modem; later we used Telebit trailblazers and worldblazers. (Remember those? :-)
I spent many happy hours on that system working on gawk and its documentation. (I was single. :-)
Phil Pemberton has a lovely 3B1 emulator available at https://github.com/philpem/freebee for anyone who wishes to journey into the past. There are links there to a drive image I made for it a while back.
Later on, I donated that machine to a charity and got a small Sun Sparcstation, and from there I moved on to GNU/Linux.
RF: How did you get involved in doing service to the community, like working on gawk?
I had always had something of a (small) flare for languages, doing well in French in high school, and getting higher scores on the English SAT than on the Math one.
When I started studying Computer Science, I found programming languages, compilers and interpreters to be of particular interest.
I had read the original V7 paper on awk, but found the language described to be too hard to wrap my head around, so I largely ignored it.
However, in October of 1987, I came across The AWK Programming Language in a bookstore. I bought the book and read it and finally "grokked" awk. The book described "new" awk, which wasn't so easily available at the time, and (a) knowing about the GNU Project, (b) having an interest in languarges, and (c) being single with lots of spare time, I decided to see if there was a GNU version of awk out there to play with.
There was, but it was a buggy clone of the original awk. I got in touch with the GNU project about updating it to match "new" awk and was told that someone else — a nice guy named David Trueman — had already volunteered to do that, but that I could join in if I wanted. I got in touch with David, we set up UUCP links, and got to work. For a long time we exchanged diffs via email. We met in person a time or two at USENIX conferences.
David did most of the heavy lifting converting gawk to match new awk, although I contributed bug fixes and some features, and I started to work on improving the manual. The existing manual was bare-bones, at about 90 pages. Today it's about 600 or so and has been published, first by Specialized Systems Consultants (SSC), later by O'Reilly.
In the fall of 1993 David had to drop out, and I took over as sole maintainer. Even then, there were contributors who helped in the porting. If I recall correctly, MS-DOS was the first port and Vax/VMS was the second. Even today, OpenVMS is still supported!
RF: Was there an inflection point of some sort during gawk's development?
AR: Initially, people used gawk alongside Unix awk and nawk (new awk). Even in the early days, though, the GNU "no arbitrary limits" principle was in place, and sometimes this made a difference. For example, in the mid-90s sometime, Rick Adams at UUNET was in touch with me. He used gawk to do his accounting, because Unix awk just fell over and died on the amounts of data he had to process.
The "inflection point", though, came with the creation of GNU/Linux distributions. It was one thing when gawk was used alongside Unix awk. It was a whole 'nother ball game when gawk became the awk on a user's system. During Linux's initial few years, gawk really stabilized and became much more "production" quality.
I would also say that this is the point when what I was doing became a real service to the community.
RF: You've been the maintainer of GNU awk for many years. Maintaining open source software tools seems to me like a source for receiving criticism from people who have ideas they aren't willing or able to implement themselves. How do you handle that?
AR: Sometimes I handle it better than at other times. This is an issue that plagues other Free Software maintainers; Chet Ramey (of Bash fame) and I discuss this stuff a lot. The attitude of "it's Free Software, so you work for me" is not uncommon, and ranges from annoying to distressing to making you wonder why you continue to bother. In other words, being a BDFL (Benevolent Dictator For Life, a term coined by Guido van Rossum) isn't always everything it might be cracked up to be.
Now is also an opportunity to mention https://xkcd.com/2347/; sometimes I feel like I'm the guy he's talking about.
In 2012, Chet and I co-wrote An Open Letter To Those of You Who Are Unhappy, available at https://www.skeeve.com/fork-my-code.html. Not that it helped a whole lot...
More recently, I have simply stopped posting in comp.lang.awk, as the level of discourse there is stunningly low. There is no "awk community" like there is for Perl, Python, Ruby, Go or Rust. In some ways this is sad. In others, it's OK, as I don't have the cycles for more than what I'm doing now.
As another example, there was (and still is) an extremely annoying user. I was almost at the point of giving up on maintainership due to him. Instead, I repurposed the "bug reports for gawk" list to be bug reports only, and started a "help for gawk" mailing list, which I don't read. I then pointed him to it. He sends his requests there, and other people deal with him. And I am considerably happier.
Even more recently, another user was so obnoxious that I had the GNU people block his access to the gawk lists.
Fortunately, these people are fairly few, and every once in a while, out of the blue, I get a "gawk is wonderful, I couldn't manage without it" kind of email from a random user out in the world. Those emails make my day, and help me keep going.
RF: What lessons have you learned maintaining gawk that you wish you had understood earlier on?
I'm glad you asked me that.
1. Just because someone asks for a feature, it doesn't mean that you have to add it. It took me a while to figure out that often, when someone says "gawk needs XXX", it means "I want gawk to do XXX". But XXX may not be worth doing.
2. Once a feature goes in, it won't come out. So be careful about adding features. In the early days, adding features is easy, and provides an outlet for one's creativity. This makes it fun.
But over time, features accrue, and they start to interact with one another in unexpected ways (C++ syndrome), and the accumulated baggage of code (and documentation!) makes the program too large for one person to maintain well. This also raises the barrier of entry for new contributors.
I have to say that this is one of areas where the original Bell Labs people stood out: they knew when to stop. It seems that almost nobody else in the software world does.
3. Be careful what you let other people contribute, particularly if it causes significant upheaval in the code. You are still the maintainer, and have final responsibility for the code. That contributor might leave the project, leaving you holding the bag. This happened. I survived, but it wasn't pleasant for the first several years.
4. You can't please everyone, so don't try to. We talked about this some earlier.
5. Be sure you like the project, you may end up as the BDFL! If you'd told me 36 years ago that I'd still be working on gawk, I'd've been pretty surprised, and probably would have said, "nah! that's unlikely". On the other hand, gawk led me to my writing career, for which I'm grateful.
RF: Writing books is more a labor-of-love most of the time than a profitable enterprise. Yet you have written or revised over a dozen books. Tell us about that, please.
AR: My writing indeed started out as a labor-of-love: working on the gawk manual. One of the goals of the GNU Project was always to have good free documentation along with the code.
At some point, I acquired a copy of O'Reilly's sed & awk, written by Dale Dougherty. I think I was thinking I might learn something, or I wanted to see what it said. As I read, I marked up things that were incorrect. When I was done, I got in touch with Professor Eugene Spafford at Purdue to ask who at O'Reilly I could send the book to. Gene and I had been in grad school together at Geogia Tech, and he had co-written a popular O'Reilly book on Unix security. In any case, I sent the marked-up book on to O'Reilly and then more or less forgot about it.
Sometime later, they contacted me and asked if I would be interested in revising the book, as Dale didn't have the cycles. I said "sure, please find the marked up copy and send it back to me." They did, and I revised the book. Although I signed a contract for royalties, I wasn't expecting it to make much of a financial difference in my life.
The first royalty check was a big surprise. It was much larger than anything I expected. This was early in 1997, before everyone just searched on the Internet, and the book business was quite profitable!
At that point, I said, "Gee, do you have any more books that need updating?" I was moving out of the US and knew I'd have some free time. So I signed contracts to update their Learning vi book, and then to revise UNIX In A Nutshell.
After I moved to Israel, I found that start-up companies were where the interesting jobs were, but they also expected 60-80 hour weeks. I was making enough money writing that I just decided to do that full time instead. I was able to support myself and my family for several years, until a little after the first Internet bubble burst. I had to go to work at a day job early in 2005.
I've kept my hand in the writing, but at a much reduced level. The most recent was a revision of the Learning the vi and Vim editors published a few years ago. And right now I'm working on a new edition of another of my books.
Writing is enjoyable, but it is definitely work. I think that there is still a lot of value in a well-organized, well-written technical book that you can't get out of searching the Internet. Sadly, it seems that the younger generation of developers don't necessarily realize this.
RF: Do you want to tell me about the gawk manual, which sounds like a very long book in itself? I think that started before your book revising projects.
AR: Yes indeed. I had to go back to my bookshelf to see what the FSF had done with the manual. They published six editions: October of 1989, October of 1991, October of 1992, August of 1993, January of 1996 and January of 1998. It goes without saying that they didn't pay me any royalties, as their publications help them raise money, and I was fine with that. It's interesting that the first one they published came out about a year after I first found The AWK Programming Language. It was a productive time for me.
Sometime in the late 80s or early 90s I'd met Phil Hughes of SSC, I think at a USENIX conference. SSC had a booming business in vi, sh, and other fold out reference cards, and Unix instruction and consulting. At some point, I offered to turn their sh card into one for ksh88 (and later also for ksh93). This worked out, and I started getting some money from them. This was really my first venture into getting paid for writing.
In the mid-90s, "real" publishers wouldn't touch a book with a "free to copy" license like the gawk manual had (I had tried with O'Reilly and Pearson). Howewver, late in 1995, I got Phil to read the book and he agreed that it was worth publishing. It came out in January of 1996 as Effective AWK Programming. We published a second edition, this time with an included AWK reference card, in 1997. I continue to update the troff source for the card; it's part of the gawk distribution.
It also helped the relationship with SSC that they were in Seattle, where my wife is from; I was able to visit them in person a few times when visiting family there. Phil even lent me a Linux laptop one time so that I could do some book work while in Seattle.
For comparison, my work with O'Reilly started around 1996; the update of sed & awk came out in the spring of 1997.
SSC then started to switch its focus to Linux Journal, and my relationship with O'Reilly continued to ramp up. In May of 2001 O'Reilly published the third edition of Effective AWK Programming; by then they had a few other free (as in freedom) books and were more open to the idea. A percentage of the royalties went to the FSF, as well.
As I mentioned, in 2005 I went back into the working world, but I continued to maintain gawk and its manual. In March of 2015, O'Reilly published the fourth edition of Effective AWK Programming.
As gawk has acquired features, so too the book has grown. I think that both gawk and the book are as feature-complete as I can make them, although I do eventually want to publish one more edition of the book, as some of it could use reworking.
RF: Do you think that Unix exists any longer? There are the BSDs, the true descendents of V7, and MacOSX still has FreeBSD embedded in there. Linux, well, Linux is Unix-like, but like GNU, it's not Unix. And at over 400 system calls and about 25 millions LOC, Linux seems a far ways away from the roots of Unix and the simplicity of the early versions of the OS.
AR: This isn't a simple question. Today's computers aren't the PDP-11s and Vaxen of yesteryear, and Linux runs on everything from cell phones to home routers to laptops to desktops to servers to supercomputers with hundreds of cores.
A guy I know, Dan Forsyth, long ago stated to me: "Elegance is power cloaked in simplicity".
If we define "Unix" as: "An elegant system that can solve current problems yet be easily understood in its totality", then I'd have to point at Plan 9.
I'm not familiar with any of the BSDs, so I don't know how they stack up in terms of number of system calls and code size.
I'll also point out that, practically speaking, nobody really wants a V7 PDP-11 as their daily driver. I like having networks, windowing systems, streaming video, and yes, command-line editing in my shell.
RF: Speaking of Unix philosophy, of keeping things simple and elegant, systemd is in the news again. Not just for wanting to replace sudo with its own builtin run0, but with a command that when run deletes your home directory. There's a very long thread about this on The Unix Heritage Society's mailing list.
AR: It's clear that current developers aren't being taught "the Unix philosophy" and thus don't appreciate it. It's also clear that modern systems have to deal with many more complicated things than the early Unix systems did. I'd like to think that there's a happy medium that will be reached one day, but I don't know how to make that happen.
Personally, I don't like systemd. But (as I wrote on TUHS), I also don't care enough to go out and find another system. I've been using Ubuntu at (different) day jobs since Ubuntu 10.04, and at home since at least Ubuntu 12.04, maybe even 10.04. I've been using Ubuntu Mate for at least eight years. It "just works", and that's what's most important to me. I still have a day job, kids living at home, Free Software to maintain, books to write, etc.. I don't want to waste time creating "Ubuntu Mate minus systemd", since I don't have that time available in the first place.
If I were running a bunch of big servers, would I feel differently? Probably. Fortunately, I'm not in that situation.
RF: Anything else?
Just "thank you again" to USENIX for the Flame award, and thanks to you for the interview. I enjoyed it.