I had run across a blog post by Matt Welsh about using Rust at a startup [1], and thought perhaps I could get Matt to expound on using Rust. I like the idea of Rust a lot, as its memory safety features do away with about two thirds of the bugs found in software written in C and C++. I filed the thought away, until Matt caused another stir by writing an editorial for CACM [2].
I had met Matt during a HotOS workshop in 2013, and I later tried to get him to write about a favorite experience finding some obscure bug. Matt politely declined. This time, Matt is the CEO of the startup Fixie (fixie.ai), and the CACM article suggested that he would be more interested in talking this time around.
Rik Farrow: You wrote a blog post [1] in October 2022 saying that Rust is not the best choice for a startup. I was sad to read that Rust, with equivalent performance to C/C++ but eliminating about 67% of the typical vulnerabilities caused by using C (according to work on CHERI by Microsoft [4]), is just too hard to use.
Matt Welsh: As I wrote in my blog post, at a startup it's very important to move fast and often that means using less-good, but familiar and popular, tools. Choosing esoteric tools or programming languages is a big risk in a startup setting as you'll have difficulty hiring people who already know those tools. Rust is still new and unfamiliar enough that finding developers on the job market with Rust experience is pretty rare. Now, in the case of a language like, say, Kotlin or Swift, this is not that big of a deal because those languages are very, very similar to existing languages that the learning curve isn't that steep, so people can get up to speed quickly.
Rust, though, is a different beast. Its core tenant is that it should be impossible to write type- or memory-unsafe code. To achieve that goal, Rust makes use of some fairly novel language constructs, such as lifetimes and affine types, which are often hard for programmers to get their heads around if they have not seen them before. The good news is that if you can get a Rust program to compile, it's probably correct (at least, there should not be lurking memory or type errors that you might encounter at runtime). But getting a Rust program to compile is quite a chore, given how strict the language is. That coupled with the immaturity of the Rust ecosystem means that developers pay a heavy tax when using Rust. The question is whether it's worth it.
Based on my experience with Rust, I don't think the productivity hit to a startup's engineering team is generally worth the benefits that Rust provides. Unless you're writing mission-critical software that runs on a space station or a surgical robot — which few startups are — it doesn't seem to me that the real-world implications of the occasional type safety bugs or memory leaks are worth spending so much overhead avoiding up front.
RF: Speaking of hiring programmers, I recently wrote an opinion column [4] suggesting that very few people can be good programmers. Instead, good programmers are rare, while mediocre programmers are common, and perhaps bad programmers are rife. What have you noticed over your career?
MW: Well, I think it's important to keep in mind that someone like you or I likely have a pretty skewed view on the quality of programmers out there — in large part because we've been fortunate to work with some extremely talented people in our careers. Most of my exposure to the job market has been at Google and recruiting for well-funded startups in Seattle, which I doubt is representative of most of the world. Most programmers do not work at places like Google, they work in fields like finance, industrial automation, and IT outsourcing. Most programmers do not live in the US, they live in India and China. Given the vast differences in the demographics I think it is very difficult to extrapolate very much from my own experience!
One challenge here, I think, is defining what we mean by "good programmer". So much of what we think of as being a good programmer has little to do with programming and more to do with skills like communication, design, and managing complexity. Unfortunately I think both academia and industry tend to place too much emphasis on teaching people how to "write code" that we don't explicitly focus on teaching problem-solving, communication, and other skills that are absolutely essential in the real world. If you look at the curricula for most university CS programs, for example, it's rare to see courses explicitly teaching things like system design, debugging, working as a team, writing design docs, or testing. These things are far, far more important than, say, being able to implement a red-black tree in Java. So I'm not sure how we're going to increase the proportion of "good programmers" until we make some changes there!
RF: I agree that real change will begin with education. But I also believe that not everyone has the type of intelligence to be a good programmer. You mentioned problem solving as something that needs to be taught, and I think that might be something that requires a knack for doing it.
Besides, today it seems that most programmers are of the 'cut-and-paste' school of programming. When a function for left-justifying text by removing white space was taken out of npm, thousands of programs stopped working [5].
Copilot appears to work along those lines: you tell it what you need to do, and Copilot provides you with a code snippet that does that. Or is supposed to do that, as Copilot has been trained on code, not necessarily correct or quality code. Your article in CACM suggests that a user provides a 'training model' and your product handles the programming. I don't know how much you can say about this at this point in time, but am I understanding that correctly?
MW: Right, my point is generally that it can be difficult to decouple the questions about whether someone has the right problem-solving ability to be a good programmer from, for example, what educational opportunities they may have had in their lives. I tend to believe there are just as many brilliant people living in Lagos as in London, but there is a huge gap in terms of the opportunity available to people in each city.
Copilot works incredibly well, but operates on the principle that an expert programmer is prompting it with other code (either extant code or code that they are in the process of writing) that it is expected to complete. We are quite far, today, from being able to map purely English (or Thai or Yoruba) descriptions of an end goal into computer code. However, the broader point in my CACM article is that the idea of "code" is likely going to become obsolete when the AI models become powerful enough to directly solve computational problems without having to translate them into computer code first.
Computer programs are, in many ways, an artificial abstraction that came about because early computing machines could only perform one basic instruction at a time. Hence, it was necessary to develop a science around mapping problems onto the limited vocabulary of early computing machines, much in the same way that electronics engineers map problems onto circuits constructed from basic components like transistors, resistors, and capacitors.
However, the transistors and such are not fundamental: they represent a point in time in terms of technological capabilities. While Turing, Church, and others have shown that these fundamental abstractions are, in fact, universal, that does not mean that they are the only or the best way to model the problem solving ability of modern machines. I doubt very much that anyone would want to write a modern 3D graphics rendering pipeline as a Turing Machine using a tape with only 0s and 1s!
Modern software is long past the point of being amenable to formal static analysis (though plenty of people are still trying to crack that nut). By the same token, biological systems "work" even though we cannot hope to rigorously analyze or deconstruct their behavior down to the molecular level. My claim is that our relationship to computing is about to go through a similar shift, whereby the building blocks are not programs but models. And much like one could never hope to understand the workings of the human brain by looking at it under a microscope, we can't reason about the operation of a suitably sophisticated AI by viewing it through a classical lens.
By the way, while my company is working in this general area, in that article, I was not trying to hint at any particular technical capabilities that we plan to bring to market. Much of the above is premised on an evolution of large language models over the next, say, decade. What we or anyone else will be able to do in the next couple of years will be far more limited, but that does not mean we should not start figuring out what it will mean!
RF: That's an interesting concept: that instead of programs we may be using models as our abstractions for solving problems. That still leaves me wondering: if you use AI to create your software from models, do you also set it up to create the integration tests? That seems dangerous to me.
MW: Right, good question.
The goal of integration tests is to ensure that the software under test behaves according to some specification. Often that specification is written separately, as a document, or only known by the person who wrote the code and/or tests. So, the question is, if an AI is writing the code, what specification was it given, and can you generate both the code and the tests from that specification? Alternately, do you have humans write the tests and use those tests as the specification itself, for which the AI then generates the implementation?
Of course all of this presumes a conventional model for software development in which there is some set of software modules that need to be independently tested. I posit that this concept only makes sense when it's humans who are writing and maintaining the code. If an AI is writing the whole shebang, there's no need for the code to be clean, modular, and reusable, since (apart from things like binary size issues) it doesn't really matter how well-structured it is. The AI can generate a godawful mess of spaghetti code and it should be perfectly fine, as long as it works.
So I suppose I would turn your question on its head and ask — if the goal of integration tests is to allow humans to ensure that individual software components work independently, is that the right approach to testing AI-generated code that need not follow those conventions?
RF: You're right: integration tests are out in the future you describe for AI-driven programming. Unit tests and building code as modules are mechanisms we've developed that make it possible for groups of people to create and maintain large and complex systems.
What you've described is a system where an AI creates a 'blob' that is supposed to perform the same tasks as the distributed, modular system created by humans would have. That means losing the ability to isolate problems that show up, to tune system performance by increasing or decreasing the number of instantiations of modules, and to test out incremental changes. People will still want to test the AI-developed code to check if it is performing as specified, and I wonder if allowing the same AI model that developed the code to test itself is a good idea.
As we are talking about an imagined future, I think that there's a place for both realities: AI-developed code and human-developed code. While the number of highly skilled human programmers is limited, just as there are only so many Olympic athletes, there is certainly a place for AIs that write software. Most programmers write bad code, so having an AI that creates "a godawful mess of spaghetti code" may not be any worse, and could even be a lot better.