How I Got Started Hacking Rustc, and How You Can Too!

This is the first part of a planned series about rustc, the Rust compiler

I remember first hearing about Rust during the summer of 2011. In fact, I remember the exact moment. I was at MIT, doing their Junction program. It was during a seminar about semiconductors. I remember browsing through the source on github, getting lost, and going home.

Fast forward to two months ago. A slashdot post appears, bringing Rust back to the forefront of my consciousness. By this point I’d actually gained some programming chops, gotten a job, etc. I read through the Wikipedia article, though “hey, this looks like it has potential,” and forgot about it.

Fast forward a week or two. The matasano crypto challenges were linked on HN. “Our friend Maciej says these challenges are a good way to learn a new language, so maybe now’s the time to pick up Clojure or Rust.” And pick up Rust I did. Rust was a pretty easy language to get started with, with my predominantly Python, C, and Lua background. Especially for the crypto challenges, which start off fairly basic.

First, some warnings:

Rust is pre-alpha software. Backwards incompatible changes happen weekly, either in the libraries, or in the language. It’s probably best to not write any “serious” code in Rust right now, unless you plan on fixing it every few weeks to keep up with the language. The nice part about contributing code to the compiler is that when someone changes the language or a library, it is their job to fix the code that uses it.

Make sure to use the master branch, and use the doc links under “Trunk”. It will save you pain. Nothing is worse than accidentally using the 0.6 documentation and finding that a method has been renamed or removed, and getting confused when the build fails halfway through.

The Rust compiler is poorly written. This is an artifact of being written in Rust, which, as stated, changes rapidly. Some code is very old, and uses very old idioms, or doesn’t use newer language features that would be cleaner and easier to read. If you notice this, try and fix it! If you notice it, that means you already more-or-less know what needs to be done to clean it up a bit. If the change is very invasive, it’s probably best to open an issue and let an experienced dev deal with it. An example of a cleanup is pull request 7315, which cleaned up indentation and replaced some ifs with matches.

Do not, repeat, not, use the rustc code as a source of “how to write Rust.” Almost all of it is bad code. I don’t even know where to tell you to look to find consistently good code. The upside is that generally reviewers will catch suboptimal code, and suggest improvements. This pull request, for example, used some old Rust idioms, which the reviewers suggested fixes for. So feel free to get elbow-deep in code without worrying too much about whether the code you’re writing is good or bad. General guidelines: avoid @ always, avoid ~ usually, use Option and Result, handle errors. That will guide you straight most of the time, and by the time you know when to ignore those, you probably already know what good Rust code is.

Getting started

The first thing I did was, of course, go to the home page. I read the feature summary (which seemed mostly unchanged from when I first saw it. Indeed, looking in the wayback machine, it is mostly unchanged). I read the example, and clicked “tutorial” over on the left. I built the compiler while doing this. There are instructions for building Rust over at the wiki. It’s a lot easier to get started if you’re using Linux or Mac, though not impossible on Windows (just a bit more setup and waiting).

The tutorial left me confused and alone, and I’m sure it did the same to you. But it gave me enough information that I could write a base64 encode and decoder, although I constantly referenced the tutorial. By this point I had moved on to the second matasano challenge, and I found my first compiler bug: really poor error messages. Of course I had to fix this! Error messages are easy, right?

Yes and no. With a codebase as large and complex as a compiler, there are many layers of stuff you need to pick apart to figure out the cause and fix of an issue. In my case, it was easy, just grep for the error message. The fix, however, was more complex. I had to figure out how to turn a “span” (the compiler’s way of matching up an AST node with a chunk of source code) in a string. Often you’ll need to go digging through other files to figure out what you can do, what data structures there are, etc.

Rust makes this easy! There are no IDEs or any fancy tools, but Rust source is insanely grepable. You see a method call like parser.parse_ident(...), you just need to grep for fn parse_ident. Of course, actually understanding what the method does is a whole new can of worms…

Picking an issue to fix

I think the best way to pick an issue to fix is to fix a bug you encounter yourself. Ask in IRC about it, often someone will be online that either knows about it and can point you in the right direction, or at the very least help reproduce, debug, and sift through the issue trcker.

There is the E-easy label on certain issues. This are issues that shouldn’t take too much trickery to get done, though they might take some time to get “acclimated” to the codebase. E-easy doesn’t mean fast, it means easy. It might be tedius or take non-trivial amounts of effort, but it shouldn’t require overarching design issues or a lot of knowledge about rust or rust internals.

Documentation always needs writing. Open a random file from libstd or libextra, look for functions, structs, enums, and traits that aren’t documented. You’ll get to see a bunch of Rust code, probably using features you wouldn’t otherwise see writing “normal” code.

After you fix it

Once you fix the issue, open a pull request. See GitHub’s help for how to do this. If you get stuck or need additional help, jump onto IRC (webchat) and ask. Someone will have to review your changes, ask “r? $link_to_pull_request” in IRC to expedite the process.

Feel free to ping me (cmr) on IRC if you have any questions or problem.