Monday, October 19, 2015
VerySillyMUD: Unit Tests

This post is part of my "VerySillyMUD" series, chronicling an attempt to refactor an old MUD into working condition[1].

In our last episode, we encountered a function we needed to refactor:

Only trouble is: we don't have unit tests in place, and I don't really have any idea what this function is supposed to be doing. Looks like we have to get some tests! Now, I am used to the wonderful JUnit family of unit testing frameworks, but it's unclear which C unit testing framework to use. I decided to opt for the Criterion library, as it was xUnit-styled and seemed pretty straightforward.

The first step is to figure out how to run a basic test. In the short term, I'll have to disable the -Werror flag that treats compiler warnings as errors; in order to write and run unit tests against the current code, I'm going to need to that code to compile first! Recall that we've cleared true compilation errors already, so this should work.

Now, the way Criterion unit tests work is that you compile your code under test along with your unit tests and then link in the Criterion library into a single executable; when you run that executable it runs your unit tests. So let's try to get a basic unit test written against the dsearch function:

There's a few things to note here: first, I explicitly included the prototype for dsearch. The project currently puts all the function prototypes into a single protos.h file and includes that everywhere, but I ran into some conflicts trying to do that here. At some point once I'm in a cleaner project it will be worth going back through to move each source file's prototypes out into a separate .h file so that they can be included exactly where appropriate, and so that incremental changes don't require rebuilding everything (right now, if there is a change to protos.h we have to recompile everything). Second, Criterion's Test macro takes two arguments; the first is the name of a test suite and the second is the name for the test. I used the name of the C source file act.wizard.c as the basis for the suite name and just chose an initial test name for now. I will probably go back and rename it to something that reflects the property this test is checking once I understand dsearch a little better.

Now, let's get a make test target implemented so that it's easy for us to run the unit tests. My initial attempt at creating a test executable tried to just link act.wizard.o (the code under test) with test.act.wizard.o and -lcriterion, but it turns out that the act.wizard.c code refers to external symbols in other source code files, so linking failed. Rather than sort out exactly which object files I need, I decided to just link them all in together into one fat unit test executable. Unfortunately, -lcriterion contains a main() function in it (so it can run the test suite), so the rest of the linked object code needs not to have one in it. Right now, comm.c has the main function for the MUD, so first what we'll do is rename that function and then create a main.c file that has a main for the MUD that calls the one in comm.c; then we simply won't include main.o in the test executable.

Next, we can set up some new Makefile variables for TEST_SRCS and TEST_OBJS and then create two targets: one to build the test executable tests and one to actually run it. Finally, we need to run makedepend to update all the dependencies. I note that when I did this I get a lot more detailed dependencies than were in the Makefile originally. The way to do this nowadays would be through automake and autoconf, but I won't tackle that right now; I'll just create an issue on the Github repo.

Wow, ok. Running unit tests! The next step will actually involve diving into the guts of the dsearch function to figure out what it currently does and to document its behavior with tests, which we'll do in the next episode.

[1] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.

Sunday, October 11, 2015
VerySillyMUD: Cleaning Up the Warnings

This post is part of my "VerySillyMUD" series, chronicling an attempt to refactor an old MUD into working condition[1].

In our last episode, we discovered a bug that the compiler had actually warned us about, but which we ignored because we were trying to just fix errors and get on with running the game. Lesson learned here; time to go clean up all those warnings. As there are a lot of them, and they are likely going to be somewhat repetitive, I will just highlight the particularly interesting ones here. However, you can see the full set of cleanups here.

The first thing we want to do is to recompile all the object files to see what the warnings are. I was about to make clean except that the Makefile doesn't seem to have a "clean" target! Let's fix that. Then, let's make sure we don't make this mistake again by adding the -Werror flag to the CFLAGS variable to treat compiler warnings as errors. Now, the first type of error we encounter is:

Basically, we have not explicitly declared a prototype for the atoi library function, nor have we #included one. This is simply a matter of adding the right system include file in the case of C library functions. Next:

When you are not explicit about your braces, and you have nested if statements, you can have unintentional paths through your code when an else gets associated with the wrong if. We can correct these by adding explicit braces, taking indentation in the source code as a hint as to what was probably intended. Next we find that adding in some missing #include files created some conflicts:

A global variable reboot is used to track whether the game is supposed to restart itself after an intentional, non-crash shutdown. This conflicts with the reboot system call that actually reboots the whole machine! We can handle this by renaming the variable, taking care to find all locations of it in other files.

We also fixed some cases where functions were declared with a non-void return type yet there was no explicit return statement. As we saw in a previous episode, we can fix these by changing the declaration to return void if no callers ever check for a return value. Our next error involves the type of one of the arguments passed to the bind() system call:

A glance at the code shows that sa is declared as a struct sockaddr_in (an Internet socket address), but bind wants a pointer to a struct sockaddr. Here's one of those cases where C expects you to cast one struct to another with the same prefix, for as much confusion as that might cause. This is a standard C TCP/IP socket programming idiom, however. The next error complains of an implicit function definition, but is a little more involved:

With a little judicious grepping, we find that CAN_SEE_OBJ is declared in utility.c, where we see:

Ugh. This apparently was a macro at one point but was redefined as a function, although someone conveniently left the old macro definition in here for us, but compiled out with "#if 0". At any rate, we can clean that out and then add the missing prototype. For the next error, the compiler helpfully gives us options about how to fix it:

We can fix this by explicitly comparing the value of the assignment to a null character ('\0') instead. I also encountered some type errors showing up mismatches between a format string and an argument:

Again, the compiler happily provided a useful suggestion in these cases to update the format string. I think this is an area where the C compilers have improved since I was doing a lot of C programming; I'm not sure they would always catch these for me in the past. But I might also be misremembering!

Next, we have a couple spots where the developers wanted to be extra sure about order of operations:

In this case, a quick perusal of the code suggests that this was indeed meant to be a comparison and not an assignment, so we can just get rid of the extra parens.

Now, at one point I ran into some type errors surrounding function pointers:

Ugh, function pointers. I can never get the typing right on these and always have to look up how to do it. In this case, it looks like the programmers just got lazy, calling all the function pointers void (*)(), but we can propagate the correct types through the code instead.

And then I ran into this error:

Hmm, taking a look at the code shows:

Ok, I can see the problem, which is essentially some kind of nested sprintf calling. This needs to be refactored, but there's no way this function is clear enough to me to refactor it without unit tests. Guess we'll have to get that going in the next post.

[1] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.

Saturday, October 10, 2015
VerySillyMUD: A Lesson About Compiler Warnings

This post is part of my "VerySillyMUD" series, chronicling an attempt to refactor an old MUD into working condition[1].

In our last post, we managed to get the MUD executable to build, and attempted to run it, with somewhat lackluster results:

Fortunately, we get a file and line number with this error, so let's take a look:

Hmm, we seem to not be able to change into whatever directory is represented by dir. Scrolling up through the source we see that this can be set via a command-line argument:

...but we didn't provide one, so it must have a default value. A little further up and we see it gets set to DFLT_DIR; a quick grep gives us:

Aha! It's looking for the "lib" directory, which doesn't exist in the src directory but does exist as a top-level directory in the project. Let's try pointing it there:

Great! There are a lot of error messages in here about "unknown auxiliary codes" but the MUD seems to have started. Let's try to connect as a client:

Ok! Let's proceed and create a character to play:

A quick look over at the terminal where the server was running shows this in the log:

Oh, joy. Looks like we'll have to run this in a debugger to see where it's barfing.

Uh, this is segfaulting in a C library routine? What's going on here? The source code line where this is happening is:

A little more poking around in the debugger shows that all the pointers here (d->pwd, arg, and d->character-> are all well-formated, null-terminated strings. What's going on?

I will spare you, good reader, quite a bit of head scratching and Googling, but here's ultimately what happened: there is no function prototype for crypt() defined or included in this source code file, and the C compiler is making the assumption that it returns an int, which is a 32-bit quantity. In actuality, crypt returns a char *, which on a 64-bit machine like I have, is a 64-bit value; the return value is getting truncated, and when it is likewise passed to strncpy that expects a pointer, we're off again, and the pointer passed in is essentially garbage. Now that I've figured out what the issue is, I would not be surprised if we had a compiler warning about this. Sure enough:

Remember when I said previously that I was generally a fan of "treat compiler warnings as errors"? Looks like I should have taken my own advice. In the next episode, I'll go ahead and clean all of those up.

[1] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.

Saturday, October 3, 2015
VerySillyMUD: Remaining compilation errors

This post is part of my "VerySillyMUD" series, chronicling an attempt to refactor an old MUD into working condition[1].

In this edition, we'll try to fix all the remaining compilation errors. The next error we have is in handler.c:

As before, we'll fix this by converting it into a comment. The code clearly isn't used, but this is an unfamiliar codebase so I don't want to throw code away (yet). Easy enough. Next error:

Now this is truly frightening. Someone is calling malloc with the wrong number of arguments; this is almost certainly the source of some kind of nefarious memory bug. What's going on here? A glance at the source code enlightens us:

The parallelism in the code here now makes this clear once we can see the associated realloc() call, which takes a pointer to the old allocation (script_data[top_of_scripts].script). This pointer is not needed for the initial malloc call, so we can remove it. Next up we have:

Ok, that's a straightforward error, but how do we fix it? Let's look at this function:

The error occurs at the first return statement. This function appears to reference special procedures; as I recall, certain areas or creatures ("mobs") in the MUD had unusual behaviors or triggers. Looking at the rest of the body of this function, it seems to be looking for special procedures and firing them, returning a 1 if one was found. Since the first return statement doesn't seem to fire any, it seems that returning 0 there is probably the correct return value. Next error:

Here the code is using a deprecated header file (malloc.h) and we can just include stdlib.h instead. Fixing that let the compilation of utility.c get further:

Ah, our good friend "#ifdef 0" again. Let's turn you into a comment instead. Next:

Hmm, we'll have to take a look at this shop_keeper function, which presumably operates the many places in the MUD where your character could buy items.

Ok, what seems to be happening here is that the function returns TRUE when the shopkeeper has performed some special behavior, and FALSE otherwise. It seems like the three return statements causing the errors are the ones up at the top of the function doing early exits from the function; it seems reasonable to guess that we should return FALSE in these cases, but we really don't have any way to tell without a much closer understanding of the codebase, which we're not going to attempt to get yet. Next error:

Another missing return value; we'll guess we should supply another zero here for this early-exit case.

More missing return values! This time, it turns out all the return statements in the DamageMessages() function don't have return values; this leads me to suspect the return type of this function is wrong. A quick grep confirms that none of the callers capture or inspect a return value, so we'll just change the return type to void. Next:

As with the shop_keeper() function we saw above, the missing return value is on an early exit from the function, which manages characters interacting with the bank and returns TRUE if such an interaction occurs. The early exit happens if IS_NPC() is true for the character. An NPC is a non-player character, i.e. one played by the program and not by a human; presumably NPCs can't interact with the bank, so it seems returning FALSE in this case would be right. Next:

We saw log_msg() in our previous post. It's defined this way:

Note the commentary left by a developer on this one-liner, which is near the top of its file and hence looked like a function prototype. At any rate, we see that log_msg() is really shorthand for calling a different logging function, log_sev(), with a priority of 1. It seems like our error in board.c was either meant to call log_sev() directly or just accidentally added an argument:

Since the other error message here just invokes log_msg() for what would seem to be an equivalently serious error, we'll change the problematic invocation to match. Next:

More missing return values. We'll default them to returning FALSE. Next:

If we supply one more FALSE then the whole thing compiles into an executable called dmserver. Let's try to run it!

Well, not so hot, but we'll leave fixing it to the next episode.

[1] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.

VerySillyMUD: Compilation

This post is part of my "VerySillyMUD" series, chronicling an attempt to refactor an old MUD into working condition[1].

The first goal is just to get the thing to compile. The source code is early 1990s C, and I have this C compiler on my computer at home:

The compiler doesn't quite like the source code here; it generates 21 warnings and 20 errors on the first file when we run make! I haven't included the full output here for readability's sake, but you can peruse the full gist if you like. Our goal here is to get compilation to succeed while making only minor edits that we think are safe and do not change intended functionality.

The first error is here:

The error in question points out a conflict between the prototype included here and the system-level include file string.h. A quick grep of the source code shows that strdup() is not implemented in the source code here, so this should just be a matter of removing the prototypes where we find them in the source code and instead just including them via the system level library, as seen in this commit. The next error/warning is:

If we look at the source code in question, we see:

This could either be a mistake in the function definition (i.e. it should return void instead of int) or this is an oversight and the function ought to be returning some kind of status code as an integer. A quick grep shows this is actually only called from one place (elsewhere in comm.c) where the return value is not captured or inspected. Therefore we can take care of this in the short term by correcting the function declaration, as captured in this commit. What's next?

Ok, from looking at the error message, the compiler seems to be confusing a locally-defined log() function that outputs some kind of logging message with the one from the math.h library for computing algorithms. The easiest way to fix this is to rename the local function for clarity to something like log_msg(), as captured in this commit. Ah, rats, now we see that this function didn't actually have a prototype anywhere, or at least not in enough places:

Adding the prototype to the protos.h include file seems to fix that up. Now, I just noticed at this point the compilation of comm.c still generates 25 warnings but only 2 errors! It turns out the very first fix we did for strdup cleaned most of them up. As I am running out of time for this session, let's press on to try to clean up the last two errors and at least get one file compiling. I'm generally a proponent of the "treat compiler warnings as errors" rule of thumb, but I think we're not quite ready to add that rule. I'd really like to get to the point where we can get some unit tests in place before doing too many more modifications. The first remaining error is:

Hmm, this is a new one for me. The bind() function connects a socket to a particular address so that we can start accepting connections from clients there. The system library's bind() on Mac OS X only takes 3 arguments, as does Linux's. A look at the source code shows there's an extra 0 argument being passed. Maybe this was for an older version of bind? According to running man 2 bind on my Mac OS X computer, in the HISTORY section it says "The bind() function call appeared in 4.2BSD." It turns out if we actually look at the man page for bind() from 4.2BSD it also only takes 3 arguments! What the heck? Without really knowing what the compilation target was for the codebase, it's hard to say what to do here. I'm going to assume one of the compilation targets had a bind() call that maybe took some optional flags as a fourth argument, but the source code didn't opt into any of them (hence the zero). According to this, extra arguments to functions were probably safely ignored, so I think we can just get rid of the extra argument here and see if that works. It'll certainly clean up the error at least. Ok, the last error, then is:

I think this dates me as a C programmer, because I thought that was still a valid way to #ifdef-out a section of code. I'll just comment the code out instead. Update: when I mentioned this on Twitter, Steve Vinoski pointed out:

This was interesting because Steve was right in that I had forgotten the idiom I was used to (#if 0). But it was also interesting that this code, which apparently worked at one point, used this incorrect idiom (#ifdef 0). A little experiment shows that this is not valid in C89 ("ANSI C"):

If I attempt to compile this using C89 compatibility, I get:

So this was written for a compiler with non-ANSI C support. It doesn't even appear to be valid K&R C, according to the list of gcc incompatibilities between it and ANSI C (the version of gcc I have installed is based on the clang compiler, which doesn't support the -traditional flag, and I'm not motivated to go install an older version to try it out). There are hints in the source code that this was targeted towards Sun computers; perhaps their early-1990s-vintage proprietary system C compilers accepted this syntax. Anyway, it's an easy enough fix, so let's not dwell further on it.

(Partial) success! At this point, the make run actually builds 10 object files (including comm.o) before erroring out (full gist). Next time out we'll continue fixing compilation errors in the hopes of getting at least a clean compilation run.

[1] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.

VerySillyMUD: An Exercise In Refactoring: Intro

When I was in college, I played a MUD called SillyMUD, which was in turn a derivative of DikuMud, which was written by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe. I happened to mention it to one of my kids, who expressed interest in seeing/playing it.

I found an old source code archive of it, and it turns out it was distributed under an open source-style license. Unfortunately, it doesn't even compile as-is, and as I recall there were several scalability and memory bugs present in the codebase even when I played it.

Therefore, I decided that I would undertake gradually refactoring the thing into shape and perhaps modernizing some of the technology choices in the process, while chronicling it here on the blog to capture it for educational purposes as an example.

The posts in this series so far are:

  1. Compilation
  2. Remaining Compilation Errors
  3. A Lesson About Compiler Warnings
  4. Cleaning Up the Warnings
  5. Unit Tests
  6. Understanding a Function With Tests
  7. Fixing the Rest of the Compiler Warnings
  8. Adding autotools
  9. Continuous Integration

More to come!