As a computer science tutor, and as a developer myself, I’ve often run across some interesting, and frustrating bugs. Fortunately, I can count myself as a pretty decent debugger, but sometimes I get some real head-scratchers that take some persistence.
One issue that I’ve seen a lot with new students has to do with the concept of “pass by value” versus “pass by reference.” For whatever reason, I notice that a lot of my intro programming students, even in AP Computer Science classes, don’t seem to have a good grasp of this idea, but if you’re not aware of it, it can cause some pretty tricky bugs.
The basic idea is that when you pass arguments to a function or method, you either send a copy of the argument value (which is passing by value), or you send the argument’s actual address in memory, which means that any changes made to the argument within the function will overwrite that value in the memory. In other words, in this second case, you’re passing by reference.
Think of it in terms of this real-world example. Imagine that you’re in a team meeting for school or work, and your team is working on a report. Your task for the day is to edit the document you’ve written and complete a final revision. Which of these approaches would you normally choose?
A. Bring a copy of the document for each person, have everyone write their comments on their own copy, and then pass them all back to the you to process the information they’ve provided.
B. Bring the original document, and pass it around so that everyone can write their input on it.
If we compare this process to a software program, we can say that your editing team’s process is like the method you’re calling in a program, and the document is the argument you’re passing.
If you chose A, then you understand that after the editing session, your original copy is still the same. Any work that the team did was on the copies, and you can do with those copies whatever you feel is necessary. You might decide to incorporate their changes into your original document, thereby replacing it, or you could just opt to leave it as is, but your original did not actually change. This approach is analogous to passing by value in programming. In this case, the argument is a copy of the original document.
If you went with choice B, once the editing process (our function) is done, you’ve made changes directly to the original document (our argument). So the argument has actually changed once we get back to the function caller, which is you, in this case. This is equivalent to passing by reference in a program.
So why can this lead to sneaky bugs? Well, let’s think about a game, maybe chess, that uses artificial intelligence. This will be an imperfect example, but I want it to be high level so you get the basic idea. If we’re playing against the computer, our opponent needs to look at all the different possible moves, and predict what the results would be for each one, in order to pick the best next course of action. Let’s say that the possible moves are stored in an array or list, and we have a function that goes through and evaluates each move one at a time, removing the next move from the top and then adding new possible moves based on the results of the previous move. The function argument is that initial list of possible moves. We’ll also say that the function will keep a separate list of each best possible move, and that’s what will be returned at the end of the function. Sounds reasonable, right?
It might be reasonable, if the programming language we’re using passes the initial move list by value. In that case, the game can make changes to that move list within the function, and come up with the next move, but not change the actual state of the game board. Once the function is done, that original move list has not been modified, we just throw it away, and we’re back in the calling function with the original move list.
But if the move list was passed by reference, any changes made to it within the function, will actually change the move list once the function is done and has returned. It’s like our editing example again, instead of being left with our possible edits on separate papers, with our original copy still clean, we have the results of everyone else’s brainstorming on our master copy, and we’ve lost the original content. It’s the same idea for our move list.
A situation like this can get really tricky when we have a list and we’re removing elements from it. Because imagine if you sent off a list to a function, assuming that it would remain a certain size once the function finished. Then you try to do some additional work on the list, only to find that now the list is smaller or even empty, and now you’ve got an index or null pointer error because you’ve tried to access a list element that is no longer there. I’ve seen that happen in some programs I’ve debugged with students, and it was really hard to figure out, because we weren’t looking at the list contents throughout the program, so we had no idea what was happening, and we couldn’t seem to make any predictions or find patterns as to when the error was happening. It wasn’t until I started printing out all variables and data that could possibly be related to the bug that I realized the list was shrinking and why.
So, the takeaway lesson here is to make sure you know the specific rules for your programming language about when arguments are passed by value or by reference. If you’re not sure, you can look at the documentation or do a quick Google search, or even ask on a forum or something. I would say to use this lesson as a preventative measure, and save yourself the trouble of having to track down a weird error later.