As a software developer, at some point, you probably will have to write code that gets some kind of input from the user. Most of the Java students I’ve tutored have been using a standard keyboard to get the user input. There’s several different ways to accomplish this task in Java, but the common approach my students learn in class is to use the Scanner class. In a lot of ways, I find the class simpler to use than what I was taught when I first learned Java (we used the BufferedReader and InputStreamReader classes). However, there’s a very subtle mistake that I see a lot of new students making with the Scanner class.
Let’s take a very simple math example. Suppose that you are writing a program that asks the user to enter a list of, let’s say, 10 numbers, from the keyboard. Your program should do two tasks with the list: calculate the print the average, and print the smallest number in the list. If you’re rusty on your math skills, the average of a list of numbers is the sum of all the numbers, divided by the length of the list. Let’s give this problem a shot. You can find the whole code at https://codepad.co/snippet/bAGpUdAb, so head over, copy the code, and try to run it. What happens? Well, here’s what it looks like when I run it from the command line on my computer:
What’s this all about, why is it letting me enter so many numbers? I should be able to enter 10 numbers, press Enter, and then have the calculations done, but that’s obviously not the case here. Let’s look at some lines of code. I’ve shown lines 16 – 44 in the image below:
On line 16, we have a while loop that will check to see if there’s another number to be read, which is fine. You can see that we have a counter called i, which will break the loop once we’ve read 10 numbers (line 38 – 41). So theoretically, we shouldn’t be able to enter all those numbers like I did in my screenshot. Let’s look elsewhere.
Lines 21, 26, and 32 are getting the current number in the list, so we can update our sum and check to see if it’s the lowest number, making changes accordingly.
Oh, but wait a minute…are we really getting the number we think we’re getting? Look at the documentation for the nextDouble() method: https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html#nextDouble. Here’s the relevant line: “Scans the next token of the input as a double. This method will throw InputMismatchException
if the next token cannot be translated into a valid double value. If the translation is successful, the scanner advances past the input that matched. ” Now read very carefully here. It says if we successfully find a double value (meaning there’s another number to read), “the scanner advances past the input that matched.” That line might be a bit tricky, but what it means is that once we’ve read the next number, the Scanner object will update it’s position, and get ready to read the next number.
Here’s the thing about Scanner objects. We’re assuming here that it will take a line or token of input and read it. But does it read the whole line all at once? Well, that depends on if we use methods to have it the whole next line, or if the next number, or the next character, whatever it is we need. As you can see from the documentation, there’s lots of methods that basically allow us to get as little or as much of the next piece of input as we need. So the Scanner will read however much we tell it to, and then stop and wait for further instructions. If we want to read a line at a time, we tell it to get the next line, do whatever we want with that line, and then tell it to move on to the next line, and so on.
But that implies that the Scanner has some way of “bookmarking” its progress through the text that’s been input. When we ask it to get the next line, it knows its position in the text so far, so it can move on to the next spot. So when that line of documentation says that “the scanner advances,” that means it’s going on to the next line, or number, in this case. And that’s where our problem is in the code. Those three lines 21, 26, and 32 are not just getting the current number that we’re supposed to be adding and then checking to see if it’s the lowest. It’s actually moving the Scanner‘s “bookmark” to the next number in the list of input! We were supposed to be adding 1 to the variable i everytime we processed a new number, and once i got to 10, we were supposed to stop. But those lines are advancing the Scanner 3 times, reading 3 extra numbers, but not updating i. So that’s why in the screenshot I posted, I’m able to enter so many numbers before the program finally puts out a result.
The moral of this story is, if you’re using a Scanner object, and you need to do something with the current line, number, or whatever kind of token, don’t keep calling for the next token from the Scanner. Read it once, and then store it in a variable that you can use without having to call the Scanner again until you’re ready to move to the next. Here’s a better screenshot of what our code should look like instead:
You can get the code at https://codepad.co/snippet/Ex7zEYNE. And now here’s the result when I run it:
It’s a very subtle bug that a lot of people miss, but again, prevention is the best protection, in my opinion. Also, this goes to show that learning how to read documentation is such a valuable skill, even if it can be daunting.