MacDoctor January 7, 2009

Zune Doom

GEEK WARNING: The following post contains computer code. non-geeks are invited to pass on to the next post. You have been warned!

On the 31st December every 30G Zune in the world died a horrible death.  Fortunately, it was only temporary problem due to faulty code which Microsoft has now fixed. Yesterday, they released the code where the fault lay. I’m going to get a bit geeky here and talk about the programming, because it says a couple of things about Microsoft that concern me. This is not a code-monkey blog so bear with me – I will explain the fault in understandable terms. Here is the code fragment:

year = ORIGINYEAR;

    while (days > 365)
    {
        if (IsLeapYear(year))
        {
            if (days > 366)
            {
                days -= 366;
                year += 1;
            }
        }
        else
        {
            days -= 365;
            year += 1;
        }
    }

The Zune, like most electronic devices, stores the date as a number. In this case the number represents the number of days from an arbitrary date – in this case 1st January 1980. This fragment of code is supposed to work out the year. It does this by subtracting 365 from the stored date number and adding one to the year and going back to the start again until the number is less than 365 (no more whole years) . If the number was 3650, it would do this ten times and the resulting year would be 1990. Of course, some of the years are leap years, so the code takes off 366 days for a leap year. Eventually the number becomes less than 365 and other parts of the code work out the month and the day in a similar fashion. All this to get a date! Sounds like some teenagers I know.

The problem comes when it is 31st December on a leap year. Programmers will have already spotted that the while bit (the bit that tells the computer that the number is now less that 365 and it has found the year) will miss this date because, although it it in the current year, it is day number 366. It is more than 365, so the loop checks that it’s a leap year. That’s a yes, so it checks the number is larger than 366. That is a no so it ignores the code and goes back to the start.

More than 365? Yes, Leap year? Yes. More than 366? No. Do nothing. Back to the start.

More than 365? Yes, Leap year? Yes. More than 366? No. Do nothing. Back to the start.

More than 365? Yes, Leap year? Yes. More than 366? No. Do nothing. Back to the start.

More than 365? Yes, Leap year? Yes. More than 366? No. Do nothing. Back to the start.

This is called an infinite loop and crashes your computer. If it occurs during startup, as with the Zune, the computer won’t start. At least it won’t start on 31st December in a leap year.

OK – the reason why I got all geeky on you is that this is an elementary mistake. The sort of mistake that first year computer science students or hobbyists like myself make. This is NOT the sort of error you want to see in a piece of professional programming. But this not the worst of it, the worst is that this is one of the commonest routines in computer programming. This is an error that simply should not have happened.

Now, I know that Microsoft tend to contract out their programming to India and Asia. I also know that India and Asia have some of the best programmers in the world, so skill is not an issue here. My concern is that quality control is clearly lacking. All computer code has bugs in it, but programs should at least be free of elementary ones. I also wonder if there is not excessive pressure on programmers to churn out a certain number of lines of code every day, regardless of quality. Date routines are usually contained in prepackaged modules, but preparing these for a newish platform like the Zune takes time and money. One too many R&D budget cuts, perhaps?

All of which makes me wonder if Microsoft wrote the software for the Airbus Flight Control computers?

Share

2 Comments

Leave A Reply
  • Hi MacDoctor, technically Microsoft didn’t write this code, it was supplied to them by the manufacturer of the clock chip (FreeScale Semiconductor) used in this particular model of the Zune. Ideally this would have been caught in QA of course. MS is hardly alone in this problem though, for example google “Iphone bugs”. I would hope this bug will get manufacturers looking a lot more closely at their hardware driver code.

    Thanks, Dave. In some ways that’s scarier than Microsoft screwing up, as it is less easy to control. All systems have bugs, of course. My gripe here is that this bug is so elementary.

  • “All of which makes me wonder if Microsoft wrote the software for the Airbus Flight Control computers?”

    It wouldnt suprise me. Though I think the Airbus A300 runs on its own OS system.

Comments Are Closed