Tuesday, July 04, 2006

Is String.IsNullOrEmpty Really Broken ?

The Short Answer

Yes. Use of String.IsNullOrEmpty can cause runtime errors unexpectedly. Bill McCarthy has found and proved this; the details are posted on his blog.

Should We Panic ?

No. The error only occurs when

  • Code is compiled to release mode.
  • Code is compiled with optimizations.
  • The code itself is structured so a particular set of optimizations occur.
  • The program is NOT started from the Visual Studio IDE.

If you haven't found the problem yourself and you have compiled your code for release with optimizations, and you've tested your application from outside the IDE, then you're not likely to encounter the problem anyway.

Oh, and yes, the problem does occur in the sample code Bill provided, even when the 'if' block has code in it. Several people, including myself, have proved this.

So We Don't Need To Fix Anything and We Can Keep Using String.IsNullOrEmpty Right ?

Erm, not exactly - you should stop using the String.IsNullOrEmpty function in any new code, and you probably should refactor old code for safety's sake.

What Is The Workaround and Do Microsoft Know ?

The problem has been reported to Microsoft, and there is a workaround posted on the Visual Studio Feedback Centre site. Microsoft say they don't plan to fix the problem in the short term because the work around is available.

Basically the workaround is to produce your own copy of the function, and to mark the function as not optimizable so the compiler won't play with it.

Note the function provided on the Microsoft site is apparently the most performant way to check whether a String.IsNullOrEmpty (checking for null and then string length, as opposed to comparing the string to null/"" or String.Empty). Using an IsNullOrEmpty function also looks neater than manually checking the two conditions separately all the time, so I recommend using the workaround provided.

So That's That Then ?

Yes and no. There are three problems I see with the current state of things;

  1. Not everyone knows about the String.IsNullOrEmpty problem.
  2. We're now going to have umpteen different versions of the same function, all maintained by different people.

Technically number two could be solved very easily. Somebody could create a library that contained a working function and publish it for free, and the Visual Studio/.NET community could all agree to use that library. In practice though, this could be difficult. Trying to get everyone to use the same library, deploy an additional assembly with their applications, agree to whatever license agreement comes with the library, and trust person X's code is likely to be a bit like herding cats. In practice, it probably won't happen. Even if it did, you'd still have to contend with point #1.

If we're not all using the function, it's probably just an annoyance rather than a big problem. That is until all of today's programs become legacy code. The IsNullOrEmpty function is so trivial it should never be a problem. On the other hand, storing years as two digits once seemed like it was reasonable too, and then we got Y2k. And similarly, the Year 2038 problem. My point is, that if there ever is a change to that function, be it a year from now or 20 years when we're using the @Basket framework 1.0, the upgrade path isn't going to be very neat because everybody's individual code is going to need to be replaced or upgraded appropriately. We can only hope this is never a problem. I'm a pessimist however.

Point #1 is the big problem however. Firstly, there are many books, blogs, forums, websites and people all touting the use of String.IsNullOrEmpty. Indeed, searching for String.IsNullOrEmpty shows a great number of posts by people saying how cool it is this function is in .NET 2.0 and that it should have been in earlier versions. Have these people realized the potential problems with the function ? Secondly, there are new programmers entering the IT world all the time, so even if everyone knows about the problem tomorrow, we're going to have to continue teaching newbies not to use it.

The second problem is that while the sample code for reproducing the problem is fairly specific, and not likely to be a huge problem in a lot of production code, it's possible that other code could be optimized in a similar way. Or at least in a way that causes similar problems. This means that any library, control, or other .NET function that uses String.IsNullOrEmpty may now be suspect. I don't want to be an alarmist - again, if the problem hasn't already been found in a particular piece of code then probably it won't show up at all, ever.

Is that guaranteed though ? What if the compiler, current or future versions, don't always apply the same optimizations (perhaps based on how much memory is available and the performance trade off between faster/large code and smaller/slower code) ? Wouldn't that mean the problem might occur intermittently even with the same code ?

What I don't understand is Microsoft's reluctance to fix the problem. I can understand that changing the compiler at this point is perhaps too hard. I can also understand that releasing any hot fix or service pack is not a minor challenge for products the size of .NET/Visual Studio. I would think, however, that patching the framework so their own IsNullOrEmpty function is marked at not optimizable and releasing this as either a. a hot fix, b. part of a larger service pack (which they are probably planning anyway), or c. both, would be the sensible thing to do.

I guess we can only pray they change their minds. If you agree with me, I suggest you go the problem report at the Microsoft Visual Studio Centre and vote for the problem as being important. If you really feel strongly about it, maybe you could even leave a comment voicing your concerns. Who knows, public opinion might force a fix ?

No comments:

Post a Comment