Tuesday, July 04, 2006

Trap For VB6 Developers in C# - Beware Your Use of Strings

Ok, so I wrote this whole post about how storing binary data in strings was bad, mostly around an experience I had with C# strings and character zero. I even published it for a few hours. Then I discovered I couldn't reproduce the problem I thought I'd had, which means either;

  1. The problem wasn't what I thought it was (even though I'm certain I proved it).
  2. The problem only occurred in one of the betas.
  3. The problem isn't reproducible in any way I can remember or puzzle out.

In any case, I removed the post since it now appears to be inaccurate. However, this doesn't mean we should all feel free to store binary data in strings (the byte array is your friend). This article in the MSDN security blog talks about several ways binary data can be corrupted due to string encodings, and while it talks about encrypted data the same rules apply to any binary data. I have actually seen the same problem occur with binary data created by the GZip classes, and it has confused a number of people.

So in short, don't store binary data in strings. Use a stream where you can as it's likely more efficient, and where you can't, use a byte array. If you must use a string, make sure you base64 encode the data first, but beware that any function attempting to use the binary data will need to know to base64 decode the data first.

No comments:

Post a Comment