Tuesday, September 23, 2008

String constructor considered useless turns out to be useful after all (film at 11)

When you were still a Java neophyte, chances are you wrote some code that looked like this:
String s = new String("test");
Brings back some embarrassing memories, doesn't it? You soon learned that instantiating Strings via the constructor was hardly ever done, and the String constructors seemed to be, well, utterly useless. When do you really ever need to do "new String(oldString)"? Come on, IntelliJ IDEA even flags all occurrences of this as "redundant"!

It turns out that this constructor can actually be useful in at least one circumstance. If you've ever peeked at the String source code, you'll have seen that it doesn't just have fields for the char array value and the count of characters, but also for the offset to the beginning of the String. This is so that Strings can share the char array value with other Strings, usually results from calling one of the substring() methods. Java was famously chastised for this in jwz'  Java rant from years back:
The only reason for this overhead is so that String.substring() can return strings which share the same value array. Doing this at the cost of adding 8 bytes to each and every String object is not a net savings...
Byte savings aside, if you have some code like this:
// imagine a multi-megabyte string here
String s = "0123456789012345678901234567890123456789";
String s2 = s.substring(0, 1);
s = null;
You'll now have a String s2 which, although it seems to be a one-character string, holds a reference to the gigantic char array created in the String s. This means the array won't be garbage collected, even though we've explicitly nulled out the String s!

The fix for this is to use our previously mentioned "useless" String constructor like this:
String s2 = new String(s.substring(0, 1));
It's not well-known that this constructor actually copies that old contents to a new array if the old array is larger than the count of characters in the string. This means the old String contents will be garbage collected as intended. Happy happy joy joy.

Sometimes, seemingly useless constructs reveal hidden gems of usefulness. Be sure to check out the source for String to find how this works!