When to Construct a new String Object in Java
Articles —> When to Construct a new String Object in Java
The String class in java is a class representing an array of characters. As with all classes, an object of the String class is created using the new keyword. Yet java also contains shorthand ways to create a String object, for instance using quotes or using the methods of the String class to create a modified version of a previously created String. These often negate the need to ever create a new String by calling the String constructor using the new keyword. So when might it be necessary to create a String using the new keyword?.
To understand this question, it helps to understand how the java String class works 'under the hood'. Every String object contains a field for a character array. But in addition there are int field values for an offset and a length. When one creates a new String, its offset is set to 0 and length to that of the character array. However, when one creates a modified version of a String - say for instance through the substring method - rather than modifying the underlying character array the returned String has its offset and length changed. Given the String class is immutable, and in the interest of memory and performance this is often a good thing.
To demonstrate what is going on we can inspect the fields of the String class via reflection. The Reflection API allows us to modify or inspect the runtime behavior of an application, in this case to look at the values of the character array, offset, and length of String objects created in different ways.
import java.lang.reflect.*; /** * Class used to demonstrate how the String class works * 'underneath the hood' */ public class StringTest{ public static void main(String[] args) throws Exception{ String val1 = "Hello World!"; System.out.println(getStringValues(val1)); System.out.println(getStringValues(val1.substring(6))); System.out.println(getStringValues(new String(val1.substring(6)))); } /** * Parses String values the using Reflection to get the private fields value, offset, and count. * @param string * @return * @throws SecurityException * @throws NoSuchFieldException * @throws IllegalArgumentException * @throws IllegalAccessException */ public static Wrapper getStringValues(String string) throws SecurityException, NoSuchFieldException, IllegalArgumentException, IllegalAccessException{ Field field = string.getClass().getDeclaredField("value"); field.setAccessible(true); char[] chars = (char[])field.get(string); field = string.getClass().getDeclaredField("count"); field.setAccessible(true); int length = field.getInt(string); field = string.getClass().getDeclaredField("offset"); field.setAccessible(true); int offset = field.getInt(string); return new Wrapper(chars, length, offset); } /** * Class used to wrap String values obtained by reflection so that the * values can be inspected using toString method. * @author Greg Cope */ public static class Wrapper{ private int length; private int offset; private char[] value; /** * Constructs a new Wrapper object with value, length, and offset * @param value The character value array * @param len The length * @param offset The start offset of the String. */ public Wrapper(char[] value, int len, int offset){ this.value = value; this.length = len; this.offset = offset; } @Override public String toString(){ StringBuilder sb = new StringBuilder(); sb.append("Length: ").append(length).append(" Offset: ").append(offset).append(" Value: "); sb.append(value); return sb.toString(); } } }
The above code creates three String objects in its main method. The first String is created with the value "Hello World!", the second String is created by calling the substring method, and the third String is created similar to the second, but doing so by constructing a new String instance using the new keyword.
And the result from running the code:
Length: 12 Offset: 0 Value: Hello World! Length: 6 Offset: 6 Value: Hello World! Length: 6 Offset: 0 Value: World!
Notice that the second String's underlying character array never changed - the returned String generated by the substring method contains the same characters as that of the parent String, only the offset and length are changed. However, the last line of the main method creates a new String from the returned substring object. In this scenario the character array of the resulting String is different than the first, demonstrating that creating a new String using the new keyword creates a new character array.
So when might this be useful? Quite often one does not need to worry about constructing a String using the new keyword and appropriate constructor. However, when parsing large volumes of text one might unwillingly call a method such as substring on a large String object 'A' to return another object 'B', retaining object 'B' and letting 'A' run out of scope. However, the String object 'B' may still be retaining the original character array, thus 'B' may be using a lot more memory than necessary, in affect creating a pseudo memory leak. In fields such as bioinformatics, where long character arrays are typical events this can affect the memory footprint of an application, and even cause OutOfMemoryError exceptions when these accumulate to any significant extent. As the above code demonstrates, being cognizant that constructing a new String overcomes this problem can help prevent such behavior in these contexts.
There are no comments on this article.