Java 11 Developer Certification - String Operations

October 04, 2020

What we are covering in this lesson

  1. String Concatenation
  2. String Manipulations
  3. Text Search in String
  4. Joining and spliting Strings
  5. String Replacement methods and text transformation
  6. Other methods

String Concatenation

String concatenation is the most commonly used operation on Strings. Concatenation can be performed by overloading the ’+’ operation or using the String concat method.

package maxCode.online.String;

public class StringConcatenation {
	public static void main(String[] args) {
		String s1 = "First";
		String s2 = "Second";
		
        //Overloading the '+' operator
		String concatString = s1 + " " + s2;
		System.out.println(concatString);

        String s3 = "Concat";
        String s4 = "String";
        
        //Using the concat method
        String s5 = s3.concat(s4);
        System.out.println(s5);

        //Concatenating object with String
		Object obj = null;
		String s6 = "Test";
		String op1 = obj + " " + s6;
		System.out.println(op1);
		
		//Concatenating int with String
		int i = 100;
		String s7 = "Test";
		String op2 = i + " " + s7;
		System.out.println(op2);

        //Arithmetic operations with String concatenation
        int i1 = 10;
		int i2 = 20;
		System.out.println(i1 + i2);
		System.out.println(i1 + i2 + " " + i1 + i2);
		System.out.println(i1 + i2 + " " + (i1 + i2));
	}
}

The output will be

First Second
ConcatString
null Test
100 Test
30
30 1020
30 30

The last 3 outputs are interesting and worth understanding in detail. The output of 30 is a direct addition of both the variables. In the next output 30 1020, the arithmetic operation takes place until it encounters a String (in our case " "). After this, the next + results in String concatenation and hence the numbers are concatenated to create 1020. Now if we need arithmetic operation to be performed even after a String encounter, we would have to use brackets as brackets takes more precedence. So this explains the last output 30 30.

Note: When you concatenate a string to a reference variable, if the variable is a reference type, then the toString() method on the object is called. If the variable is a primitive data type, the variable is boxed to a wrapper and its toString() method is called.

Lets have a look at some more concatenation of non-string components with a String object.

package maxCode.online.String;

public class StringCompondConcatenation {
	public static void main(String[] args) {
		String compString1 = "Compound";
		int abc = 123;
		String op = compString1 + abc;
		System.out.println(op);
		
		String compString2 = "The value is equal to ";
		compString2 += 10 + 20;
		System.out.println(compString2);
		
		String compString3 = "This value is ";
		compString3 = compString3 + 10 + 20;
		System.out.println(compString3);
		
		String compString4 = "This value is ";
		compString4 = compString4 + (10 + 20);
		System.out.println(compString4);
	}
}

Output is as below

Compound123
The value is equal to 30
This value is 1020
This value is 30

Lets understand the above output. The first output is very straight forward. The second output is interesting, as we have String and int together and its a += operator. Here, the int values are summed up and then concatenated to a String. This output differs if we use + and = separately and it is clearly seen in the third output. And as seen before, the brackets will have the highest precedence and on using it on ints, we have the int addition along with String concatenation.

String Manipulation

There are quite a lot String object methods and its almost next to impossible to review them all individually. The most important point to note is that the index begins with 0 and ends at string.length()-1. The start index in inclusive, meaning the character at starting index will be included, but the end index is exclusive, meaning the character at end index will NOT be included in the final output. So, if you need the complete string, the start index should be 0 and end index should be string.length().

Comparision Operations

  • equals
  • equalsIgnoreCare
  • contentEquals
  • compareTo
  • isEmpty
  • isBlank

Text Searches

  • contains
  • equalsIgnoreCase
  • endsWith
  • indexOf
  • lastIndexOf
  • matches
  • startsWith

Text Manipulation

  • concat
  • join (introduced in Java 8)
  • replace
  • replaceAll
  • replaceFirst
  • split
  • substring
  • subSequence

Text Transformation

  • chars (introduced in Java 9)
  • codePoints (introduced in Java 9)
  • format
  • lines (Java 11)
  • repeat (Java 11)
  • strip
  • stripLeading
  • stripTrailing
  • toCharArray
  • toLowerCase
  • toUpperCase
  • trim
  • valueOf

Below is a simple example of equals and == comparision.

package maxCode.online.String;

public class StringManipulations {
	public static void main(String[] args) {
		String nullString = null;
		String str1 = "MaxCode.Online";
		String str2 = "MaxCode.Online";
		String str3 = "maxCode.ONLINE";
		Object null1 = null;
		
		//==
		//returns true when both objects refer to the same object, not the same value
		System.out.println(" == for " + str1 + " and " + str2 + " is " + (str1 == str2));
		System.out.println(" null == for " + nullString + " and " + str1 + " is " + (nullString == str1));
		System.out.println(" == null for " + str1 + " and " + nullString + " is " + (str1 == nullString));
		
		//Equals
		//returns true if references are the same,
	    //OR if parameter type is String AND both values are the same
		System.out.println("equals for " + str1 + " and " + str2 + " is " + str1.equals(str2));
		System.out.println("equalsNullString for " + str1 + " and " + nullString + " is " + str1.equals(nullString));
		System.out.println("equalsNullObject for " + str1 + " and " + null1 + " is " + str1.equals(null1));
		
		//equalsIgnoreCase
		//returns true if two Strings have same value ignoring case
		System.out.println("EqualsIgnoreCase for " + str1 + " and " + str3 + " is " + str1.equalsIgnoreCase(str3));
		
		//contentEquals
		//returns true if values in String and the second passed parameter are the same
		System.out.println("contentEquals for " + str1 + " is " + str1.contentEquals(str2));
		
		//compareTo
		System.out.println("compareTo for " + str1 + " is " + str1.compareTo(str2));
		
		//compareToIgnoreCase
		System.out.println("compareToIgnoreCase for " + str1 + " is " + str1.compareToIgnoreCase(str2));
		
		//isEmpty
		System.out.println("isEmpty for " + str1 + " is " + str1.isEmpty());
		
		//isBlank (Java 11)
		System.out.println("isBlank for " + str1 + " is " + str1.isBlank());
		
		//Below line will throw NullPointerException
		//System.out.println("Nullequals" + nullString.equals(str1));
	}
}

Output would be as below

 == for MaxCode.Online and MaxCode.Online is true
 null == for null and MaxCode.Online is false
 == null for MaxCode.Online and null is false
equals for MaxCode.Online and MaxCode.Online is true
equalsNullString for MaxCode.Online and null is false
equalsNullObject for MaxCode.Online and null is false
EqualsIgnoreCase for MaxCode.Online and maxCode.ONLINE is true
contentEquals for MaxCode.Online is true
compareTo for MaxCode.Online is 0
compareToIgnoreCase for MaxCode.Online is 0
isEmpty for MaxCode.Online is false
isBlank for MaxCode.Online is false

Note the last commented line. It will throw a NullPointerException if uncommented, and the reason is that the String variable on the left hand side of equals is a null string. So make sure to use any such String on the right hand side which may be a null string.

Similarly other operations should be tried out individually so as you get yourself familiarized with the String manipulation operations.

Text Search in String

We will be covering the below methods for String text search understanding.

  • indexOf
  • lastIndexOf
  • regionMatches

Important thing to note is that the start offset parameter is inclusive and the end offset parameter is exclusive.

package maxCode.online.String;

public class TextSearch {
	public static void main(String[] args) {
		String text = "abcdefghijklmnopdefqrstuvwxyzdef";
		String searchString = "def";
		
		System.out.println(text.indexOf(searchString));
		System.out.println(text.lastIndexOf(searchString));
		//search starts from offset (index 20) and looks for def backwardsss
		System.out.println(text.lastIndexOf(searchString, 20));	
	}
}

Output will be as below

3
29
16

The lastIndexOf when used with offset, starts from the offset index and searches for the search string backwards. So the last output is 16.

matches on the other hand returns a boolean value (true or false). It returns true if the complete search string matches the string. Usually we use regular expressions when matches is used, otherwise passing the whole string as a parameter seems pointless.

Now lets have a look at some of the examples for matches.

package maxCode.online.String;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Matches {
	public static void main(String[] args) {
        // Matches uses regex Pattern matching, so will only match whole string
        String mississippi = "mississippi";
        boolean matches = mississippi.matches("miss");
        boolean indexMatch = (mississippi.indexOf("miss") > -1);
        System.out.println("mississippi.matches(\"miss\") = " + matches);
        System.out.println("mississippi.indexOf(\"miss\") > -1 = " + indexMatch);

        // Let's try the whole string now...
        matches = mississippi.matches("mississippi");
        System.out.println("mississippi.matches(\"mississippi\") = " + matches);
        matches = mississippi.matches("missi(.*)");
        System.out.println("mississippi.matches(\"missi(.*)\") = " + matches);

        // You can also use Pattern & Matcher to do the same thing
        Pattern p = Pattern.compile("(.*)miss(.*)");
        Matcher m = p.matcher("mississippi");
        System.out.println("m.matches() = " + m.matches());

        // Region Matches, exact case, substring match
        System.out.println(mississippi.regionMatches(0, "miss", 0, 4));

        // Region Matches with ignore case
        System.out.println(mississippi.regionMatches(true, 0, "MISS", 0, 4));

        // Look for "miss" starting at index 1 in mississippi
        System.out.println(mississippi.regionMatches(1, "miss", 0, 4));

        // Look for "iss" starting at index 2 in mississippi
        System.out.println(mississippi.regionMatches(1, "miss", 1, 3));

        // Look for "iss" starting at index 4 in mississippi
        System.out.println(mississippi.regionMatches(4, "miss", 1, 3));
    }
}

Output

mississippi.matches("miss") = false
mississippi.indexOf("miss") > -1 = true
mississippi.matches("mississippi") = true
mississippi.matches("missi(.*)") = true
m.matches() = true
true
true
false
true
true

So as we can see in the above code, regionMatches doesn’t evaluate whether a substring is in the full string exactly but evaluates whether the region specified by the index offsets and length match a region in the source string with the same length.

Joining and spliting Strings

Lets look at some concepts on joining and spliting strings.

  • String.join()
  • StringJoiner
  • split

Lets directly jump into the code to understand these concepts. The below code creates a sentence using a for loop anc concatenation. Then we do a join() (which was introduced in Java 8), and using this method we can do the same concatenation without using any loops. We can also use join() for array of strings or a list of strings.

Then we have StringJoiner, which was also introduced in Java 8. It is in the Java.util package and is used to construct a sequence of characters separated by a delimiter and optionally starting with a supplied prefix and ending with a supplied suffix.

Now lets look at split, which takes a string and splits into an array of strings based on the separator. We have used regular expression which is simply the space character OR Unicode character 0020. We can also use regex for whitespace or any other separator we want. Also, we can limit the number of splits by passing a number as the second parameter, as shown in the last output.

package maxCode.online.String;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.StringJoiner;

public class JoinSplitString {
	public static void main(String[] args) {
		// Data init
        String[] wordArray = new String[]{"Hello", "World", "How", "are", "you"};
        ArrayList<String> wordList = new ArrayList<>(Arrays.asList(wordArray));
        String sentence = "";

        for (String word : wordArray) {
            sentence += word + " ";
        }
        System.out.println("concat sentence using loop: " + sentence);

        // join using a variable list of CharSequence elements
        sentence = String.join(" ", "Hello", "World", "How", "are", "you");
        System.out.println("sentence using join (no loop needed): " + sentence);

        // join with Iterable elements, such as array or list of String
        sentence = String.join(" ", wordArray);
        System.out.println("sentence using join, Array of String: " + sentence);

        sentence = String.join(" ", wordList);
        System.out.println("sentence using join, ArrayList of String: " + sentence);

        // StringJoiner with delimiter
        StringJoiner joiner = new StringJoiner(" ");
        for (String word : wordArray) {
        	joiner.add(word);
        }
        sentence = joiner.toString();
        System.out.println("sentence using StringJoiner: " + sentence);

        // split examples
        // First make sure sentence is delimited by a space to test
        sentence = String.join(" ", wordList);
        // split using space
        String[] splitWords = sentence.split(" ");
        System.out.println("Split only on spaces: " + Arrays.toString(splitWords));

        //create a sentence with multiple white space
        sentence = String.join("\u0020\t\u0020\n", wordList);
        System.out.println("sentence with spaces and tabs and carriage returns: " + sentence);

        // regular expression matches any white space
        splitWords = sentence.split("\\s+");
        System.out.println("Split on any whitespace: " + Arrays.toString(splitWords));

        sentence = String.join(" ", wordList);  // reseting sentence
        // The second parameter is a limit of split needed
        splitWords = sentence.split(" ", 2);
        System.out.println("Split with a limit 2: " + Arrays.toString(splitWords));
	}
}

Output

concat sentence using loop: Hello World How are you 
sentence using join (no loop needed): Hello World How are you
sentence using join, Array of String: Hello World How are you
sentence using join, ArrayList of String: Hello World How are you
sentence using StringJoiner: Hello World How are you
Split only on spaces: [Hello, World, How, are, you]
sentence with spaces and tabs and carriage returns: Hello 	 
World 	 
How 	 
are 	 
you
Split on any whitespace: [Hello, World, How, are, you]
Split with a limit 2: [Hello, World How are you]

String Replacement methods and text transformation

We have the replace and substring methods which are frequently used. The most important thing to remember is that these methods DO NOT change the actual String value (as we already know String is immutable). These methods simply return a new string with either the substring or replaced value. Only if the returned value is assigned to the same string, the value of that variable will change. Lets try and understand this using some examples.

package maxCode.online.String;

public class StringReplaceAndTransform {
	public static void main(String[] args) {
		String srcString = "MaxCode.Online";
		
		//Replacing a single character
		System.out.println("Replacing 'd' with 'D' -- " + srcString.replace('d', 'D'));
		
		//Replacing a charSequence
		System.out.println("Replacing 'Code' with 'Java' -- " + srcString.replace("Code", "Java"));
		
		//ReplaceAll with regex, nl or ne replace with AA
		System.out.println("ReplaceAll 'nl' or 'ne' with 'AA' -- " + srcString.replaceAll("(n(l|e))", "AA"));
	
		//ReplaceFirst to replace only first occurrence of nl or ne
		System.out.println("Replace only first instance of 'nl' or 'ne' with 'AA' -- " 
				+ srcString.replaceFirst("(n(l|e))", "AA"));
		
		//SubString
		System.out.println("Substring after index 5 -- " + srcString.substring(5));
		//note that end index char at 10 is not included
		System.out.println("Substring after index 5 till index 10 -- " + srcString.substring(5, 10));
		
		//valueOf
		char[] charArr = {'a', 'b', 'c', 'd', 'e'};
		System.out.println("ValueOf from index 2 and length 2 -- " + String.valueOf(charArr, 2, 2));
		System.out.println("subString -- " + str.substring(1, 3) 
			+ " VS valueOf -- " + String.valueOf(charArr, 1, 3) + 
			" for params 1 and 3");

		//SubSequence, same as substring with begin and end index
		System.out.println("Subsequence after index 5 till index 10 -- " + srcString.subSequence(5, 10));
		
		//Re-assigning the output to the same string
		System.out.println("Original String -- " + srcString);
		srcString = srcString.substring(8);
		System.out.println("After assigning output to same variable -- " + srcString);
	}
}

Output will be as per below

Replacing 'd' with 'D' -- MaxCoDe.Online
Replacing 'Code' with 'Java' -- MaxJava.Online
ReplaceAll 'nl' or 'ne' with 'AA' -- MaxCode.OAAiAA
Replace only first instance of 'nl' or 'ne' with 'AA' -- MaxCode.OAAine
Substring after index 5 -- de.Online
Substring after index 5 till index 10 -- de.On
ValueOf from index 2 and length 2 -- cd
subString -- bc VS valueOf -- bcd for params 1 and 3
Subsequence after index 5 till index 10 -- de.On
Original String -- MaxCode.Online
After assigning output to same variable -- Online

The things to note in the above output is the last output. Unless the output is assigned back to the original variable, the value of that variable remains the same. Rest of the output is self explanatory. We also have a valueOf method, which is similar to subString() which is a static method, and takes in the start index and length as parameters. It takes a character array as first parameter. Also note the difference in output subString -- bc VS valueOf -- bcd for params 1 and 3 for same string in subString() and valueOf.

There is a new method repeat(n) introduced in Java 11, which repeats the string n number of times. So if we consider the above code, and use srcString = srcString.repeat(2), the srcString value will be MaxCode.OnlineMaxCode.Online. If n is 0, repeat methods returns an empty string.

We also have another new method strip() introduced in Java 11 and now also the preferred method going ahead. It is quite similar to trim() with a difference in case of whitespace character recognition, where whitespace character in trim() is <='U+0020' but whitespace character in strip() is determined by Character.isWhitespace(). Hence, strip() actually strips more space character than trim() and also have extra methods in the form of stripTrailing() and stripLeading().

There are 3 more streaming methods introduced in Java 9 which are chars, codepoints and lines. We will not be covering these in detail here, but you can go ahead and practice.

So to summarize the important points:

  • Strings are immutable, any method applied on string does not change its value
  • String constructors and valueOf methods use a start offset (inclusive) and a count of characters
  • All other methods use a start offset (inclusive) and an end offset (exclusive)
  • Methods matches(), split(), replaceAll(), replaceFirst() use regular expression

Other methods

compareTo and compareToIgnoreCase

We have some special string methods like compareTo() and compareToIgnoreCase() which return integer. The output is derived based on below conditions

  • If both strings are absolutely same, return value is 0
  • If we do str1.compareTo(str2) and str1 is a substring of str2 from index 0, return value will be str1.length() - str2.length()
  • If we do str2.compareTo(str1) and str2 contains str1 from index 0, return value will be str2.length() - str1.length()
  • Usually the calculation used is str1.charAt(indexWhereStringsDiffer) - str2.charAt(indexWhereStringsDiffer)

Lets look at some simple examples

package maxCode.online.String;

public class StringCompareTo {
	public static void main(String[] args) {
		String str1 = "MaxCode.Online";
		String str2 = "Max";
		
		System.out.println("MaxCode.Online compareTo Max = " + str1.compareTo(str2));	//str1.length() - str2.length()
		System.out.println("Max compareTo MaxCode.Online = " + str2.compareTo(str1));	//str2.length() - str1.length()
		
		String str3 = "max";
		
		System.out.println("Max compareTo max = " + str2.compareTo(str3));
		System.out.println("max compareTo Max = " + str3.compareTo(str2));
		System.out.println("Max compareTo max ignoring case = " + str2.compareToIgnoreCase(str3));
		
		String str4 = "Max";
		System.out.println("Max compareTo Max = " + str4.compareTo(str2));
	}
}

Output

MaxCode.Online compareTo Max = 11
Max compareTo MaxCode.Online = -11
Max compareTo max = -32
max compareTo Max = 32
Max compareTo max ignoring case = 0
Max compareTo Max = 0

valueOf, subString and String constructor

package maxCode.online.String;

public class StringValueOf {
	public static void main(String[] args) {
		String str = "MaxCode.Online";
		
		String strCons = new String(str.toCharArray(), 3, 4);	//offset, length
		System.out.println("Using String Constructor -- " + strCons);
		String valOf = String.valueOf(str.toCharArray(), 3, 4); //offset, length
		System.out.println("Using ValueOf -- " + valOf);
		String subStr = str.substring(3, 4);	//offset, endIndex
		System.out.println("Using subString -- " + subStr);
		String subStr1 = str.substring(3, 7);
		System.out.println("Using subString -- " + subStr1);
	}
}

Output

Using String Constructor -- Code
Using ValueOf -- Code
Using subString -- C
Using subString -- Code

As you can see, the string constructor and valueOf method takes the offset and length as parameter, whereas substring takes offset and endIndex as parameters.

replace and replaceAll

The important thing to remember about replace and replaceAll is that if there is no char found in the string which can be replaced, the method does not create a new string and instead references then old string only.

String s1 = "aabbccdd";
String s2 = s1.replace('Z', 'A');
System.out.println("s1 == s2 = " + (s1 == s2));

The above code returns s1 == s2 = true since there is nothing to replace and so s2 also references to s1. Same is true for replaceAll and replaceFirst.

If the parameters have the same character value, like s1.replace('a', 'a'), it will not create a new string object. Note that this is only true when for char parameters. If the parameters are string like s1.replace("a", "a") the method creates a new string on replace and does not reference the original one. So if we do a s1.replace("a", "a") and then System.out.println("s1 == s2 = " + (s1 == s2)), the output would be s1 == s2 = false.

subString

Creating a subString with start index 0 and end index equal to length of string does not create a new string. So if String str2 = str1.subString(0, str1.length()) then str1 == str2 = true.

concatenation

// compiler calculates this expression to the constant "abc:def" so only one
// string is created at runtime on this line
String s1 = "abc" + ":" + "def";

// some more string setup
String s2 = "s2";
String s3 = "s3";
String s4 = "s4";

// The following statement creates 3 string objects
// object1 = s2 + s1
//         object2 = s3 + object1
//                object3 = s4 + object2
// If you rewrite this as s5 = (((s4 + s3) + s2) + s1)
// it might be easier to count the number of objects created
// by counting left parentheses groupings
String s5 = s4 + s3 + s2 + s1;

The above code creates a total of 4 string objects, 1 in the beginning and the rest 3 at the last line. The easier way to understand this is (((s4 + s3) + s2) + s1) so 1 object containing s4+s3 then another object containing (s4 + s3) + s2 and the final object containing the complete concatenation ((s4 + s3) + s2) + s1.