Asked  7 Months ago    Answers:  5   Viewed   31 times

While working in a Java app, I recently needed to assemble a comma-delimited list of values to pass to another web service without knowing how many elements there would be in advance. The best I could come up with off the top of my head was something like this:

public String appendWithDelimiter( String original, String addition, String delimiter ) {
    if ( original.equals( "" ) ) {
        return addition;
    } else {
        return original + delimiter + addition;
    }
}

String parameterString = "";
if ( condition ) parameterString = appendWithDelimiter( parameterString, "elementName", "," );
if ( anotherCondition ) parameterString = appendWithDelimiter( parameterString, "anotherElementName", "," );

I realize this isn't particularly efficient, since there are strings being created all over the place, but I was going for clarity more than optimization.

In Ruby, I can do something like this instead, which feels much more elegant:

parameterArray = [];
parameterArray << "elementName" if condition;
parameterArray << "anotherElementName" if anotherCondition;
parameterString = parameterArray.join(",");

But since Java lacks a join command, I couldn't figure out anything equivalent.

So, what's the best way to do this in Java?

 Answers

57

Pre Java 8:

Apache's commons lang is your friend here - it provides a join method very similar to the one you refer to in Ruby:

StringUtils.join(java.lang.Iterable,char)


Java 8:

Java 8 provides joining out of the box via StringJoiner and String.join(). The snippets below show how you can use them:

StringJoiner

StringJoiner joiner = new StringJoiner(",");
joiner.add("01").add("02").add("03");
String joinedString = joiner.toString(); // "01,02,03"

String.join(CharSequence delimiter, CharSequence... elements))

String joinedString = String.join(" - ", "04", "05", "06"); // "04 - 05 - 06"

String.join(CharSequence delimiter, Iterable<? extends CharSequence> elements)

List<String> strings = new LinkedList<>();
strings.add("Java");strings.add("is");
strings.add("cool");
String message = String.join(" ", strings);
//message returned is: "Java is cool"
Tuesday, June 1, 2021
 
rypskar
answered 7 Months ago
41

like this:

var foo = 45;
var bar = '' + foo;

Actually, even though I typically do it like this for simple convenience, over 1,000s of iterations it appears for raw speed there is an advantage for .toString()

See Performance tests here (not by me, but found when I went to write my own): http://jsben.ch/#/ghQYR

Fastest based on the JSPerf test above: str = num.toString();

It should be noted that the difference in speed is not overly significant when you consider that it can do the conversion any way 1 Million times in 0.1 seconds.

Update: The speed seems to differ greatly by browser. In Chrome num + '' seems to be fastest based on this test http://jsben.ch/#/ghQYR

Update 2: Again based on my test above it should be noted that Firefox 20.0.1 executes the .toString() about 100 times slower than the '' + num sample.

Tuesday, June 1, 2021
 
Jubair
answered 7 Months ago
64

I answered this question earlier and Elliot pointed out I was just plain wrong. I apologise to the community.

There is nothing magical about the String.indexOf code. It is not natively optimised or anything like that. You can copy the indexOf method from the String source code and it runs just as quickly.

What we have here is the difference between O() efficiency and actual efficiency. Rabin-Karp for a String of length N and a pattern of length M, Rabin-Karp is O(N+M) and a worst case of O(NM). When you look into it, String.indexOf() also has a best case of O(N+M) and a worst case of O(NM).

If the text contains many partial matches to the start of the pattern Rabin-Karp will stay close to its best-case performance, whilst String.indexOf will not. For example I tested the above code (properly this time :-)) on a million '0's followed by a single '1', and the searched for 1000 '0's followed by a single '1'. This forced the String.indexOf to its worst case performance. For this highly degenerate test, the Rabin-Karp algorithm was about 15 times faster than indexOf.

For natural language text, Rabin-Karp will remain close to best-case and indexOf will only deteriorate slightly. The deciding factor is therefore the complexity of operations performed on each step.

In it's innermost loop, indexOf scans for a matching first character. At each iteration is has to:

  • increment the loop counter
  • perform two logical tests
  • do one array access

In Rabin-Karp each iteration has to:

  • increment the loop counter
  • perform two logical tests
  • do two array accesses (actually two method invocations)
  • update a hash, which above requires 9 numerical operations

Therefore at each iteration Rabin-Karp will fall further and further behind. I tried simplifying the hash algorithm to just XOR characters, but I still had an extra array access and two extra numerical operations so it was still slower.

Furthermore, when a match is find, Rabin-Karp only knows the hashes match and must therefore test every character, whereas indexOf already knows the first character matches and therefore has one less test to do.

Having read on Wikipedia that Rabin-Karp is used to detect plagiarism, I took the Bible's Book of Ruth, removed all punctuation and made everything lower case which left just under 10000 characters. I then searched for "andthewomenherneighboursgaveitaname" which occurs near the very end of the text. String.indexOf was still faster, even with just the XOR hash. However, if I removed String.indexOfs advantage of being able to access String's private internal character array and forced it to copy the character array, then, finally, Rabin-Karp was genuinely faster.

However, I deliberately chose that text as there are 213 "and"s in the Book of Ruth and 28 "andthe"s. If instead I searched just for the last characters "ursgaveitaname", well there are only 3 "urs"s in the text so indexOf returns closer to its best-case and wins the race again.

As a fairer test I chose random 20 character strings from the 2nd half of the text and timed them. Rabin-Karp was about 20% slower than the indexOf algorithm run outside of the String class, and 70% slower than the actual indexOf algorithm. Thus even in the use case it is supposedly appropriate for, it was still not the best choice.

So what good is Rabin-Karp? No matter the length or nature of the text to be searched, at every character compared it will be slower. No matter what hash function we choose we are surely required to make an additional array access and at least two numerical operations. A more complex hash function will give us less false matches, but require more numerical operators. There is simply no way Rabin-Karp can ever keep up.

As demonstrated above, if we need to find a match prefixed by an often repeated block of text, indexOf can be slower, but if we know we are doing that it would look like we would still be better off using indexOf to search for the text without the prefix and then check to see if the prefix was present.

Based on my investigations today, I cannot see any time when the additional complexity of Rabin Karp will pay off.

Friday, August 6, 2021
 
sunshinejr
answered 4 Months ago
91

The other alternative is to use AspectJ and @Configurable. Spring seems to be going towards these days (favoring).

I would look into it if you are using Spring 3 as it is faster (performance) and more flexible than proxy based aop.

Tuesday, November 16, 2021
 
jsuggs
answered 3 Weeks ago
10

Have a look at http://en.wikipedia.org/wiki/Suffix_array. It is possible that what you really want to do is to create a suffix array for each document, and them merge all the suffix arrays, with pointers back to the original versions, so that you can search the collection as one for a string by looking for it as a suffix in the array.

Wednesday, November 17, 2021
 
stupakov
answered 2 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share