Asked  7 Months ago    Answers:  5   Viewed   34 times

For a poor man's implementation of near-collation-correct sorting on the client side I need a JavaScript function that does efficient single character replacement in a string.

Here is what I mean (note that this applies to German text, other languages sort differently):

native sorting gets it wrong: a b c o u z ä ö ü
collation-correct would be:   a ä b c o ö u ü z

Basically, I need all occurrences of "ä" of a given string replaced with "a" (and so on). This way the result of native sorting would be very close to what a user would expect (or what a database would return).

Other languages have facilities to do just that: Python supplies str.translate(), in Perl there is tr/…/…/, XPath has a function translate(), ColdFusion has ReplaceList(). But what about JavaScript?

Here is what I have right now.

// s would be a rather short string (something like 
// 200 characters at max, most of the time much less)
function makeSortString(s) {
  var translate = {
    "ä": "a", "ö": "o", "ü": "u",
    "Ä": "A", "Ö": "O", "Ü": "U"   // probably more to come
  };
  var translate_re = /[öäüÖÄÜ]/g;
  return ( s.replace(translate_re, function(match) { 
    return translate[match]; 
  }) );
}

For starters, I don't like the fact that the regex is rebuilt every time I call the function. I guess a closure can help in this regard, but I don't seem to get the hang of it for some reason.

Can someone think of something more efficient?


Answers below fall in two categories:

  1. String replacement functions of varying degrees of completeness and efficiency (what I was originally asking about)
  2. A late mention of String#localeCompare, which is now widely supported among JS engines (not so much at the time of the question) and could solve this category of problem much more elegantly.

 Answers

51

I can't speak to what you are trying to do specifically with the function itself, but if you don't like the regex being built every time, here are two solutions and some caveats about each.

Here is one way to do this:

function makeSortString(s) {
  if(!makeSortString.translate_re) makeSortString.translate_re = /[öäüÖÄÜ]/g;
  var translate = {
    "ä": "a", "ö": "o", "ü": "u",
    "Ä": "A", "Ö": "O", "Ü": "U"   // probably more to come
  };
  return ( s.replace(makeSortString.translate_re, function(match) { 
    return translate[match]; 
  }) );
}

This will obviously make the regex a property of the function itself. The only thing you may not like about this (or you may, I guess it depends) is that the regex can now be modified outside of the function's body. So, someone could do this to modify the interally-used regex:

makeSortString.translate_re = /[a-z]/g;

So, there is that option.

One way to get a closure, and thus prevent someone from modifying the regex, would be to define this as an anonymous function assignment like this:

var makeSortString = (function() {
  var translate_re = /[öäüÖÄÜ]/g;
  return function(s) {
    var translate = {
      "ä": "a", "ö": "o", "ü": "u",
      "Ä": "A", "Ö": "O", "Ü": "U"   // probably more to come
    };
    return ( s.replace(translate_re, function(match) { 
      return translate[match]; 
    }) );
  }
})();

Hopefully this is useful to you.


UPDATE: It's early and I don't know why I didn't see the obvious before, but it might also be useful to put you translate object in a closure as well:

var makeSortString = (function() {
  var translate_re = /[öäüÖÄÜ]/g;
  var translate = {
    "ä": "a", "ö": "o", "ü": "u",
    "Ä": "A", "Ö": "O", "Ü": "U"   // probably more to come
  };
  return function(s) {
    return ( s.replace(translate_re, function(match) { 
      return translate[match]; 
    }) );
  }
})();
Tuesday, June 1, 2021
 
Jeff
answered 7 Months ago
52

A somewhat inefficient way of doing this:

NSString *s = @"foo/bar:baz.foo";
NSCharacterSet *doNotWant = [NSCharacterSet characterSetWithCharactersInString:@"/:."];
s = [[s componentsSeparatedByCharactersInSet: doNotWant] componentsJoinedByString: @""];
NSLog(@"%@", s); // => foobarbazfoo

Look at NSScanner and -[NSString rangeOfCharacterFromSet: ...] if you want to do this a bit more efficiently.

Wednesday, June 2, 2021
 
Besnik
answered 6 Months ago
33

str.replace is the wrong function for what you want to do (apart from it being used incorrectly). You want to replace any character of a set with a space, not the whole set with a single space (the latter is what replace does). You can use translate like this:

removeSpecialChars = z.translate ({ord(c): " " for c in "!@#$%^&*()[]{};:,./<>?|`~-=_+"})

This creates a mapping which maps every character in your list of special characters to a space, then calls translate() on the string, replacing every single character in the set of special characters with a space.

Monday, June 28, 2021
 
jcubic
answered 6 Months ago
26

If you consider

obj.sort().reverse();

VS

obj.sort((a, b) => (a > b ? -1 : 1))

VS

obj.sort((a, b) => b.localeCompare(a) )

The performance winner is : obj.sort().reverse().

Testing with an array of 10.000 elements, obj.sort().reverse() is faster than obj.sort( function ) (except on chrome), and obj.sort( function ) (using localCompare).

Performance test here :

var results = [[],[],[]]

for(let i = 0; i < 100; i++){
  const randomArrayGen = () => Array.from({length: 10000}, () => Math.random().toString(30));
  const randomArray = randomArrayGen();
  const copyArray = x => x.slice();

  obj = copyArray(randomArray);
  let t0 = performance.now();
  obj.sort().reverse();
  let t1 = performance.now();

  obj = copyArray(randomArray);
  let t2 = performance.now();
  obj.sort((a, b) => (a > b ? -1 : 1))
  let t3 = performance.now();

  obj = copyArray(randomArray);
  let t4 = performance.now();
  obj.sort((a, b) => b.localeCompare(a))
  let t5 = performance.now();  

  results[0].push(t1 - t0);
  results[1].push(t3 - t2);
  results[2].push(t5 - t4);  
}

const calculateAverage = x => x.reduce((a,b) => a + b) / x.length ;

console.log("obj.sort().reverse():                   " + calculateAverage(results[0]));
console.log("obj.sort((a, b) => (a > b ? -1 : 1)):   " + calculateAverage(results[1]));
console.log("obj.sort((a, b) => b.localeCompare(a)): " + calculateAverage(results[2]));
Thursday, July 15, 2021
 
Besnik
answered 5 Months ago
38
int x =  0;
int i= 0;
int j = 0;
while(i != list1.length && j != list2.length){
  int v = list1[i].compareTo(list2[j]);
  if (v == 0){
    x++;i++;j++;
  }else if (v < 0){
    i++;
  } else {
    j++;
  }
}
return x;

This makes the straightforward assumption that the strings are sorted using String.compareTo which would make sense.

Saturday, August 7, 2021
 
Matt Bullock
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share