Asked  7 Months ago    Answers:  5   Viewed   29 times

I'm looking for a simple function that would remove Emoji characters from instagram comments. What I've tried for now (with a lot of code from examples I found on SO & other websites) :

// PHP class
public static function removeEmoji($string)
{
    // split the string into UTF8 char array
    // for loop inside char array
        // if char is emoji, remove it
    // endfor
    // return newstring
}

Any help would be appreciated

 Answers

68

I think the preg_replace function is the simpliest solution.

As EaterOfCode suggests, I read the wiki page and coded new regex since none of SO (or other websites) answers seemed to work for Instagram photo captions (API returning format) . Note: /u identifier is mandatory to match x unicode chars.

public static function removeEmoji($text) {

    $clean_text = "";

    // Match Emoticons
    $regexEmoticons = '/[x{1F600}-x{1F64F}]/u';
    $clean_text = preg_replace($regexEmoticons, '', $text);

    // Match Miscellaneous Symbols and Pictographs
    $regexSymbols = '/[x{1F300}-x{1F5FF}]/u';
    $clean_text = preg_replace($regexSymbols, '', $clean_text);

    // Match Transport And Map Symbols
    $regexTransport = '/[x{1F680}-x{1F6FF}]/u';
    $clean_text = preg_replace($regexTransport, '', $clean_text);

    // Match Miscellaneous Symbols
    $regexMisc = '/[x{2600}-x{26FF}]/u';
    $clean_text = preg_replace($regexMisc, '', $clean_text);

    // Match Dingbats
    $regexDingbats = '/[x{2700}-x{27BF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    return $clean_text;
}

The function does not remove all emojis since there are many more, but you get the point.

Please refer to unicode.org - full emoji list (thanks Epoc)

Wednesday, March 31, 2021
 
Octopus
answered 7 Months ago
38
if(preg_match('/xEE[x80-xBF][x80-xBF]|xEF[x81-x83][x80-xBF]/', $value) 

You really want to match Unicode at a character level, rather than trying to keep track of UTF-8 byte sequences. Use the u modifier to treat your UTF-8 string on a character basis.

The emoji are encoded in the block U+1F300–U+1F5FF. However:

  • many characters from Japanese carriers' ‘emoji’ sets are actually mapped to existing Unicode symbols, eg the card suits, zodiac signs and some arrows. Do you count these symbols as ‘emoji’ now?

  • there are still systems which don't use the newly-standardised Unicode emoji code points, instead using ad-hoc ranges in the Private Use Area. Each carrier had their own encodings. iOS 4 used the Softbank set. More info. You may wish to block the entire Private Use Area.

eg:

function unichr($i) {
    return iconv('UCS-4LE', 'UTF-8', pack('V', $i));
}

if (preg_match('/['.
    unichr(0x1F300).'-'.unichr(0x1F5FF).
    unichr(0xE000).'-'.unichr(0xF8FF).
']/u'), $value) {
    ...
}
Wednesday, March 31, 2021
 
Jauco
answered 7 Months ago
85

After debugging and investigation i found that call is sent from instagram twice if callback file is not executed fast enough.

Based on the documentation:

Also, you should acknowledge the POST within a 2 second timeout--if you need to do more processing of the received information, you can do so in an asynchronous task.

They will send second request in case they dont receive response on first request within 2 secs.

At the end, i had blank callback.php file with only "sleep" inside it and its called twice each time.

Wednesday, March 31, 2021
 
Nil
answered 7 Months ago
Nil
34

It is not impossible, use valueOf().

function add(initNum) {
    var sum = initNum;
    var callback = function (num) {
        sum += num;
        return callback;
    };
    callback.valueOf = function () {
        return sum;
    };
    return callback;
};
console.log(add(1)(2)==3);            //true
console.log(add(1)(1)+1);             //3
console.log(add(1)(2)(3).valueOf());  //6
Tuesday, August 17, 2021
 
Gili
answered 3 Months ago
63

One solution is to templatize your traverse function to take a function object. Then instead of specifying the parameters in the traverse function, move those parameters to the function object and let the function object's operator() handle the details when called:

template <typename func>
void LinkedBST<T>::traverse(Node *x, func fn)
{
     if(x == nullptr)
          return;

     traverse(x->left, fn);
     fn(x->val);
     traverse(x->right, fn);
}

struct some_func
{
   int param1;
   int param2;
   int param3;

   some_func(int p1, int p2, int p3) : param1(p1), param2(p2), param3(p3) {}
   void operator()(int node_value) 
   {
      std::cout << "The node value is " << node_value << "n";
      // the parameters are param1, param2, param3
   }
};

When the operator() is invoked (the function is called), you now have the node value, plus all the parameters you set inside the object.

Then something like this can be done:

Node *node_ptr;
//...
LinkedBST<int> the_list;
//...
some_func f(1,2,3);  // we want to use 1,2,3 as the parameters to the custom function
the_list.traverse(node_ptr, f);

Here is a simplified version showing the basics by using a dummy class.


You could also use a lambda with this technique:

Node *node_ptr;
//...
LinkedBST<int> the_list;
//...
int arg1=1, arg2=2, arg3=3;
the_list.traverse(node_ptr, 
                  [&](int node_value){std::cout << "The node value is " << 
                                      node_value << "n" << arg1 << " " << 
                                      arg2 << " " << arg3;});
Friday, August 20, 2021
 
Sam Adamsh
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :