Asked  7 Months ago    Answers:  5   Viewed   64 times

I have a code snippet written in PHP that pulls a block of text from a database and sends it out to a widget on a webpage. The original block of text can be a lengthy article or a short sentence or two; but for this widget I can't display more than, say, 200 characters. I could use substr() to chop off the text at 200 chars, but the result would be cutting off in the middle of words-- what I really want is to chop the text at the end of the last word before 200 chars.

 Answers

51

By using the wordwrap function. It splits the texts in multiple lines such that the maximum width is the one you specified, breaking at word boundaries. After splitting, you simply take the first line:

substr($string, 0, strpos(wordwrap($string, $your_desired_width), "n"));

One thing this oneliner doesn't handle is the case when the text itself is shorter than the desired width. To handle this edge-case, one should do something like:

if (strlen($string) > $your_desired_width) 
{
    $string = wordwrap($string, $your_desired_width);
    $string = substr($string, 0, strpos($string, "n"));
}

The above solution has the problem of prematurely cutting the text if it contains a newline before the actual cutpoint. Here a version which solves this problem:

function tokenTruncate($string, $your_desired_width) {
  $parts = preg_split('/([snr]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
  $parts_count = count($parts);

  $length = 0;
  $last_part = 0;
  for (; $last_part < $parts_count; ++$last_part) {
    $length += strlen($parts[$last_part]);
    if ($length > $your_desired_width) { break; }
  }

  return implode(array_slice($parts, 0, $last_part));
}

Also, here is the PHPUnit testclass used to test the implementation:

class TokenTruncateTest extends PHPUnit_Framework_TestCase {
  public function testBasic() {
    $this->assertEquals("1 3 5 7 9 ",
      tokenTruncate("1 3 5 7 9 11 14", 10));
  }

  public function testEmptyString() {
    $this->assertEquals("",
      tokenTruncate("", 10));
  }

  public function testShortString() {
    $this->assertEquals("1 3",
      tokenTruncate("1 3", 10));
  }

  public function testStringTooLong() {
    $this->assertEquals("",
      tokenTruncate("toooooooooooolooooong", 10));
  }

  public function testContainingNewline() {
    $this->assertEquals("1 3n5 7 9 ",
      tokenTruncate("1 3n5 7 9 11 14", 10));
  }
}

EDIT :

Special UTF8 characters like 'à' are not handled. Add 'u' at the end of the REGEX to handle it:

$parts = preg_split('/([snr]+)/u', $string, null, PREG_SPLIT_DELIM_CAPTURE);

Tuesday, June 1, 2021
 
innovation
answered 7 Months ago
17

This is what I came up with... you should check if the sentence is longer than the len you are looking for.. among other things like what g13n said. It might be better if the sentence is too short/long to chopping it off and putting "...". Plus, you would have to check/convert whitespace since strrpos will only look for what is given.

$maxlen = 150;
$file = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer malesuada eleifend orci, eget dignissim ligula porttitor cursus. Praesent in blandit enim. Maecenas vitae eleifend est. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Maecenas pulvinar gravida tempor.";
if ( strlen($file) > $maxlen ){
    $file = substr($file,0,strrpos($file,". ",$maxlen-strlen($file))+1);
}

if you want to use the same function you have, you can try this:

function shortenString($string, $your_desired_width) {
  $parts = preg_split('/([snr]+)/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
  $parts_count = count($parts);

  $length = 0;
  $last_part = 0;
  $last_taken = 0;
  foreach($parts as $part){
    $length += strlen($part);
    if ( $length > $your_desired_width ){
        break;
    }
    ++$last_part;
    if ( $part[strlen($part)-1] == '.' ){
        $last_taken = $last_part;
    }
  }
  return implode(array_slice($parts, 0, $last_taken));
}
Wednesday, March 31, 2021
 
KingCrunch
answered 9 Months ago
72

No. Since bytearrays are also strings in PHP, a simple replacement of the 8-bit string functions with their mb_* counterparts will cause nothing but trouble. Functions like strlen() and substr() are probably more frequently used with bytes than actual text strings.

At the place I last worked, we managed to build a multilingual web-site (Arabic, Hindi, among other languages) in PHP without using the mbstring library at all. Text string manipulation actually doesn't happen that often. When it does, it would require far more care than just changing a function name. Most of the challenges, I've found, lie on the HTML side. Getting a page layout to work with a RTL language is the non-trivial part.

I don't know if you're just using Arabic as an example. The difficulty of internationalization can vary quite substantially depending on whether "international" means European languages only (plus Russian), or if it's inclusive of Middle-Eastern, South-Asian, and Far-East languages.

Wednesday, March 31, 2021
 
Smandoli
answered 9 Months ago
100

It could be:

function split2($string, $needle, $nth) {
    $max = strlen($string);
    $n = 0;
    for ($i=0; $i<$max; $i++) {
        if ($string[$i] == $needle) {
            $n++;
            if ($n >= $nth) {
                break;
            }
        }
    }
    $arr[] = substr($string, 0, $i);
    $arr[] = substr($string, $i+1, $max);

    return $arr;
}
Monday, June 14, 2021
 
Laimoncijus
answered 6 Months ago
66

You should simply store the return value in a variable:

$deliveryPrice = getDeliveryPrice(12);
echo $deliveryPrice; // will print 20

The $deliveryPrice variable above is a different variable than the $deliveryPrice inside the function. The latter is not visible outside the function because of variable scope.

Thursday, August 5, 2021
 
dimitarvp
answered 5 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share