Asked  7 Months ago    Answers:  5   Viewed   48 times

how can i create a preg_match_all regex pattern for php to give me this code?

<td class="class2">&nbsp;</td>
<td class="class2" align="right"><span class="DarkText">I WANT THIS TEXT</span></td>

To get me the text inside the span class? thanks!

 Answers

65

You can use:

preg_match_all("!<span[^>]+>(.*?)</span>!", $str, $matches);

Then your text will be inside the first capture group (as seen on rubular)

With that out of the way, note that regex shouldn't be used to parse HTML. You will be better off using an XML parser, unless it's something really, really simple.

Saturday, May 29, 2021
 
freeMagee
answered 7 Months ago
72

If you do want to match umlauts, then add the regex /u modifier, or use pL in place of w. That will allow the regex to match letters outside of the ASCII range.

Reference: http://www.regular-expressions.info/unicode.html
and http://php.net/manual/en/regexp.reference.unicode.php

Saturday, May 29, 2021
 
keyBeatz
answered 7 Months ago
60

If the browser doesn't have the file in cache, "skipping" causes the browser to send a new request, with a "range" header. Your PHP file needs to handle this header.

this means:

  1. get and parse the range header
  2. respond to the request with status code 206 and corresponding range headers
  3. output only necessary bytes
Saturday, May 29, 2021
 
bumperbox
answered 7 Months ago
87

This regex will do the trick:

(d+)d (d+)h (d+)m (d+)s

Each value (day, hour, minute, second) will be captured in a group.

About your regex: I don't know what do you mean by "isn't correct", but I guess it's probably failing because your regex is greedy instead of lazy (more info). Try using lazy operators, or using more specific matches (d instead of ., for example).

EDIT:

I need them to be separate variables

After matching, they will be put in different locations in the resulting array. Just assign them to variables. Check out an example here.

If you have trouble understanding the resulting array structure, you may want to use the PREG_SET_ORDER flag when calling preg_match_all (more information here).

Saturday, May 29, 2021
 
relyt
answered 7 Months ago
50

Ok - after a lot of research I've come up with the final option - which seem to be just what I needed.

I've used the HTMLPurifier and filtered my content using the following:

require_once('HTMLPurifier/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$objPurifier = new HTMLPurifier($config);
return $objPurifier->purify($string);

I hope someone else will find it useful.

Monday, November 8, 2021
 
Jeroen
answered 3 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share