Asked  7 Months ago    Answers:  5   Viewed   30 times

I need to parse an HTML document and to find all occurrences of string asdf in it.

I currently have the HTML loaded into a string variable. I would just like the character position so I can loop through the list to return some data after the string.

The strpos function only returns the first occurrence. How about returning all of them?

 Answers

14

Without using regex, something like this should work for returning the string positions:

$html = "dddasdfdddasdffff";
$needle = "asdf";
$lastPos = 0;
$positions = array();

while (($lastPos = strpos($html, $needle, $lastPos))!== false) {
    $positions[] = $lastPos;
    $lastPos = $lastPos + strlen($needle);
}

// Displays 3 and 10
foreach ($positions as $value) {
    echo $value ."<br />";
}
Wednesday, March 31, 2021
 
inieto
answered 7 Months ago
46

A regex would be simplest:

$input = 'foo_left.jpg';
if(!preg_match('/_(left|right|center)/', $input, $matches)) {
    // no match
}

$pos = $matches[0]; // "_left", "_right" or "_center"

See it in action.

Update:

For a more defensive-minded approach (if there might be multiple instances of "_left" and friends in the filename), you can consider adding to the regex.

This will match only if the l/r/c is followed by a dot:

preg_match('/(_(left|right|center))./', $input, $matches);

This will match only if the l/r/c is followed by the last dot in the filename (which practically means that the base name ends with the l/r/c specification):

preg_match('/(_(left|right|center))\.[^\.]*$/', $input, $matches);

And so on.

If using these regexes, you will find the result in $matches[1] instead of $matches[0].

Wednesday, March 31, 2021
 
braindamage
answered 7 Months ago
38

try the following

const std::string s = "*A";
const std::string t = "*An";

std::string::size_type n = 0;
while ( ( n = chartDataString.find( s, n ) ) != std::string::npos )
{
    chartDataString.replace( n, s.size(), t );
    n += t.size();
}
Thursday, July 29, 2021
 
williamcarswell
answered 3 Months ago
61

This can't work properly. Stored with Unicode there are many more Characters than with ANSI. So if you "convert" to ANSI, you will loose lots of charackters.

http://php.net/manual/en/function.htmlentities.php

You can use Unicode (UTF-8) charset with htmlentities:

string htmlentities ( string $string [, int $flags = ENT_COMPAT [, string $charset [, bool $double_encode = true ]]] )

htmlentities($myString, ENT_COMPAT, "UTF-8"); should work.

Thursday, August 5, 2021
 
CoderGuy123
answered 3 Months ago
100

One way to do this is to find the indices using list comprehension:

currentWord = "hello"

guess = "l"

occurrences = currentWord.count(guess)

indices = [i for i, a in enumerate(currentWord) if a == guess]

print indices

output:

[2, 3]
Friday, August 20, 2021
 
Bruce
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :