Asked  7 Months ago    Answers:  5   Viewed   36 times

In RegEx, I want to find the tag and everything between two XML tags, like the following:

<primaryAddress>
    <addressLine>280 Flinders Mall</addressLine>
    <geoCodeGranularity>PROPERTY</geoCodeGranularity>
    <latitude>-19.261365</latitude>
    <longitude>146.815585</longitude>
    <postcode>4810</postcode>
    <state>QLD</state>
    <suburb>Townsville</suburb>
    <type>PHYSICAL</type>
</primaryAddress>

I want to find the tag and everything between primaryAddress, and erase that.

Everything between the primaryAddress tag is a variable, but I want to remove the entire tag and sub-tags whenever I get primaryAddress.

Anyone have any idea how to do that?

 Answers

88

It is not a good idea to use regex for HTML/XML parsing...

However, if you want to do it anyway, search for regex pattern

<primaryAddress>[sS]*?</primaryAddress>

and replace it with empty string...

Wednesday, March 31, 2021
 
jerrygarciuh
answered 7 Months ago
60

Use following working code :

$mpmatch = "!<td>(.*?)</td>!is";
$str = "<td>sdfdfdfdsfds</td><td>333333333</td>";
preg_match_all($mpmatch, $str, $result);
foreach ($result as $val) {
    echo "<pre>";
    print_r($val);
}

Hope this will help you.

Wednesday, March 31, 2021
 
dirigibleplum
answered 7 Months ago
71

I suggest you create your own modified Java library. Simply copy the java.util.regex source into your own package.

The Sun JDK 1.6 Pattern.java class offers these default flags:

static final int GREEDY     = 0;

static final int LAZY       = 1;

static final int POSSESSIVE = 2;

You'll notice that these flags are only used a couple of times, and it would be trivial to modify. Take the following example:

    case '*':
        ch = next();
        if (ch == '?') {
            next();
            return new Curly(prev, 0, MAX_REPS, LAZY);
        } else if (ch == '+') {
            next();
            return new Curly(prev, 0, MAX_REPS, POSSESSIVE);
        }
        return new Curly(prev, 0, MAX_REPS, GREEDY);

Simply change the last line to use the 'LAZY' flag instead of the GREEDY flag. Since your wanting a regex library to behave like the PHP one, this might be the best way to go.

Friday, May 28, 2021
 
TuomasR
answered 5 Months ago
52

The standard disclaimer applies: Parsing HTML with regular expressions is not ideal. Success depends on the well-formedness of the input on a character-by-character level. If you cannot guarantee this, the regex will fail to do the Right Thing at some point.

Having said that:

<ab[^>]*>(.*?)</a>   // match group one will contain the link text
Saturday, May 29, 2021
 
lewiguez
answered 5 Months ago
46

You said you were sending form encoded data.

request.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8');

Then you sent:

temp = JSON.stringify(data);

JSON is application/json not application/x-www-form-urlencoded (and isn't natively supported by PHP anyway).

Either encode your data as application/x-www-form-urlencoded or correct your content-type and parse it manually in PHP.

Saturday, May 29, 2021
 
tika
answered 5 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :