"php - processing invalid xml" Answer’s

0

What you need is something that will use libxml's internal errors to locate invalid characters and escape them accordingly. Here's a mockup of how I'd write it. Take a look at the result of libxml_get_errors() for error info.

function load_invalid_xml($xml)
{
    $use_internal_errors = libxml_use_internal_errors(true);
    libxml_clear_errors(true);

    $sxe = simplexml_load_string($xml);

    if ($sxe)
    {
        return $sxe;
    }

    $fixed_xml = '';
    $last_pos  = 0;

    foreach (libxml_get_errors() as $error)
    {
        // $pos is the position of the faulty character,
        // you have to compute it yourself
        $pos = compute_position($error->line, $error->column);
        $fixed_xml .= substr($xml, $last_pos, $pos - $last_pos) . htmlspecialchars($xml[$pos]);
        $last_pos = $pos + 1;
    }
    $fixed_xml .= substr($xml, $last_pos);

    libxml_use_internal_errors($use_internal_errors);

    return simplexml_load_string($fixed_xml);
}
Wednesday, March 31, 2021
 
SpiderLinked
answered 11 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :