Asked  7 Months ago    Answers:  5   Viewed   32 times

PHP DOMnode objects contain a textContent and nodeValue attributes which both seem to be the innerHTML of the node.

nodeValue: The value of this node, depending on its type

textContent: This attribute returns the text content of this node and its descendants.

What is the difference between these two properties? When is it proper to use one instead of the other?

 Answers

83

I finally wanted to know the difference as well, so I dug into the source and found the answer; in most cases there will be no discernible difference, but there are a bunch of edge cases you should be aware of.

Both ->nodeValue and ->textContent are identical for the following classes (node types):

  • DOMAttr
  • DOMText
  • DOMElement
  • DOMComment
  • DOMCharacterData
  • DOMProcessingInstruction

The ->nodeValue property yields NULL for the following classes (node types):

  • DOMDocumentFragment
  • DOMDocument
  • DOMNotation
  • DOMEntity
  • DOMEntityReference

The ->textContent property is non-existent for the following classes:

  • DOMNameSpaceNode (not documented, but can be found with //namespace:* selector)

The ->nodeValue property is non-existent for the following classes:

  • DOMDocumentType

See also: dom_node_node_value_read() and dom_node_text_content_read()

Wednesday, March 31, 2021
 
Silfverstrom
answered 7 Months ago
66

Hope this will help you out..

Try this code snippet here

<?php
ini_set('display_errors', 1);
$string='<html><body><b>AMZN 466.00 ( 15743 ) ( <span class='red'> -1 </span>) 
MSFT 290.00 ( 37296 ) ( <span class='red'> -2 </span>)
TWTR 4,000.00 ( 20 ) ( <span class=''> 0 </span>)</b></body></html>';

$dom = new DOMDocument();
$dom->loadHTML($string);
$dom->getElementsByTagName("b");

$xpath= new DOMXPath($dom);
$result=$xpath->query("//b/span");//here we are querying domdocument to find span which is inside b.

$nodesToRemove=array();//here we are maintaining an array of nodes which we want to remove
foreach($result as $node)
{
    $node->parentNode->removeChild($node);//removing nodes from its parent
}
 echo $dom->getElementsByTagName("b")->item(0)->textContent;//displaying content after removing nodes.

Output:

AMZN 466.00 ( 15743 ) ( ) 
MSFT 290.00 ( 37296 ) ( )
TWTR 4,000.00 ( 20 ) ( )
Wednesday, March 31, 2021
 
Slinky
answered 7 Months ago
46

No, there is no way of specifying a particular doctype to use, or to modify the requirements of the existing one.

Your best workable solution is going to be to disable error reporting with libxml_use_internal_errors:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML('...');
libxml_clear_errors();
Wednesday, June 2, 2021
 
ajreal
answered 5 Months ago
24

This is mentioned in a couple of comments on the DomNode::removeChild documentation, with the issue apparently being how the iterator pointer on the foreach not being able to deal with the fact that you are removing items from a parent array while looping through the list of children (or something).

The recommended fix is to loop through the main node first and push the child nodes you want to delete to its own array, then loop through that "to-be-deleted" array and deleting those children from their parent. Example:

$dom = new DOMDocument();
@$dom->loadHTML($description);
$pTag = $dom->getElementsByTagName('p');

$spotid_children = array();

foreach ($pTag as $value) {
    /** @var DOMElement $value */
    $id = $value->getAttribute('data-spotid');
    if ($id) {
        $spotid_children[] = $value; 
    }
}

foreach ($spotid_children as $spotid_child) {
    $spotid_child->parentNode->removeChild($spotid_child); 
}
Monday, August 16, 2021
 
Litty
answered 2 Months ago
46

loadXML() takes an options argument, and one of the options is LIBXML_NOENT, which enables converting entities to their representations, so by default loadXML() shouldn't do so. However, there appears to be a bug in libxml that causes it to happen all the time, according to this bug report

Thursday, August 19, 2021
 
buymypies
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :