Asked  7 Months ago    Answers:  5   Viewed   33 times

Here is what I am trying to achieve : retrieve all products on a page and put them into an array. Here is the code I am using :

$page2 = curl_exec($ch);
$doc = new DOMDocument();
@$doc->loadHTML($page2);
$nodes = $doc->getElementsByTagName('title');
$noders = $doc->getElementsByClassName('productImage');
$title = $nodes->item(0)->nodeValue;
$product = $noders->item(0)->imageObject.src;

It works for the $title but not for the product. For info, in the HTML code the img tag looks like this :

<img alt="" class="productImage" data-altimages="" src="xxxx">

I have been looking at this (PHP DOMDocument how to get element?) but I still don't understand how to make it work.

PS : I get this error :

Call to undefined method DOMDocument::getElementsByclassName()

 Answers

27

I finally used the following solution :

    $classname="blockProduct";
    $finder = new DomXPath($doc);
    $spaner = $finder->query("//*[contains(@class, '$classname')]");
Wednesday, March 31, 2021
 
kmunky
answered 7 Months ago
88

You should only call curl_close() when you know you're done with that particular handle, or if switching from its current state to a new one (ie: changing a ton of options via curl_setopt() would be faster by going from a clean new handle than your current "dirty" one.

The cookiejar/file options are only strictly necessary for maintaining cookies between seperate curl handles/invokations. Each one's independent of the others, so the cookie files are the only way to share between them.

Wednesday, March 31, 2021
 
huhushow
answered 7 Months ago
23

You need to add the Curl libraries to the command line PHP.ini.

You can probably just copy the file C:wampbinapacheApache2.2.xbinphp.ini to c:wampbinphpphp5.3.10php.ini (adjust for the actual directories on your system).

Wednesday, March 31, 2021
 
Fanda
answered 7 Months ago
62

The problem is that the NodeList returned to you is "live" - it changes as you alter the class name. That is, when you change the class on the first element, the list is immediately one element shorter than it was.

Try this:

  while (elementArray.length) {
    elementArray[0].className = "exampleClassComplete";
  }

(There's no need to use setAttribute() to set the "class" value - just update the "className" property. Using setAttribute() in old versions of IE wouldn't work anyway.)

Alternatively, convert your NodeList to a plain array, and then use your indexed iteration:

  elementArray = [].slice.call(elementArray, 0);
  for (var i = 0; i < elementArray.length; ++i)
    elementArray[i].className = "whatever";

As pointed out in a comment, this has the advantage of not relying on the semantics of NodeList objects. (Note also, thanks again to a comment, that if you need this to work in older versions of Internet Explorer, you'd have to write an explicit loop to copy element references from the NodeList to an array.)

Monday, June 14, 2021
 
Hat
answered 4 Months ago
Hat
85

It is not good to use custom attributes in the HTML. If any, you should use HTML5's data attributes.

Nevertheless you can write your own function that traverses the tree, but that will be quite slow compared to getElementById because you cannot make use of any index:

function getElementByAttribute(attr, value, root) {
    root = root || document.body;
    if(root.hasAttribute(attr) && root.getAttribute(attr) == value) {
        return root;
    }
    var children = root.children, 
        element;
    for(var i = children.length; i--; ) {
        element = getElementByAttribute(attr, value, children[i]);
        if(element) {
            return element;
        }
    }
    return null;
}

In the worst case, this will traverse the whole tree. Think about how to change your concept so that you can make use browser functions as much as possible.

In newer browsers you use of the querySelector method, where it would just be:

var element = document.querySelector('[tokenid="14"]');

This will be much faster too.


Update: Please note @Andy E's comment below. It might be that you run into problems with IE (as always ;)). If you do a lot of element retrieval of this kind, you really should consider using a JavaScript library such as jQuery, as the others mentioned. It hides all these browser differences.

Thursday, June 17, 2021
 
Xavio
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :