Asked  7 Months ago    Answers:  5   Viewed   31 times

I am using this library (PHP Simple HTML DOM parser) to parse a link, here's the code:

function getSemanticRelevantKeywords($keyword){
    $results = array();
    $html = file_get_html("http://www.semager.de/api/keyword.php?q=". urlencode($keyword) ."&lang=de&out=html&count=2&threshold=");
    foreach($html->find('span') as $e){
            $results[] = $e->plaintext;
    }
    return $results;
}

but I am getting this error when I output the results:

Fatal error: Call to a member function find() on a non-object in /var/www/vhosts/efamous.de/subdomains/sandbox/httpdocs/getNewTrusts.php on line 25

(line 25 is the foreach loop), the odd thing is that it outputs everything (at least seemingly) correctly but I still get that error and can't figure out why.

 Answers

77

This error usually means that $html isn't an object.

It's odd that you say this seems to work. What happens if you output $html? I'd imagine that the url isn't available and that $html is null.

Edit: Looks like this may be an error in the parser. Someone has submitted a bug and added a check in his code as a workaround.

Wednesday, March 31, 2021
 
Kwadz
answered 7 Months ago
57

This is what PHP Tidy is for. For example:

<?php
ob_start();
?>
<html>a html document</html>
<?php
$html = ob_get_clean();

// Specify configuration
$config = array(
           'indent'         => true,
           'output-xhtml'   => true,
           'wrap'           => 200);

// Tidy
$tidy = new tidy;
$tidy->parseString($html, $config, 'utf8');
$tidy->cleanRepair();

// Output
echo $tidy;
?>

See HTML Tidy Configuration Options.

Saturday, May 29, 2021
 
e_i_pi
answered 5 Months ago
34

There you go:

 $str = '<div>
            <h1>Hello</h1>
            <figure id="XXX">
            <div class="abc">ABC</div>
            <div class="qwe">QWE</div>
            <div class="zxc">ZXC</div>
            </figure>
        </div>';

$html = str_get_html($str);

$str = $html->find('figure[id=XXX]',0)->children(1)->plaintext;

echo($str);

find returns elements and you need single element so set index 0 (the first one) at the second parameter in find

Saturday, May 29, 2021
 
Ultimater
answered 5 Months ago
76

I suggest you use the right tool for this job. Use SimpleXML: Plus, its built-in :)

$xml = simplexml_load_file('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');
$parsed_results_array = array();
foreach($xml as $entry) {
    foreach($entry->item as $item) {
        // $parsed_results_array[] = json_decode(json_encode($item), true);
        $items['title'] = (string) $item->title;
        $items['description'] = (string) $item->description;
        $items['link'] = (string) $item->link;
        $parsed_results_array[] = $items;
    }
}

echo '<pre>';
print_r($parsed_results_array);

Should yield something like:

Array
(
    [0] => Array
        (
            [title] => Apple - iPod
            [description] => Learn about iPod, Apple TV, and more. Download iTunes for free and purchase iTunes Gift Cards. Check out the most popular TV shows, movies, and music.
            [link] => http://www.apple.com/ipod/
        )

    [1] => Array
        (
            [title] => iPod - Wikipedia, the free encyclopedia
            [description] => The iPod is a line of portable media players designed and marketed by Apple Inc. The first line was released on October 23, 2001, about 8½ months after ...
            [link] => http://en.wikipedia.org/wiki/IPod
        )
Thursday, August 19, 2021
 
fhonics
answered 2 Months ago
59

The table cells you are looking for are not part of that HTML document. You first of all need to understand the basics of webscraping, I suggest you borrow some books about the topic and read through them.

Time for the library ;)


In case the table cells are in the document (it seems to vary, sometimes they are, sometimes they are not), the original example shows it, this also demonstrates how to iterate over a DOMNodeList:

$doc = new DOMDocument();

libxml_use_internal_errors(true);
$doc->loadHTMLFile('Catawba County Legacy Map Server.html');

$tds = $doc->getElementsByTagname('td');
foreach($tds as $td) {
    printf(" * %sn", $td->textContent);
}

Exemplary output:

php "test.php" (in directory: /home/hakre/php/test)
 *
 * Real Estate Search - Legacy
 *
 *
 *
 *
 *
 *
 *
 *
 *
 * Map Layers
 * visible
 *
 *
 * Parcels
 *
 * Parcel Annotation
 *
 * Address Points
 *
 * Misc. Lines
 *
 * Structures
 *
 * Contour Lines
 *
 * Soils
 *
 * Townships
 *
 * Water Features
 *
 * Tiles
 *
 * Flood Zone
 *
 * Agricultural District
 *
 * Aerial 2009
 *
 * Aerial 2005
 *
 * Aerial 2002
 *
 * Cities
 *
 * Print the Map  
 * Print Map and Parcel Report  
 * Print the Parcel Report  
 * Assessment Report  
 * List all Owners  
 * Deed History Report
 * Parcel Information:
 * Owner Information:
 * Parcel ID: 372215634301
 * Name: PENLEY TREASURE B
 * Parcel Address: 3152 7TH AV SE 
 * Name2:  
 * City: CONOVER 28613
 * Address: 5508 SWINGING BRIDGE RD
 * LRK(REID): 57186
 * Address2:  
 * Deed Book/Page: 1906/0741 Deed Image
 * City: CONOVER
 * Subdivision: FOREST HGTS
 * State/Zip: NC 28613-7415
 * Lots: 1-4
 *
 * Block: C
 *
 * Last Sale:
 * School Information:
 * Plat Book/Page: 8/119 Plat Image
 * School District: COUNTY
 * Calculated Acreage: 0.31
 * Elementary School: WEBB A MURRAY
 * Tax Map: 167H  04006A
 * Middle School: ARNDT
 * State Road:  
 * High School: ST STEPHENS
 * Township: HICKORY
 * School Map
 *  
 *  
 * Tax/Value Information:  Tax Rates(pdf)
 * Zoning Information:
 * Municipal Tax District:  
 * Zoning District: HICKORY
 * Fire District: HICKORY RURAL
 * Zoning1: OI
 * Tax Account Number:  
 * Zoning2:  
 * Market Building(s) Value: $55,400
 * Zoning3:  
 * Market Land Value: $20,300
 * Zoning Overlay:  
 * Market Total Value: $75,700
 * Small Area:  
 * Year Built/Remodeled: 1959  
 * Split Zoning District 1/2: 0/0
 * Current Tax Bill
 * Zoning Agency Phone Numbers
 * Miscellaneous:
 *  
 * Voter Precinct:P35
 * Firm Panel Date: 9/5/2007
 * Building Permits for this parcel
 * Firm Panel #: 3710372200J
 * WaterShed:  
 * 2010 Census Tract: 011000
 * WaterShed Split:  
 * 2010 Census Block: 3035
 * Parcel Report Data Descriptions
 * Agricultural District:  
 * FAQ's
 * Help
 * GIS Home
Compilation finished successfully.
Tuesday, August 31, 2021
 
alko989
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :