Asked  7 Months ago    Answers:  5   Viewed   38 times

I am trying to display Xml content in to tables, all works perfectly but some content in the tag that i don't want to display, I want only image but not

November 2012 calendar from 5.10 The Test

like in xml,
 <content:encoded><![CDATA[<p>November 2012 calendar from 5.10 The Test</p>
    <p><a class="shutterset_" href='http://trance-gemini.com/wordpress/wp-content/gallery/calendars/laura-bertram-trance-gemini-145-1080.jpg' title='&lt;br&gt;November 2012 calendar from 5.10 The Test&lt;br&gt; &lt;a href=&quot;</a></p>]]>
</content:encoded> 

I want to display image but not

November 2012 calendar from 5.10 The Test

.
<?php
// load SimpleXML
$item = new SimpleXMLElement('test1.xml', null, true);

echo <<<EOF
<table border="1px">
        <tr cl>

        </tr>       
EOF;
foreach($item->channel->item as $boo) // loop through our books
{
        echo <<<EOF

         <tr>
            <td rowspan="3">{$boo->children('content', true)->encoded}</td>
            <td>{$boo->title}</td>   
        </tr>

        <tr>
           <td>{$boo->description}</td>
        </tr>

        <tr>
           <td>{boo->comments}</td>
        </tr>
EOF;
}
echo '</table>';
?>

 Answers

29

I once answered it but I don't find the answer any longer.

If you take a look at the string (simplified/beautified):

<content:encoded><![CDATA[
    <p>Lorem Ipsom</p>
    <p>
      <a href='laura-bertram-trance-gemini-145-1080.jpg' 
         title='&lt;br&gt;November 2012 calendar from 5.10 The Test&lt;br&gt; &lt;a href=&quot;</a>
    </p>]]>
</content:encoded> 

You can see that you have HTML encoded inside the node-value of the <content:encoded> element. So first you need to obtain the HTML value, which you already do:

$html = $boo->children('content', true)->encoded;

Then you need to parse the HTML inside $html. With which libraries HTML parsing can be done with PHP is outlined in:

  • How to parse and process HTML/XML with PHP?

If you decide to use the more or less recommended DOMDocument for the job, you only need to get the attribute value of a certain element:

  • PHP DOMDocument getting Attribute of Tag

Or for its sister library SimpleXML you already use (so this is more recommended, see as well the next section):

  • How to get an attribute with SimpleXML?

In context of your question here the following tip:

You're using SimpleXML. DOMDocument is a sister-library, meaning you can interchange between the two so you don't need to learn a full new library.

For example, you can use only the HTML parsing feature of DOMDocument, but import it then into SimpleXML. This is useful, because SimpleXML does not support HTML parsing.

That works via simplexml_import_dom().

A simplified step-by-step example:

// get the HTML string out of the feed:
$htmlString = $boo->children('content', true)->encoded;

// create DOMDocument for HTML parsing:
$htmlParser = new DOMDocument();

// load the HTML:
$htmlParser->loadHTML($htmlString);

// import it into simplexml:
$html = simplexml_import_dom($htmlParser);

Now you can use $html as a new SimpleXMLElement that represents the HTML document. As your HTML chunks did not have any <body> tags, according to the HTML specification, they are put inside the <body> tag. This will allow you for example to access the href attribute of the first <a> inside the second <p> element in your example:#

// access the element you're looking for:
$href = $html->body->p[1]->a['href'];

Here the full view from above (Online Demo):

// get the HTML string out of the feed:
$htmlString = $boo->children('content', true)->encoded;

// create DOMDocument for HTML parsing:
$htmlParser = new DOMDocument();

// your HTML gives parser warnings, keep them internal:
libxml_use_internal_errors(true);

// load the HTML:
$htmlParser->loadHTML($htmlString);

// import it into simplexml:
$html = simplexml_import_dom($htmlParser);

// access the element you're looking for:
$href = $html->body->p[1]->a['href'];

// output it
echo $href, "n";

And what it outputs:

laura-bertram-trance-gemini-145-1080.jpg
Wednesday, March 31, 2021
 
ManojGeek
answered 7 Months ago
81

You only have one Pages so you are only entering that foreach once. Try looping on the urls.

$xml = "<?xml version='1.0'?>
<AdXML>

<Response>
    <Campaign>
        <Overview>
            <Name>strip</Name>
                <Description>category</Description>
                <Status>L</Status>
        </Overview>
        <Pages>
            <Url>page01</Url>
            <Url>page02</Url>
            <Url>page03</Url>
        </Pages>
    </Campaign>
</Response>
</AdXML>";
$xmlparsed = new SimpleXMLElement($xml);
foreach ($xmlparsed->Response->Campaign->Pages->Url as $url) {
    echo $url, PHP_EOL;
}

Output:

page01
page02
page03
Wednesday, March 31, 2021
 
PHPWDev
answered 7 Months ago
59

To successfully obtain these media elements you first of all need to find the parent element. How to access those non-namespaced elements is outlined in depth in the basic Simplexml usage examples, I spare you this highly redundant code here.

So after obtaining the parent into a variable - let's call it $item this time - it works as outlined in the already hinted duplicated Q&A material, just here specific to your example XML:

$media = $item->children('media', true);
                            |
                          __|__
 <rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" 
                    xmlns:flow="http://www.flownetworks.com/schemas/media/0.1.0">

By using the namespace-prefix, it corresponds to the highlighted prefix part. This requires to use true as second parameter.

You can also use the alternative, the namespace-URI, so you can spare the second parameter (defaults to FALSE):

$media = $item->children('http://search.yahoo.com/mrss/');
                                        |
                                 _______|_____________________
 <rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" 
                    xmlns:flow="http://www.flownetworks.com/schemas/media/0.1.0">

Regardless which variant you prefer, telling simplexml exactly that you are concerned about children in the media XML-namespace enables you to access the various parts from within that media group as you know it:

$group = $media->group;
echo $group->category, "n"; # US

I hope this is helpful and explicitly shows you how it works for the namespaced elements. You need to use the SimpleXMLElement::children() method and specify which namespace you mean of getting the children elements from.

Saturday, May 29, 2021
 
footy
answered 5 Months ago
27
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
Element root = doc.getDocumentElement();
NodeList nodeList = doc.getElementsByTagName("player");
for (int i = 0; i < nodeList.getLength(); i++) {
  Node node = nodeList.item(i);
  // do your stuff
}

but I'd rather suggest to use XPath

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(<uri_as_string>);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("/GameWorld/player");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
Tuesday, August 31, 2021
 
David Miani
answered 2 Months ago
18

As @Tomalak mentioned, outputting CDATA is not supported.

You can probably write ![CDATA[ as xml tag and later on replace the closing tag from the resulting xml. Will this work for you? Its probably not the one with minimal costs, but easiest. You can of course replace the MarshalIndent call with just the Marshal call in the example below.

http://play.golang.org/p/2-u7H85-wn

package main

import (
    "encoding/xml"
    "fmt"
    "bytes"
)

type XMLProduct struct {
    XMLName          xml.Name `xml:"row"`
    ProductId        string   `xml:"product_id"`
    ProductName      string   `xml:"![CDATA["`
    OriginalPrice    string   `xml:"original_price"`
    BargainPrice     string   `xml:"bargain_price"`
    TotalReviewCount int      `xml:"total_review_count"`
    AverageScore     float64  `xml:"average_score"`
}

func main() {
    prod := XMLProduct{
        ProductId:        "ProductId",
        ProductName:      "ProductName",
        OriginalPrice:    "OriginalPrice",
        BargainPrice:     "BargainPrice",
        TotalReviewCount: 20,
        AverageScore:     2.1}

    out, err := xml.MarshalIndent(prod, " ", "  ")
    if err != nil {
        fmt.Printf("error: %v", err)
        return
    }

    out = bytes.Replace(out, []byte("<![CDATA[>"), []byte("<![CDATA["), -1)
    out = bytes.Replace(out, []byte("</![CDATA[>"), []byte("]]>"), -1)
    fmt.Println(string(out))
}
Friday, October 8, 2021
 
Puneet
answered 3 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share