Asked  7 Months ago    Answers:  5   Viewed   26 times

I need to be able to parse XML using JavaScript. The XML will be in a variable. I would prefer not to use jQuery or other frameworks.

I have looked at this, XML > jQuery reading.

 Answers

22

I'm guessing from your last question, asked 20 minutes before this one, that you are trying to parse (read and convert) the XML found through using GeoNames' FindNearestAddress.

If your XML is in a string variable called txt and looks like this:

<address>
  <street>Roble Ave</street>
  <mtfcc>S1400</mtfcc>
  <streetNumber>649</streetNumber>
  <lat>37.45127</lat>
  <lng>-122.18032</lng>
  <distance>0.04</distance>
  <postalcode>94025</postalcode>
  <placename>Menlo Park</placename>
  <adminCode2>081</adminCode2>
  <adminName2>San Mateo</adminName2>
  <adminCode1>CA</adminCode1>
  <adminName1>California</adminName1>
  <countryCode>US</countryCode>
</address>

Then you can parse the XML with Javascript DOM like this:

if (window.DOMParser)
{
    parser = new DOMParser();
    xmlDoc = parser.parseFromString(txt, "text/xml");
}
else // Internet Explorer
{
    xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
    xmlDoc.async = false;
    xmlDoc.loadXML(txt);
}

And get specific values from the nodes like this:

//Gets house address number
xmlDoc.getElementsByTagName("streetNumber")[0].childNodes[0].nodeValue;

//Gets Street name
xmlDoc.getElementsByTagName("street")[0].childNodes[0].nodeValue;

//Gets Postal Code
xmlDoc.getElementsByTagName("postalcode")[0].childNodes[0].nodeValue;

JSFiddle


Feb. 2019 edit:

In response to @gaugeinvariante's concerns about xml with Namespace prefixes. Should you have a need to parse xml with Namespace prefixes, everything should work almost identically:

NOTE: this will only work in browsers that support xml namespace prefixes such as Microsoft Edge

// XML with namespace prefixes 's', 'sn', and 'p' in a variable called txt
txt = `
<address xmlns:p='example.com/postal' xmlns:s='example.com/street' xmlns:sn='example.com/streetNum'>
  <s:street>Roble Ave</s:street>
  <sn:streetNumber>649</sn:streetNumber>
  <p:postalcode>94025</p:postalcode>
</address>`;

//Everything else the same
if (window.DOMParser)
{
    parser = new DOMParser();
    xmlDoc = parser.parseFromString(txt, "text/xml");
}
else // Internet Explorer
{
    xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
    xmlDoc.async = false;
    xmlDoc.loadXML(txt);
}

//The prefix should not be included when you request the xml namespace
//Gets "streetNumber" (note there is no prefix of "sn"
console.log(xmlDoc.getElementsByTagName("streetNumber")[0].childNodes[0].nodeValue);

//Gets Street name
console.log(xmlDoc.getElementsByTagName("street")[0].childNodes[0].nodeValue);

//Gets Postal Code
console.log(xmlDoc.getElementsByTagName("postalcode")[0].childNodes[0].nodeValue);
Tuesday, June 1, 2021
 
SubniC
answered 7 Months ago
56

You can use the DOM or XmlReader extensions

  • DOMDocument::schemaValidate — Validates a document based on a schema
  • XMLReader::setSchema — Validate document against XSD

to validate documents against a schema.

Wednesday, March 31, 2021
 
danjah
answered 9 Months ago
69

Your current method fails, because HTML properties are not defined for the given XML document. If you supply the text/html MIME-type, the method should work.

var string = '<!DOCTYPE html><html><head></head><body>content</body></html>';
var doc = new DOMParser().parseFromString(string, 'text/html');
doc.body.innerHTML; // or doc.querySelector('body').innerHTML
// ^ Returns "content"

The code below enables the text/html MIME-type for browsers which do not natively support it yet. Is retrieved from the Mozilla Developer Network:

/* 
 * DOMParser HTML extension 
 * 2012-02-02 
 * 
 * By Eli Grey, http://eligrey.com 
 * Public domain. 
 * NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK. 
 */  

/*! @source https://gist.github.com/1129031 */  
/*global document, DOMParser*/  

(function(DOMParser) {  
    "use strict";  
    var DOMParser_proto = DOMParser.prototype  
      , real_parseFromString = DOMParser_proto.parseFromString;

    // Firefox/Opera/IE throw errors on unsupported types  
    try {  
        // WebKit returns null on unsupported types  
        if ((new DOMParser).parseFromString("", "text/html")) {  
            // text/html parsing is natively supported  
            return;  
        }  
    } catch (ex) {}  

    DOMParser_proto.parseFromString = function(markup, type) {  
        if (/^s*text/htmls*(?:;|$)/i.test(type)) {  
            var doc = document.implementation.createHTMLDocument("")
              , doc_elt = doc.documentElement
              , first_elt;

            doc_elt.innerHTML = markup;
            first_elt = doc_elt.firstElementChild;

            if (doc_elt.childElementCount === 1
                && first_elt.localName.toLowerCase() === "html") {  
                doc.replaceChild(first_elt, doc_elt);  
            }  

            return doc;  
        } else {  
            return real_parseFromString.apply(this, arguments);  
        }  
    };  
}(DOMParser));
Wednesday, June 9, 2021
 
van_folmert
answered 6 Months ago
27

After much trial and error and stumbling through the dark, I found the solution I was looking for. This will get the value of an individual xml element and set it to a variable:

var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();

So my final code looks like this. I hope it helps you in your endeavors.

//get the data from boardgamegeek
  var url = 'http://www.boardgamegeek.com/xmlapi/boardgame/' + bggCode;
  var bggXml = UrlFetchApp.fetch(url).getContentText();

  var document = XmlService.parse(bggXml);
  var root = document.getRootElement();

  //set variables to data from bgg
  var yearpublished = root.getChild('boardgame').getChild('yearpublished').getText();
  var minplayers = root.getChild('boardgame').getChild('minplayers').getText();
  var maxplayers = root.getChild('boardgame').getChild('maxplayers').getText();
  var playingtime = root.getChild('boardgame').getChild('playingtime').getText();
  var name = root.getChild('boardgame').getChild('name').getText();

  //populate sheet with variable data
  SpreadsheetApp.getActiveSheet().getRange(i+1,1).setValue(name);
  SpreadsheetApp.getActiveSheet().getRange(i+1,4).setValue(minplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(maxplayers);
  SpreadsheetApp.getActiveSheet().getRange(i+1,5).setValue(playingtime);
  SpreadsheetApp.getActiveSheet().getRange(i+1,7).setValue(yearpublished);

In case you happen to also be querying BGG, there are multiple name elements. I want the one with the primary attribute set to "true". Iterating through those elements to find the correct one will be my next challenge.

Wednesday, September 8, 2021
 
Seán McCabe
answered 3 Months ago
14

Straight away it looks like the problem is with Encoding of XML in your response.

URL url = new URL("http://myurl.com");
InputSource is = new InputSource(url.openStream());
is.setEncoding("ISO-8859-1"); // Also Try UTF-8 or UTF-16
BufferedReader br = new BufferedReader(new InputStreamReader(is.getByteStream()));
String line,str;
while((line=br.readLine())!=null)
{
      str = str + line;
}
Log.i(TAG,str);
Monday, October 4, 2021
 
DMTintner
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share