Asked  7 Months ago    Answers:  5   Viewed   38 times

I can convert JSON to HTML using JsontoHtml library. Now,I need to convert present HTML to JSON as shown in this site. When looked into the code I found the following script:

<script>
$(function(){

    //HTML to JSON
    $('#btn-render-json').click(function() {

        //Set html output
        $('#html-output').html( $('#html-input').val() );

        //Process to JSON and format it for consumption
        $('#html-json').html( FormatJSON(toTransform($('#html-output').children())) );
    });

});

//Convert obj or array to transform
function toTransform(obj) {

    var json;

    if( obj.length > 1 )
    {
        json = [];

        for(var i = 0; i < obj.length; i++)
            json[json.length++] = ObjToTransform(obj[i]);
    } else
        json = ObjToTransform(obj);

    return(json);
}

//Convert obj to transform
function ObjToTransform(obj)
{
    //Get the DOM element
    var el = $(obj).get(0);

    //Add the tag element
    var json = {'tag':el.nodeName.toLowerCase()};

    for (var attr, i=0, attrs=el.attributes, l=attrs.length; i<l; i++){
        attr = attrs[i];
        json[attr.nodeName] = attr.value;
    }

    var children = $(obj).children();

    if( children.length > 0 ) json['children'] = [];
    else json['html'] = $(obj).text();

    //Add the children
    for(var c = 0; c < children.length; c++)
        json['children'][json['children'].length++] = toTransform(children[c]);

    return(json);
}

//Format JSON (with indents)
function FormatJSON(oData, sIndent) {
    if (arguments.length < 2) {
        var sIndent = "";
    }
    var sIndentStyle = "  ";
    var sDataType = RealTypeOf(oData);

    // open object
    if (sDataType == "array") {
        if (oData.length == 0) {
            return "[]";
        }
        var sHTML = "[";
    } else {
        var iCount = 0;
        $.each(oData, function() {
            iCount++;
            return;
        });
        if (iCount == 0) { // object is empty
            return "{}";
        }
        var sHTML = "{";
    }

    // loop through items
    var iCount = 0;
    $.each(oData, function(sKey, vValue) {
        if (iCount > 0) {
            sHTML += ",";
        }
        if (sDataType == "array") {
            sHTML += ("n" + sIndent + sIndentStyle);
        } else {
            sHTML += (""" + sKey + """ + ":");
        }

        // display relevant data type
        switch (RealTypeOf(vValue)) {
            case "array":
            case "object":
                sHTML += FormatJSON(vValue, (sIndent + sIndentStyle));
                break;
            case "boolean":
            case "number":
                sHTML += vValue.toString();
                break;
            case "null":
                sHTML += "null";
                break;
            case "string":
                sHTML += (""" + vValue + """);
                break;
            default:
                sHTML += ("TYPEOF: " + typeof(vValue));
        }

        // loop
        iCount++;
    });

    // close object
    if (sDataType == "array") {
        sHTML += ("n" + sIndent + "]");
    } else {
        sHTML += ("}");
    }

    // return
    return sHTML;
}

//Get the type of the obj (can replace by jquery type)
function RealTypeOf(v) {
  if (typeof(v) == "object") {
    if (v === null) return "null";
    if (v.constructor == (new Array).constructor) return "array";
    if (v.constructor == (new Date).constructor) return "date";
    if (v.constructor == (new RegExp).constructor) return "regex";
    return "object";
  }
  return typeof(v);
}
</script>

enter image description here

Now, I am in need of using the following function in PHP. I can get the HTML data. All what I needed now is to convert the JavaScript function to PHP function. Is this possible? My major doubts are as follows:

  • The primary input for the Javascript function toTransform() is an object. Is it possible to convert HTML to object via PHP?

  • Are all the functions present in this particular JavaScript available in PHP?

Please suggest me the idea.

When I tried to convert script tag to json as per the answer given, I get errors. When I tried it in json2html site, it showed like this:enter image description here .. How to achieve the same solution?

 Answers

57

If you are able to obtain a DOMDocument object representing your HTML, then you just need to traverse it recursively and construct the data structure that you want.

Converting your HTML document into a DOMDocument should be as simple as this:

function html_to_obj($html) {
    $dom = new DOMDocument();
    $dom->loadHTML($html);
    return element_to_obj($dom->documentElement);
}

Then, a simple traversal of $dom->documentElement which gives the kind of structure you described could look like this:

function element_to_obj($element) {
    $obj = array( "tag" => $element->tagName );
    foreach ($element->attributes as $attribute) {
        $obj[$attribute->name] = $attribute->value;
    }
    foreach ($element->childNodes as $subElement) {
        if ($subElement->nodeType == XML_TEXT_NODE) {
            $obj["html"] = $subElement->wholeText;
        }
        else {
            $obj["children"][] = element_to_obj($subElement);
        }
    }
    return $obj;
}

Test case

$html = <<<EOF
<!DOCTYPE html>
<html lang="en">
    <head>
        <title> This is a test </title>
    </head>
    <body>
        <h1> Is this working? </h1>  
        <ul>
            <li> Yes </li>
            <li> No </li>
        </ul>
    </body>
</html>

EOF;

header("Content-Type: text/plain");
echo json_encode(html_to_obj($html), JSON_PRETTY_PRINT);

Output

{
    "tag": "html",
    "lang": "en",
    "children": [
        {
            "tag": "head",
            "children": [
                {
                    "tag": "title",
                    "html": " This is a test "
                }
            ]
        },
        {
            "tag": "body",
            "html": "  n        ",
            "children": [
                {
                    "tag": "h1",
                    "html": " Is this working? "
                },
                {
                    "tag": "ul",
                    "children": [
                        {
                            "tag": "li",
                            "html": " Yes "
                        },
                        {
                            "tag": "li",
                            "html": " No "
                        }
                    ],
                    "html": "n        "
                }
            ]
        }
    ]
}

Answer to updated question

The solution proposed above does not work with the <script> element, because it is parsed not as a DOMText, but as a DOMCharacterData object. This is because the DOM extension in PHP is based on libxml2, which parses your HTML as HTML 4.0, and in HTML 4.0 the content of <script> is of type CDATA and not #PCDATA.

You have two solutions for this problem.

  1. The simple but not very robust solution would be to add the LIBXML_NOCDATA flag to DOMDocument::loadHTML. (I am not actually 100% sure whether this works for the HTML parser.)

  2. The more difficult but, in my opinion, better solution, is to add an additonal test when you are testing $subElement->nodeType before the recursion. The recursive function would become:

function element_to_obj($element) {
    echo $element->tagName, "n";
    $obj = array( "tag" => $element->tagName );
    foreach ($element->attributes as $attribute) {
        $obj[$attribute->name] = $attribute->value;
    }
    foreach ($element->childNodes as $subElement) {
        if ($subElement->nodeType == XML_TEXT_NODE) {
            $obj["html"] = $subElement->wholeText;
        }
        elseif ($subElement->nodeType == XML_CDATA_SECTION_NODE) {
            $obj["html"] = $subElement->data;
        }
        else {
            $obj["children"][] = element_to_obj($subElement);
        }
    }
    return $obj;
}

If you hit on another bug of this type, the first thing you should do is check the type of node $subElement is, because there exists many other possibilities my short example function did not deal with.

Additionally, you will notice that libxml2 has to fix mistakes in your HTML in order to be able to build a DOM for it. This is why an <html> and a <head> elements will appear even if you don't specify them. You can avoid this by using the LIBXML_HTML_NOIMPLIED flag.

Test case with script

$html = <<<EOF
        <script type="text/javascript">
            alert('hi');
        </script>
EOF;

header("Content-Type: text/plain");
echo json_encode(html_to_obj($html), JSON_PRETTY_PRINT);

Output

{
    "tag": "html",
    "children": [
        {
            "tag": "head",
            "children": [
                {
                    "tag": "script",
                    "type": "text/javascript",
                    "html": "n            alert('hi');n        "
                }
            ]
        }
    ]
}
Wednesday, March 31, 2021
 
employeegts
answered 7 Months ago
38

Sorry guys, the issue was a super small one. After inspecting the DOM in chrome I found out that there was a syntax error loading the javascript. I was getting a "Uncaught ReferenceError: $ is not defined" error.

Anyways this is the error:

<link rel="stylesheet" type="text/css" href="/css/structure.css">
<link rel="stylesheet" type="text/css" href="/css/pure-min.css"
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/jquery.form.min.js"></script>
<script type="text/javascript" src="js/uploader.js"></script>

Should be this:

<link rel="stylesheet" type="text/css" href="/css/structure.css">
<link rel="stylesheet" type="text/css" href="/css/pure-min.css"> <-- FORGOT TO CLOSE! :| 
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/jquery.form.min.js"></script>
<script type="text/javascript" src="js/uploader.js"></script>

Sorry for wasting your time, that was my bad.

Wednesday, March 31, 2021
 
jab
answered 7 Months ago
jab
83

JSON.parse() is not working because json string is not proper. File_path is imggallerijAfbeelding-61.png but it should be img/gallerij/demo-image.jpg. Please change File_path everywhere in json string .

Click here for jsfiddle example.

File_path is getting generated from $outp .= '"File_path":"'. $rs["Foto_File"] . '",'; So replace with /.

Benefit: You get a path using JSON.parse() like img/gallerij/demo-image.jpg directly in client, so that this path can be used directly as src of DOM element.

Saturday, May 29, 2021
 
Bere
answered 5 Months ago
92
using System;
using System.Linq;
using System.Web.Script.Serialization;
using System.Xml.Linq;

class Program
{
    static void Main()
    {
        var xml = 
        @"<Columns>
          <Column Name=""key1"" DataType=""Boolean"">True</Column>
          <Column Name=""key2"" DataType=""String"">Hello World</Column>
          <Column Name=""key3"" DataType=""Integer"">999</Column>
        </Columns>";
        var dic = XDocument
            .Parse(xml)
            .Descendants("Column")
            .ToDictionary(
                c => c.Attribute("Name").Value, 
                c => c.Value
            );
        var json = new JavaScriptSerializer().Serialize(dic);
        Console.WriteLine(json);
    }
}

produces:

{"key1":"True","key2":"Hello World","key3":"999"}

Obviously this treats all the values as strings. If you want to keep the underlying type semantics you could do the following:

using System;
using System.Linq;
using System.Web.Script.Serialization;
using System.Xml.Linq;

class Program
{
    static void Main()
    {
        var xml = 
        @"<Columns>
          <Column Name=""key1"" DataType=""System.Boolean"">True</Column>
          <Column Name=""key2"" DataType=""System.String"">Hello World</Column>
          <Column Name=""key3"" DataType=""System.Int32"">999</Column>
        </Columns>";
        var dic = XDocument
            .Parse(xml)
            .Descendants("Column")
            .ToDictionary(
                c => c.Attribute("Name").Value, 
                c => Convert.ChangeType(
                    c.Value,
                    typeof(string).Assembly.GetType(c.Attribute("DataType").Value, true)
                )
            );
        var json = new JavaScriptSerializer().Serialize(dic);
        Console.WriteLine(json);
    }
}

produces:

{"key1":true,"key2":"Hello World","key3":999}

And if you cannot modify the underlying XML structure you will need a custom function that will convert between your custom types and the underlying .NET type:

using System;
using System.Linq;
using System.Web.Script.Serialization;
using System.Xml.Linq;

class Program
{
    static void Main()
    {
        var xml = 
        @"<Columns>
          <Column Name=""key1"" DataType=""Boolean"">True</Column>
          <Column Name=""key2"" DataType=""String"">Hello World</Column>
          <Column Name=""key3"" DataType=""Integer"">999</Column>
        </Columns>";
        var dic = XDocument
            .Parse(xml)
            .Descendants("Column")
            .ToDictionary(
                c => c.Attribute("Name").Value, 
                c => Convert.ChangeType(
                    c.Value, 
                    GetType(c.Attribute("DataType").Value)
                )
            );
        var json = new JavaScriptSerializer().Serialize(dic);
        Console.WriteLine(json);
    }

    private static Type GetType(string type)
    {
        switch (type)
        {
            case "Integer":
                return typeof(int);
            case "String":
                return typeof(string);
            case "Boolean":
                return typeof(bool);
            // TODO: add any other types that you want to support
            default:
                throw new NotSupportedException(
                    string.Format("The type {0} is not supported", type)
                );
        }
    }
}
Thursday, June 17, 2021
 
avon_verma
answered 4 Months ago
23

Try the following. I've tested it on FF 3.6 and Chrome 6, it works.

$.get('data/eurofxref-daily.xml', function(xml) {
      var jsonObj = $.xml2json(xml);
      alert(jsonObj.Cube.Cube.Cube[0]["rate"]);
}); 
Saturday, June 26, 2021
 
NaeiKinDus
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :