How to convert Html to json

Is there any way to request an HTML file ( or part of it) from a server and convert it into json

like this

fetch('https://www.example.com/index.html').then(res => res.json()).then(json => console.log(json);

HTML and JSON are totally two different data structures.

If you’re looking to retrieve the HTML data, the following snippet will work.
[ You need to use res.test(); instead of res.json(); ]

fetch('https://www.example.com/index.html')
  .then(res => res.text())
  .then(html => console.log(html));

@husseyexplores when we enter a URL in our browser it downloads the necessary files. Now is there any way that we put the same URL in fetch(‘url’) and somehow transfer the HTML file into JSON without getting any CORS error.

Can you explain what do you mean by HTML file to JSON? I don’t understand this as they are different data structures.
Perhaps you can show me the desired output that you want?

The method I wrote above gives the whole HTML page. You can extract any files from it. Be it images, JS files, or CSS, or just raw HTML. It is all there.

As long as the CORS issue, you can’t solve it. It there for security reasons. Browsers will prevent this from making you such kind of request. Event if the browsers will let you make the request, the server will deny it most likely.

The above method will only work if you perform it on the site where the code will run.

For example, you will be able to fetch the HTML data of example.com/index.html from any other site other than the example.com domain.

Edit: You can only make such kind of requests in a server side environment, such as using NodeJS/PHP/Python etc. Search for web scrapping for more.

1 Like

We can access this file in our browser without any CORS issue

https://www.example.com/index.html

Why can’t we do the same thing in our js code?

No. Browsers prevent this from happening from day 1!

If you really need to do this, you will need a proxy. Either your own or some external, that will fetch data from the url on the server side, then send back the response to you.

There is a service/open-source project called CORS Anywhere. You can look it up on the google.
To fetch the above url using the CORS Anywhere, you’d need to do this:
(jQuery example)

$.ajax({
  method: 'GET',
  url: 'https://cors-anywhere.herokuapp.com/https://www.example.com/index.html',
  success: function(html) {
   console.log(html);
  }
})
1 Like

Thanks, Hope it works

hi there, regarding the conversation, i’m recommend you to use www.coolutils.com they provide wide range of converters you can check out there i’m sharing the link for html converters so you can check it out if it satisfies your needs.

All the best and goodluck :slight_smile:
https://www.coolutils.com/TotalHTMLConverter

function htmlToJson(div,obj){
if(!obj){obj=[]}
var tag = {}
tag['tagName']=div.tagName
tag['children'] = []
for(var i = 0; i< div.children.length;i++){
    tag['children'].push(htmlToJson(div.children[i]))
}
for(var i = 0; i< div.attributes.length;i++){
    var attr= div.attributes[i]
    tag['@'+attr.name] = attr.value
}
return tag    

}