Friday, 22 July 2011

JavaScript: get data from an xml file (like a Blogger backup file) and display it (or print it) - Part 1


I've been running around this post quite a lot. I know that it could contain valuable information for everyone and specifically for Bloggers, but there's so much into it that I don't know where to start.
Anyway... The point here is:
1) take the Blogger backup file (xml) - or any other properly formed xml file (an Atom feed as well);
2) clean it up - if needed;
3) build an HTML page that gets the data from the xml file (using JavaScript);
4) properly style the page.

As you can see the task is not that easy... or is it? Read on, and judge for yourself.

The backup file
If you backup your blog, you end up with an xml file that contains everything (posts, comments and pages). We are talking about Blogger here, but any xml file will do (RSS feeds or anything else). As you may already know, a xml file has a defined structure. It is basically made of closed nodes. Those nodes are normally nested, so that we have parent nodes and child nodes.
When we open a Blogger backup file, we can see that what we are looking for is contained in a parent node called <entry>, while its relevant children are:
  • <category>
  • <title>
  • <published>
  • <content>
Those are the nodes we will focus on.
The first part of the xml file is full of rubbish. While at the end you can find comments and pages. I decided to make a copy of the backup file and call it "thought.xml". If you are not using a Blogger backup file, you might want to manually clean your xml file - it's up to you.

The HTML and CSS
Now that we have the xml file ready, we need to work on the HTML page. Just create a new document with your favourite editor. The document should have a:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
in the head. Just to be sure.
As for the CSS, we can style the nodes the way we prefer. I just want to include my styles, so that you can play with them (not that there's much in them anyway).
<style type="text/css">
body {
    font-family: Verdana, Arial, Helvetica, sans-serif;
    font-size: 12px;
    color: #FFFFFF;
    text-decoration: none;
    background-color: #000000;
    margin-top: 50px;
    margin-right: 250px;
    margin-bottom: 50px;
    margin-left: 250px;
font-family: "Courier New",Courier,monospace;
background-color: #666666;
border-top-width: thin;
border-right-width-value: thin;
border-right-width-ltr-source: physical;
border-right-width-rtl-source: physical;
border-bottom-width: thin;
border-left-width-value: thin;
border-left-width-ltr-source: physical;
border-left-width-rtl-source: physical;
border-top-style: solid;
border-right-style-value: solid;
border-right-style-ltr-source: physical;
border-right-style-rtl-source: physical;
border-bottom-style: solid;
border-left-style-value: solid;
border-left-style-ltr-source: physical;
border-left-style-rtl-source: physical;
display: block;
border-bottom-color: #ffffff;
border-left-color-value: #333333;
border-left-color-ltr-source: physical;
border-left-color-rtl-source: physical;
border-right-color-value: #ffffff;
border-right-color-ltr-source: physical;
border-right-color-rtl-source: physical;
border-top-color: #333333;
padding-left: 5px;
h2 {
    font-family: Verdana, Arial, Helvetica, sans-serif;
    text-transform: uppercase;
    color: #FFFFFF;
    text-decoration: none;
text-decoration: none;
color: #888888;
text-decoration: underline;
color: #cccccc;
I just styled the body and little other things. The <h2> tag will be used for the post title (the <title> node).

Ok! Now we are almost ready for the magic.

Get the data!
We will use Javascript to get the data from the xml file. From now on, we will work on the body of our HTML page.
For the moment I will just put the code. I will explain the code in the next post.
<div align="center"><img src="" alt="The Web Thought"><br>
  A place where I can share my thoughts on web development and programming. The web is such a big place...<br>
<script type="text/javascript">
if (window.XMLHttpRequest)
  {// code for IE7+, Firefox, Chrome, Opera, Safari
  xmlhttp=new XMLHttpRequest();
  {// code for IE6, IE5
  xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");

var x=xmlDoc.getElementsByTagName("entry");
for (i=x.length-1;i>=1;i--)
  var sea = x[i].getElementsByTagName("category")[0].getAttribute("term");
  if ("comment")==-1 &&"page")==-1)
  var pubdatetime=(x[i].getElementsByTagName("published")[0].childNodes[0].nodeValue);
  var pubdate = pubdatetime.substr(0,10);
  var pubtime = pubdatetime.substr(11,8);
  document.write("Published on "+pubdate);
  document.write(" at "+pubtime);
  document.write("<br><br><div style='position: relative;'>");
  var kids = x[i].getElementsByTagName("content")[0].childNodes.length;
  for (j=0;j<kids;j++)
Now you have three whole days to think about the above code. On Monday I will post an in depth explanation.
Just an important note: because we are using XMLHttpRequest, we should put the source xml file and the html page we are building on the same domain, otherwise you will get an Access Denied error from your browser. However, that is NOT always true: in my experience FireFox and Safari work perfectly locally, while Chrome and Internet Explorer need the files to be served by ISS on the same domain.
In the mean time, you can try the whole thing!
Happy coding.

0 thoughts:

Post a Comment

Comments are moderated. I apologize if I don't publish comments immediately.

However, I do answer to all the comments.