with Lorelle and Brent VanFossen

Putting Our Site into WordPress

Once I had a grip on the language of PHP and an understanding of MySQL, I needed to start thinking about how it would convert into database material. And I needed to figure out how to get this information not only into the database, but to meet WordPress’ needs for generating the content. Nothing is harder than trying to stick a square peg into a round hole and I need to get my website content into a form that WordPress will not only accept but like to work with.

I decided to begin slowly, with only 65 web pages at first, so I could do damage on only a small number of my 500+ web pages. I knew that this would get me through the learning curve, and take less time to figure out the process than if I was smashing through all 500+ files. Once I figured out the process, I knew it would go much faster across the rest of the files.

Luckily, WordPress has put a lot of effort into making it as easy as possible for the users of other blog software to import their data. WordPress supplies a wide range of import scripts for popular blog software, but little help specifically for non-blog software, like the strictly old fashioned, do-it-yourself html encoded blogs or websites, like mine.

What they haven’t done is come up with a simple way to CONVERT the data, only import. So it’s up to me to make my site data conform to one of their import systems. Unfortunately, WordPress imports are designed to move data from one database system to another, and not from a static website INTO a database. This means I have to convert my data into something that one of the import files will accept, digest, and spit into the database.

Of course, I wanted the easy way out. I wanted to find software that would convert HTML into database material – easy to import. What I was asking for was for someone to create software that would read through all my HTML and CSS coding, pick out what was important to little old me, and strip away the gunk and leave something nice and pretty, ready for import to the database. WRONG!

There is no clean and fast way. As good as software is, we are still not to the Star Trek world and there is nothing that can read my mind, or my html pages, that will give me database material. I looked everywhere. Some will hint at it, but the best I can do is strip the code out so HTML will convert to text. Taking the HTML out will destroy my layout, so that wasn’t an option. There is no nice and clean and fast way to do this. I’m stuck.

Converting Static HTML to Import Material

Ah, but contraire, my friends! Lorelle found a way to do it actually quite easily. Shooting in the dark with only a little help from the forums (until I used capital letters and started pleading), I figured it out. Come along for the long ride.

I began by thoroughly studying the data form required by WordPress in order to “easily” import the information. I decided that the most simplistic form that I could covert my HTML pages to was the import-mt format, used for importing MoveableType blogs into the WordPress database. I liked it because it looked simple to convert my site’s web pages to something similar, and it would also allow importing of HTML/XHTML tags so I could keep my formatting within each page, such as tip boxes, photographs, and graphics. All I had to do was sort my HTML information into the import-mt format.

I found a tutorial on How to Import MovableType Entries into your WordPress Blog and MoveableType instructions for importing data and began to memorize them, tearing each apart and putting it together so I understood each element.

Work From a Copy

The next step was to make sure I had the most current version of my site to work from, and that it was validated to death. I copied my entire website from my site to my computer’s hard drive into a new folder, ready to destroy and rearrange. The originals are still protected and backed up, thank goodness, because if I screw up along the way, I have to have more than one backup.

I also spent some time double checking and validating a good number of pages to make sure that what I was starting with was in good condition. I’m glad I did because I found a lot of little errors, not life shattering but capable of causing me grief later on after the conversion from HTML to XHMTL. The issue of every tag either having a closing tag or being a self-closing tag made it very important to find or add every closing </p> and </li> tag.

One of the little things that also caught me off guard was the issue of the tags all being in lowercase. I’d made that a “rule” during my last major revision of the website, in order to be compliant for when I finally made the move to XHMTL, but some still slipped through, left over from a prior HTML editor that capitalized HTML tags by default. I had to go through and check for those stray capitalized tags, and thankfully, I only found a few in my test pages.

The next step is probably the most tedious. I thought it would be easy, but it may not turn out that way. I have to first covert the HTML into XHTML in order to validate and meet the requirements of WordPress.

After a bunch of research, it turns out that HTML Tidy is the best way to go, but using this program is like going back to Windows before they had numbers after the title. I’m telling you, it’s like working with Pre-Windows 3.11. It harkens back to DOS 3.3 and earlier. Does anyone remember those “good old days” where dreams of having 640K of RAM were still fantasy?

Anyway, Tidy is powerful and archaic, to say the least. There are a few Tidy GUI programs out there that turn Tidy into a Windows program, but they are few in number. If you can handle the old method, 106-IBM has good instructions for helping you through the process, but only as a starting point. After looking at my few choices and failing to make the old fashioned DOS versions work, I settled on Hab Utilities HABTidy, a very simplistic free Windows GUI interface which does multiple files, but no more than 11 at a time

I set the custom options to include “output xml” and choose the “Process Files as a Group”. It works really fast so I didn’t cry too hard at being limited to only 11 files. Select the files and set the selector at the bottom to read “Format with custom options” and you are good to go. HABTidy saves a backup of the file as ~filename.html and changes the original. This conversion isn’t perfect if you have unvalidated HMTL, but if you have seriously validated your code prior to beginning, the conversion works like a charm.

Once I had the converted files, it was time to convert those into something WordPress and MySQL would recognize for import via the MoveableType import for WordPress.

It seems that MoveableType uses the same import and export format, so by following the instructions at MoveableType instructions for importing data, I had a format to follow.

It was time to begin the Search and Replace.

4 Comments

  • shanmukh
    Posted August 18, 2006 at 1:31 | Permalink

    i dont know how to use blog software i am using wordpress can any body pls suggest me how to use blog software i need all the detailed steps even though i went to wordpress site it is littel bit of confusing help me……i am an java programmer..

  • Posted August 18, 2006 at 9:53 | Permalink

    WordPress Lessons, New To WordPress – Where to Start, and Lorelle on WordPress would be my starting points. If you can understand java than blogging with WordPress is a breeze.

    Do not start by ripping it apart under the hood. Blog the basics and then poke at it slowly. It’s not hard. Young kids are using it.

    As for what is confusing on the WordPress site, please let me know specifically how it can be improved as I have some input there. Thank you.

  • Posted January 22, 2008 at 18:35 | Permalink

    I know this is a very old topic but I need to resurrect it please. In October when Yahoo 360 announced that they would no longer support our blogs, we all began scrambling to find a replacement blogging home. After checking many other platforms out, I decided on WordPress as the most comfortable. I now have two blogs at wordpress.com, but have not yet been able to transfer my Yahoo 360 blog there.

    I have written both Yahoo and WordPress about a way to transfer my posts some way other than cutting and pasting over 100 entries. Yahoo has never responded and WordPress says that Yahoo needs an export tool.

    The social network Multiply wrote an import tool for Yahoo refugees within weeks after Yahoo annouced it ‘s closing, so I am feeling that writing an import tool must not be that difficult for an experienced programmer … but what do I know!

    I have been reading you WordPress related blogs and find you very knowledable, so I thought that if anyone could tell me how to do this – you could. BTW there are many others who would like this issue resolved as well.

    Thanks for reading.

  • Posted January 23, 2008 at 9:31 | Permalink

    It looks like you have two choices. If the old blog service has feeds, you can use the XML feed import feature. If it doesn’t, and you still have access to the blog content, it’s copy and paste time. Sure, it’s time consuming, but how much time have you been waiting already?

    WordPress offers over a dozen methods for importing content, and one of those may work. Check the WordPress Codex for information on all the different types of importing. Good luck.

One Trackback

Post a Comment

Your email is kept private. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.