I’ve spent hours hunting all over the net for software that will convert my HMTL pages to something that will fit in a database. The reason? I want to convert my web site to PHP and mySQL and that requires taking my current pages and converting them to a form that can be read by a database such as Excel, Access, or mySQL. With more than 500 pages on our web site, stripping the codes and putting in tabs, commas or recognizable code for databases to understand for import is an incredibly time consuming effort doing it one page at a time. Ugh.
The first few hours were spent trying to find the terminology for the process of converting, importing, changing, migrating, or just fixing HMTL to make it recognizable by a database program. I tried “convert HMTL to php” and “convert HTML to mysql” with little success. I dug through hundreds of pages that gave me more than I wanted to know about how to generate HTML pages with PHP and mySQL but nothing on getting the HTML into mySQL.
Finally I stumbled upon the phrase “convert HTML to database”. That brought me more possibilities, but unfortunately, as with a lot of the Internet, the suggestions were more appropriate for those using Windows 3.11 or Windows 95 than newer software and the links were dead.
I did stumble upon one site that specializes in file conversion software called Intelligent Converters but they only can help turn database information into something else like pdf, html, and other database program information.
I found an amazing site called GetaFreelancer.com. I stumbled on it because a company was looking for someone to convert a web site’s 200 pages to a database setup. They had dozens of people and companies willing to bid on the job. The account was closed so I assume they hired someone, but this is really worth a further look at…later, when I have time.
Continuing to plug away at this – more determined to spend time hunting for a quick solution than actually spending hours on end copying and pasting from more than 500 web pages – I finally found some possibilities. I will give them a try over the next few days and report back.
FileChicken.com , a funny name but interesting site. It listed a bunch of HTML conversion programs available for downloading including programs for converting HTML to and from other things. But as soon as I found that page, the rest of the site started not functioning. PHP errors everywhere. Luckily I was able to get to the home page of one of the software developers and download a program from there.
Here are some others:
- Nirsoft’s HTML as Text
- Jafsoft’s Detagger
- Hixus’s HTML to PHP converter
- Arigola HMTL2PHP Converter – Very overblown ad page – in order to download you have to send email address to get link and “subscribe” to emails from them.
- Tipsntutorials has a php script for stripping HTML code from documents
According to several sites, “converting your current html to php is easier done than said”. They recommend several things.
- Slip PHP in Where Needed
- Change your page name extensions to php (and where is a batch program that will do that, huh?) and then slap in php code around your current html code and you have php. I assume you will then add php content and database information as your site grows, or you slowly change things over. This idea is a nice one but doesn’t answer my specific needs. It is more of a “get by” process.
- Change Your .htacess File to Recognize HTML as PHP
- Once you start changing your file extensions, broken and dead link hell appears. Links from inside and outside your web site become lost and broken, crippling your site. Yet, it seems that PHP can recognized HTML files if it is told to do so. SpiderPro offered a step-by-step process to explain how to change your .htaccess file to recognize all HTML extensions as PHP, as do Webmasterworld Forum discussion on PHP Server Side Scripting, Virtualvenus WIKI information on converting to PHP, and an article about understanding Apache Servers and Redirects , explanations for Apache on addtype.
Basically, it means adding the following two lines to your .htaccess file:
AddType application/x-httpd-php .php .php3 .phtml .html
AddHandler x-httpd-php .html
Do this at your own risk. Server must be Apache and handle mod-rewrites.
So I’ll keep working on all of this and let you know if I survive the transformation from HTML to PHP.