Wordpress to Drupal

There are now two programs that convert Wordpress blogs to Drupal.

The first relates to the conversion of Wordpress 2.1/2.2 and b2Evolution 1.9 to Drupal 5.x.

The latest accounts for the conversion of Wordpress 2.5 to Drupal 5.x and 6.x (as of version 1.03).

Select the appropriate manual.

Wordpress 2.1 - 2.2

I'm hoping that this guide should contain enough information in order to use my conversion program, currently version 5-07, without a hitch.

First things first, BACKUP

We have heard this term many times and we all know what it means. To those that have lost data, either through mishandling, theft, fire etc. and didn't have a backup, I'm sure stress levels went through the roof. If, on the other hand you haven't had such a catastrophe imagine you did last night, how would you feel right now. It happens.

Do not use the conversion program on your one and only Drupal 5.x (target) or Wordpress/b2evolution (source) databases. How much more explicit can I be.

At the end of this entire process you should still have your old source and target databases intact. You should also have a new Drupal 5.x database which contains all the imported information.

General Requirements

  1. An already created and installed Drupal 5.x database.
    • This can either be a freshly installed database which effectively is empty other than User ID 1, the 'super' or 'creator' user of the Drupal database installation; or
    • An already populated Drupal 5.x database. This basically means that posts, users, comments, categories etc have been created, most likely through Drupal itself.

      What I mean by 'most likely' is that you may have started with a clean, fresh, unpopulated Drupal 5.x database. Using this program it is possible to import a Wordpress or b2evolution database numerous times. Obviously, after a first blog is imported, the Drupal 5.x database is no longer a fresh unpopulated database.

  2. A source database, at this point in time, either Wordpress 2.x or b2evolution 1.8.
  3. All databases must be MySQL.

Program - Converter

Can't find imageThe latest conversion program (yet to be given a proper name and logo) can be downloaded from here.

Currently, only the Windows version has been tested, though I have compiled a Linux version. A Mac version will be available, when I sort out a few Mac issues.

(As of version 5-04, the Mac version has been tested and found to be in good working order)

Simply select the latest version, which will be at the top of the list (at the download page).

The file is contained within a zip file. You will need to extract that file and save it to a folder of your choosing.

Once saved to a folder of your choosing, simply double-click the file in order to run it. If all goes well you should see something similar to the screen shot above.

Import

The following sections or pages explains the information required in order to successfully import records into a Drupal 5.x database.

Import Form - details required

Can't find image

To access the actual importing form or window click on the 'Continue to Importer Form' button. Another window will appear, as per the screen shot to the left. It is this form that accepts all the information.

Information required

Database type
'wp' - Wordpress (default)
'b2' - b2evolution

This should be obvious, if you're wanting to convert a Wordpress blog then 'wp' should be selected.

Host
If accessing the MySql database remotely, that is, on a hosted server, somewhere, other than your computer, then you will need to input the IP number. If you are unsure of the IP number it can be found at your CPanel, failing that, your hosting company can provide it.

If you are accessing the MySql database locally, that is, on your home computer, then 'localhost' should suffice.

Port
The default is '3306'. Regardless of whether you are accessing your MySql database remotely (across the internet) or locally (on your computer) the number will probably be '3306'. If not, your hosting company can provide the correct figure.

Database Name

The name of the MySql database file of either the Wordpress or b2evolution database. If unsure, simply check the relevant 'config' or 'settings' php file from either system.

Username

Enter the 'Username' that has access rights to the database. Again, if unsure, check the 'config' or 'settings' php file from either system for confirmation.

Password

Self-explanatory. If unsure check the 'config' or 'settings' php file from either system for confirmation.

Table Prefix

If your database (Drupal, Wordpress or b2evolution) was setup with the inclusion of table prefixes, then the correct table prefix must be entered. If no prefixes then no entry is required.

All the information explained above from 'Host' through to 'Password' is also applicable to the Drupal 5.x database setup settings.

Test Connection

All the database information above is required to be inputted. Clicking on the 'Test Connection' button merely confirms that a connection can be made. If a connection can't be made the user is advised. Unfortunately you will have to work out which piece of information is incorrect.

Preview Permalinks

Clicking this button will open a new window displaying the permalinks, if a Wordpress file. Can't do much with it, it's just for display purposes.

Source Summary

Clicking on this button will merely display the number of records for Posts, Comments, Users, Categories and Post/Category links. Can't do much with it, it's just for display purposes.

If available, import Permalinks

If importing a Wordpress blog and permalinks are active, it's from here where you can choose to either import them or not. Only choices available are "Yes" or "No". The default is "Yes".

Records to get (per Grab)

This figure corresponds to how many records (posts) are retrieved in one hit. The default is 200. The limit is from 10 through to 1000. If attempting to convert a massively huge database, problems can arise with overloading which ultimately, is probably, the cause of program hanging. This setting merely says only grab a small bit a time but keep returning grabbing a small bit until the whole lot is got. For massively huge databases we don't want 10,000 records coming down the line in one go, it simply doesn't work.

Post Types to be:

Prior to version 5-05 the converter transferred posts and created them as Blog entries in Drupal. Now the user has the option of selecting whether the posts to be transferred are either:
  1. Blog
  2. Page
  3. Story

Start Importing Process

Clicking this button will start the importing process. The process will only start if the correct database connection settings have been input. Hence the reason for the 'Test Connection' buttons.

Upon successful commencement a pop-up window appears which reports the progress of the Users, Categories and Posts being imported.

Importing Progress

Can't find imageIf all the correct information has been inputted and the button 'Start Importing Process' has been clicked the whole process starts. A pop-up window appears which merely indicates the progress of the entire operation.

It should be self-explanatory what is being indicated though I'll explain anyway.

You will notice there are two components to the progress indicator. One is the 'bar' indicating percentage done and the other is a figure to the right of the bars. This figure reports the number of records processed per bar.

There are four progress bars:-

  • Categories
  • Users
  • Pages (Record Grab)
  • Posts

Most likely Categories and Users should be progressed fairly quickly before Pages and Posts start. Most of the time will probably be spent processing Pages and Posts.

As previously explained, 'Pages' merely represents how many times the program is accessing the database in order to retrieve 'x' number of records in one hit. The 'x' number of records by default is 200. So 1 page equals 200 records received.

As records are retrieved from the database they are processed and converted into Drupal. The actual process of converting into Drupal should tick over at a steady speed. Where the delays will occur are in the retrieving of the records. If retrieving 200 records from a 10,000 record database, that will take some time, be patient.

Can't find imageAt the conclusion and barring any problems the program merely reports 'Done'.

That's it, you should have successfully converted records into a Drupal 5.x database. All you have to do is check from your browser that everything worked.

There are other areas of concern, such as how images and links were handled in the originating system (blog). This can be addressed by the use of the 'Search and Replace' module.

Search and Replace

To be completed.

Versions

In the following pages I'll attempt to explain the reasons and features of any updates.

5-05

Post Types

Previously, all posts transferred from Wordpress or b2evolution to Drupal were created as blog entries within Drupal. This might not be the preferred option of some users.

It is now possible to select either, Blog, Story or Page as post types that are created within Drupal.

To further clarify, any posts tagged as 'static' within Wordpress are transferred to Drupal and created as a 'Page' type post. This was the case in prior versions but thought I should mention it in the context of this paragraph.

Teasers

Previously the teaser was set to the first 600 characters of the post entry. This could create problems in that closing HTML tags could be missing. This would play havoc on the main page.

This is what happens now. Within Wordpress, if there is any text in the 'post_excerpt' field then that text will be transferred to the Drupal 'teaser' field. Wordpress has already stripped out all HTML and therefore will not be a problem for our Drupal mainpage.

If there is no text in the 'post_excerpt' field then the entire body of the post is transferred to the teaser field. This is in essence, as if the option regarding teasers had been set to unlimited. This is the safest option.

There is no option for b2evolution. The entire post is transferred to the teaser field.

5-06

Drupal Table Prefix

This version fixes a little problem where if the target database (Drupal) included a table prefix then the comments data was not transferred across. An error message was displayed, advising the user accordingly, clicking OK the conversion progressed.

This particular problem caused the effect of not being able to add a node, category, comment or user.

This has been rectified.

Character Encodings

I'm still working on this and hopefully by next release it should be resolved, touch wood.

5-07

Character Encodings

This version addresses the problem of weird characters ending up in the Drupal database and was as a result of incorrect character encodings. For what it's worth the Wordpress MySql database is encoded as 'Latin1' and the Drupal MySql database is encoded as 'utf8' (unicode).

I conducted some tests and specifically tested against this particular sentence:-

“Mares eat oats–and does eat oats—and…little lambs eat ‘ivy’.”

I created a post in Wordpress, including the above sentence. Previously, after importing, the above sentence contained gibberish. Doing the same again, this time with the current version, the above sentence remained intact.

I've also tested against a few other Wordpress files that I've accumulated from users over the past few months and they all appear to be converting correctly.

Connection and Table Prefix

Previously it was possible to test both the source and target database connections though it didn't account for correct table names (table prefixes). What could occur was that a successful connection could be indicated, then when the system began to process records, it'd crash.

What occurs now is that when testing the connection it also accounts for correct Table Name prefixes. The user is advised one of three result conditions:-

  1. Successful connection.
  2. Successful connection though an incorrect database, check the table prefix.
  3. Unsuccessful connection.

Future

b2evolution will be removed from the converter as there seems to be no demand for it. So, it gets the chop.

With the removal of b2evolution I'll also incorporate some house cleaning and clean things up a bit.

Problems, Issues and/or Bugs

This section details most of the problems regarding the converter. Why some things simply don't work. Actually, it won't be known initially why some things don't work but at least here I'm highlighting that I'm in fact addressing it, somehow.

Character Encodings

An encoding problem has reared its ugly head and giving me no end of grief.
At the Wordpress end, type in (or more commonly paste in) this line of text:-

“Mares eat oats–and does eat oats—and…little lambs eat ‘ivy’.”

I'll point out that the double quotes starting and ending the line of text are not double quotes, they are in fact curly quotes. Try and find them on your keyboard. Chucked in for good measure are a few other non-standard type characters.

Within Wordpress it is correctly displayed as above, but at the MySql database level it is stored like so:-

“Mares eat oats–and does eat oats—and…little lambs eat ‘ivy’.”

You will notice that the awkward characters have been replaced by gibberish (technical term for crap characters).

The converter program merely copies data from the Wordpress MySql database and thus regarding the above, copies the line of text containing the gibberish (translated characters of the awkward characters) into Drupal.

I'm currently in the process of coming up to speed with character encodings and hopefully this will be addressed."

If anybody has any solutions or advice in relation to this topic I'd be more than happy to take it on board.

For those interested I've started a thread in relation to the above over at the Real Basic Forums

b2evolution

Only Posts and Comments at this time are converted. Categories, Users and Permalinks are ignored.

Wordpress

The following information explains what happens when importing a Wordpress 2.0.2 blog.

Users

The steps that take place when importing users.

  1. Prior to import, Usernames are scanned in Wordpress.
  2. The Username about to be imported into Drupal is firstly checked against Usernames already in Drupal. If the username is a username in the Drupal database then the Wordpress username is prefixed with "u_" and a number.

    • eg. if 'Steve' is a Wordpress username and it is also a username within Drupal then it may become 'u_1_Steve'. If 'u_1_Steve' was already taken (as a result of a previous import) then the Drupal username could be 'u_2_Steve' etc. Within Drupal, no matter how many times Wordpress blogs are imported, usernames will be unique.

I suppose in the normal course of transferring a Wordpress Blog to Drupal, it's most likely going to be to a clean fresh Drupal installation. If that is the case, one would half assume that all the Wordpress usernames are unique and accordingly they will be transferred across, as is.

Categories

All Wordpress categories are imported into Drupal 5.x and they retain their hierarchical structure.

Within Wordpress it is possible to create sub-categories of categories. I'm not sure how deep they can be nested but that is not a limiting factor.

The converter scans the Wordpress categories and identifies the 'top level' categories. It maybe the case where all the Wordpress categories are 'top level', and not have any sub-categories. If that is the case see the tip at the end of this page.

As a 'top level' category is identified, a Vocabulary is created within Drupal by the same name prepended with a prefix, "v_" and a number. This is to ensure a unique vocabulary name.

A term (category) is then created in Drupal using the same Wordpress category name (no prefixes). This is required as it is possible in Wordpress to link posts to a 'top level' category. Within Drupal, 'top level' categories are Vocabularies and is not possible to link posts directly to a vocabulary.

If the converter identifies any Wordpress sub-categories, those sub-categories are imported into Drupal, as terms (categories) associated to the correct Vocabulary.

Tip

Prior to importing categories into Drupal it maybe beneficial to re-visit your existing Wordpress category structure. If there are 15 categories, all 'top level' categories and no sub-categories then 15 vocabularies are created. Each vocabulary will contain a term (category) of the same name as in Wordpress. This may not be your desired result.

As an example, if you have a sporting theme and created categories such as Football, Squash, Running, Golf etc it may be wise to create another category called 'Sports'. Edit all your other categories to be sub-categories of 'Sports'. Do not link any posts to 'Sports', only link to the sub-categories. During the conversion process, only one vocabulary will be created called "v_1_Sports" and all the other categories will be associated to "v_1_Sports". After conversion simply edit the vocabulary to a name of your choosing.

Permalinks

If Permalinks for Wordpress are enabled the Converter will correctly import them.

The user has a choice here. It may be the case that permalinks are enabled but the importing of them is not required. There is an option that can turn off permalink importing. The default is 'YES' import.

How permalinks work.

At the Wordpress end the permalink structure maybe something similar to "/index.php/2007/02/my_post". The actual URL therefore would be something similar to

"www.yoursite.com/index.php/2007/02/my_post".

Two options are available

(1) Install Drupal to the same sub-directory (as per Wordpress) and no other fiddling required.

(2) If Drupal is installed to a higher level directory ie root then it's simply a matter of changing your options at the Wordpress end. At Wordpress navigate to 'Options/Permalinks'. You will notice that the 'custom' permalink is already populated with something similar to:-

"/index.php/%year%/%monthnum%/%post_name%/".

Simply edit that line by prefixing it with the sub-directory name. Your edited custom value may be similar to this

"/blog/index.php/%year%/%monthnum%/%post_name%/"

Be advised though, that this permallink will not work for Wordpress, it's only a means to copy across the sub-directory name included in the permalink. Wordpress will accept the saving of this permalink even though Wordpress will not recognise it from a browser.

Clean URL's

Make sure to turn on 'Clean URLs' in Drupal. If not "?q=" will be included in the permalink.

Path Module

By default, upon a new or fresh installation of Drupal, the Path Module will not be enabled. This is not a problem. Permalinks will be recognised even with this module disabled.

With this module disabled, it merely means that it's not possible to edit the permalink. Enable it and editing will be possible.

Posts and Comments

The whole reason why this converter was written.

After each post is imported from Wordpress, comments, categories (post/category links) and permalinks are imported. The cycle continues until all posts imported.

If the user elected not to import permalinks then permalinks will be ignored.

If permalinks are to be imported, only non-draft posts will have permalinks.

If, for some unknown reason, there is a duplicate permalink, only the first will be imported. The duplicates will be ignored. At this point the user is not notified of this.

Teasers

Prior to version 5-05 the teaser was set at the first 600 characters of the body of the post. It became apparent that in some cases the teaser was, in a sense, left hanging, in that there were no closing HTML tags. This creates havoc for the main page.

What now occurs is this process. If there is any entry in a Wordpress 'post_excerpt' field then the contents of this field are added to the teaser field. Wordpress, at this point, has already stripped out any HTML. If there is no entry in the excerpt field then the entire body of the post is added to the teaser field. In effect, as if the option setting for teasers had been set to 'unlimited'. At least with this approach there will be no issues of any unclosed HTML tags.

Wordpress 2.5

I'm hoping this guide should contain enough information in order to use the conversion program.

The program is a desktop application which converts Wordpress 2.5 database to an existing Drupal 5.x database and as of version 1.03, Drupal 6.x database.

First things first, BACKUP

We have heard this term many times and we all know what it means. To those that have lost data, either through mishandling, theft, fire etc. and didn't have a backup, I'm sure stress levels went through the roof. If, on the other hand you haven't had such a catastrophe imagine you did last night, how would you feel right now. It happens.

Do not use the conversion program on your one and only Drupal 5.x (target) or Wordpress (source) databases. How much more explicit can I be.

At the end of the entire process you should still have your old source and target databases intact. You should also have a new Drupal 5.x database which contains all the imported information.

For backing up procedures check this article, it's a page from the book Changing Hosts - a Dummies Guide.

General Requirements

  1. An already created and installed Drupal 5.x/6.x database.
    • This can either be a freshly installed database which effectively is empty other than User ID 1, the 'super' or 'creator' user of the Drupal database installation; or
    • An already populated Drupal 5.x/6.x database. This basically means that posts, users, comments, categories etc have been created, most likely through Drupal itself.

      What I mean by 'most likely' is that you may have started with a clean, fresh, unpopulated Drupal database. Using this program it is possible to import a Wordpress database numerous times. Obviously, after a first Wordpress database is imported, the Drupal database is no longer a fresh unpopulated database.

  2. A source database, Wordpress 2.5
  3. All databases must be MySQL.

Program - Converter

The latest conversion program (yet to be given a proper name and logo) can be downloaded from here.

Image - Opening Screen Shot

Currently, only the Windows version has been tested, though I have compiled Linux and Mac versions.

The file is contained within a zip file. You will need to extract that file and save it to a folder of your choosing.

Once saved to a folder of your choosing, simply double-click the file in order to run it. If all goes well you should see something similar to the screen shot above.

For Linux installs you will need to set the program's permissions to be 'executable'.

Connect to MySql Databases

Before any importing can be done a connection must first be made to both the Wordpress and Drupal databases. From the main opening window, clicking the 'Continue to Importer Form' button will open the connection form (if a connection hasn't already been established) or purposely clicking 'Connect to Databases' button.

Image - Wordpress Connection Screen

All connection details must be entered correctly and connection is established via clicking the 'Connect' button. When successful connections have been established to both databases only then will the 'OK' button become active. Clicking the 'OK' button will progress the user to the 'Importing' screen.

Please be aware that upon successful connection the connection settings are saved in a text file (connect.txt) in the same directory where this program resides. This 'connect.txt' file is insecure in that the passwords are easily determined. If this file is present upon loading then the connection settings are automatically filled in.

The following is a brief description of the information required.

Version

Select the appropriate Wordpress and Drupal versions. As of this writing only Wordrpess version 2.5 is available and for Drupal you have a choice between 5.x and 6.x.

Image - Drupal Connection Screenshot

Host

If accessing the MySql database remotely, that is, on a hosted server, somewhere, other than your computer, then you will need to input the IP number. If you are unsure of the IP number it can be found at your hosting administration site (CPanel), failing that, your hosting company can provide it.

If you are accessing the MySql database locally, that is, on your home computer, then 'localhost' should suffice. If 'localhost' doesn't work then '127.0.0.1' will.

Port

The default is '3306'. Regardless of whether you are accessing your MySql database remotely (across the internet) or locally (on your computer) the number will probably be '3306'. If not, your hosting company can provide the correct figure.

Database Name

The name of the MySql database file of either the Wordpress or Drupal database (depending upon whether your at the 'source' or 'target' connection setting tab. If unsure, simply check the relevant 'config' or 'settings' php file from either system.

Username

Enter the 'Username' that has access rights to the database. Again, if unsure, check the 'config' or 'settings' php file from either system for confirmation. If you are accessing the MySql database files locally, that is, on your computer, and have installed MySql with all default settings, then the odds are that the user name will be 'root' and no password.

Password

Self-explanatory. If unsure check the 'config' or 'settings' php file from either system for confirmation.

Table Prefix

If your database (Drupal, Wordpress) was setup with the inclusion of table prefixes, then the correct table prefix must be entered. If no prefixes then no entry is required.

Logging into a Remote Database

Okay, so you know all the connection details in order to connect to remote MySql databases and yet the converter claims it can't make a connection.

In order to connect to your remote MySql database(s) you need to activate the 'allow remote connections' setting at your server. If 'allow remote connections' is switched off, a connection will never be made.

Image - Manage Remote Access

Using cpanel as a guide, navigate to your database section and select "Manage Remote MySql Access". At the following screen merely enter your home computer IP address as an 'Access Host'. To simplify things further entering a % key will allow access to your MySql database from any computer.

If you can't find such a setting your hosting provider should be able to help.

Import

In order to arrive at this form a successful connections must have been established. It is from this window where the importing process begins. As of version 1.03 the user now has more options to control how the import should proceed.

Image - Importer Screen Shot

The following briefly describes the various options.

Records to get (per Grab)

This figure corresponds to how many records (posts) are retrieved in one hit. The default is 200. The limit is from 10 through to 1000. If attempting to convert a massively huge database, problems can arise with overloading which ultimately, is probably, the cause of program hanging. This setting merely says only grab a small bit a time but keep returning grabbing a small bit until the whole lot is got. For massively huge databases we don't want 10,000 records coming down the line in one go, it simply doesn't work.

If available, import Permalinks

If permalinks are active, it's from here where you can choose to either import them or not. Only choices available are "Yes" or "No". The default is "Yes".

In order to preview the permalinks click the 'Preview Permalinks' button. A separate window appears listing all the permalinks per post. Only published posts will have permalinks.

Filter Format

The values stored in these (node & comment) dropdown controls are retrieved from your Drupal database. By default, the options available will probably be 'Filtered HTML' and 'Full HTML'. If there are other filter format types they will be displayed and available to be selected.

If your posts and comments contain numerous HTML tags, styles and attributes then most likely you will want to select 'Full HTML' as under that mode Drupal will correctly render the pages. That is, the HTML tags are adhered to. If choosing 'Filtered HTML' then some of the HTML tags, styles and attributes might not be recognised and the page not displayed as intended.

Empty Drupal Tables

Clicking this button wipes out all posts, comments, categories and users (except user 1) from your Drupal database. This is useful if you wish to start over again and again.

Import

When all configuration items have been dealt with the 'Import' button will become active and clickable. This button starts the importing routine. Upon successful commencement a pop-up window appears which reports the progress of the Users, Categories and Posts being imported.

Post Type to Content Type

Image - Post Type Screen Shot

From here is where we can associate Drupal content types to Wordpress post types. The left column is populated from your Wordpress database and lists all the post types used and available. Columns two and three relate to Drupal content types and 'Promote to Front Page' settings. Initially, these columns will be empty.

In order to assign corresponding values to the Wordpress post types, merely click on a Wordpress post type and you will note that the post type is displayed in a separate field below the post type column. This particular field is not editable but the drop down boxes to the right are and is where selections are made and ultimately assigned to the Wordpress post type by clicking the 'Assign' button.

Image - Post Type Screen Shot

Image - Post Type Screen Shot

Upon assigning values corresponding to the Wordpress post types, they will be displayed in the list. All post types must be accounted for.

Image - Post Type Screen Shot

The Drupal content types are retrieved from your Drupal database. If a particular content type is not present, ie Blog, then it's just a matter of adding the blog module to your system, via Drupal. Likewise, if you prefer other content types, simply create them within Drupal, then return to the importer and they will be available to be selected.

Post Status to Drupal Published

Image - Post Status - Screen Shot

Wordpress uses a number of status types for posts and those status types are retrieved and displayed in the left column of the above two column list. The process of assigning values is as per post types to content types.

These Wordpress types correspond to the Drupal published information. Your only choices are either the corresponding status type is either published or not.

Image - Post Status - Screen Shot

If all the content type, promote to front page and published information has been input then the 'Import' button will become active. At this point the conversion can take place.

Analyse Wordpress

This particular screen is more for informational purposes and should be self-explanatory.

Image - Analyse Wordpress

Importing Progress

If all the correct information has been entrered and the button 'Import' has been clicked the whole process starts. A pop-up window appears which merely indicates the progress of the entire operation.

Image - Progress Screen Shot

It should be self-explanatory what is being indicated though I'll explain anyway.

You will notice there are two components to the progress indicator. One is the 'bar' indicating percentage done and the other is a figure to the right of the bars. This figure reports the number of records processed per bar.

There are six progress bars:-

  • Categories
  • Users
  • DB Accesses (Record Grab)
  • Posts (Imported)
  • Comments (Imported)
  • Analysing Categories

Most likely Categories and Users should be progressed fairly quickly before DB Accesses and Posts start. Most of the time will probably be spent processing DB Accesses and Posts and their associated comments and categories.

As previously explained, 'DB Accesses' merely represents how many times the program is accessing the database in order to retrieve 'x' number of records in one hit. The 'x' number of records by default is 200. So 1 DB Access equals 200 records received.

As records are retrieved from the database they are processed and converted into Drupal. The actual process of converting into Drupal should tick over at a steady speed. Where the delays will occur are in the retrieving of the records. If retrieving 200 records from a 10,000 record database, that will take some time, be patient.

At the conclusion and barring any problems the program merely reports 'Done'.

That's it, you should have successfully converted records into a Drupal 5.x/6.x database. All you have to do is check from your browser that everything worked.

There are other areas of concern, such as how images and links should be handled. This can be addressed by the use of the 'Search and Replace' module.

Search and Replace

Caution

If you don't know what you are doing this module can be very dangerous. You have the ability to effectively change all the content in one fell swoop. If you do, that shouldn't be such a big problem, more an inconvenience, as you will have a backup (remember - backups). This aspect of the utility is intended for post import procedures though it could also be used without having imported anything.

Connection

In order to access this window a connection must first be established. Clicking the 'Search and Replace' button from the main opening window will activate this window. If a connection hasn't been established the Connection window will first appear. Upon successful connection (only to the Drupal database) the user is progressed to the Search and Replace window.

Image - Search and Replace

Criteria

It should be fairly self-explanatory what happens here. Enter text to be replaced in the 'Find What...' text box. Enter text that is to replace the searched text in the 'Replace with' text box.

Find (Preview First)

Click the 'Find (Preview First)' button and the first matching record will be displayed in both the bottom panels. The left panel will display the original post and the right panel will indicate the result of the replace operation. At this point no changes have occurred, it's merely a preview. Also note, it only shows the first matching record. There could be 100's of matching records which you are advised by a message box.

At this point you have the ability to change the criteria, click the 'Find...' button again and peruse the results.

Replace (satisfied - commit)

If, and only if, you are satisfied with the replace operation, click the 'Replace (satisfied - commit)' button. This will make the necessary replacements throughout the entire Drupal database (teasers and full entries).

Possible Use

You may have transferred all your images from an old server to a new server in the whole process of updating to Drupal. All your Wordpress entries regarding links to your images will still be intact. Such images maybe linked like so

<a href="http://yoursite.com/wordpress/wp-content/image-001.jpg"...</a>

If your images have been transferred to 'drupal/files/images/' then the correct link should be

<a href="http://yoursite.com/drupal/files/images/image-001.jpg"...</a>

To make this change simply enter
href="http://yoursite.com/wordpress/wp-content/"
in the 'Find' text box, and

href="http://yoursite.com/drupal/files/images/"
in the 'Replace' text box.

All posts will be replaced with the new link information.

Explanation of Import

The following sections explain the various reasons, methods and process of the importing process.

Users

The steps that take place when importing users.

Prior to import, Usernames are scanned in Wordpress.

The Username about to be imported into Drupal is firstly checked against Usernames already in Drupal. If the username is a username in the Drupal database then the Wordpress username is prefixed with "u_" and a number.

eg. if 'Steve' is a Wordpress username and it is also a username within Drupal then it may become 'u_1_Steve'. If 'u_1_Steve' was already taken (as a result of a previous import) then the Drupal username could be 'u_2_Steve' etc. Within Drupal, no matter how many times Wordpress blogs are imported, usernames will be unique.

I suppose in the normal course of transferring a Wordpress Blog to Drupal, it's most likely going to be to a clean fresh Drupal installation. If that is the case, one would half assume that all the Wordpress usernames are unique and accordingly they will be transferred across, as is.

Categories

Wordpress categories are the main reason for the upgrade from Wordpress 2.2 to 2.3. Wordpress 2.5 is based upon version 2.3. The actual importing of Wordpress categories is now simpler.

Wordpress 2.3/2.5 introduced vocabularies and terms, similar to the Drupal way of doing things.

By default, Wordpress 2.5 records three vocabularies and within each vocabulary terms (categories) are recorded.

The vocabularies are:

  • category;
  • link_category and
  • post_tag

The 'category' vocabulary is where formal categories (terms) are recorded beforehand. That is, when creating a post the category must exist and can not be created at that time.

The 'post_tag' vocabulary is where 'tags' are freely created at post creation. In effect 'free form' categories. In the old days these were referred to as keywords.

The 'link_category' vocabulary relates to url links that are created and reside to the sidebar of a Wordpress 2.5 page. The terms created within this vocabulary are ignored by the converter. They are simply not relevant to the posts.

The above three vocabularies are imported into Drupal, even the 'link_category' even though the terms will not be imported. Each vocabulary is prepended with "v_" and a number followed by the Wordpress vocabulary name ie v_1_category or v_2_post_tag. This is to ensure a unique vocabulary name.

All Wordpress categories are imported into Drupal 5.x and they retain their hierarchical structure within their associated vocabulary.

Permalinks

If Permalinks for Wordpress are enabled the Converter will correctly import them.

The user has a choice here. It may be the case that permalinks are enabled but the importing of them is not required. There is an option that can turn off permalink importing. The default is 'YES' import.

How permalinks work.

At the Wordpress end the permalink structure maybe something similar to "/index.php/2007/02/my_post". The actual URL therefore would be something similar to

"www.yoursite.com/index.php/2007/02/my_post".

Two options are available

(1) Install Drupal to the same sub-directory (as per Wordpress) and no other fiddling required.

(2) If Drupal is installed to a higher level directory ie root then it's simply a matter of changing your options at the Wordpress end. At Wordpress navigate to 'Options/Permalinks'. You will notice that the 'custom' permalink is already populated with something similar to:-

"/index.php/%year%/%monthnum%/%post_name%/".

Simply edit that line by prefixing it with the sub-directory name. Your edited custom value may be similar to this

"/blog/index.php/%year%/%monthnum%/%post_name%/"

Be advised though, that this permalink will not work for Wordpress, it's only a means to copy across the sub-directory name included in the permalink. Wordpress will accept the saving of this permalink even though Wordpress will not recognise it from a browser.

Clean URL's

Make sure to turn on 'Clean URLs' in Drupal. If not "?q=" will be included in the permalink.

Path Module

By default, upon a new or fresh installation of Drupal, the Path Module will not be enabled. This is not a problem. Permalinks will be recognised even with this module disabled.

With this module disabled, it merely means that it's not possible to edit the permalink. Enable it and editing will be possible.

Posts and Comments

The whole reason why this converter was written.

After each post is imported from Wordpress, comments, categories (post/category links) and permalinks are imported. The cycle continues until all posts imported.

If the user elected not to import permalinks then permalinks will be ignored.

If permalinks are to be imported, only non-draft posts will have permalinks.

If, for some unknown reason, there is a duplicate permalink, only the first will be imported. The duplicates will be ignored. At this point the user is not notified of this.

Teasers

Teasers are created under two conditions.

If the Wordpress post entry contains the tag <!--More--> a teaser will be created. At the same time the tag <!--More--> is replaced with the Drupal version break.

If there is any entry in a Wordpress 'post_excerpt' field then the contents of this field are added to the teaser field. Wordpress, at this point, has already stripped out any HTML.

If there is no reference to either <!--More--> or the post excerpt field then the entire body of text is added to the teaser field.

Comments

If the 'Comment Status' of a Wordpress post is 'open' then the corresponding Drupal comment moderation is assigned as 'Read/Write'. If 'Comment Status' is 'closed' then Drupal comment moderation is assigned as 'Read Only'.

Versions

The pages of this chapter merely indicate what changes occurred for each version.

1.03

As promised the Wordpress to Drupal 6.x converter (Version 1.03) is now available for download.

What's new.

  • Conversion to 5.x and 6.x
  • Filter Formats specifically retrieved from Drupal
  • More flexibility re - Content types, Promote to Front page and published
  • Teasers created re - <!--More--> tag
  • More thorough snapshot of the Wordpress database

For full details see the manual.

1.04

Bug fixes and enhancements for Version 1.04.

Item
Issue
Description
1
1
Node Title and Comment Name - now correctly display symbols where before in certain cases HTML code was displayed. A secondary text file is included with the program. It is to reside in the same directory as the program. This file contains the relevant codes to be translated.
2
2
Comment headings now created (now enables comment links)
3
3
Comment thread field now correctly populated with Base 36 number system, previously was Base 10 (decimal). I doubt this was a cause of any dramas, it's just that I recently became aware of it.
4
4
Filter format option for comments. Same as for nodes, which was introduced in 1.03
5
5
Sub-categories now correctly associated to their parent categories.
6
7
Progress Window. Now included, two additional bars relating to the import of comments and categories per post. The user is at least aware that something is happening. Previously, if a post had many, many comments, the post import seemed to halt, where in fact the converter is cycling through 100's of comments for the post. Now, at least the user can see that something is happening.
7
8
Recent posts were not showing correctly - they do now.
8
9
Fixed permalink error relating to category based permalinks.
9
10
Account users are now correctly associated to thier posts
10
11
Drupal history table updated.