Restoring a WordPress Blog

So, as mentioned, this blog has moved and I was able to restore everything except what had been lost on/by the previous web host.  There’s actually a lot of information and mechanisms around for doing this.  These are just a few notes on my process and experience.  Most of this is not for non-coders.  I used a bunch of direct database accessing that’s actually super simple but not for the SQL-faint-of-heart.  For those in the latter category, there are a bunch of WordPress plugins and other tools to do this kind of thing more accessibly.  In particular, I think WordPress itself now has some mechanisms to export and import the blog.  I had already lost/thrown away access to the old blog though and was working from backups, so none of that was accessible.

What You Need

For the most part, you need two things: Your images/uploads, and your database.

Images and other uploads are kept in the wp-content/uploads/ folder under your blog directory.  This pretty much just needs to be tarballed or bzipped up.

You database has a dozen or so tables per blog.  Wordpress has notes on exporting it here.  I had previously used mysqldump pretty similarly to what they describe there to export the entire database into an SQL log/script/backup.  It’ll look something like this:

mysqldump --add-drop-table -h dbserver -u dbuser
          -p dbname | bzip2 -c > database.sql.bz2

This will prompt for your password and then dump everything into a compressed log file.  The database server, user, name, and password can all be found in your wp-config.php file in your blog directory.

Restoring

Bringing back the images is no big deal, you just extract the archive into wp-content/uploads/.  I’ve seen a few notes that WordPress has options for how this folder is organized that you might have to make sure match.  However, I did not see any such options in the current version on a very cursory examination.  The default is to organize into sub-folders by year and month.  Unless you changed that on either the old or new blog, you should be able to just unzip the archive in place and be all set.

The database backup may or may not be just slightly trickier.  If you’re using roughly similar versions of WordPress—that have not had any database schema updates, which are fairly rare—and your blog is set to use the same prefix, then you should be able to simply replay the SQL log.  The command will look something like this:

bunzip2 -c database.sql.bz2 | mysql -u dbuser -h dbserver -D dbname -p

This will again ask for your database password.  Note that if you’re using the same database prefix, this may or may not have problems with duplicate primary keys and such if you already have database entries; you probably do, even just with the default install.

In my case, I was switching database prefixes as the old one didn’t make as much sense in the new context with several blogs operating in the same database.  I also didn’t want to keep any of the old blog themes, widgets, links, options, etc.  Just the posts and comments.

Fortunately, none of the new blog installs used the old prefix.  So, I simply ingested the log script as in the command above.  Then I manually ran a query for each table I wanted to restore, inserting into the new tables.  Now I can drop the old tables again.

At the time of this writing, WordPress is on stable version 3.4.1 and uses 11 tables.  They’re described in detail here.  You can also learn alot snooping around in phpMyAdmin or MySQL, e.g., using the ‘show tables;’ and ‘describe table;’ commands.  Those tables are:

wp_prefix_commentmeta
wp_prefix_comments
wp_prefix_links
wp_prefix_options
wp_prefix_postmeta
wp_prefix_posts
wp_prefix_term_relationships
wp_prefix_term_taxonomy
wp_prefix_terms
wp_prefix_usermeta
wp_prefix_users

My old database did not have a commentmeta table. Hopefully that doesn’t cause problems later, but I don’t think it should matter much. After all, if the old blog didn’t have any metadata for comments, it’s not like there’s anything missing in the new blog…

The links table is just the links in your sidebar blogroll, not links in posts themselves.

The term* tables capture information about both tags and categories.

In my case I only had one user who’d ever written anything and it matched the new single user, so usernames were no work. More complicated situations would require a script to map both the table and every entry.

I specifically only wanted the comments and posts, so I transferred over the comments, postmeta, posts, term_relationships, term_taxonomy, and terms tables. Again, some of these will have entries, even in a new install. If you don’t want to trash the old one, you’ll have to adjust all of the primary keys on one or the other table. In my case I didn’t care about the new install, so I just deleted each table in the new prefix. Something like this command will do so:

delete from wp_newprefix_comments;

At that point you should carefully check that the tables have not changed between WordPress versions. You can examine the schema using phyMyAdmin or the ‘describe table;’ command. If they match, you can just copy them over with a simple command like:

insert into wp_newprefix_comments select * from wp_oldprefix_comments;

If they don’t match, you’ll need to manually map the fields. Note that if you get an error like ‘Column count doesn’t match value count at row 1.’ then you have a schema mismatch; a field was either added or dropped between the relevant WordPress versions. At that point you need to look at the schemas and/or codex and figure out what’s going on.

In my case, I got a mismatch on the posts table.  My old database had a post_category field not present in the new database.  I haven’t looked into this more, but I gather this was a deprecated table in my old database and then actually removed, as even after removing the column my old posts still had categories.

Fields can be mapped over by explicitly enumerating the insert and select columns, e.g., like so:

insert into wp_newprefix_posts (ID,post_author,post_date,post_date_gmt,
post_content,post_title,post_excerpt,post_status,post_password,post_name,
to_ping,pinged,post_modified,post_modified_gmt,post_content_filtered,
post_parent,guid,menu_order,post_type,post_mime_type,comment_count)
select ID,post_author,post_date,post_date_gmt,post_content,post_title,
post_excerpt,post_status,post_password,post_name,to_ping,pinged,
post_modified,post_modified_gmt,post_content_filtered,post_parent,
guid,menu_order,post_type,post_mime_type,comment_count
from wp_oldprefix_posts;

Fixing

Once everything was basically in place, I realized that WordPress (somewhat unfortunately) uses absolute URLs for all of its images and uploads. I didn’t realize it at first because a lot of them were still in my cache, but I just happened to casually look at earlier posts I hadn’t loaded in a long time and saw the problem. Fortunately, this is a fairly simple thing to fix with a command like this:

update wp_newprefix_posts set post_content =
   replace(post_content, 'old.com/blog/',
                         'new.com/fancy-new-blog-location/');

A similar command got rid of some funny character codes that had somehow crept into the posts in a long ago attempt to restore from DB and fix problems on the previous host.

Success!

I had assumed this would be a huge ordeal, but it was actually really straightforward. On a very techy note, I think a lot of that is attributable to a declarative, database oriented approach. If WordPress stored everything in files and I had to remove that post_category field, it would be a much larger hassle.

Onward and upward!