icon Get the most out of Surmunity, read our tips here! Need an interesting blog to read? You've got to read the Surpass Blog! | Welcome! Please register to access all of our features.
Old November 29th, 2007, 8:24 PM   #1 (permalink)
H
after g, before i
Resident.
 
H's Avatar
 
Joined in Jul 2004
Hosted on Gojira
8,027 posts
Gave thanks: 48
Thanked 129 times
URL Thoughts

I've been in the process of rewriting my blog software for about 18 months. It's been started about 6-7 times in two languages (PHP and Ruby). In my current go, I have a nice portion of the frontend complete after about 12 hours of work.

Here's my question. I'm going to be migrating my prior content over, but a change in the database schema is going to merge 4 tables into one. Now the problem was that previously I was specifying the type and ID of the resource. The type would determine the table and the ID would determine the resource to fetch. In my new schema type is now irrelevent and not important.

So my concern is that when I migrate to the new schema, all of my URLs will break. IDs won't be consistent when the previous schema.

Here's what I'm considering to solve the URL breakage (and by considering, I mean these would work, but not necessarily be nice).

1. Add two columns and populate corresponding old ID and type. (This is bad because it's adding weight to new entries)
2. Add a look-up table with fields for the old type, old id, and the new ID. (This is super simple to do)
3. Write a big ass mod_rewrite to redirect to the new article (Very clean, but at the same time time consuming and lacking fun... I might be able to automate it)
4. Say screw it and have all old URLs broken. (Pretty much not an option)
5. Change my new model to conform with the old schema. (Ugly and unfavourable -- especially considering I want to remove the ID from the URLs and go with a URL friendly title)

My concerns are mostly to do with pagerank and incoming links. I'm not overly bothered by losing a bit of page rank for individual articles and stuff, but I really don't want incoming links from other sites to break. It's going to be tricky.

Anyone have any suggestions that I haven't covered? Suggestions on which method I should take? Or have you done a migration before that changed URLs? What happened and how did you handle broken links?
H is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 29th, 2007, 8:28 PM   #2 (permalink)
ceo
\(^o^)/
Super #1
 
ceo's Avatar
 
Joined in Jan 2005
Lives in Albany, NY
Hosted on SH134
1,522 posts
Gave thanks: 69
Thanked 33 times
I did this type of migration a few years ago and went with the mod_rewrite method. It was clunky to write but in the end I found it to be the fastest method and the smoothest transition to keep the old and use the new.

And then, of course, it all became a moot point anyway because I changed domains about a year after. :p
ceo is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 29th, 2007, 10:48 PM   #3 (permalink)
H
after g, before i
Resident.
 
H's Avatar
 
Joined in Jul 2004
Hosted on Gojira
8,027 posts
Gave thanks: 48
Thanked 129 times
Approximately how many rewrites did you have to make? I'm looking at around 600+. I'm kind of curious as to how server intensive mod_rewrite is with at volume. I think if I do it that way, I'll order them by views logged.

Ultimately I think mod_rewrite is going to be the best as I can at least redirect with a 301 status code. That way search engines at will update more friendly and it's only up to other linking sites to update.
H is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 30th, 2007, 12:49 AM   #4 (permalink)
ceo
\(^o^)/
Super #1
 
ceo's Avatar
 
Joined in Jan 2005
Lives in Albany, NY
Hosted on SH134
1,522 posts
Gave thanks: 69
Thanked 33 times
Hmm, I honestly can't remember. I think I might have the htaccess from back then somewhere, I think the backup of that site managed to survive my computer crashing a few years ago. (Couldn't have been the 2,000 photos or my music, no...) I'll have to hunt for it. I know my htaccess was pretty hefty back then but the mod_rewrite was maybe 1/4 of it. I think it handled about 250 posts and it definitely wasn't 250 lines of code.

I'll try and search it up tomorrow -- now is time for bed.

Last edited by ceo; November 30th, 2007 at 12:50 AM.
ceo is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 30th, 2007, 1:06 AM   #5 (permalink)
H
after g, before i
Resident.
 
H's Avatar
 
Joined in Jul 2004
Hosted on Gojira
8,027 posts
Gave thanks: 48
Thanked 129 times
Muchly appreciated. I did some Googling and it seems mod_alias might actually be better (I'm doing straight URL redirects without regex). So pending server support I'll probably take that route. But even if I was using mod_rewrite, I saw cases of people using over 2,000 rules with minimal impact on performance.
H is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 30th, 2007, 9:28 AM   #6 (permalink)
Senior Member
Super #1
 
FredFredrickson's Avatar
 
Joined in Nov 2003
Lives in New Hampshire
1,182 posts
Gave thanks: 3
Thanked 22 times
I'd personally go the extra field direction. That's just me.
__________________
The Coding Blog - Follow along as we discover and discuss everything it takes to code an entire website, start to finish! [Latest Entry: 4/4/08 - Starting a Website]
FredFredrickson is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 30th, 2007, 5:00 PM   #7 (permalink)
H
after g, before i
Resident.
 
H's Avatar
 
Joined in Jul 2004
Hosted on Gojira
8,027 posts
Gave thanks: 48
Thanked 129 times
Quote:
Originally Posted by FredFredrickson View Post
I'd personally go the extra field direction. That's just me.
Any reason for that? It's going to require four controllers to handle the older URL structures. And for every hit, it's querying the database. It'll still be able to 301 redirect (on the plus, that'd be dynamic), but it seems more taxing on the server. I think if I was to go any route with the database, I'd do the look-up table. It's just a really, really bad idea to have fields you know will always be populated with 0's after the initial migration.

I'm currently only seeing options 2 and 3 (with mod_alias instead of mod_rewrite) as the ones worth doing. It's just a matter of balancing system resource usage, portability and time to create, etc... 3 is probably the best if I can automate it.
H is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 30th, 2007, 8:19 PM   #8 (permalink)
Senior Member
Super #1
 
FredFredrickson's Avatar
 
Joined in Nov 2003
Lives in New Hampshire
1,182 posts
Gave thanks: 3
Thanked 22 times
Just preference, I don't have any real good reasons.
__________________
The Coding Blog - Follow along as we discover and discuss everything it takes to code an entire website, start to finish! [Latest Entry: 4/4/08 - Starting a Website]
FredFredrickson is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On