We have to update to TFK soon and part of that involves updating Postgresq versions. I’m anticipating problems because this is an imported database and there may be duplicate rows. Does anyone (@TreatyStones or other people who I forget who talked about this stuff before) have any background in this area?
I’m confident I can do most of this but be handy to have someone who could help in case of downtime.
And I’d rather avoid but the longer I avoid it, the longer we risk falling behind support cycles etc which is just too painful.
Thanks, will just be if I’m stuck. I think getting to the new version will be ok but then search and things might not work if the database has integrity issues, which I expect because it’s an import with various migrations and imports over the years which have probably been imperfect.
Ok. When you say imported do you mean migrated? Like from a different database to postgres?
It should be fine if the version you are upgrading to is backwards compatible with the version you are on. Any integrity issues should be manifested already. Unless the new version changes structures of queries.
What may happen is you may not be able to take advantage of the new features of the new version.
Hard to know, without specifics. I’d PM them. Even putting out there that this place is backed by postgres wouldnt be the best idea mate
Ah it’s obvious what it runs on for anyone with any interest.
Yeah it has been all over the place. I’d say we’re on the 7th platform for TFK (but we’re stable here). Mostly MYSQL in the past. But migrations from one MYSQL to another MYSQL to fit different applications. And then the big lift and shift to Postgres.
I’d have thought it would be fine but most sites with a volume of posts have encountered issues. And we have a large volume and a complex history so I’ll be doing well to avoid integrity issues.
But hopefully will be able to get it across and then worry about reindexing after if there are any integrity issues. That’s fine. My main concern is there are bigger integrity issues that cause a failure and I’m trying to write queries to drop duplicate rows and I make a mess of them.
Here’s what I would do (given no time and resource constraints)
I’d take a snapshot of the current DB, lift and shift to a test server, leave this one running.
Upgrade on the test server. Point TFK at the upgraded version, find the bugs, if showstoppers revert back to old version and fix the bugs on new version you could then point tfk at the new version
Worst case you’d have two db servers active for a day or two.
Minimal downtime and a fallback
That’s broadly the approach. Will certainly take the backup. But will try a quick and easy upgrade on live to see if works and if not then revert.
But you’re talking hours for these transfers. Best part of a day. TFK is not a small DB. So we get time gaps if we need to revert to the backup. That’s alright though as a fail-safe. But trying to load the DB risks timeouts and CPU overloads. It’s a monster of a task.
Anyway, that’s what’s required. Just hoping the quick upgrade without resorting to backup works.
Christ it makes my skin crawl to hear anything being tried on live
Would you not even do a test with a subset of the database first, old posts and new, big chance you’ll pull an integrity issue? Spin up the app on a test server and chance the upgrade? Do a quick sanity check of features
You get the confidence of having done it before you pull the trigger
There is a fallback which is I create the app on a new server which has an already upgraded DB and I restore the DB contents to that. That will work.
But the last big server move I did, about 9 months ago, took over 24 hours to run. And I didn’t know it would work. It’s a horrific process. And I’d rather take my chances on the live version.
(I’m happy I can keep live working by reverting if I need to ).
Drop me a PM if you need any help. I’ve no experience with that technology but @EstebanSexface approach is the safest one from a quality perspective.
If duplicate rows exist they are a bitch to get rid of/ignore but queries can be updated with a specific date range maybe? Could slow performance though.