My blog goes back to MT 2.661. My character encoding is set to iso-8859-1, which I think was the default at the time.
In MT 3.2, or thereabouts, the default changed to utf-8. When I made that upgrade, all sorts of characters were wrong and I added "PublishCharset iso-8859-1" to fix it. So far so good.
Now, I'm having an issue with the blip.fm Action Stream where I'm getting goofy character displays because the stream is UTF-8 and my blog is iso-8859-1.
I want to change over my blog's encoding to utf-8, but simply changing the config directive isn't going to work. I'll be back to odd characters again. How can I make this switch?
Reported on Movable Type 4.2
BTW - You can see my blip.fm stream oddities on my Sandbox blog here.
I have this near the top of my header:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Hi Doug -
The issue is going to be one of data. Namely, do you have funky iso-8859-1 data in your database? I made the switch a while back (3.2 maybe) and didn't really have a problem, because I don't typically cut and paste into MT.
But I realize that some do - specifically from Word or something - and they have those fancy curly quotes, and those will cause your problem.
There are some resources online that you can find for converting database content from one format to another, but it ultimately is going to depend on whether or not the data itself looks good. For instance, if you look at your blip data in the database, does it look okay? Is it only on the output that it gets messed up? If so, then you are probably alright. But if it's actually bad in the database, then you aren't going to be able to do much with it.
The blip.fm data displayed is from Action Streams. Does AS data get pulled into the database? If so, where do I look at it?
I don't typically create entries outside of MT and copy and paste into the entry creation screen. I may have when I started blogging, but I know better now. I do occasionally copy and paste text from other sources, which I suppose might cause problems.
If I change the encoding in my browser, my blip.fm stream looks fine. However, doing the same thing and looking at this post, the quotes don't work. Looking at the actual entry, I have 'curly' quotes. I must have copied and pasted from the blog that tagged me on that post. :-(
I know that I did have issues when I went to 3.2, so I likely have DB issues. I checked the above post in the DB and the curly quotes are slanted.
So, does that mean my DC is 'bad', or just encoded in on format when I need another? What do I need to do or where so I go to 'fix' or convert it? How to I prevent this in the future? I guess I don't quite understand what I did wrong, or if I did anything wrong.
I wrote a few tips on converting a Movable Type blog from ISO-8859-1 to UTF-8 back in the time.
You may run into a few problems though:
- MySQL hasn't always had a consistent way of dealing with character encodings, so it depends on what version of the engine and client libraries you're using (you're not telling).
- You may have a mix of Latin1 and Unicode characters present in the same database, and that's where it's going to be tricky. So my tips cited above may not apply as is.
Another thing I learned recently (the hard way, lots of non billable days): do not count on phpMyAdmin to help you with converting encodings in MySQL. It will NOT work, whatever you try.