La Vita è Bella

Friday, January 26, 2007

MySQL collation charset problem and patch for NucleusCMS and WikkaWiki

First of all, I must say MySQL sucks!

As from MySQL 4.1, if the database default charset is UTF8 but the default charset setting in my.cnf is not UTF8, you must do a "SET NAMES UTF8" query after connection, to make sure that in this mysql connection, all query will be handled as UTF8. otherwise seems that it will use latin1 as default.

If the "SET NAMES" isn't queried, the articles and comments won't be malformed, but the data that stored in MySQL database isn't actually in UTF8 encoding, so that for example mysqldump, or phpmyadmin, won't get the right data.

So what we need to do is check database's charset setting, and send a "SET NAME" query to ensure we're using the database's charset setting.

I've found that LifeType did this thing, so I copied its detecting code and made patches for NucleusCMS and Wikka Wiki (and also my custom homepage for LifeType, of course!).

But before downloading the patches, I should tell you that:

First, you MUST convert your database. Cause before you patch NucleusCMS or Wikka Wiki, your articles, comments, etc. that already posted, was stored in the old way in the db.

To do that, follow the following steps (this is in case the my.cnf indicate "latin1" but your database was in "utf8", if you're using other charsets, modify them):

  1. mysqldump -h host -u username -ppassword database --default-character-set=latin1 > dump.sql
  2. Edit dump.sql, on Line 10, replace "latin1" with "utf8"
  3. mysql -h host -u username -ppassword database < dump.sql

Second, you must ensure that in your database, ALL charset settings are set to "utf8". The word "ALL" means databases, tables and fields.

Finally, here're the patches: for nucleus, for wikka, released under GPL. WARNING: Use it on your own risk, and make backup before patching.



tags: , , , , , , , , ,

22:47:17 by fishy - opensource - Permanent Link

Revision: 1.3/1.3, last modified on 2007-01-26 @ 00:21.

Karma: 20 (57.35% out of 136 were positive) [+/-]

You can subscribe to RSS 2.0 feed for comments and trackbacks

Trackbacks:
There are currently no trackbacks for this item.
Use this TrackBack url to ping this item (right-click, copy link target). If your blog does not support Trackbacks you can manually add your trackback by using this form.

No comments yet

Add Comment

 

May the Force be with you. RAmen