MySQL Character Encoding

I ran into some issues with a project where I was developing on one computer, pushing changes to a test server, while the production data was on another server, with a lot of different platforms and different versions of software. When I would do a backup from production to dev the character encoding was coming out wrong.

What I discovered was that running mysqldump and redirecting the output to a file can result in the terminal’s character encoding reinterpreting the output, and that the dump file from one version/platform of MySQL was not creating the new database with same character encoding.

My fix was to do the following:

mysqldump -u username -p -c -e –default-character-set=utf8 –single-transaction –skip-set-charset –add-drop-database -B database -r dump.sql

Then run the following:

sed -e ‘s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/g; s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g’ -i.bak dumpfile.sql

After the export and conversion you can run

mysql -u username -p –default-character-set=utf8 database
mysql> SOURCE dumpfile.sql