Asked  7 Months ago    Answers:  5   Viewed   47 times

One of the responses to a question I asked yesterday suggested that I should make sure my database can handle UTF-8 characters correctly. How I can do this with MySQL?

 Answers

53

Update:

Short answer - You should almost always be using the utf8mb4 charset and utf8mb4_unicode_ci collation.

To alter database:

ALTER DATABASE dbname CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

See:

  • Aaron's comment on this answer How to make MySQL handle UTF-8 properly

  • What's the difference between utf8_general_ci and utf8_unicode_ci

  • Conversion guide: https://dev.mysql.com/doc/refman/5.5/en/charset-unicode-conversion.html

Original Answer:

MySQL 4.1 and above has a default character set of UTF-8. You can verify this in your my.cnf file, remember to set both client and server (default-character-set and character-set-server).

If you have existing data that you wish to convert to UTF-8, dump your database, and import it back as UTF-8 making sure:

  • use SET NAMES utf8 before you query/insert into the database
  • use DEFAULT CHARSET=utf8 when creating new tables
  • at this point your MySQL client and server should be in UTF-8 (see my.cnf). remember any languages you use (such as PHP) must be UTF-8 as well. Some versions of PHP will use their own MySQL client library, which may not be UTF-8 aware.

If you do want to migrate existing data remember to backup first! Lots of weird choping of data can happen when things don't go as planned!

Some resources:

  • complete UTF-8 migration (cdbaby.com)
  • article on UTF-8 readiness of php functions (note some of this information is outdated)
Tuesday, June 1, 2021
 
iftheshoefritz
answered 7 Months ago
89

This may be a job for the mb_detect_encoding() function.

In my limited experience with it, it's not 100% reliable when used as a generic "encoding sniffer" - It checks for the presence of certain characters and byte values to make an educated guess - but in this narrow case (it'll need to distinguish just between UTF-8 and ISO-8859-1 ) it should work.

<?php
$text = $entity['Entity']['title'];

echo 'Original : ', $text."<br />";
$enc = mb_detect_encoding($text, "UTF-8,ISO-8859-1");

echo 'Detected encoding '.$enc."<br />";

echo 'Fixed result: '.iconv($enc, "UTF-8", $text)."<br />";

?>

you may get incorrect results for strings that do not contain special characters, but that is not a problem.

Wednesday, March 31, 2021
 
keisar
answered 9 Months ago
22

Alter the table to change the column character set to Latin-1. You will now have singly-encoded UTF-8 strings, but sitting in a field whose collation is supposed to be Latin-1.

What you do then is, change the column character set back to UTF-8 via the binary character set - that way MySQL doesn't convert the characters at any point.

ALTER TABLE MyTable MODIFY MyColumn ... CHARACTER SET latin1
ALTER TABLE MyTable MODIFY MyColumn ... CHARACTER SET binary
ALTER TABLE MyTable MODIFY MyColumn ... CHARACTER SET utf8

(is the correct syntax iirc; put the appropriate column type in where ... is)

Wednesday, March 31, 2021
 
aslum
answered 9 Months ago
72

You have to define the connection to your database as UTF-8:

// Set up your connection
$connection = mysql_connect('localhost', 'user', 'pw');
mysql_select_db('yourdb', $connection);
mysql_query("SET NAMES 'utf8'", $connection);

// Now you get UTF-8 encoded stuff
$query = sprintf('SELECT name FROM place where id = 1');
$result = mysql_query($query, $connection);
$result = mysql_fetch_assoc($result);
Friday, July 23, 2021
 
alioygur
answered 5 Months ago
28

I think it might have something to do with collation as well, but I'm not sure. In my case it certainly did, since I had to support cyrillic.
Try this, worked for me:

  1. Set initial collation while creating the target database to utf8_general_ci

  2. Add SET NAMES 'utf8' COLLATE 'utf8_general_ci'; to the top of your sql file

  3. Run mysql -u root -p --default-character-set=utf8 yourDB < yourSQLfile.sql

One more thing, in order to properly get the UTF-8 data form your database, you'll have to modify your connection string as well. For example:

mysql.url=jdbc:mysql://localhost:3306/nbs?useJvmCharsetConverters=false&useDynamicCharsetInfo=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&useEncoding=true

Additionally, take a look at what my problem was.

Sunday, August 8, 2021
 
Anton Barinov
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share