Asked  7 Months ago    Answers:  5   Viewed   28 times

Is there a SQL or PHP script that I can run that will change the default collation in all tables and fields in a database?

I can write one myself, but I think that this should be something that readily available at a site like this. If I can come up with one myself before somebody posts one, I will post it myself.

 Answers

21

Be careful! If you actually have utf stored as another encoding, you could have a real mess on your hands. Back up first. Then try some of the standard methods:

for instance http://www.cesspit.net/drupal/node/898 http://www.hackszine.com/blog/archive/2007/05/mysql_database_migration_latin.html

I've had to resort to converting all text fields to binary, then back to varchar/text. This has saved my ass.

I had data is UTF8, stored as latin1. What I did:

Drop indexes. Convert fields to binary. Convert to utf8-general ci

If your on LAMP, don’t forget to add set NAMES command before interacting with the db, and make sure you set character encoding headers.

Wednesday, March 31, 2021
 
lewiguez
answered 7 Months ago
33

You can try to do exact same thing with Information_Schema.Columns table

EDIT: Something like

SELECT CONCAT('ALTER TABLE ', TABLE_NAME, ' CHANGE `', COLUMN_NAME, '` `',
LOWER(COLUMN_NAME), '` ', COLUMN_TYPE, ';')
FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = '{your schema name}'
Tuesday, August 10, 2021
 
JakeGR
answered 3 Months ago
80

The only thing I can think of (without finding a collation that fits your needs) is to change something at the application layer (outside of MySQL) that will take care of the differentiation.

For instance, since you don't care about case, you can do something programmatically to lower the case of all the rows in the database. Then change the collation to utf8_bin.

Then you can, in the application, convert everything to lowercase before it enters the database (I'm guessing this will not affect the diacritic characters). That way, you will still get errors if people try to enter multiple cases, you should only have to change a few lines of code to precondition stuff entering the table, and you won't have the diacritic problem.

Tuesday, August 17, 2021
 
MAC
answered 3 Months ago
MAC
37

The C locale will do. UTF-8 is designed so that byte ordering is also codepoint ordering. This is not trivial but consider how UTF-8 works:

Number range  Byte 1   Byte 2   Byte 3
0000-007F     0xxxxxxx
0080-07FF     110xxxxx 10xxxxxx
0800-FFFF     1110xxxx 10xxxxxx 10xxxxxx

When sorting binary data aka C locale, the first non-equal byte will determine the ordering. What we neeed to see that if two numbers encoded into UTF-8 differ then the first non-equal byte will be lower for the lower value. If the numbers are in different ranges then the first byte will indeed be lower for the lower number. Within the same range, the order is determined by literally the same bits as without encoding.

Monday, August 23, 2021
 
Crontab
answered 2 Months ago
83

Not sure about mySQL but in MSSQL you can change the collation in the query so for example if you have 2 tables with different collation and you want to join them or as in you situation crate UNION you can do

select column1 from tableWithProperCollation
union all
select column1 COLLATE SQL_Latin1_General_CP1_CI_AS from tableWithDifferentCollation

Of course SQL_Latin1_General_CP1_CI_AS is just an example of collation you want to "convert" to

Monday, September 27, 2021
 
RenegadeAndy
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :