Encoding hell, grep and iconv salvage!edit16 Dec 2014
Nowadays we inherit a lot of old databases.
The typical problem is to extract data from badly encoded fields.
This happens when the browser encoding is forced to let say
MySQL is accepting the the default
LATIN1 encoding. In this case
the problem does not manifests immediately since the byte sequence corresponding to
the single character remains immute during the saving and retrieval, but become a problem
when dumped and migrated.
Lets get workaround this problem. At first find non
ASCII characters in the dump file
Now let’s work it out with
If you get the followinf message
this is a good sign of badly encoded character, you may correct it with vim, just type in command mode
Taking into account that you’re working with
UTF8 locale session in terminal
After you’re finished, just save the file and import it into
UTF8 encoded fields of the database!