Databases. Input/output encoding

Hello! Recently, a question on the topic “Encoding in MySQL” or something similar has often popped up on the site.
I’ll start by saying: all data must be in the same encoding. And the script file, and the data in the database table and the headers sent to the browser indicating the encoding (if any). This should be seen in the example.
The file containing this code is saved with the encoding CP1251 and the data in the database table is stored in UTF8

The encoding in which the data should be output to the visitor. It may differ from the encoding of the file. The main thing to remember is that if the encoding of the output differs from the encoding of the file, then the output data must also be transcoded into this encoding (encoding of the sent header, in other words). I am not a supporter of this. It is better to specify one encoding at once, so as not to waste time on transcoding in the script itself.

mysql_set_charset( ‘utf8’ );

In this line, we tell the MySQL server that the data needs to be recoded to utf8. In our case, this is not necessary (we already have a database with this encoding). But this is an example and I want to cover as much range of questions asked by users as possible. This function, by the way, replaces all these queries:

mysql_query (“set_client=’utf8′”);
mysql_query (“set character_set_results=’utf8′”);
mysql_query (“set collation_connection=’utf8_general_ci'”);
mysql_query (“SET NAMES utf8”);

And it works much faster. But it appeared only in PHP 5.2.3 and is supported by MySQL server with version >= 5.0.7. If you use this function, then you can not send such requests. The encoding specified in the sent header (if any) or the encoding of the file itself (if there is no such header) is passed to this function. If the encoding of the file and the table in the database match, then use this function (as well as the headers above) there’s no point. I also want to draw the reader’s attention to the fact that when working with a database, the encoding is written without a hyphen (utf8). Be careful, if you do not specify the encoding correctly, then the output will consist of questions ( ??? ).

iconv( ‘cp1251’, ‘utf-8’, ‘Text!’ );

Please note that if the sent header differs from the encoding of the file, then all output data must also be recoded to the encoding specified in the sent header. For data from a table in the database, this is done by calling the mysql_set_charset(); function, and for other text, this can be done by the iconv(); function. In the first parameter of this function, the current encoding (file encoding) is passed, in the second parameter – the desired encoding (header encoding), and in the third – the text itself that needs to be recoded.
It seems to have written everything. Everything I wanted is more accurate

 


If you output text from the database like this
RљcЂr°C’rєr°CЏ
This means that your database connection works in UTF-8 encoding, and the page is open in the browser in cp1251 encoding.
Decision:
-or save the pages in UTF-8 encoding without BOM and specify in .htacces for apache server encoding

AddDefaultCharset utf-8

 


Instead of.htaccess can send the header using

header(‘Content-Type: text/html; charset=utf-8’);

This will allow the site to work in UTF-8 encoding.

or make a request to the database immediately after the connection

SET NAMES ‘cp1251’;

This will allow the site to work in windows-1251 encoding.

If the text is output from the database like this
�������
then everything is exactly the opposite 

It is preferable to work in UTF-8 encoding.
For example, you will want to use AJAX, but in some browsers it only works with UTF-8 encoding.
Or the json_encode function – it also works with UTF-8 encoding.
It is also better to use this encoding in XML.

If you have such a conclusion:
???????????
This most likely means that there is no data in the Cyrillic table at all.
This is often because MySQL uses latin1 encoding by default.
And if you create a table without explicitly specifying another encoding, it will be created in latin1. And this encoding does not work with Cyrillic at all.
See what the query outputs

SHOW CREATE TABLE `table`;

where table is the name of the table
Example output

This indicates that the table is latin1 encoded by default. You can fix it with this request

ALTERTABLE `table`  CONVERTTOCHARACTERSET ‘utf8’;

Check Also

FAKE CRYPTO EXCHANGE SCRIPT

FAKE CRYPTO EXCHANGE SCRIPT

Main Features: – Automatic address generation for each user – Automatic verification of deposits to …

Leave a Reply

Your email address will not be published. Required fields are marked *