Asked  7 Months ago    Answers:  5   Viewed   43 times

Possible Duplicate:
UTF-8 all the way through

I'm developing some new features on a website that somebody else already developed.

I'm having a problem the charset.

I saw that the database had some tables in utf8 and some in latin1

So I'm trying to convert all the tables in UTF8.

I did it for one table (also the fields of this table now are utf8), but was not successful.

I'm using the normal mysql connect. I have to put any config to say that it must connect with utf8 to the DB? If yes witch one?

In my html I have:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

It looks like some letters works and others display the question mark. For example it not able to display this ’ that is different of this: '

 Answers

40

Try this

<?php

   header('Content-Type: text/html; charset=utf-8');
?>

and then in the connection

<?php
 $dbLink = mysql_connect($argHost, $argUsername, $argPassword);
    mysql_query("SET character_set_results=utf8", $dbLink);
    mb_language('uni'); 
    mb_internal_encoding('UTF-8');
    mysql_select_db($argDB, $dbLink);
    mysql_query("set names 'utf8'",$dbLink);
?>
Wednesday, March 31, 2021
 
aurelijusv
answered 7 Months ago
89

This may be a job for the mb_detect_encoding() function.

In my limited experience with it, it's not 100% reliable when used as a generic "encoding sniffer" - It checks for the presence of certain characters and byte values to make an educated guess - but in this narrow case (it'll need to distinguish just between UTF-8 and ISO-8859-1 ) it should work.

<?php
$text = $entity['Entity']['title'];

echo 'Original : ', $text."<br />";
$enc = mb_detect_encoding($text, "UTF-8,ISO-8859-1");

echo 'Detected encoding '.$enc."<br />";

echo 'Fixed result: '.iconv($enc, "UTF-8", $text)."<br />";

?>

you may get incorrect results for strings that do not contain special characters, but that is not a problem.

Wednesday, March 31, 2021
 
keisar
answered 7 Months ago
40

Try this:

function convert( $str ) {
    return iconv( "Windows-1252", "UTF-8", $str );
}

public function getRow()
{
    if (($row = fgetcsv($this->_handle, 10000, $this->_delimiter)) !== false) {
        $row = array_map( "convert", $row );
        $this->_line++;
        return $this->_headers ? array_combine($this->_headers, $row) : $row;
    } else {
        return false;
    }
}
Wednesday, March 31, 2021
 
sholsinger
answered 7 Months ago
11

– is common mojibake for an en dash (), which is a different character from a hyphen.

It is the result of taking the UTF-8–encoded form of the dash (0xe2 0x80 0x93) and incorrectly assuming that it is actually encoded using Windows-1252.

Interpreting those three bytes as Windows-1252: 0xe2, 0x80 and 0x93 separately represent â, and .

Assuming the offending character is in the blurb field, if you query SELECT HEX(blurb) FROM tpf_parks (with a suitable WHERE clause), you will see the hex encoding of the offending bytes.

If you see E28093 in there, then the database value is correctly encoded as UTF-8 and there will be a character encoding mismatch in your client or server configuration.

If, however, you see C3A2E282ACE2809C, then the character has already been encoded incorrectly in the database — i.e. interpreted incorrectly, then saved as the UTF-8 representation of those 3 characters. If this is the case you'll need to update the data to fix the issue. You could do this using iconv:

$fixedData = iconv("utf-8", "windows-1252", $badData);

This will convert the doubly-converted bytes back to the UTF-8 encoding.

Saturday, May 29, 2021
 
Hilmi
answered 5 Months ago
80

Any query can be injected whether it's read or write, persistent or transient. Injections can be performed by ending one query and running a separate one (possible with mysqli), which renders the intended query irrelevant.

Any input to a query from an external source whether it is from users or even internal should be considered an argument to the query, and a parameter in the context of the query. Any parameter in a query needs to be parameterized. This leads to a properly parameterized query that you can create a prepared statement from and execute with arguments. For example:

SELECT col1 FROM t1 WHERE col2 = ?

? is a placeholder for a parameter. Using mysqli, you can create a prepared statement using prepare, bind a variable (argument) to a parameter using bind_param, and run the query with execute. You don't have to sanitize the argument at all (in fact it's detrimental to do so). mysqli does that for you. The full process would be:

$stmt = $mysqli->prepare("SELECT col1 FROM t1 WHERE col2 = ?");
$stmt->bind_param("s", $col2_arg);
$stmt->execute();

There is also an important distinction between parameterized query and prepared statement. This statement, while prepared, is not parameterized and is thus vulnerable to injection:

$stmt = $mysqli->prepare("INSERT INTO t1 VALUES ($_POST[user_input])");

To summarize:

  • All Queries should be properly parameterized (unless they have no parameters)
  • All arguments to a query should be treated as hostile as possible no matter their source
Monday, June 7, 2021
 
e_i_pi
answered 5 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :