Asked  7 Months ago    Answers:  5   Viewed   70 times

Earlier today a question was asked regarding input validation strategies in web apps.

The top answer, at time of writing, suggests in PHP just using htmlspecialchars and mysql_real_escape_string.

My question is: Is this always enough? Is there more we should know? Where do these functions break down?



When it comes to database queries, always try and use prepared parameterised queries. The mysqli and PDO libraries support this. This is infinitely safer than using escaping functions such as mysql_real_escape_string.

Yes, mysql_real_escape_string is effectively just a string escaping function. It is not a magic bullet. All it will do is escape dangerous characters in order that they can be safe to use in a single query string. However, if you do not sanitise your inputs beforehand, then you will be vulnerable to certain attack vectors.

Imagine the following SQL:

$result = "SELECT fields FROM table WHERE id = ".mysql_real_escape_string($_POST['id']);

You should be able to see that this is vulnerable to exploit.
Imagine the id parameter contained the common attack vector:

1 OR 1=1

There's no risky chars in there to encode, so it will pass straight through the escaping filter. Leaving us:

SELECT fields FROM table WHERE id= 1 OR 1=1

Which is a lovely SQL injection vector and would allow the attacker to return all the rows. Or

1 or is_admin=1 order by id limit 1

which produces

SELECT fields FROM table WHERE id=1 or is_admin=1 order by id limit 1

Which allows the attacker to return the first administrator's details in this completely fictional example.

Whilst these functions are useful, they must be used with care. You need to ensure that all web inputs are validated to some degree. In this case, we see that we can be exploited because we didn't check that a variable we were using as a number, was actually numeric. In PHP you should widely use a set of functions to check that inputs are integers, floats, alphanumeric etc. But when it comes to SQL, heed most the value of the prepared statement. The above code would have been secure if it was a prepared statement as the database functions would have known that 1 OR 1=1 is not a valid literal.

As for htmlspecialchars(). That's a minefield of its own.

There's a real problem in PHP in that it has a whole selection of different html-related escaping functions, and no clear guidance on exactly which functions do what.

Firstly, if you are inside an HTML tag, you are in real trouble. Look at

echo '<img src= "' . htmlspecialchars($_GET['imagesrc']) . '" />';

We're already inside an HTML tag, so we don't need < or > to do anything dangerous. Our attack vector could just be javascript:alert(document.cookie)

Now resultant HTML looks like

<img src= "javascript:alert(document.cookie)" />

The attack gets straight through.

It gets worse. Why? because htmlspecialchars (when called this way) only encodes double quotes and not single. So if we had

echo "<img src= '" . htmlspecialchars($_GET['imagesrc']) . ". />";

Our evil attacker can now inject whole new parameters

pic.png' onclick='location.href=xxx' onmouseover='...

gives us

<img src='pic.png' onclick='location.href=xxx' onmouseover='...' />

In these cases, there is no magic bullet, you just have to santise the input yourself. If you try and filter out bad characters you will surely fail. Take a whitelist approach and only let through the chars which are good. Look at the XSS cheat sheet for examples on how diverse vectors can be

Even if you use htmlspecialchars($string) outside of HTML tags, you are still vulnerable to multi-byte charset attack vectors.

The most effective you can be is to use the a combination of mb_convert_encoding and htmlentities as follows.

$str = mb_convert_encoding($str, 'UTF-8', 'UTF-8');
$str = htmlentities($str, ENT_QUOTES, 'UTF-8');

Even this leaves IE6 vulnerable, because of the way it handles UTF. However, you could fall back to a more limited encoding, such as ISO-8859-1, until IE6 usage drops off.

For a more in-depth study to the multibyte problems, see

Tuesday, June 1, 2021
answered 7 Months ago

The CodeIgniter profiler breaks as soon as you store any non-array non-strings in its session:

foreach ($this->CI->session->all_userdata() as $key => $val)
    if (is_array($val))
        $val = print_r($val, TRUE);

    $output .= "<...>".htmlspecialchars($val)."<...>n";

(from CI_Profiler::_compile_session_data())

This looks like a pretty stupid thing since print_r() works fine with objects - so is_array($val) || is_object($val) would be more appropriate.

Wednesday, March 31, 2021
answered 9 Months ago

Found a solution on the CI forums.

Exchange Email Class Patch

It is initiating TLS after SMTP server has been connected.

Worked for me. Jeff

Saturday, May 29, 2021
answered 7 Months ago

It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (or cleaning, or whatever people call it).

What you should do, to avoid problems, is quite simple: whenever you embed a a piece of data within a foreign code, you must treat it according to the formatting rules of that code. But you must understand that such rules could be too complicated to try to follow them all manually. For example, in SQL, rules for strings, numbers and identifiers are all different. For your convenience, in most cases there is a dedicated tool for such an embedding. For example, when you need to use a PHP variable in the SQL query, you have to use a prepared statement, that will take care of all the proper formatting/treatment.

Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echo or print statement should use htmlspecialchars.

A third example could be shell commands: If you are going to embed strings (such as arguments) to external commands, and call them with exec, then you must use escapeshellcmd and escapeshellarg.

Also, a very compelling example is JSON. The rules are so numerous and complicated that you would never be able to follow them all manually. That's why you should never ever create a JSON string manually, but always use a dedicated function, json_encode() that will correctly format every bit of data.

And so on and so forth ...

The only case where you need to actively filter data, is if you're accepting preformatted input. For example, if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.

Tuesday, June 1, 2021
answered 7 Months ago

The idea of a generic sanitation function is a broken concept.

There is one right sanitation method for every purpose. Running them all indiscriminately on a string will often break it - escaping a piece of HTML code for a SQL query will break it for use in a web page, and vice versa. Sanitation should be applied right before using the data:

  • before running a database query. The right sanitation method depends on the library you use; they are listed in How can I prevent SQL injection in PHP?

  • htmlspecialchars() for safe HTML output

  • preg_quote() for use in a regular expression

  • escapeshellarg() / escapeshellcmd() for use in an external command

  • etc. etc.

Using a "one size fits all" sanitation function is like using five kinds of highly toxic insecticide on a plant that can by definition only contain one kind of bug - only to find out that your plants are infested by a sixth kind, on which none of the insecticides work.

Always use that one right method, ideally straight before passing the data to the function. Never mix methods unless you need to.

Tuesday, June 1, 2021
answered 7 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :