Asked  9 Months ago    Answers:  5   Viewed   59 times

If I try to encode the url

http://herthabsc.de/index.php?id=3631&tx_ttnews[tt_news]=13144&cHash=9ef2e9ee006fb16188ebf764232a0ba9 

with urlencode() or http_build_query() it gives me the result

http%3A%2F%2Fherthabsc.de%2Findex.php%3Fid%3D3631%26%23038%3Btx_ttnews%5Btt_news%5D%3D13144%26%23038%3BcHash%3D9ef2e9ee006fb16188ebf764232a0ba9

But that's not what it should be. Is there a known bug? Or problems in use with wordpress?

 Answers

50

You've double encoded the URL. Running urldecode() on your output string is giving me the following: http://herthabsc.de/index.php?id=3631&tx_ttnews[tt_news]=13144&cHash=9ef2e9ee006fb16188ebf764232a0ba9

EDIT: try the following

urlencode(html_entity_decode('http://herthabsc.de/index.php?id=3631&tx_ttnews[tt_news]=13144&cHash=9ef2e9ee006fb16188ebf764232a0ba9'));
Wednesday, March 31, 2021
 
Kwadz
answered 9 Months ago
73

Try this:

$str = 'http://www.example.com/some_folder/some file [that] needs "to" be (encoded).zip';
$pos = strrpos($str, '/') + 1;
$result = substr($str, 0, $pos) . urlencode(substr($str, $pos));

You're looking for the last occurrence of the slash sign. The part before it is ok so just copy that. And urlencode the rest.

Wednesday, March 31, 2021
 
TheTechnicalPaladin
answered 9 Months ago
91

According to the PHP manual, you must specifically encode a URL if it contains special characters. This means the function itself should do no special encoding. Most likely your URL is being encoded before being passed to the function, so pass it through urldecode first and see what happens.

Edit: You're saying the encoding is being messed up. Again the PHP manual specifically states that you need to encode urls prior to passing them to file_get_contents. Try encoding the URL, then passing it to the function.

$url = urlencode($url);
file_get_contents($url);
Friday, May 28, 2021
 
rorymorris
answered 7 Months ago
43

It will depend on your purpose. If interoperability with other systems is important then it seems rawurlencode is the way to go. The one exception is legacy systems which expect the query string to follow form-encoding style of spaces encoded as + instead of %20 (in which case you need urlencode).

rawurlencode follows RFC 1738 prior to PHP 5.3.0 and RFC 3986 afterwards (see http://us2.php.net/manual/en/function.rawurlencode.php)

Returns a string in which all non-alphanumeric characters except -_.~ have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 3986 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).

Note on RFC 3986 vs 1738. rawurlencode prior to php 5.3 encoded the tilde character (~) according to RFC 1738. As of PHP 5.3, however, rawurlencode follows RFC 3986 which does not require encoding tilde characters.

urlencode encodes spaces as plus signs (not as %20 as done in rawurlencode)(see http://us2.php.net/manual/en/function.urlencode.php)

Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.

This corresponds to the definition for application/x-www-form-urlencoded in RFC 1866.

Additional Reading:

You may also want to see the discussion at http://bytes.com/groups/php/5624-urlencode-vs-rawurlencode.

Also, RFC 2396 is worth a look. RFC 2396 defines valid URI syntax. The main part we're interested in is from 3.4 Query Component:

Within a query component, the characters ";", "/", "?", ":", "@",
"&", "=", "+", ",", and "$"
are reserved.

As you can see, the + is a reserved character in the query string and thus would need to be encoded as per RFC 3986 (as in rawurlencode).

Friday, June 4, 2021
 
williamcarswell
answered 6 Months ago
39

It is okay to have a * in a URL, (but it is also okay to have it in its encoded form).

RFC1738: Uniform Resource Locators (URL) states the following:

Reserved:

[...]

Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

On the other hand, characters that are not required to be encoded (including alphanumerics) may be encoded within the scheme-specific part of a URL, as long as they are not being used for a reserved purpose.

Thursday, August 12, 2021
 
mgraph
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share