Asked  9 Months ago    Answers:  5   Viewed   147 times
http://www.example.com/some_folder/some file [that] needs "to" be (encoded).zip
urlencode($myurl);

The problem is that urlencode will also encode the slashes which makes the URL unusable. How can i encode just the last filename ?

 Answers

73

Try this:

$str = 'http://www.example.com/some_folder/some file [that] needs "to" be (encoded).zip';
$pos = strrpos($str, '/') + 1;
$result = substr($str, 0, $pos) . urlencode(substr($str, $pos));

You're looking for the last occurrence of the slash sign. The part before it is ok so just copy that. And urlencode the rest.

Wednesday, March 31, 2021
 
TheTechnicalPaladin
answered 9 Months ago
57

@deceze definitely got me going down the right path, so go upvote his answer. But here is exactly what worked:

    $encoded_url = preg_replace_callback('#://([^/]+)/([^?]+)#', function ($match) {
                return '://' . $match[1] . '/' . join('/', array_map('rawurlencode', explode('/', $match[2])));
            }, $unencoded_url);

There are a few things to note:

  • http_build_url requires a PECL install so if you are distributing your code to others (as I am in this case) you might want to avoid it and stick with reg exp parsing like I did here (stealing heavily from @deceze's answer--again, go upvote that thing).

  • urlencode() is not the way to go! You need rawurlencode() for the path so that spaces get encoded as %20 and not +. Encoding spaces as + is fine for query strings, but not so hot for paths.

  • This won't work for URLs that need a username/password encoded. For my use case, I don't think I care about those, so I'm not worried. But if your use case is different in that regard, you'll need to take care of that.

Wednesday, March 31, 2021
 
Zach
answered 9 Months ago
50

You've double encoded the URL. Running urldecode() on your output string is giving me the following: http://herthabsc.de/index.php?id=3631&tx_ttnews[tt_news]=13144&cHash=9ef2e9ee006fb16188ebf764232a0ba9

EDIT: try the following

urlencode(html_entity_decode('http://herthabsc.de/index.php?id=3631&tx_ttnews[tt_news]=13144&cHash=9ef2e9ee006fb16188ebf764232a0ba9'));
Wednesday, March 31, 2021
 
Kwadz
answered 9 Months ago
43

It will depend on your purpose. If interoperability with other systems is important then it seems rawurlencode is the way to go. The one exception is legacy systems which expect the query string to follow form-encoding style of spaces encoded as + instead of %20 (in which case you need urlencode).

rawurlencode follows RFC 1738 prior to PHP 5.3.0 and RFC 3986 afterwards (see http://us2.php.net/manual/en/function.rawurlencode.php)

Returns a string in which all non-alphanumeric characters except -_.~ have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 3986 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).

Note on RFC 3986 vs 1738. rawurlencode prior to php 5.3 encoded the tilde character (~) according to RFC 1738. As of PHP 5.3, however, rawurlencode follows RFC 3986 which does not require encoding tilde characters.

urlencode encodes spaces as plus signs (not as %20 as done in rawurlencode)(see http://us2.php.net/manual/en/function.urlencode.php)

Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.

This corresponds to the definition for application/x-www-form-urlencoded in RFC 1866.

Additional Reading:

You may also want to see the discussion at http://bytes.com/groups/php/5624-urlencode-vs-rawurlencode.

Also, RFC 2396 is worth a look. RFC 2396 defines valid URI syntax. The main part we're interested in is from 3.4 Query Component:

Within a query component, the characters ";", "/", "?", ":", "@",
"&", "=", "+", ",", and "$"
are reserved.

As you can see, the + is a reserved character in the query string and thus would need to be encoded as per RFC 3986 (as in rawurlencode).

Friday, June 4, 2021
 
williamcarswell
answered 6 Months ago
54

java.net.URLEncoder should work for you - though you would have to extend it to accept the hashmap - but that is not very difficult.

Friday, October 22, 2021
 
Renon Stewart
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share