Asked  7 Months ago    Answers:  5   Viewed   59 times

I am attempting to create a php function which will check if the passes URL is a short URL. Something like this:

/**
 * Check if a URL is a short URL
 *
 * @param string $url
 * return bool
 */
function _is_short_url($url){
    // Code goes here
}

I know that a simpler and a sure shot way would be to check a 301 redirect, but this function aims at saving an external request just for checking. Neither should the function check against a list of URL shortners as that would be a less scale-able approach.

So are a few possible checks I was thinking:

  1. Overall URL length - May be a max of 30 charecters
  2. URL length after last '/' - May be a max of 10 characters
  3. Number of '/' after protocol (http://) - Max 2
  4. Max length of host

Any thoughts on a possible approach or a more exhaustive checklist for this?

EDIT: This function is just an attempt to save an external request, so its ok to return true for a non-short url (but a real short one). Post passing through this function, I would anyways expand all short URLs by checking 301 redirects. This is just to eliminate the obvious ones.

 Answers

79

I would not recommend to use regex, as it will be too complex and difficult to understand. Here is a PHP code to check all your constraints:

function _is_short_url($url){
        // 1. Overall URL length - May be a max of 30 charecters
        if (strlen($url) > 30) return false;

        $parts = parse_url($url);

        // No query string & no fragment
        if ($parts["query"] || $parts["fragment"]) return false;

        $path = $parts["path"];
        $pathParts = explode("/", $path);

        // 3. Number of '/' after protocol (http://) - Max 2
        if (count($pathParts) > 2) return false;

        // 2. URL length after last '/' - May be a max of 10 characters
        $lastPath = array_pop($pathParts);
        if (strlen($lastPath) > 10) return false;

        // 4. Max length of host
        if (strlen($parts["host"]) > 10) return false;

        return true;
}
Saturday, May 29, 2021
 
BetaRide
answered 7 Months ago
20

For any URL as a string:

if (parse_url($url, PHP_URL_QUERY))

http://php.net/parse_url

If it's for the URL of the current request, simply:

if ($_GET)
Wednesday, March 31, 2021
 
keyBeatz
answered 9 Months ago
10

PHP has built-in functions for this. Use parse_url() and parse_str() together.

Pieced together from php.net:

$url = 'http://www.example.com/page.php?ProdId=2683322&xpage=2';

// Parse the url into an array
$url_parts = parse_url($url);

// Parse the query portion of the url into an assoc. array
parse_str($url_parts['query'], $path_parts);

echo $path_parts['ProdId']; // 2683322
echo $path_parts['xpage']; // 2
Wednesday, March 31, 2021
 
SpiderLinked
answered 9 Months ago
82

I wouldn't use preg_match() for this. I think parse_url() is probably a better choice. You can pass a URL string into it, and it will break it down into all the subcomponents for you.

I don't know what the specific video URLs for those sites you mentioned look like, but I'm sure you could come up with some identifying criteria for each one that you could use with the results of parse_url() to identify. As an example, here's what the breakdown of a YouTube link might look like:

$res = parse_url("http://www.youtube.com/watch?v=Sv5iEK-IEzw");
print_r($res);

/* outputs: 
Array (
    [scheme] => http
    [host] => www.youtube.com
    [path] => /watch
    [query] => v=Sv5iEK-IEzw
)
*/

You could probably identify it based on the host name and the path in this case.

Wednesday, March 31, 2021
 
Pradip
answered 9 Months ago
12

Try this:

preg_match("#b(([w-]+://?|www[.])[^s()<>]+(?:([wd]+)|([^[:punct:]s]|/)))#i", $text, $matches);

You were missing the regex delimiters (usually /, but using # here because it's more convenient for URLs)

Wednesday, March 31, 2021
 
Strae
answered 9 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share