Asked  6 Months ago    Answers:  5   Viewed   19 times

I'm trying to write a regular expression that validates a date. The regex needs to match the following

  • M/D/YYYY
  • MM/DD/YYYY
  • Single digit months can start with a leading zero (eg: 03/12/2008)
  • Single digit days can start with a leading zero (eg: 3/02/2008)
  • CANNOT include February 30 or February 31 (eg: 2/31/2008)

So far I have

^(([1-9]|1[012])[-/.]([1-9]|[12][0-9]|3[01])[-/.](19|20)dd)|((1[012]|0[1-9])(3[01]|2d|1d|0[1-9])(19|20)dd)|((1[012]|0[1-9])[-/.](3[01]|2d|1d|0[1-9])[-/.](19|20)dd)$

This matches properly EXCEPT it still includes 2/30/2008 & 2/31/2008.

Does anyone have a better suggestion?

Edit: I found the answer on RegExLib

^((((0[13578])|([13578])|(1[02]))[/](([1-9])|([0-2][0-9])|(3[01])))|(((0[469])|([469])|(11))[/](([1-9])|([0-2][0-9])|(30)))|((2|02)[/](([1-9])|([0-2][0-9]))))[/]d{4}$|^d{4}$

It matches all valid months that follow the MM/DD/YYYY format.

Thanks everyone for the help.

 Answers

77

This is not an appropriate use of regular expressions. You'd be better off using

[0-9]{2}/[0-9]{2}/[0-9]{4}

and then checking ranges in a higher-level language.

Tuesday, June 1, 2021
 
Gilko
answered 6 Months ago
60

What you need is anchors, specifically ^ and $. The former matches the beginning of the string, the latter matches the end.

The other point I would make is the [] are unnecessary. d retains its meaning outside of character ranges.

So your regex should look like this: /^d{4}-d{2}-d{2}$/.

Friday, May 28, 2021
 
toesslab
answered 7 Months ago
66

To match dates wherever they appear, remove the $ and ^ anchors from your original regex.

To match dates at the start of any input remove the $ at the end (leave the ^).

You can also put the remaining pattern inside parentheses for convenience, so that the match is also captured as a whole.

Your suggested improvement has a spurious dot at the end which will match any character; that was the reason for returning matches with three-digit days.

Saturday, May 29, 2021
 
Kenny
answered 7 Months ago
66

You may allow any number of [^.] (any character except a dot) and [^.]).[^.] (a dot enclosed by two non-dots) by using a disjunction (the pipe symbol |) between them and putting the whole thing with * (any number of those) between ^ and $ so that the entire string consists of those. Here's the code:

$s1 = "test.test@test.com";
$s2 = "test..test@test.com";
$pattern = '/^([^.]|([^.]).[^.])*$/';
echo "$s1: ", preg_match($pattern, $s1),"<p>","$s2: ", preg_match($pattern, $s2);

Yields:

test.test@test.com: 1
test..test@test.com: 0
Saturday, August 14, 2021
 
icehawk
answered 4 Months ago
85

Expanding on this answer, how about using this to find dates (or things that at least look like dates) within the text and then try parsing those:

b                     # match a word boundary
(?:                    # either...
 (?:                   # match the following one to three times:
  (?:                  # either
   d+                 # a number,
   (?:.|st|nd|rd|th)* # followed by a dot, st, nd, rd, or th (optional)
   |                   # or a month name
   (?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)
  )
  [s./-]*             # followed by a date separator or whitespace (optional)
 ){1,3}                # do this one to three times
|                      # or match a "colloquial" date and capture in backref 1:
(to(?:day|ni(?:te|ght)|morrow)|nexts+(?:week|month|year))
)
b                     # and end at a word boundary.

So if you have a match, and backref $1 is empty, then a literal date was presumably found; if $1 is not empty, it found a date like "today" or "next week". Of course, this is only going to work with dates in English text, and it's probably not going to be very reliable.

if (preg_match(
    '%b                   # match a word boundary
    (?:                    # either...
     (?:                   # match the following one to three times:
      (?:                  # either
       d+                 # a number,
       (?:.|st|nd|rd|th)* # followed by a dot, st, nd, rd, or th (optional)
       |                   # or a month name
       (?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)
      )
      [s./-]*             # followed by a date separator or whitespace (optional)
     ){1,3}                # do this one to three times
    |                      # or ...
    (?:to(?:day|ni(?:te|ght)|morrow)|nexts+(?:week|month|year))
    )
    b                    # and end at a word boundary.%ix', 
    $subject, $regs)) {
    $result = $regs[0];
        $colloq = $regs[1];   // don't know what happens if $1 didn't participate in the match, though.
} else {
    $result = "";
}
Friday, August 20, 2021
 
Thermatix
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share