Asked  7 Months ago    Answers:  5   Viewed   39 times

I have the following string, for example:

aaXXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZvv

How can I find all XX.*YY.*ZZ parts in the string? (possibly by using preg_match())

  • XX cc YY eeXX_ ZZ
  • XX _ZZkk YY mmXX_ ZZ
  • XX _ZZnnXXoo YY uuXX_ ZZ
  • XX oo YY uuXX_ ZZ

Plus all longer matches, as:

  • XX cc YY eeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ ZZ

 Answers

39

Thank everybody for help.

My solution based on 'bobbogo' solution. Thank you.

Regular expression:

(?=(XX.*?YY.*?ZZ))(?=(.*ZZ))

Result (from RegexBuggy):

1 XXccYYeeXX_ZZ     XXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZ
2 XX_ZZkkYYmmXX_ZZ      XX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZ
3 XX_ZZnnXXooYYuuXX_ZZ  XX_ZZnnXXooYYuuXX_ZZ
4 XXooYYuuXX_ZZ     XXooYYuuXX_ZZ

Possible it can by more optimized? I am not big professional in regex.

Saturday, May 29, 2021
 
McAn
answered 7 Months ago
43

You could always run html_entity_decode before you run htmlentities? Works unless you only want to do ampersands (and even then you can play with the charset parameters).

Much easier and faster than a regex.

Wednesday, March 31, 2021
 
FWH
answered 9 Months ago
FWH
48

There are couple of problems:

  1. Your regex pattern will also match an input of more than 15 characters.
  2. Your regex will also other non-allowed characters in the middle like @ or # due to use of S

You can fix it by using a negative lookahead to disallow consecutive occurrence of period/hyphen/underscore and remove S from middle of regex that allows any non-space character

^[a-zA-Z0-9](?!.*[_.-]{2})[w.-]{4,13}[a-zA-Z0-9]$

RegEx Demo

Saturday, May 29, 2021
 
SuperString
answered 7 Months ago
70

The order of the regex is important. I'm not sure if this fully solves the issue the method of doing it this way may be fundamentally flawed but you can try this:

$regex = [];

for($i=0;$i<10;$i++) {
    $str = "";
    for($a=0;$a<10;$a++) {
        if($a > $i) {
            $str .= $a;
            if(strlen($str)>1) {
              $regex[] = $str;
            }
        }
    }
}

usort($regex, function($a,$b){
    return strlen($b) <=> strlen($a);
});

$myregex = '/'.implode('|', $regex).'/';

What it does is make the number sequences an array, then it sorts them by length and orders them the longest sequences first. The end result is this (after matching)

array(1) {
  [0]=>
  array(9) {
    [0]=>
    string(3) "234"
    [1]=>
    string(2) "12"
    [2]=>
    string(4) "6789"
    [3]=>
    string(2) "12"
    [4]=>
    string(3) "123"
    [5]=>
    string(5) "45678"
    [6]=>
    string(2) "12"
    [7]=>
    string(2) "12"
    [8]=>
    string(7) "2345678"
  }
}

Also note the spaceship operator <=> only works in PHP7+

Hope it helps.

Sandbox

and not go to the next chars after a match

I don't think this is possible with regex, if you mean you want to find 23 234 2345 all at once in 2345607 for example. However if it matches a long sequence it only stands to reason that it must match a shorter one, logically. So you could just trim off the right hand number until the length is 2 and get the matches.

Saturday, May 29, 2021
 
inieto
answered 7 Months ago
22

Some Java code using recursion.

The basic idea is to try to swap each element with the current position and then recurse on the next position (but we also need startPos here to indicate what the last position that we swapped with was, otherwise we'll get a simple permutation generator). Once we've got enough elements, we print all those and return.

static void subsets(int[] arr, int pos, int depth, int startPos)
{
   if (pos == depth)
   {
      for (int i = 0; i < depth; i++)
         System.out.print(arr[i] + "  ");
      System.out.println();
      return;
   }
   for (int i = startPos; i < arr.length; i++)
   {
      // optimization - not enough elements left
      if (depth - pos + i > arr.length)
         return;

      // swap pos and i
      int temp = arr[pos];
      arr[pos] = arr[i];
      arr[i] = temp;

      subsets(arr, pos+1, depth, i+1);

      // swap pos and i back - otherwise things just gets messed up
      temp = arr[pos];
      arr[pos] = arr[i];
      arr[i] = temp;
   }
}

public static void main(String[] args)
{
   subsets(new int[]{1,3,7,9}, 0, 3, 0);
}

Prints:

1  3  7  
1  3  9  
1  7  9  
3  7  9  

A more detailed explanation (through example):

First things first - in the above code, an element is kept in the same position by performing a swap with itself - it doesn't do anything, just makes the code a bit simpler.

Also note that at each step we revert all swaps made.

Say we have input 1 2 3 4 5 and we want to find subsets of size 3.

First we just take the first 3 elements - 1 2 3.

Then we swap the 3 with 4 and 5 respectively,
and the first 3 elements gives us 1 2 4 and 1 2 5.

Note that we've just finished doing all sets containing 1 and 2 together.

Now we want sets of the form 1 3 X, so we swap 2 and 3 and get 1 3 2 4 5. But we already have sets containing 1 and 2 together, so here we want to skip 2. So we swap 2 with 4 and 5 respectively, and the first 3 elements gives us 1 3 4 and 1 3 5.

Now we swap 2 and 4 to get 1 4 3 2 5. But we want to skip 3 and 2, so we start from 5. We swap 3 and 5, and the first 3 elements gives us 1 4 5.

And so on.

Skipping elements here is perhaps the most complex part. Note that whenever we skip elements, it just involves continuing from after the position we swapped with (when we swapped 2 and 4, we continued from after the 4 was). This is correct because there's no way an element can get to the left of the position we're swapping with without having been processed, nor can a processed element get to the right of that position, because we process all the elements from left to right.

Think in terms of the for-loops

It's perhaps the simplest to think of the algorithm in terms of for-loops.

for i = 0 to size
  for j = i + 1 to size
    for k = j + 1 to size
      subset[] = {set[i],set[j],set[k]}

Each recursive step would represent a for-loop.

startPos is 0, i+1 and j+1 respectively.

depth is how many for-loops there are.

pos is which for-loop we're currently at.

Since we never go backwards in a deeper loop, it's safe to use the start of the array as storage for our elements, as long as we revert the changes when we're done with an iteration.

Wednesday, October 6, 2021
 
700 Software
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share