Asked  9 Months ago    Answers:  5   Viewed   86 times

I need split string by commas and spaces, but ignore the inside quotes, single quotes and parentheses

$str = "Questions, "Quote",'single quote','comma,inside' (inside parentheses) space #specialchar";

so that the resultant array will have

[0]Questions
[1]Quote
[2]single quote
[3]comma,inside
[4]inside parentheses
[5]space
[6]#specialchar

my atual regexp is

$tags = preg_split("/[,s]*[^ws]+[s]*/", $str,0,PREG_SPLIT_NO_EMPTY);

but this is ignoring special chars and stil split the commas inside quotes, the resultant array is :

[0]Questions
[1]Quote
[2]single quote
[3]comma
[4]inside
[5]inside parentheses
[6]space
[7]specialchar

ps: this is no csv

Many Thanks

 Answers

15

This will work only for non-nested parentheses:

    $regex = <<<HERE
    /  "  ( (?:[^"\\]++|\\.)*+ ) "
     | '  ( (?:[^'\\]++|\\.)*+ ) '
     | ( ( [^)]*                  ) )
     | [s,]+
    /x
    HERE;

    $tags = preg_split($regex, $str, -1,
                         PREG_SPLIT_NO_EMPTY
                       | PREG_SPLIT_DELIM_CAPTURE);

The ++ and *+ will consume as much as they can and give nothing back for backtracking. This technique is described in perlre(1) as the most efficient way to do this kind of matching.

Wednesday, March 31, 2021
 
KingCrunch
answered 9 Months ago
52

The standard disclaimer applies: Parsing HTML with regular expressions is not ideal. Success depends on the well-formedness of the input on a character-by-character level. If you cannot guarantee this, the regex will fail to do the Right Thing at some point.

Having said that:

<ab[^>]*>(.*?)</a>   // match group one will contain the link text
Saturday, May 29, 2021
 
lewiguez
answered 7 Months ago
12

Try

return preg_replace('/(?<!-)b('.implode('|',$commonWords).')b(?!-)/i','',$input);

This adds negative lookaround expressions to the start and end of the regex so that a match is only allowed if there is no dash before or after the match.

Saturday, May 29, 2021
 
VieStar
answered 7 Months ago
86

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE TEST( str ) AS
          SELECT 'Hello world - test-test! - test' FROM DUAL
UNION ALL SELECT 'Hello world2 - test2 - test-test2' FROM DUAL;

Query 1:

SELECT Str,
       COLUMN_VALUE AS Occurrence,
       REGEXP_SUBSTR( str ,'(.*?)([[:space:]]-[[:space:]]|$)', 1, COLUMN_VALUE, NULL, 1 ) AS split_value
FROM   TEST,
       TABLE(
         CAST(
           MULTISET(
             SELECT LEVEL
             FROM   DUAL
             CONNECT BY LEVEL < REGEXP_COUNT( str ,'(.*?)([[:space:]]-[[:space:]]|$)' )
           )
           AS SYS.ODCINUMBERLIST
         )
       )

Results:

|                               STR | OCCURRENCE |  SPLIT_VALUE |
|-----------------------------------|------------|--------------|
|   Hello world - test-test! - test |          1 |  Hello world |
|   Hello world - test-test! - test |          2 |   test-test! |
|   Hello world - test-test! - test |          3 |         test |
| Hello world2 - test2 - test-test2 |          1 | Hello world2 |
| Hello world2 - test2 - test-test2 |          2 |        test2 |
| Hello world2 - test2 - test-test2 |          3 |   test-test2 |
Monday, July 5, 2021
 
motanelu
answered 5 Months ago
78

You could try the below regex which uses positive lookahead,

string value = @"apple, orange, ""baboons, cows"", rainbow, ""unicorns, gummy bears""";
string[] lines = Regex.Split(value, @", (?=(?:""[^""]*?(?: [^""]*)*))|, (?=[^"",]+(?:,|$))");

foreach (string line in lines) {
Console.WriteLine(line);
}

Output:

apple
orange
"baboons, cows"
rainbow
"unicorns, gummy bears"

IDEONE

Saturday, August 28, 2021
 
Jasper
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share