Asked  9 Months ago    Answers:  5   Viewed   62 times

A column in my spreadsheet contains data like this:

5020203010101/FIS/CASH FUND/SBG091241212

I need to extract the last part of string after forwward slash (/) i.e; SBG091241212

I tried the following regular expression but it does not seem to work:

/.*$

Any Idea?

 Answers

38

You need to specify a matching group using brackets in order to extract content.

preg_match("//([^/]+)$/", "5020203010101/FIS/CASH FUND/SBG091241212", $matches);

echo $matches[1];
Wednesday, March 31, 2021
 
exxed
answered 9 Months ago
15

This will work only for non-nested parentheses:

    $regex = <<<HERE
    /  "  ( (?:[^"\\]++|\\.)*+ ) "
     | '  ( (?:[^'\\]++|\\.)*+ ) '
     | ( ( [^)]*                  ) )
     | [s,]+
    /x
    HERE;

    $tags = preg_split($regex, $str, -1,
                         PREG_SPLIT_NO_EMPTY
                       | PREG_SPLIT_DELIM_CAPTURE);

The ++ and *+ will consume as much as they can and give nothing back for backtracking. This technique is described in perlre(1) as the most efficient way to do this kind of matching.

Wednesday, March 31, 2021
 
KingCrunch
answered 9 Months ago
52

The standard disclaimer applies: Parsing HTML with regular expressions is not ideal. Success depends on the well-formedness of the input on a character-by-character level. If you cannot guarantee this, the regex will fail to do the Right Thing at some point.

Having said that:

<ab[^>]*>(.*?)</a>   // match group one will contain the link text
Saturday, May 29, 2021
 
lewiguez
answered 7 Months ago
44

Using gsub or sub you can do this :

 gsub('.*-([0-9]+).*','\1','Ab_Cd-001234.txt')
"001234"

you can use regexpr with regmatches

m <- gregexpr('[0-9]+','Ab_Cd-001234.txt')
regmatches('Ab_Cd-001234.txt',m)
"001234"

EDIT the 2 methods are vectorized and works for a vector of strings.

x <- c('Ab_Cd-001234.txt','Ab_Cd-001234.txt')
sub('.*-([0-9]+).*','\1',x)
"001234" "001234"

 m <- gregexpr('[0-9]+',x)
> regmatches(x,m)
[[1]]
[1] "001234"

[[2]]
[1] "001234"
Thursday, June 10, 2021
 
MassiveAttack
answered 6 Months ago
23

I do not know whether and how this is possible with functions provided by stringr but you can also use base regexpr and substring:

pattern <- paste0("(?<=", left.border, ")[a-z]+(?=", right.border, ")")
# "(?<=nana)[a-z]+(?=baba)"

rx <- regexpr(pattern, text=my.string, perl=TRUE)
# [1] 5
# attr(,"match.length")
# [1] 6

substring(my.string, rx, rx+attr(rx, "match.length")-1)
# [1] "qwerty"
Tuesday, August 10, 2021
 
user1438082
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share