Asked  7 Months ago    Answers:  5   Viewed   15 times

The K escape sequence resets the beginning of the match to the current position in the token list (this only affects what is reported as the full match).

What environments/languages/versions support K (keep) in its regular expression engines and what libraries are needed (if any) to use this feature within patterns?



The K escape sequence is supported by several engines, languages or tools, such as:

  • boost (since ???)
  • grep -P                                                     ? uses PCRE
  • Oniguruma (since 5.13.3)
  • PCRE (since 7.2)
  • Perl (since 5.10.0)
  • PHP (since 5.2.4)
  • Ruby (since 2.0.0)
  • Notepad++ (since 6.0)

...and (so far) not supported by:

  • .NET
  • awk
  • bash
  • ICU
  • Java
  • Javascript
  • Objective-C
  • Python
  • Qt/QRegExp
  • sed
  • Tcl
  • vim        ? it doesn't have K, but its zs is equivalent
  • XML
  • XPath
Tuesday, June 1, 2021
answered 7 Months ago

You could either use a lookahead assertion like others have suggested. Or, if you just want to use basic regular expression syntax:


This matches strings that are either zero or one characters long (^.?$) and thus can not be my. Or strings with two or more characters where when the first character is not an m any more characters may follow (^[^m].+); or if the first character is a m it must not be followed by a y (^m[^y]).

Sunday, June 27, 2021
answered 6 Months ago

The replacement expression is:

  • 1, 2 are the captures (or $1, $2)
  • u up-cases (see the Replacement String Syntax section).

See the Regular Expressions chapter (in the TextMate docs) for more information.

There's already a package that does this, and more:

  • Brief blog about CaseConversion
  • CaseConversion package
Sunday, September 5, 2021
answered 3 Months ago

In some regex flavors, you can use a lookbehind:

s/(?<=^ *)  /   /g

In all other flavors, you can reverse the string, use a lookahead (which all flavors support) and reverse again:

 s/  (?= *$)/   /g
Sunday, September 19, 2021
answered 3 Months ago

Hive's supported notation (at least for 0.14, and I think I recall it being this way for 0.13.x as well) for regex backreferences seems to be $1 for capture group 1, $2 for capture group 2, etc. It looks like it is based upon (and may even be implemented by) the replaceAll method from the Matcher class. This is the germane portion of that documentation:

Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

So I think what you want is this:

select regexp_replace('2015-01-01 02:03:04 +0:00', ' ([+-])', ' GMT$1');

For example:

hive> select regexp_replace('2015-01-01 02:03:04 +0:00', ' ([+-])', ' GMT$1');
2015-01-01 02:03:04 GMT+0:00
Time taken: 0.072 seconds, Fetched: 1 row(s) 
hive> select regexp_replace('2015-01-01 02:03:04 -1:00', ' ([+-])', ' GMT$1');
2015-01-01 02:03:04 GMT-1:00
Time taken: 0.144 seconds, Fetched: 1 row(s)
Sunday, September 26, 2021
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :