Author Topic: regex (regular expressions) - open discussion topic  (Read 28854 times)

Bee-liever

  • Member
  • Sr. Member
  • *****
  • Posts: 3830
  • MB Version: 3.6.8830 P
I also tried the offline tester 'Expresso' and set it to use Visual Basic.
Same result.
If you use Expresso in C# you get the Misa criollaCredo result.
Turning on 'Explicit Capture' gives the "Misa criolla: " result.
MusicBee and my library - Making bee-utiful music together

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
Only MusicBee returns: "Misa criollaCredo"
As Steven said, it's to do with the .NET version of regex and something called implicit and explicit capture groups.
Try adding (?n) to the start of your regex.

Not saying you are wrong, and I will try it, but I believe that should be related to (not) capturing 'named groups'. And here I am not naming groups anyway.

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
I also tried the offline tester 'Expresso' and set it to use Visual Basic.
Same result.
If you use Expresso in C# you get the Misa criollaCredo result.
Turning on 'Explicit Capture' gives the "Misa criolla: " result.

That's weird.
Here I get the expected result both using C# and VB.



Are you referring to the 'explicit capture' button in design mode?

Bee-liever

  • Member
  • Sr. Member
  • *****
  • Posts: 3830
  • MB Version: 3.6.8830 P
Are you referring to the 'explicit capture' button in design mode?
yes
MusicBee and my library - Making bee-utiful music together

Steven

  • Administrator
  • Sr. Member
  • *****
  • Posts: 34312
there are a number of options with RegEx. The only one MB sets is IgnoreCase

there are some others that look like they might be relevant:
ECMAScript
Enables ECMAScript-compliant behavior for the expression. This value can be used only in conjunction with the IgnoreCase, Multiline, and Compiled values. The use of this value with any other values results in an exception.For more information on the RegexOptions.ECMAScript option, see the "ECMAScript Matching Behavior" section in the Regular Expression Options topic.
Supported by the XNA Framework Supported by Portable Class Library Supported in .NET for Windows Store apps

ExplicitCapture
Specifies that the only valid captures are explicitly named or numbered groups of the form (?<name>…). This allows unnamed parentheses to act as noncapturing groups without the syntactic clumsiness of the expression (?:…). For more information, see the "Explicit Captures Only" section in the Regular Expression Options topic.

you can force these options in your expressions:
https://msdn.microsoft.com/EN-US/library/yd1hzczs(v=VS.110,d=hv.2).aspx

Bee-liever

  • Member
  • Sr. Member
  • *****
  • Posts: 3830
  • MB Version: 3.6.8830 P
just tried the (?n) at begining of expression - doesn't work  :(
gets parsed as another capture group
MusicBee and my library - Making bee-utiful music together

Steven

  • Administrator
  • Sr. Member
  • *****
  • Posts: 34312
i tried running this directly
Misa criolla: Credo
$RxReplace(<Title>,"(^.+?)\:\s","$1")

and always get:
Misa criollaCredo

i tried different initialisation options on regex and none helped

did you want to put a space after $1?
$RxReplace(<Title>,"(^.+?)\:\s","$1 ")

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
I tried this .net tester, using both ECMAScript settings you referred to as a possible culprit, and the result is the same:


Steven

  • Administrator
  • Sr. Member
  • *****
  • Posts: 34312
i tried running this directly
Misa criolla: Credo
$RxReplace(<Title>,"(^.+?)\:\s","$1")

and always get:
Misa criollaCredo

i tried different initialisation options on regex and none helped

did you want to put a space after $1?
$RxReplace(<Title>,"(^.+?)\:\s","$1 ")
further to this, directly running the .net regex split function
Split("Misa criolla: Credo")

returns
{Length=3}
    (0): ""
    (1): "Misa criolla"
    (2): "Credo"

so i dont think its a case of MB not parsing the text incorrectly - this is simply how the .net 4.0 regex is processing the example
I have no idea why the first split item is blank

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
Just to make sure why I brought this up, so not to possibly waste anybody's time:
It is not that I want to get these specific examples giving these specific results.
So it's not like "please help me getting this specific regex formula to work".

It's about trying to improve my regex skills, reading a lot, trying out a lot, and hoping to make use of some regex testers so I can experiment, validate, and fine-tune formulas in those, before copy/pasting them into MusicBee.

Running into this simple regex giving a different outcome in MB then it does in every tester I tried worries me about what will happen when using more complicated regexes.
I thought I either ran into some bug, or I was overseeing something very obvious. That's why I brought it up.

Bee-liever

  • Member
  • Sr. Member
  • *****
  • Posts: 3830
  • MB Version: 3.6.8830 P
@ hiccup
I noticed in your screenshot from RegexPlanet that the field "as a .Net string" has extra backslashes in the regex.
Maybe they need to be entered in the regex-pattern for MB?
MusicBee and my library - Making bee-utiful music together

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
I noticed in your screenshot from RegexPlanet that the field "as a .Net string" has extra backslashes in the regex.
Maybe they need to be entered in the regex-pattern for MB?

That's well spotted. It does make a difference indeed, but only in the sense that MB then returns Misa criolla: Credo instead of Misa criollaCredo
But still not what the regex testers show: Misa criolla:

B.t.w. I don't understand why you should escape a backslash with a backslash there, but that's surely me.

theta_wave

  • Sr. Member
  • ****
  • Posts: 680
Guys, I can replicate hiccup's issue in notepad++'s regex parser.  Maybe notepad++ has a a .Net dependency.  Anyways, it appears that notepad++ and Musicbee wrap regex's around with /regex/g by default repeating the regex until the end of the line (I noted this a few posts above) .

Any regex expression I tested in notepad++ appears to work in Musicbee so far (except for escaping "\" which presents a whole host of issues to MB).  Anyways, I never use those online regex testers since they use Javascript to process the regexes and Javascript does not support all of the lookahead or lookbehind features of regex.
Last Edit: February 26, 2017, 03:06:00 PM by theta_wave

hiccup

  • Sr. Member
  • ****
  • Posts: 7782
Any regex expression I tested in notepad++ appears to work in Musicbee so far (except for escaping "\" which presents a whole host of issues to MB).  Anyways, I never use those online regex testers since they use Javascript to process the regexes and Javascript does support all of the lookahead or lookbehind features of regex.

I used a regex helper plugin for notepad++, and it behaves the same as the online testers:
Do you use something else for np++?



About online testers, I am sure most use java indeed, but the ones I tried at regexplanet and derekslager specifically state they are for .net?

theta_wave

  • Sr. Member
  • ****
  • Posts: 680
I used a regex helper plugin for notepad++, and it behaves the same as the online testers:
Do you use something else for np++?


About online testers, I am sure most use java indeed, but the ones I tried at regexplanet and derekslager specifically state they are for .net?
Don't be wedded too much to online checkers.  Although they are a helpful tool in some respects in explaining what you're doing and learning the syntax, they may not produce the desired results when actually using the regex.  For example:
$ echo "Misa criolla: Credo" | perl -pe 's|(^.+?)\:\s|\1|'
Misa criollaCredo

$ echo "Misa criolla: Credo" | perl -pe 's|(^.+?)\:\s.*$|\1|'
Misa criolla

And yes, I have regex helper for notepad++, but I stopped using it long ago due to some of the reasons you inadvertently showed.  I like to think for regex's that have capturing groups should cover from the beginning of the line to the end of the line.
Last Edit: February 27, 2017, 05:01:34 PM by theta_wave