Print Page - regex (regular expressions)

General => General Discussions => Topic started by: hiccup on February 06, 2017, 08:23:16 AM

Title: regex (regular expressions) - open discussion topic
Post by: hiccup on February 06, 2017, 08:23:16 AM

This is a topic for the purpose of having open discussions on the topic of RegEx. (regular expressions)

This seems like a good idea considering this is a rather special subject that will probably prove to be a challenge for most MusicBee users, and having a topic like this might be useful, even if just for scrolling through it to get some ideas and suggestions on the matter.

And with any luck, members with a good knowledge and understanding of regex might subscribe to this topic to get notifications on new post, so to be able to jump in to help other members with their regex challenges.

___

update:

Some links that I find to be useful:

Regular Expressions: coding, examples, testing resources (a great RegEx overview/tutorial/cheatsheet tailored towards MusicBee by forum member karbock)
https://getmusicbee.com/forum/index.php?topic=38817.0

Cheatsheets:
http://regexstorm.net/reference
http://www.rexegg.com/regex-quickstart.html/regex-csharp.html#chars

Microsoft's RegEx reference guide:
https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

Online RegEx testers:
http://regexstorm.net/tester
https://regex101.com/
https://regexr.com/
https://www.regexplanet.com/advanced/dotnet/index.html

RegEx Tutorials:
http://www.rexegg.com/regex-quickstart.html/regex-csharp.html
https://www.regular-expressions.info/quickstart.html
https://www.codeproject.com/Articles/9099/The-30-Minute-Regex-Tutorial
https://regexone.com/

A MusicBee forum thread where you are invited to post your RegEx formulas that you believe may be useful to others:
https://getmusicbee.com/forum/index.php?topic=21150.0

A handy little RegEx test utility created by Steven:
download MusicBee RegEx test utility (https://rebrand.ly/MusicBee_Regex_tester)

Title: Re: regex (regular expressions) - free discussion topic
Post by: theta_wave on February 06, 2017, 08:46:31 AM

Subscribed. I'll try to offer explanations with any regex formulas I post. That's I how learned the syntax.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 06, 2017, 09:15:05 AM

That would be great. Even at many 'tutorials for beginners' they fail/omit to describe in plain spoken language what is happening in a formula or an expression.

Title: Re: regex (regular expressions) - open discussion topic
Post by: Alumni on February 07, 2017, 06:50:06 AM

I would appreciate help with this formula for now playing:

if %contentgroup% has value "classical" then display %composer%:_ otherwise display nothing

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 07, 2017, 05:00:37 PM

Quote from: Alumni on February 07, 2017, 06:50:06 AM

I would appreciate help with this formula for now playing:

if %contentgroup% has value "classical" then display %composer%:_ otherwise display nothing

The virtualtag you want doesn't need regex. I'm just going by what you wrote above.

Code

$if($contains(<grouping>,classical),<composer>":_",)

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 07, 2017, 05:30:56 PM

Quote from: hiccup on February 06, 2017, 09:15:05 AM

That would be great. Even at many 'tutorials for beginners' they fail/omit to describe in plain spoken language what is happening in a formula or an expression.

Just going off from the virtualtags I showed you in the other thread

First thing first, everyone should try http://regexr.com/, their cheatsheet was extremely helpful for me learning the regex syntax (I have no affiliation with the site).

General $rxreplace virtualtag formula: $RxReplace(<field>,"regex-pattern","replace-text")

Purpose: Title (Live) -> (Live)
Virtualtag: $RxReplace(<Title>,"(^.*)($[Ll]ive$)","$2")

Explanation (read the regex left to right):

( ) = Capturing parentheses (or capture group). Content enclosed by them are captured into memory and can be recalled. In the above regex pattern, there are two captured groups. The content from the first capturing parentheses can be returned using "$1" and contents from the second can be returned using "$2". If there was a third, one would use $3, a fourth $4, so on and so forth.

^ = Beginning of a line
$ = End of the line

.* = . (any character), * (matches none or more of the proceeding token, here "."). .* = any number of characters going rightward. For example, ".*" by itself matches an entire line since it selects "." any character and "*" any number of characters. Other examples, K.* = selects K and every character following K to the end of the line; .*K = selects everything from the beginning of the line to the last K. .*? = The "?" means that the searching will be non-greedy, so .*?K = selects everything from the start of the line to the first K encountered.

^[Ll]ive$ = The brackets here mean that any of the characters included within it will match the single character. Here, [Ll]ive means that the regex-pattern will find the words "Live" or "live". Likewise, [FfLl]ive will find the words Five, five, Live and live. On the flip side, adding a "^" at the beginning of the bracket array, such as [^Ll]ive, will result finding any four letter word having the last three letters "ive" while excluding "Live" and "live" from the results.

Note, if the two anchors ^ and $ were not there, then [Ll]ive will also select Live or live in the following words: Livermore, alive, olive, lively (you get the picture). In the regex-pattern at the top, [Ll]ive is bordered by parentheses, so we use that to demarcate it from other words we don't want to match up with our pattern. As for why the parenthesis are preceded by \, read the next paragraph.

Punctuation marks = What if someone wants to find parenthesis or periods? Use escape characters. It means that having a "\" precede a character that is used in regex-patterns (like . ( ) [ ]), will mean that regex-pattern will search for those characters. From above, \( means that regex-pattern will search for a parenthesis rather than establishing another capturing group. Likewise \. means that a period will be searched rather than any character.

"$2" = This is the replace-text portion of $RxReplace. Here, I'm telling it to replace <title> with whatever content is in the second captured group.

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 17, 2017, 04:00:35 PM

Hi, everyone:

Does anyone know whether is possible to use the content of a <tag> (as a literal string) in the expression?

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 19, 2017, 04:24:06 AM

Quote from: marlonob on February 17, 2017, 04:00:35 PM

Hi, everyone:

Does anyone know whether is possible to use the content of a <tag> (as a literal string) in the expression?

I would think so since regex is all about manipulating the text contents in <tag>. Do you have an example of what you wanted to do?

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 19, 2017, 04:10:22 PM

Quote from: theta_wave on February 19, 2017, 04:24:06 AM

Do you have an example of what you wanted to do?

Yes. I have a virtual tag (for classical) that deletes the work part of the title an leaves only the particular information of the track, like this:
(http://i.imgur.com/WkcydXL.png)

Where (for the first track):

Work:   Symphony no. 4 in D minor, op. 120
Title:  Symphony no. 4 in D minor, op. 120: I. Ziemlich langsam - Lebhaft
Title-: ⋮I. Ziemlich langsam - Lebhaft

<Title-> is my virtual tag, defined as

$RxReplace($Replace(<Title>,<Work>,⋮),"⋮[ :,]*","⋮")

The problem occurs when the <Work> is used in the part of the title that I want to preserve. Example:
(http://i.imgur.com/9vM0ADY.png)
See track 06, where:

Work:   La noche de los mayas
Title:  La noche de los mayas: I. La noche de los mayas
Title-: ⋮I. ⋮

The expected <Title-> is

⋮I. La noche de los mayas

So, I’d rather use something like

$RxReplace(<Title>,"^"<Work>"[ :,]*(.+)","⋮$1")

for my virtual tag, but I can’t find a way to do so.

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 19, 2017, 05:43:52 PM

Quote from: marlonob on February 19, 2017, 04:10:22 PM

Where (for the first track):
Work:   Symphony no. 4 in D minor, op. 120
Title:  Symphony no. 4 in D minor, op. 120: I. Ziemlich langsam - Lebhaft
Title-: ⋮I. Ziemlich langsam - Lebhaft
<Title-> is my virtual tag, defined as
$RxReplace($Replace(<Title>,<Work>,⋮),"⋮[ :,]*","⋮")
See track 06, where:
Work:   La noche de los mayas
Title:  La noche de los mayas: I. La noche de los mayas
Title-: ⋮I. ⋮
The expected <Title-> is
⋮I. La noche de los mayas
So, I’d rather use something like
$RxReplace(<Title>,"^"<Work>"[ :,]*(.+)","⋮$1")
for my virtual tag, but I can’t find a way to do so.

Hmm, I don't know why you need to include <work> in your regex when you can still easily parse <Title> without including that tag field. I don't know the significance of "⋮" in your setup, but I'll include it anyways. You seem to know your way around regex, so you can remove it if you don't need it.

What I would use for <Title->:

Code

$rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","⋮$3")

The "?" just makes the search non-greedy, so it will stop when it encounters the first ": ". "\s" simply denotes whitespace. I didn't need to escape the ":", but I guess that's just old practice from the days I was learning regex and felt it was better to escape a weird character rather than finding out after completing the regex the inoperative expression was due to not escaping said character.

For the following <Title>:
Symphony no. 4 in D minor, op. 120: I. Ziemlich langsam - Lebhaft
to
⋮I. Ziemlich langsam - Lebhaft

La noche de los mayas: I. La noche de los mayas
to
⋮I. La noche de los mayas

Die Entführung aus dem Serail (The Abduction from the Seraglio), K.384: Act III, No.21b. Chorus: "Bassa Selim leve lange!"
to
⋮Act III, No.21b. Chorus: "Bassa Selim leve lange!"

Edit, mini-editoral: The addition of the <Work> field is nice, but I'm lazy and I don't want to retag 20k files with <Work> or even create a advance search and replace setting. Since all of my classical music follows the "Work: Movement" format, I simply use regex in a virtualtag for the job: $rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","$1")

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 19, 2017, 06:03:03 PM

Quote from: theta_wave on February 19, 2017, 05:43:52 PM

$rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","⋮$2")

I think the $2 at the end should be $3 ?

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 19, 2017, 06:06:29 PM

Quote from: hiccup on February 19, 2017, 06:03:03 PM

Quote from: theta_wave on February 19, 2017, 05:43:52 PM
$rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","⋮$2")

I think the $2 at the end should be $3 ?

Noted and edited prior to your post :). Heh, I guess I was typing a little fast and didn't bother checking.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 19, 2017, 06:10:06 PM

Quote from: theta_wave on February 19, 2017, 06:06:29 PM

Noted and edited prior to your post :). Heh, I guess I was typing a little fast and didn't bother checking.

Just wanted to say your contributions and explanations in this thread are really great.
Without it I would probably still be very lost and confused.
And look at me now, already correcting a black-belt.
hahaha

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 19, 2017, 06:31:02 PM

Quote from: hiccup on February 19, 2017, 06:10:06 PM

Just wanted to say your contributions and explanations in this thread are really great.
Without it I would probably still be very lost and confused.
And look at me now, already correcting a black-belt.
hahaha

Thanks for the kind words. I learned regex mostly by googling the problem I wanted to be solved and reading the explanations from users at stackoverflow. Thus, I am nowhere near a black belt. Maybe brown belt ;)

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 19, 2017, 06:31:20 PM

Quote from: theta_wave on February 19, 2017, 05:43:52 PM

Since all of my classical music follows the "Work: Movement" format, I simply use regex in a virtualtag for the job: $rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","$1")

I am guessing you mean $3 at the end here too?

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 19, 2017, 07:32:24 PM

Quote from: hiccup on February 19, 2017, 06:31:20 PM

Quote from: theta_wave on February 19, 2017, 05:43:52 PM
Since all of my classical music follows the "Work: Movement" format, I simply use regex in a virtualtag for the job: $rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","$1")

I am guessing you mean $3 at the end here too?

No, that expression is used to extract "Work" from "Work: Movement". The above regex is broken down to "(Work)(: )(Movement)". Naturally, I use $1 to grab "Work" because the 1st capture group contains the information I want.

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 20, 2017, 01:28:05 AM

Quote from: theta_wave on February 19, 2017, 05:43:52 PM

Hmm, I don't know why you need to include <work> in your regex when you can still easily parse <Title> without including that tag field. I don't know the significance of "⋮" in your setup, but I'll include it anyways. You seem to know your way around regex, so you can remove it if you don't need it.

What I would use for <Title->:
Code
$rxreplace(<Title>,"(^.+?)(\:\s)(.*$)","⋮$3")

Thank you for your reply.
The ⋮ is just an indicator that the title has been abreviated.
As for your solution: that would not work as well for me. Consider the following cases.

The Hollywood Songbook no. 19: Panzerschlacht
The Hollywood Songbook no. 16: Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: sonate op. 5 no.5): I. Adagio
Star Wars Episode V: The Empire Strikes Back: The Imperial March

In this cases, the expected results will be.

The Hollywood Songbook ⋮no. 19: Panzerschlacht
The Hollywood Songbook ⋮no. 16: Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: sonate op. 5 no.5): ⋮I. Adagio
Star Wars Episode V: The Empire Strikes Back: ⋮The Imperial March

But with that expression, it will be.

The Hollywood Songbook no. 19: ⋮Panzerschlacht
The Hollywood Songbook no. 16: ⋮Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: ⋮sonate op. 5 no.5): I. Adagio
Star Wars Episode V: ⋮The Empire Strikes Back: The Imperial March

I do have a very similar expression in mp3tag to auto-populate the <Work> tag, along with one to copy to it everything before the last colon, and another to copy everyting before "n(o|r|º)\.\s?\d" for this cases, but I have to select which is the right one per case.

The expression you suggested is, of course, the most common, but not the only one I’ll need.

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 20, 2017, 09:02:40 AM

Quote from: marlonob on February 20, 2017, 01:28:05 AM

Thank you for your reply.
The ⋮ is just an indicator that the title has been abreviated.
As for your solution: that would not work as well for me. Consider the following cases.
The Hollywood Songbook no. 19: Panzerschlacht
The Hollywood Songbook no. 16: Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: sonate op. 5 no.5): I. Adagio
Star Wars Episode V: The Empire Strikes Back: The Imperial March
In this cases, the expected results will be.
The Hollywood Songbook ⋮no. 19: Panzerschlacht
The Hollywood Songbook ⋮no. 16: Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: sonate op. 5 no.5): ⋮I. Adagio
Star Wars Episode V: The Empire Strikes Back: ⋮The Imperial March
But with that expression, it will be.
The Hollywood Songbook no. 19: ⋮Panzerschlacht
The Hollywood Songbook no. 16: ⋮Die letzte Elegie
Concerto grosso no. 5 in G‐major (arr. of Corelli: ⋮sonate op. 5 no.5): I. Adagio
Star Wars Episode V: ⋮The Empire Strikes Back: The Imperial March
I do have a very similar expression in mp3tag to auto-populate the <Work> tag, along with one to copy to it everything before the last colon, and another to copy everyting before "n(o|r|º)\.\s?\d" for this cases, but I have to select which is the right one per case.

The expression you suggested is, of course, the most common, but not the only one I’ll need.

From the examples you gave me, if you want the search to go select everything prior to the last colon as "Work", then remove the "?" from my expression to make it greedy. I'm just going off of the examples you provided.

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 20, 2017, 03:36:12 PM

That still wouldn’t function for my case in the first two examples (

The Hollywood Songbook no. 19: Panzerschlacht
The Hollywood Songbook no. 16: Die letzte Elegie

since the main work title ends before any colon) neither will for titles such as

Saeviat tellus inter rigores, HWV 240: Recitativo: Carmelitarum ut confirmet ordinem
Mass no. 17 for Soloists, Chorus & Orchestra in C minor, K. 417a/427 (fragment) “Great”: IIa. Gloria: “Gloria in excelsis”

where the work title ends just before the first colon.

As I mentioned before, I have three substitutions in mp3tag to help me automate the tagging of <Work>:

^(.+):\s.+ →  $1
:\s.+ →
\sn(o|r|º)\.\s?\d.* →

but which one will be needed each time has to be determined manually.

So, the <Work> tag contains the result of this task, and it would be useful to have the possibility to use it for other purposes as well.

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 20, 2017, 08:05:44 PM

Then create an if-else function where you $ismatch titles with two or more semicolons:

Code

(.*?:){2,}

Do one $rxreplace for that and another for titles containing only one semicolon

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 21, 2017, 04:05:10 AM

Quote from: theta_wave on February 20, 2017, 08:05:44 PM

Then create an if-else function where you $ismatch titles with two or more semicolons:

Code
(.*?:){2,}
Do one $rxreplace for that and another for titles containing only one semicolon

That also wouldn’t work, since there are cases where the relevant part is after the first colon (Saeviat tellus inter rigores, HWV 240: Recitativo: Carmelitarum ut confirmet ordinem) and others where it is after the second (Star Wars Episode V: The Empire Strikes Back: The Imperial March)

Your post, however, gave me an idea, and I think I got a solution. For anyone interested, it’s:

Code

$If($IsMatch($Replace(<Title>,<Work>,⋮),"(.*⋮){2}"),
$RxReplace(<Work>:::$Replace(<Title>,<Work>,⋮),"^(.+):::⋮[\s:,]*(.+)⋮(.*)$","⋮$2$1$3"),
$RxReplace($Replace(<Title>,<Work>,⋮),"⋮[\s:,]*","⋮"))

This first checks whether there’s two ⋮ (meaning, there have been two substitutions). If yes, then will put the content of <Work> along with a unique string (“:::”) before the resulting $Replace, so it can be used by the regex and put it instead of the second ⋮.

Thank you very much for your help, @theta_wave and for your commitment with the community.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 21, 2017, 12:14:40 PM

I have some difficulty in understanding the workings and/or syntax of RxSplit.

Suppose a title being:
Misa criolla: Credo

If I use:
$RxSplit(<Title>,"(: )",1)
To get the complete contents before the colon that works.
(displaying: Misa criolla)

But if I try the same to get the contents after the colon this won't work:
$RxSplit(<Title>,"(: )",2)
(it will only display ":")

Only if I add some arguments it will work:
$RxSplit(<Title>,"(: .*)",2)

Why doesn't the first example need more arguments, and why does the second?

Edit:
A second question came up:

Using the same track title as above (Misa criolla: Credo), and trying to isolate the 'work (before the colon) using this:
$RxReplace(<Title>,"(^.+?)\:\s","$1")
I would expect it to only show the work, but it will display:
MisacriollaCredo
So it doesn't stop at running into \:\s
Shouldn't it?

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 22, 2017, 03:12:29 AM

Quote from: marlonob on February 21, 2017, 04:05:10 AM

Thank you very much for your help, @theta_wave and for your commitment with the community.

No problem, I'm no expert at this kind of thing, but I try to do what I can to help. I'm here to learn as well. For example, I'm wondering about your use of [\s:,]*. To me, read left to right, it looks inconsequential because of the use of the "*" rather than "+" since "*" would still select a character even in the absence of the whitespace, colon or comma at that location.

Quote from: hiccup on February 21, 2017, 12:14:40 PM

I have some difficulty in understanding the workings and/or syntax of RxSplit.

Suppose a title being:
Misa criolla: Credo

If I use:
$RxSplit(<Title>,"(: )",1)
To get the complete contents before the colon that works.
(displaying: Misa criolla)

But if I try the same to get the contents after the colon this won't work:
$RxSplit(<Title>,"(: )",2)
(it will only display ":")

Only if I add some arguments it will work:
$RxSplit(<Title>,"(: .*)",2)

Why doesn't the first example need more arguments, and why does the second?

Edit:
A second question came up:

Using the same track title as above (Misa criolla: Credo), and trying to isolate the 'work (before the colon) using this:
$RxReplace(<Title>,"(^.+?)\:\s","$1")
I would expect it to only show the work, but it will display:
MisacriollaCredo
So it doesn't stop at running into \:\s
Shouldn't it?

1) For the first question, I'm not familiar with $rxsplit, as I have yet to use it since Steven included it in MusicBee. Your guess is as good as mine. For what you are trying to achieve here, couldn't $Split do the job?

2) The reason why is that your search is running twice in that line, try it in notepad++. (^.+)?\:\s starts at the beginning and ends at ": ", but its position in the line is not at the end yet. So, the expression repeats itself and starts at the beginning. "(^.+)?\:\s" matches "(Misa criolla): " and when it restarts after ": ", "(^.+)?\:\s" matches "(Credo)" because there's no ": " stop sign, so "(^.+?)\:\s" continues all the way to the end of the line. Now, you have two groups with "$1" (see the groups in the previous sentence enclosed in parenthesis).

In this case, you need to have an expression that's good for the whole line. From what you are trying to do, "(^.+?)\:\s.*$" should be good enough. Still, I have a habit to enclose groups even if I'm not going to use them for a particular virtualtag because I can simply copy-paste the regex into another virtualtag unchanged and simply swap $2 for the $1. So, in your case, my ideal regex would be "(^.+?)\:\s(.*$)"

I hope this helps.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 22, 2017, 07:12:26 AM

Quote from: theta_wave on February 22, 2017, 03:12:29 AM

Quote from: hiccup on February 21, 2017, 12:14:40 PM
I have some difficulty in understanding the workings and/or syntax of RxSplit.

1) For the first question, I'm not familiar with $rxsplit, as I have yet to use it since Steven included it in MusicBee. Your guess is as good as mine. For what you are trying to achieve here, couldn't $Split do the job?

The only purpose of this simple example was to achieve understanding of how $RxSplit (should) work.
Not to solve this particular example in other ways.

Hopefully Steven can chip in and give some explanation on the workings (and advantages) of $RxSplit in MusicBee.

Quote from: theta_wave

2) The reason why is that your search is running twice in that line, try it in notepad++. (^.+)?\:\s starts at the beginning and ends at ": ", but its position in the line is not at the end yet.

Thnks, that's clear.
I did run the expression through an online regex tester I found and like:
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

With the same formula and target string it ran fine there. So it's probably behaving slightly different from MB's regex engine.
B.t.w. what's also nice about this tester, it clearly shows content groups.

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 22, 2017, 07:52:19 AM

Quote from: hiccup on February 22, 2017, 07:12:26 AM

The only purpose of this simple example was to achieve understanding of how $RxSplit (should) work.
Not to solve this particular example in other ways.

Hopefully Steven can chip in and give some explanation on the workings (and advantages) of $RxSplit in MusicBee.

Hah, yeah that too. I haven't come across a situation where I would think I'd use it. However, I'm all ears on some use cases.

Quote from: hiccup on February 22, 2017, 07:12:26 AM

I did run the expression through an online regex tester I found and like:
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

With the same formula and target string it ran fine there. So it's probably behaving slightly different from MB's regex engine.
B.t.w. what's also nice about this tester, it clearly shows content groups.

Good catch. I'll check it out too. I'm just assuming here, but maybe notepad++ and MB have regex expressions behave as if they were bordered like this /regex/g. This simply calls for the regex to run repeatedly, not just once. Again, this is just a guess and I'm used to this repeating behavior by default in notepad++ w/o the "/" and "/g". The regex above is the type of syntax that is used in programming or sed.

Title: Re: regex (regular expressions) - open discussion topic
Post by: marlonob on February 23, 2017, 02:31:08 PM

Quote from: theta_wave on February 22, 2017, 03:12:29 AM

[…]I'm wondering about your use of [\s:,]*. To me, read left to right, it looks inconsequential because of the use of the "*" rather than "+" since "*" would still select a character even in the absence of the whitespace, colon or comma at that location.

According to my experience, * will select zero or more [\s:,], but I'm having trouble imagining a case with zero occurrences of [\s:,], so + may be more suitable.

Quote from: hiccup on February 21, 2017, 12:14:40 PM

I have some difficulty in understanding the workings and/or syntax of RxSplit.

[…] if I try the same to get the contents after the colon this won't work:
$RxSplit(<Title>,"(: )",2)
(it will only display ":")

$RxSplit(<Title>,"(: )",3) will give you what you need. It seem that, for whatever reason, MB counts the split pattern as part of the splitted series. This may be a bug, though.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 08:41:03 AM

Quote from: theta_wave on February 22, 2017, 07:52:19 AM

Quote from: hiccup on February 22, 2017, 07:12:26 AM
I did run the expression through an online regex tester I found and like:
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
With the same formula and target string it ran fine there. So it's probably behaving slightly different from MB's regex engine.
Good catch. I'll check it out too. I'm just assuming here, but maybe notepad++ and MB have regex expressions behave as if they were bordered like this /regex/g. This simply calls for the regex to run repeatedly, not just once. Again, this is just a guess and I'm used to this repeating behavior by default in notepad++ w/o the "/" and "/g". The regex above is the type of syntax that is used in programming or sed.

Something is 'off' with how MusicBee's regex engine handles this.
When I test:

Code

(^.+?)\:\s

with several regex testers on the string: Misa criolla: Credo
All of them return "Misa criolla: "
Only MusicBee returns: "Misa criollaCredo"

And it's probably not the /g switch responsible for this, since many regex tester (such as http://regexr.com/) have that one active by default also.

I am not saying something is wrong (I lack the insight and understanding of regex to state such), but I think it would be good if MusicBee's regex engine would behave more like the ones from such on- and offline regex testers.

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 26, 2017, 08:54:22 AM

keep in mind MB is using regex from .NET
i have read somewhere before that its not quite standard and you should read any documentation from the microsoft website to deterime the expected behavior

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 09:04:52 AM

Yes, I am aware of differences between java, perl, C++ etc. and also tried out some engines that specifically (claim to) use .net

Like http://regexstorm.net/tester
But that will give the same result.

I also tried the offline tester 'Expresso' and set it to use Visual Basic.
Same result.

Do you perhaps have a suggestion for a regex tester that behaves as MusicBee's engine currently does?

Title: Re: regex (regular expressions) - open discussion topic
Post by: Bee-liever on February 26, 2017, 09:06:19 AM

Quote from: hiccup on February 26, 2017, 08:41:03 AM

Only MusicBee returns: "Misa criollaCredo"

As Steven said, it's to do with the .NET version of regex and something called implicit and explicit capture groups (whatever they are??).
Try adding (?n) to the start of your regex.

Title: Re: regex (regular expressions) - open discussion topic
Post by: Bee-liever on February 26, 2017, 09:10:57 AM

Quote from: hiccup on February 26, 2017, 09:04:52 AM

I also tried the offline tester 'Expresso' and set it to use Visual Basic.
Same result.

If you use Expresso in C# you get the Misa criollaCredo result.
Turning on 'Explicit Capture' gives the "Misa criolla: " result.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 09:12:29 AM

Quote from: Bee-liever on February 26, 2017, 09:06:19 AM

Quote from: hiccup on February 26, 2017, 08:41:03 AM
Only MusicBee returns: "Misa criollaCredo"
As Steven said, it's to do with the .NET version of regex and something called implicit and explicit capture groups.
Try adding (?n) to the start of your regex.

Not saying you are wrong, and I will try it, but I believe that should be related to (not) capturing 'named groups'. And here I am not naming groups anyway.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 09:22:32 AM

Quote from: Bee-liever on February 26, 2017, 09:10:57 AM

Quote from: hiccup on February 26, 2017, 09:04:52 AM
I also tried the offline tester 'Expresso' and set it to use Visual Basic.
Same result.
If you use Expresso in C# you get the Misa criollaCredo result.
Turning on 'Explicit Capture' gives the "Misa criolla: " result.

That's weird.
Here I get the expected result both using C# and VB.

(http://i.imgur.com/MTv50Bpl.jpg) (http://i.imgur.com/MTv50Bp.png)

Are you referring to the 'explicit capture' button in design mode?

Title: Re: regex (regular expressions) - open discussion topic
Post by: Bee-liever on February 26, 2017, 09:27:01 AM

Quote from: hiccup on February 26, 2017, 09:22:32 AM

Are you referring to the 'explicit capture' button in design mode?

yes

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 26, 2017, 09:38:28 AM

there are a number of options with RegEx. The only one MB sets is IgnoreCase

there are some others that look like they might be relevant:
ECMAScript
Enables ECMAScript-compliant behavior for the expression. This value can be used only in conjunction with the IgnoreCase, Multiline, and Compiled values. The use of this value with any other values results in an exception.For more information on the RegexOptions.ECMAScript option, see the "ECMAScript Matching Behavior" section in the Regular Expression Options topic.
Supported by the XNA Framework Supported by Portable Class Library Supported in .NET for Windows Store apps

ExplicitCapture
Specifies that the only valid captures are explicitly named or numbered groups of the form (?<name>…). This allows unnamed parentheses to act as noncapturing groups without the syntactic clumsiness of the expression (?:…). For more information, see the "Explicit Captures Only" section in the Regular Expression Options topic.

you can force these options in your expressions:
https://msdn.microsoft.com/EN-US/library/yd1hzczs(v=VS.110,d=hv.2).aspx

Title: Re: regex (regular expressions) - open discussion topic
Post by: Bee-liever on February 26, 2017, 09:56:06 AM

just tried the (?n) at begining of expression - doesn't work :(
gets parsed as another capture group

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 26, 2017, 10:46:06 AM

i tried running this directly
Misa criolla: Credo
$RxReplace(<Title>,"(^.+?)\:\s","$1")

and always get:
Misa criollaCredo

i tried different initialisation options on regex and none helped

did you want to put a space after $1?
$RxReplace(<Title>,"(^.+?)\:\s","$1 ")

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 10:49:00 AM

I tried this .net tester, using both ECMAScript settings you referred to as a possible culprit, and the result is the same:

(http://i.imgur.com/sbEGb3il.jpg) (http://i.imgur.com/sbEGb3i.png)

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 26, 2017, 11:08:48 AM

Quote from: Steven on February 26, 2017, 10:46:06 AM

i tried running this directly
Misa criolla: Credo
$RxReplace(<Title>,"(^.+?)\:\s","$1")

and always get:
Misa criollaCredo

i tried different initialisation options on regex and none helped

did you want to put a space after $1?
$RxReplace(<Title>,"(^.+?)\:\s","$1 ")

further to this, directly running the .net regex split function
Split("Misa criolla: Credo")

returns
{Length=3}
(0): ""
(1): "Misa criolla"
(2): "Credo"

so i dont think its a case of MB not parsing the text incorrectly - this is simply how the .net 4.0 regex is processing the example
I have no idea why the first split item is blank

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 11:23:46 AM

Just to make sure why I brought this up, so not to possibly waste anybody's time:
It is not that I want to get these specific examples giving these specific results.
So it's not like "please help me getting this specific regex formula to work".

It's about trying to improve my regex skills, reading a lot, trying out a lot, and hoping to make use of some regex testers so I can experiment, validate, and fine-tune formulas in those, before copy/pasting them into MusicBee.

Running into this simple regex giving a different outcome in MB then it does in every tester I tried worries me about what will happen when using more complicated regexes.
I thought I either ran into some bug, or I was overseeing something very obvious. That's why I brought it up.

Title: Re: regex (regular expressions) - open discussion topic
Post by: Bee-liever on February 26, 2017, 01:06:14 PM

@ hiccup
I noticed in your screenshot from RegexPlanet that the field "as a .Net string" has extra backslashes in the regex.
Maybe they need to be entered in the regex-pattern for MB?

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 01:44:40 PM

Quote from: Bee-liever on February 26, 2017, 01:06:14 PM

I noticed in your screenshot from RegexPlanet that the field "as a .Net string" has extra backslashes in the regex.
Maybe they need to be entered in the regex-pattern for MB?

That's well spotted. It does make a difference indeed, but only in the sense that MB then returns Misa criolla: Credo instead of Misa criollaCredo
But still not what the regex testers show: Misa criolla:

B.t.w. I don't understand why you should escape a backslash with a backslash there, but that's surely me.

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 26, 2017, 02:58:13 PM

Guys, I can replicate hiccup's issue in notepad++'s regex parser. Maybe notepad++ has a a .Net dependency. Anyways, it appears that notepad++ and Musicbee wrap regex's around with /regex/g by default repeating the regex until the end of the line (I noted this a few posts above) .

Any regex expression I tested in notepad++ appears to work in Musicbee so far (except for escaping "\" which presents a whole host of issues to MB). Anyways, I never use those online regex testers since they use Javascript to process the regexes and Javascript does not support all of the lookahead or lookbehind features of regex.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 26, 2017, 03:10:44 PM

Quote from: theta_wave on February 26, 2017, 02:58:13 PM

Any regex expression I tested in notepad++ appears to work in Musicbee so far (except for escaping "\" which presents a whole host of issues to MB). Anyways, I never use those online regex testers since they use Javascript to process the regexes and Javascript does support all of the lookahead or lookbehind features of regex.

I used a regex helper plugin for notepad++, and it behaves the same as the online testers:
Do you use something else for np++?

(http://i.imgur.com/IWxbfHC.png)

About online testers, I am sure most use java indeed, but the ones I tried at regexplanet and derekslager specifically state they are for .net?

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on February 27, 2017, 03:02:31 PM

Quote from: hiccup on February 26, 2017, 03:10:44 PM

I used a regex helper plugin for notepad++, and it behaves the same as the online testers:
Do you use something else for np++?

About online testers, I am sure most use java indeed, but the ones I tried at regexplanet and derekslager specifically state they are for .net?

Don't be wedded too much to online checkers. Although they are a helpful tool in some respects in explaining what you're doing and learning the syntax, they may not produce the desired results when actually using the regex. For example:

$ echo "Misa criolla: Credo" | perl -pe 's|(^.+?)\:\s|\1|'
Misa criollaCredo

$ echo "Misa criolla: Credo" | perl -pe 's|(^.+?)\:\s.*$|\1|'
Misa criolla

And yes, I have regex helper for notepad++, but I stopped using it long ago due to some of the reasons you inadvertently showed. I like to think for regex's that have capturing groups should cover from the beginning of the line to the end of the line.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 27, 2017, 05:22:18 PM

O.k, I'll let go of the idea that (even .net) regex testers will (should) give the exact same result as MusicBee's regex engine.

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 27, 2017, 06:32:11 PM

i will create a very simple application to enter text and see the result

Title: Re: regex (regular expressions) - open discussion topic
Post by: Steven on February 27, 2017, 09:07:23 PM

http://www.mediafire.com/file/4scm9s52a2z3gub/RegEx.zip

run regex.exe

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on February 28, 2017, 08:44:14 AM

Quote from: Steven on February 27, 2017, 09:07:23 PM

http://www.mediafire.com/file/4scm9s52a2z3gub/RegEx.zip

run regex.exe

Great, that's very nice and useful.
This makes it a lot easier to try out and test regexes.

edit,
For the purpose of making this easier to find for other members too, I copied this link in the start post of this topic.

Title: Re: regex (regular expressions) - open discussion topic
Post by: alec.tron on May 24, 2017, 05:47:53 AM

Heya,
maybe someone who's fluid in regex (and Musicbee's S&R or boroda74s Advance S&R functionalities ) doesn't mind lending a helping hand...?

Based on this suggestion:
https://getmusicbee.com/forum/index.php?topic=20659.msg123990#msg123990
when I tested around it wasn't quite functional as described for me, but I got the replace multi value strings working somewhat... at least for the preview... but not so much for the actual Search&replace operation itself as the replace seems to then re-inject multi value separator ";" as literal string value into the same field and not as a aseparator, so as a practical example to explain the issue, which I am looking for a solution for:

On a file with a multi value genre values of:
Downtempo; Breaks

I want to inject 'Beats' as a second genre value, so, via MB standard Search & Replace tool:
S&R gui input:
"search for:"

Code

(^.*)(Downbeat)(.*$)

"in field": Genre
"replace with":

Code

$1$2; Beats$3

Results in the seemingly correct preview of:
Downtempo; Beats; Breaks

But, IF I apply the change, I then end up with 2 genre values (and one is a literal string injected after the regex search value...), where there should be 3... so the resulting genre values are:
"Downtempo; Beats"
"Breaks"

So I'm looking to understand how one would need to I re-inject the semicolon separated values correctly via regex/S&R, so it does create its' own genre field correctly...?
So I end up with
"Downtempo"
"Beats"
"Breaks"
genre values on all files ?

Cheers.
c.

Title: Re: regex (regular expressions) - open discussion topic
Post by: boroda on May 24, 2017, 10:18:12 AM

alec.tron, have you tried the same regexes with ASR? my understanding it should work.

Title: Re: regex (regular expressions) - open discussion topic
Post by: boroda on May 24, 2017, 10:34:24 AM

@Steven, is 'genres' (not 'genre') tag is read-only in api?

Title: Re: regex (regular expressions) - open discussion topic
Post by: alec.tron on May 24, 2017, 11:07:59 AM

hah!
re:

Quote from: boroda74 on May 24, 2017, 10:18:12 AM

alec.tron, have you tried the same regexes with ASR? my understanding it should work.

I did not, as I tried it earlier today on a laptop without your ASR installed.

I now just did at home where I have your tagging tools installed, and the same syntax does correctly inject the field delimiter in your Advanced Search & Tag Tools for mp3s as well as flac files!
Awesome!
Guess I'll start creating a fair few presets for ASR ;)
Thanks!
c.

Title: Re: regex (regular expressions) - open discussion topic
Post by: boroda on May 24, 2017, 01:29:04 PM

keep in mind that you can easily modify your own asr presets. you cant modify only preinstalled presets.

Title: Re: regex (regular expressions) - open discussion topic
Post by: minor_glitch on November 12, 2017, 07:56:15 PM

Hey everyone, sorry if this is a stupid question or I'm posting in the wrong place.
I'm trying to create a filter for Artist does not match Album Artist.
I was hoping it would be as simple as creating a filter and choosing 'Artist', 'is not', and typing in <Album Artist> but obviously no such luck. I assume this is something I'd have to do through 'match RegEx'.
Can anyone suggest something that might work?

Thanks!

Title: Re: regex (regular expressions) - open discussion topic
Post by: theta_wave on November 15, 2017, 05:45:53 AM

Quote from: minor_glitch on November 12, 2017, 07:56:15 PM

Hey everyone, sorry if this is a stupid question or I'm posting in the wrong place.
I'm trying to create a filter for Artist does not match Album Artist.
I was hoping it would be as simple as creating a filter and choosing 'Artist', 'is not', and typing in <Album Artist> but obviously no such luck. I assume this is something I'd have to do through 'match RegEx'.
Can anyone suggest something that might work?

Thanks!

I don't think the solution requires regex. A virtualtag (http://musicbee.wikia.com/wiki/Define_New_Tags#Custom_Virtual_Tags) should work though.

The virtualtag (say "Album artist tag check") formula should look something like this: $If("<Album Artist>"="<Artist>",,blahblah)

Basically, if <Album Artist> equals <Artist>, then the virtualtag will have no value. Then create a filter or autoplaylist. For the latter, select the virtualtag you just made and select "has a value".

Title: Re: regex (regular expressions) - open discussion topic
Post by: minor_glitch on November 18, 2017, 10:05:27 PM

Quote from: theta_wave on November 15, 2017, 05:45:53 AM

I don't think the solution requires regex. A virtualtag (http://musicbee.wikia.com/wiki/Define_New_Tags#Custom_Virtual_Tags) should work though.

The virtualtag (say "Album artist tag check") formula should look something like this: $If("<Album Artist>"="<Artist>",,blahblah)

Basically, if <Album Artist> equals <Artist>, then the virtualtag will have no value. Then create a filter or autoplaylist. For the latter, select the virtualtag you just made and select "has a value".

Thanks! $If(<Album Artist>=<Artist>,Y,N) seems to have worked!

Title: Re: regex (regular expressions) - open discussion topic
Post by: Mayibongwe on July 08, 2023, 10:18:56 AM

I might need this for future reference:
https://getmusicbee.com/forum/index.php?topic=39538.0

-------------

@hiccup, would you consider linking the topic below somewhere in your start-post?
https://getmusicbee.com/forum/index.php?topic=38817.0

I've always found this thread of yours to be easily locatable compared to other regex topics.
So if there's a chance that most people are landing here than elsewhere, I'm sure they'd appreciate knowing about the existence of karbock's thread above.

Title: Re: regex (regular expressions) - open discussion topic
Post by: hiccup on July 08, 2023, 11:06:14 AM

Quote from: Mayibongwe on July 08, 2023, 10:18:56 AM

@hiccup, would you consider linking the topic below somewhere in your start-post?
https://getmusicbee.com/forum/index.php?topic=38817.0

Good idea, it should have been there already ages ago. Done.

B.T.W.
A while back I played around with creating a custom Google search page that limits itself to searching the MusicBee forum.
I also added some 'keyword magic' that puts the most relevant page(s) at the top.

This is the link:
https://cse.google.com/cse?cx=426af386625354430
If you enter 'regex' as search query, both this topic and karbock's should be at the top.