Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - crisp

Pages: 1
1
Plugins / Re: LyricsReloaded (Updated)
« on: June 18, 2022, 03:11:26 AM »
Using Musixmatch, only after the third line of lyrics there is no new line character.

EDIT: the culprit resides in the webpage's structure, which has 2 distinct paragraphs for the first 3 lines and the rest of the lyrics. I would like to edit the yml config but I don't know how to do that correctly.

You can download the Musixmatch yml from the repo, put it in %APPDATA%\MusicBee\mb_LyricsReloaded\providers, change the name inside the yml (as per usual), and add this line before strip_html in the post-filters:
- [regex, '<div class="inline_video_ad_container_container">', "\n", s]
There's a div for ads after the first 3 lines, but it's going to be stripped out anyways, so you can replace it with a newline.

2
Plugins / Re: LyricsReloaded (Updated)
« on: December 01, 2021, 05:43:30 PM »
For anyone who wants to use the update provided by crisp, and is too lazy to make sveakul's changes...
Thanks phred, if this doesn't work for anyone, it's just a missing square bracket at the end of the last regex. And if you wanna get rid of the leftover double newlines, this can go right after that regex:
Code
- [regex, '\n{2,}',"\n\n", 's']

3
Plugins / Re: LyricsReloaded (Updated)
« on: December 01, 2021, 04:09:18 AM »
Well this is embarassing, I wasn't seeing lyrics because at some point during debugging I removed the 's' flag from the config pattern.

@crisp:  I'm guessing that you haven't found a way to restore this yml despite the data I PM'd you?  If not, thanks anyway for making the attempt.  I know it's bound by the limits of the dll itself.
Thanks sveakul, your message pointed me in the right direction, here's the updated yml:
Code
name: Genius (2021-11-30)

variables:
    artist:
        type: artist
        filters:
        - strip_diacritics
        - lowercase
        - [replace, "!!!", "chk-chik-chick"]
        - [regex, '(?<=\W|\s)+(feat.+|ft[\W\s]+|(f\.\s)).+', ""]
        - [regex, '\.+|,+|(\W+(?=$))|(^\W+)', ""]
        - [regex, "'", ""]
        - [regex, '(?<=[a-z0-9%])[^\sa-z0-9%]+(?=[a-z0-9%]+)', "-"]
        - [regex, '((?<=\s)([^a-z0-9\s-])+(\s|\W)+)|((?<=\w)([^a-z0-9-])+(\s|\W)+)', " "]
        - [strip_nonascii, -]
    title:
        type: title
        filters: artist

config:
    url: "https://genius.com/{artist}-{title}-lyrics"
    pattern: ['<div id="lyrics-root-pin-spacer">(?<lyrics>.*)<div class="Lyrics__Footer-sc-', 's']

post-filters:
- br2nl
- strip_html
- utf8_encode
- entity_decode
- clean_spaces
- [regex, '[\[\{].{1,75}[\]\}]', ""]
- [replace, "\n\n", "\n"]
- trim

4
Plugins / Re: LyricsReloaded (Updated)
« on: November 21, 2021, 02:26:22 AM »
This is an odd one. The way I zeroed in on the solution before was (1) making the config pattern as permissive as possible and turning off all the post-filters to dump the whole HTML in the MusicBee lyrics pane, (2) finding the tags that surrounded the lyrics (i.e., Lyrics__Container and Lyrics__Footer) and modifying the config pattern with them, and (3) iteratively add post-filters to get rid of HTML tags, bad formatting, etc. In this case, I'm not even getting any HTML when I use config pattern '(?<lyrics>.*)' (which I think should just capture the whole page?). I do get a "Lyrics found" printout in the log file though.

My first instinct was to curl an example URL from the log file, which got me a tiny HTML telling me I got redirected. If I uppercased the first character of the artist's name (so `curl https://genius.com/Artist-title-lyrics` instead of `curl https://genius.com/artist-title-lyrics`), curl returned the whole page as expected, but doing the same thing on the plugin side didn't really help. Also tried building the plugin myself to add some debug prints, but it looks like it depends on an old unavailable (?) version of YamlDotNet, so no luck there either.

5
Plugins / Re: LyricsReloaded (Updated)
« on: October 16, 2021, 04:57:33 PM »
I think I'm getting a yaml for the actual Genius too, but it's really complicated. Just nothing else works anymore, this plugin needs updating.

I think I have a yaml that works for Genius (until they change their formatting again).
Code
name: Genius

variables:
    artist:
        type: artist
        filters:
        - strip_diacritics
        - lowercase
        - [replace, "!!!", "chk-chik-chick"]
        - [regex, '(?<=\W|\s)+(feat.+|ft[\W\s]+|(f\.\s)).+', ""]
        - [regex, '\.+|,+|(\W+(?=$))|(^\W+)', ""]
        - [regex, "'", ""]
        - [regex, '(?<=[a-z0-9%])[^\sa-z0-9%]+(?=[a-z0-9%]+)', "-"]
        - [regex, '((?<=\s)([^a-z0-9\s-])+(\s|\W)+)|((?<=\w)([^a-z0-9-])+(\s|\W)+)', " "]
        - [strip_nonascii, -]
    title:
        type: title
        filters: artist

config:
    url: "https://genius.com/{artist}-{title}-lyrics"
    pattern: ['<div class="Lyrics__Container.*?">(?<lyrics>.*)<div class="Lyrics__Footer.*?">']

post-filters:
- utf8_encode
- entity_decode
- [regex, "<br/>", "\n"]
- strip_html
- clean_spaces

I haven't tested it extensively, but it works well when the URL is generated right. A few test failures I found:
1) Artist "X, the Y" is logged by the plugin as "X". I suspect this happens before the plugin regexes anything (I turned off the filters to check), maybe Musicbee just gives the plugin the segment before the first comma.
2) "X (Y)" is passed to the plugin as "X". Similar issue as before, but with parentheses. In my test case, Y wasn't a featured artist. Curiously, "X (Y) (ft. Z)" was correctly logged as "X (Y)". Is it only the last parenthesized phrase that's removed?
3) "Cygnus....Vismund Cygnus" is regexed to "cygnusvismund-cygnus", while the Genius URL expected "cygnus-vismund-cygnus". Other titles with ellipses remove them altogether, maybe this needs special handling.

Pages: 1