An in-between observation;
I noticed some lyrics missing when using the build-in MusixMatch engine, that I believed I was able to retrieve before.
So I copied my stored yml of MusixMatch to the providers folder, activated it, and deactivated the build-in one.
And indeed I immediately got more results.
Some examples of songs the build-in MM didn't find, and the yml MM does:
Ivan Lins - Novo tempo
Margriet Eshuijs Band - Black Pearl
Chico Buarque - Fortaleza
Simone - Geraldinos e Arquibaldos
Ane Brun - Du gråter så store tåra
Nana Caymmi - Doce Presença
Walter Becker - Door Number Two
the yml:
name: Musixmatch (new)
variables:
artist:
type: artist
filters:
- strip_diacritics
- lowercase
- [regex, "'", ""]
- [regex, "/", " "]
- [regex, '\s&(?=\s)', " "]
- [regex, '(?<=\W|\s)+(feat.+|ft[\W\s]+|(f\.\s)).+', ""]
- [regex, '[^\sa-z0-9]\s*', ""]
- [strip_nonascii, -]
title:
type: title
filters:
- strip_diacritics
- lowercase
- [regex, " '|' |/", " "]
- [regex, "'", " "]
- [regex, '\.+|,+|/+|(\W+(?=$))|(^\W+)', ""]
- [regex, '\s&(?=\s)', " and"]
- [strip_nonascii, -]
config:
url: "http://www.musixmatch.com/lyrics/{artist}/{title}"
pattern: ['<p class="mxm-lyrics__content.*?">(?<lyrics>.*?)<div [^>]*"lyrics-report".*?>', s]
post-filters:
- [regex, "<script.*?</script>", "", s]
- strip_html
- utf8_encode
- entity_decode
- clean_spaces
Any idea?