Those Genius stats are insane and I assume they're from v1.1.15.5 as 15.4 would definitely not have had a 0% for incorrect lyrics.
The downside with the google search engine is that it is biased towards English results.
Using the song examples that you've provided, I tried to make the correction that I'd said I'd make - but ended up not due to me rolling back to the first yml workings.
Tracks without a comment indicate that there was a correct match.Ivan Lins - Novo tempo
Margriet Eshuijs Band - Black Pearl
Chico Buarque - Fortaleza
Simone - Geraldinos e Arquibaldos
Ane Brun - Du gråter så store tåra (the english translation was ahead of the original in the google search results)
Nana Caymmi - Doce Presença
Walter Becker - Door Number Two
Ivan Lins - Dinorah, Dinorah
Jacques Brel - Les Bonbons 67
Carla Bruni - Quelqu'un m'a dit (version listed at the top of the results has an English translation alongside and the yml's pattern is not built for that)
Simone - Pétala
Sharon Robinson - Sustenance (correctly returns a no match like the original)
Simone - Fantasia (the original cannot match it bcoz the artist on the site is 'Suzane & Simone', but with this, the match would occur)
..............................
Overall comments:
- The musixmatch yml below (which makes use of google searching) would mostly suit those who have most of their collection in English.
- There also would be a side effect in which songs that aren't available on the website, would return incorrect lyrics instead.
- Other users are right in that this behaviour would not be ideal as the default.
name: Musixmatch_Google
loader: search
variables:
artist:
type: artist
filters:
- lowercase
title:
type: title
filters: artist
config:
identity url: "https://www.google.com/search?q=Musixmatch+{title}+{artist}"
identity pattern: ['(?<identity>https://www.musixmatch.com.*?)%3Futm_|(?<identity>https://www.musixmatch.com.*?)["&]', 's']
lyrics url: ""
lyrics pattern: ['<p class="mxm-lyrics__content.*?">(?<lyrics>.*?)<div [^>]*"lyrics-report".*?>', s]
post-filters:
- [regex, "<script.*?</script>", "", s]
- [regex, '<div class="inline_video_ad_container_container">', "\n", s]
- strip_html
- utf8_encode
- entity_decode
- clean_spaces