Documentation================
Regular expressionsThis plugin relies heavily on Microsoft-flavored regular expression for its functionality. Tutorials and tools to write these expressions are all over the internet, for example RegExr which is a greate tool to write and test regular expressions.
Using these regexes is usually done in 2 ways:
1. a single string, which specifies only the regex
2. an array of 2 strings that specifies the regex in its first element and regex options as its second element. Example: ["regex", is]
There are 2 things to watch out for:
1. backslashes (\) must be escaped when using double quotes ("), as it is a meta character in the YAML format: \s -> \\s
2. named capturing groups are defined like this: (?<lyrics>.*?) with "lyrics" being the name of the group
Regex optionsThe regex options are specified as a string that contains the characters for the options. A lowercase character enables the options, an uppercase character disables the options.
The options:
i: the regex is case insensitive
s: the input string is seen as a single line
m: the input is seen as multiple lines
c: the regex will be compiled (improves execution performance, but slows startup)
x: whitespace in the regex will be ignored (nice for complex regexes)
d: the regex will go from right to left though the string
e: only named capturing groups will be used
j: the regex will be ECMA script compatible
l: the regex will be culture invariant
==============================================================
FilterFilters are small functions that can modify the given content. Filters can currently be applied to variable values and the lyrics content.
Important: The filters are executed in the specified order, so stripping HTML-tags before converting <br> tags to newlines won't get you far.
strip_htmlThis filter removes all HTML tags from the content.
entity_decodeThis filter decodes HTML entities like © -> ©.
strip_linksThis filter removes links from the lyrics.
utf8_encodeThis filter converts the content's encoding to UTF-8 (without BOM).
br2lnThis filter converts <br> tags to newlines (\n).
p2breakThis filter converts </p> tags to 2 newlines (\n) indicating a new paragraph.
clean_spacesThis filter cleans up the whitespace of the content by normalizing line endings, converting tabs to spaces, vertical tabs to newlines and removing unnecessary newlines and spaces.
trimThis filter removes whitespace from the beginning and the end of the content.
lowercaseThis filter converts the whole content to lower case. Optionally you can provide a culture name as the first argument.
By default the conversion is culture unaware.
uppercaseThis filter converts the while content to upper case Optionally you can provide a culture name as the first argument.
By default the conversion is culture unaware.
diacritics2asciiThis filter removes diacritics from the content, so "äöüß" becomes "aous".
umlauts2asciiThis filter is specialized version of diacritics2ascii that handles only the german umlauts and replaces them with their two character representation, so "äöüß" becomes "aeoeuess".
urlencodeThis filter URL-encodes the content where necessary, so a space becomes +.
urlencodeThis filter URL-encodes the content where necessary, so a space becomes %20.
regexThis filter does a regex replace, the first argument is the regex (which will be cached) and the second argument is the replacement which may contain backreferences. Optionally a third argument can be given which specifies regex options
Example usage: [regex, '\s+?', " "]
strip_nonasciiThis filter removes all non-ASCII characters. The filter has 2 optional arguments: The first is a replacement for the removed character and the second one specifies whether the replacement can be inserted multiple times in a row.
Examples:
1. strip_nonascii -> "test *** test" -> "testtest"
2. [strip_nonascii, -] -> "test *** test" -> "test-test"
3. [strip_nonascii, -, duplicate] -> "test *** test" -> "test-----test"
replaceThis filter replaces the given search string (first argument) with the replacment (second argument).
Example usage: [replace, search, replace]
===============================================================
ValidatorsValidators are meant to verify the loaded lyrics. An example where this would be necessary: A website that doesn't return an error 404 when lyrics were not found, but instead show a page with the exact same format, but a "not found"-message instead of lyrics. The result of validators can be inverted by prefixing their name with "not ".
Examples:
- [contains, lyrics]
- [not contains, not found]
containsThis validator checks whether the content contains a given string (first argument).
matchesThis validator checks whether the given regex matches something in the content. It takes a regular expression (first argument) and options for it (second argument)
======================================================
Example Configuration YML File# the name of the provider. this will be shown in MusicBee's settings
name: 'Example'
# the loader for this provider: static, search, api
loader: static
# prepare the input
variables:
# filters to apply to the artist
artist:
type: artist # the source of the value
filters:
- strip_diacritics
- [stripdown, _]
- urlencode
# filters to apply to the album
# album: skip entry omitted as it isn't needed
# filters to apply to the title
title:
type: title
filters: artist # reference the filters of artist
post-filters:
- strip_html
- utf8_encode
- trim
validations:
- [not contains, Click here to submit these lyrics]
config:
# the URL to request. {artist}, {album} and {title} are placeholders for the values from the song.
url: "http://www.azlyrics.com/lyrics/{artist}/{title}.html"
# The regular expression to apply to the content of the website. The pattern must contain a named capturing group called "lyrics" like: (?<lyrics>.+?)
# variables are allowed as well
pattern: '<!-- start of lyrics -->(?<lyrics>.+?)<!-- end of lyrics -->'
# The options for the pattern:
# - i: case insensitive
#
# more to come
pattern-options: 'i'