Author Topic: Enhance duplicate files finding  (Read 4707 times)

zzh1989829

  • Guest
i think it would be good that when searching dup files the tag value can be trimmed and remove the special chars and ingore case.
e.g.
"R'nB"
"r'nB"
"RnB"
"rnb"
"RNB"
"r   n   b"
should be dup values.
which meanings when searching it should use following keyword:
     tagValue.trim(special chars).toLowerCase()

boroda

  • Sr. Member
  • ****
  • Posts: 4595
+1. And I suppose that ignoring diacritics is fine as well.


zzh1989829

  • Guest
Sooo... how 'bout "R&B"? ;D
& is also a special char. I was just giving an example :)
i think user can define what character can be ignored when doing the dup finding ,at the same time ,user can define which tag(s) should do the special enhanced handling.
Last Edit: September 21, 2010, 05:02:50 AM by zzh1989829

Elberet

  • Full Member
  • ***
  • Posts: 167
Yeah, and 'R&B' is an example why matching like that is anything but foolproof. :)

According to your rules - remove "special" chars, collapse whitespace -, "R'n'B", "RnB" and "RNB" are all considered equal, while "R&B" differs. But the users will naturally expect that "R&B" == "RnB"...

Intelligent matching is a race between the programmer and users that the former can't ever win - not because we don't have the technology, but because no two people have the absolutely exact same idea of which two words are identical and which aren't. The best thing the program can do is to give the user the tools they need to get it to do what they want, which MusicBee already does quite well: have you checked out genre categories?

zzh1989829

  • Guest
Yeah, and 'R&B' is an example why matching like that is anything but foolproof. :)

According to your rules - remove "special" chars, collapse whitespace -, "R'n'B", "RnB" and "RNB" are all considered equal, while "R&B" differs. But the users will naturally expect that "R&B" == "RnB"...

Intelligent matching is a race between the programmer and users that the former can't ever win - not because we don't have the technology, but because no two people have the absolutely exact same idea of which two words are identical and which aren't. The best thing the program can do is to give the user the tools they need to get it to do what they want, which MusicBee already does quite well: have you checked out genre categories?

I was just raising an example. most times the dup tags are often the name of tracks , albums ,artists. BUT NOT the genre tag.
Whatsoever , the present function is good enough