"Similar By Terms: Similarity" outputs invalid values like "1300%", "480%"

Created on 4 January 2011, over 13 years ago
Updated 12 June 2024, 17 days ago

The views integration of 'Similar By Terms' allows to output the calculated similarity in views with a field; in one view i just built, these fields output bizarre similarity values like "1300%", "480%", "400%", "280%" etc. This is quite irritating for users since nobody understands what this is supposed to mean.

I'm no math guru, but I'd guess that a complete match of all terms should be "100%" similarity, if less than all terms match, a value below 100% should be calculated; I can't imagine a valid scenario where "1300%" similarity would make any sense. So maybe there is a flaw somewhere is the code.

Configuration of the Views field: "Show as percentage", "Append % when showing percentage".

Also I tried to switch from "Show as percentage" to "Show count of common terms"; this outputs strange figures as well; e.g. I'm getting the number "65" (count of common terms!) for a node which has just 11 terms (from which only 3 should be evaluated according to the argument's settings).

The site is live and I can point out some links if this would help. This is not a major issue since the calculated nodes by similarity seem to be (more or less) correct, but it is really irritating for users, and it confuses me the more I try to understand it.

Thanks & greetings,
-asb

✨ Feature request
Status

Active

Version

2.0

Component

Code

Created by

πŸ‡©πŸ‡ͺGermany asb

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡©πŸ‡ͺGermany hexabinaer Berlin, Germany

    This is still the case in the current 8.x version. Comparing to a second project where SbT delivers quite satisfying results while in my current project we have very nested hierarchical vocabularies, I reckon this might be related to a miscalculation of (counted in) term children.

Production build 0.69.0 2024