Tuesday, 28 August 2018

Similarity test algorithm

An algorithm has been developed by me to calculate an index that is proportional with the melodic similarity between samples. It considers melody, rhythm, harmony and some other factors as well. Some of these factors are calculated in a sophisticated way to result a reasonable value.

Melodyc factor is considering same pitch, different pitch (decreasing effect), closely timed notes, intervals, repetitions, ...
Rhythm factor is considering note locations and for some extent rests as well.
Harmony factor is considering chord changes weighing by how usual/unusual they are.
The "others" factor is considering tempo, location (section, phrase, bar), similarity of instrumentation. More details only for those who interested.

The test is still under fine tuning.
Preliminary test results below, so the results may change a bit up or down. Keep in mind the proposed limit is around 8.0, that has to be handled carefully. Between 7.0 and 9.0 there is a "gray" range, but out of this the case is more or less black or white. 

1) Stay With Me vs. I Won't Back Down
Similarity index: 11,96

Melody: 9,23
Rhythm: 1,14
Harmony: 1,02
Others: 1,11

Clear case.

2) Blurred Lines vs. Got To Give It Up

2a) Blurred Lines vs. Got To Give It Up - "signature phrase"
Similarity index: 2,73

Melody: 2,7
Rhythm: 1,02
Harmony: 0,9
Others: 1,08

This result is maximized by triming off the non-matching notes. Without the trimming the
entire phrase would result a negative value due to the too many different notes.

2b) Blurred Lines vs. Got To Give It Up - "hook"
Similarity index: 3,91
Melody: 3,45
Rhythm: 1,09
Harmony: 1,06
Others: 0,98

By far the highest result in the case. Only one perfect match, plus three close ones.

2c) Blurred Lines vs. Got To Give It Up - bass
Similarity index: 1,48

Melody: 1,23
Rhythm: 1,01
Harmony: 0,95
Others: 1,25

This result too is maximized by triming off the non-matching notes. Without the trimming the
entire phrase would result a negative value due to the too many different notes.

2d) Blurred Lines vs. Got To Give It Up - 5 to 1 bass motif
Similarity index: 1,30

Melody: 1,72
Rhythm: 0,80
Harmony: 0,79
Others: 1,19


2e) Blurred Lines vs. Got To Give It Up - hey-hey-hey
Similarity index: 0,45

Melody: 0,70
Rhythm: 0,80
Harmony: 0,79
Others: 1,02

The lowest index. It was also pointed out as a similar motif by musicologists
and later testified as being substantially similar (with all other points).

2f) 
Blurred Lines vs. Got To Give It Up - "keep on dancin"
Similarity index: 0,93

Melody: 1,10
Rhythm: 1,00
Harmony: 0,91
Others: 0,93

Summary of the six Blurred Lines vs Got To Give It Up samples:
a:2,76
b:4,31
c:1,48
d:1,37
e:0,45
f:0,93

c to f are ranging from 0,55 to 1,43. We could just say "no comment", but it cries out
for a comment. These are ridicoulusly low values to label as substantially similar
or even just similar. Gayes-party expert in her testimony claimed each
of these being substantially similar - in the musicologic meaning of the word.

Also note that none of these patterns occure simultainously or subsequently.
Now think it over what percentage of randomly chosen (pop) songs contain
an at least 4,17 and a 2,76 strong melodic coincidence.


3) Blurred Lines vs Another One Bites The Dust
Similarity index: 3,46
It's just a melismatic motif with nine (!) consecutive matching notes, that are following a commonplace pattern. The algorhythm effectively compensates the repeated commonplace motifs.


Melody: 4,6
Rhythm: 1,11
Harmony: 0,8
Others: 0,9


4) Sweet Child Of Mine vs. Unpublished Critics
Similarity index: 5,72

Melody: 4,1
Rhythm: 1,03
Harmony: 1,1
Others: 1,28

This refers only to the verse melodies. Similarly to 2a) the result would be much lower (a negative value) if the comparation would consider the entire phrase. For getting a higher result the non-matching motes were trimmed down from the melodic comparison. In this case there were other similar details as well.


5) Creep vs. Air That I Breathe
Similarity index: 9,14

Melody: 7,01
Rhythm: 1,13
Harmony: 1,15
Others: 1,00

The compared pattern in Creep is the falsetto sung melody after the "solo".


6) Get Free vs. Creep
Similarity index: 9,64 (depends on!)

Note that in this case the complaining melodies in Creep are different from those that are similar with the Air That I Breathe. The two cases are melody-wise independent from eachother.
The melodies in this case are just partly similar. Some phrases are rather different. The rough placement of the phrases is similar in both songs: starting 2-3 beats before the downbeat of the actual harmonic phrase (where the chords change).
We have two different verses in both songs. Slightly different in Get Free more
different in Creep (phrases 3 and 4). To maximize the matching notes I hade to take the closer variant of the verses which is the first verse in Creep.
The highest result was given by considering phrase 3-4 of verses through phrases 1-2 of chorus. This is a "cheat" in favour of Creep since these phrases are not subsequent with the chorus phrases. Without this cheat the index would not reach the propopsed limit at 8.0!

Melody: 8,5
Rhythm: 0,87
Harmony: 1,18
Others: 1,10

7) Photograph vs. Amazing
Highest score is resulted by the first ABB sequence that shows a similarity index of: 10,90 according to the algorithm.

Melody: 9.03
Rhythm: 1.06
Harmony: 1.09
Others: 1.04


8) Come As You Are vs. Eighties
Similarity index: 12,64

Melody: 9,13
Rhythm: 1,03
Harmony: 1,08
Others: 1,24

13,81 considering the repetitions.

Clear case? Not quite! Just to mess things up:

Eighties (1985) vs. Life Goes On (1982)
Similarity index: 12,06 or 16,88 considering the repetitions.

Come As You Are vs. Life Goes On
Similarity index: 10,52
11,19 considering the repetitions.


9)
Love Is A Wonderful Thing (Isley Brothers) vs. Da Doo Ron Ron
Similarity index: 7,40
Under the limit.

Melody: 6,62
Rhythm: 1,06 The shuffle beat difference is considered in the "others" factor: 0,9.
Harmony: 1,02
Others: 1,04

10)
Love Is A Wonderful Thing (Michael Bolton)
vs.
Love Is A Wonderful Thing (IsleyBrothers)
Similarity index: 6,78

Melody: 3,90
Rhythm: 1,14
Harmony: 0,98
Others: 1,56 The four identic words alone contribute with a 1,2 gain.

This best result was bychoosing the once-occuring title phrase variant in Bolton's song, next to the sax solo. The most frequently occuring Bolton variants resulted in an 3,82 index.

11)
Thinking Out Loud vs. Let's Get It On

11a)
The bass base loop.
Similarity index: 7,37

This a surprisingly high index for a four note melody. It is considering the looping with a 1,4 "gain". Since it is a commonplace motif even in prior art, it does not matter much.

Melody: 6,30
Rhythm: 1,10 (if the3+5 pattern would not be commonplace this factor would be higher)
Harmony: 1,04
Others: 1,80

11b)
TOL verse 1st phrase vs. LGIO chorus 3rd phrase
The opening notes, the title phrase in LGIO is a traditional fanfare motif. The compared fragment is a melismatic motif in LGIO: 3-4 notes only, since the rest is rather different.
Similarity index: 3,03

Melody: 2,9
Rhythm: 1,01
Harmony: 1,07
Others: 0,97

11c)
TOL verse 2st phrase vs. LGIO chorus 4rd phrase 
(the 3 5 6 5 3 motif)
Similarity index: 2,81

Melody: 2,9
Rhythm: 1,13
Harmony: 0,93
Others: 0,92

11d)
TOL verse with LGIO verse
Very different melodies. There is a two note fragment that is "similar".
Similarity index: 1,52

Melody: 1,49
Rhythm: 0,9
Harmony: 1,06
Others: 1,08

To becontinued:
Fireworks, How Deep Is Your Love,...