Skip to content
This repository has been archived by the owner on Mar 28, 2022. It is now read-only.

actual hash rate is about half what the Ellis-Whitman-Porter-11 paper states #82

Open
adrianomitre opened this issue Sep 16, 2015 · 1 comment

Comments

@adrianomitre
Copy link

In section 2 of the paper ECHOPRINT - AN OPEN MUSIC IDENTIFICATION SERVICE, it is stated that the "the overall hash rate is approximately 8 (bands) × 1 (onset per second) × 6 (hashes per onset) ≈ 48 hashes/sec". However, all the songs I have ran echoprint-codegen on have resulted in a much lower figure: always in the 23-28 hashes per second range, with an average slightly above 25. I am computing hash rate as H/L, there H is the total number of hashes produced for a song and L is the song length in seconds (which can be estimated as the maximum hash frame divided by the time quanta of the frame 11025/256 ≈ 43.07).

  • Is anyone having similar issues? Has anyone ever measured their code rates?
  • May the lower code rate hurt accuracy?
  • What parameters one should tweak to increase the code rate? I would assume it is related to the onset detection, thus in Fingerprint::adaptiveOnsets() method...

My fork of codegen which prints the hashes unhashed in [frame, band, delta1, delta2] JSON format is public and the following Ruby code computes the "hash" rate of the arguments:

#!/usr/bin/env ruby

require 'json'

def get_code(filename)
  JSON.parse(JSON.parse(File.read(filename))[0]["code"])
end

# Mean code rate in codes per second.
#
TimeQuantum = 11_025 / 256.0
def mean_code_rate(code)
  max_frame = code.map {|fr, b, d1, d2| fr }.max
  code.size / (max_frame / TimeQuanta)
end

ARGV.each do |filename|
  r = mean_code_rate(get_code(filename))
  puts "#{"%.2f" % r} ; #{filename}"
end
@adrianomitre
Copy link
Author

It is stated, in section 2 of the paper, that "the overall hash rate is approximately [...] 48 hashes/sec". Then, in section 3, it is stated that "A 30 second query has about 800 hash keys." (800/30 = 26,6 hashes/sec). Only one of theses statement can be correct, and according to the results detailed in the previous comment, I would say it is the second.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant