A Highly Robust Audio Fingerprinting System
A Highly Robust Audio Fingerprinting System
Eindhoven, The Netherlands
ABSTRACT
Imagine the following situation. You’re in your car, listening tothe radio and suddenly you hear a song that catches your attention.It’s the 帮写留学生硕士论文best new song you have heard for a long time, but youmissed the announcement and don’t recognize the artist. Still, youwould like to know more about this music. What should you do?You could call the radio station, but that’s too cumbersome.Wouldn’t it be nice if you could push a few buttons on yourmobile phone and a few seconds later the phone would respondwith the name of the artist and the title of the music you’relistening to? Perhaps even sending an email to your default emailaddress with some supplemental information. In this paper wepresent an audio fingerprinting system, which makes the abovescenario possible. By using the fingerprint of an unknown audioclip as a query on a fingerprint database, which contains thefingerprints of a large library of songs, the audio clip can beidentified. At the core of the presented system are a highly robustfingerprint extraction method and a very efficient fingerprintsearch strategy, which enables searching a large fingerprintdatabase with only limited computing resources.
1. INTRODUCTION
Fingerprint systems are over one hundred years old. In 1893 SirFrancis Galton was the first to “prove” that no two fingerprints ofhuman beings were alike. Approximately 10 years later ScotlandYard accepted a system designed by Sir Edward Henry foridentifying fingerprints of people. This system relies on the patternof dermal ridges on the fingertips and still forms the basis of all“human” fingerprinting techniques of today. This type of forensic“human” fingerprinting system has however existed for longerthan a century, as 2000 years ago Chinese emperors were alreadyusing thumbprints to sign important documents. The implication isthat already those emperors (or at least their administrativeservants) realized that every fingerprint was unique. Conceptuallya fingerprint can be seen as a “human” summary or signature thatis unique for every human being. It is important to note that ahuman fingerprint differs from a textual summary in that it doesnot allow the reconstruction of other aspects of the original. Forexample, a human fingerprint does not convey any informationabout the color of the person’s hair or eyes.Recent years have seen a growing scientific and industrial interestin computing fingerprints of multimedia objects [1][2][3][4][5][6]. The growing industrial interest is shown among others by alarge number of (startup) companies [7][8][9][10][11][12][13]and the recent request for information on audio fingerprinting
technologies by the International Federation of the PhonographicIndustry (IFPI) and the Recording Industry Association of
America (RIAA) [14].
The prime objective of multimedia fingerprinting is an efficientmechanism to establish the perceptual equality of two multimediaobjects: not by comparing the (typically large) objects themselves,but by comparing the associated fingerprints (small by design). Inmost systems using fingerprinting technology, the fingerprints of alarge number of multimedia objects, along with their associatedmeta-data (e.g. name of artist, title and album) are stored in adatabase. The fingerprints serve as an index to the meta-data. Themeta-data of unidentified multimedia content are then retrieved bycomputing a fingerprint and using this as a query in thefingerprint/meta-data database. The advantage of usingfingerprints instead of the multimedia content itself is three-fold:
1. Reduced memory/storage requirements as fingerprintsare relatively small;
2. Efficient comparison as perceptual irrelevancies havealready been removed from fingerprints;
3. Efficient searching as the dataset to be searched issmaller.
As can be concluded from above, a fingerprint system generallyconsists of two components: a method to extr