Napster to Use "Fingerprinting" Technology to help it filter out copyrighted songs.
April 21, 2001 7:56 AM Subscribe
Napster to Use "Fingerprinting" Technology to help it filter out copyrighted songs. "There are many technological challenges.'' That's putting it lightly. How badly would this slow down their system if they could even get it to work?
Napster is going to download 3 gigs from my 56k modem to see if they're copyrighted or not? I don't think this fingerprinting technology is really possible on a large scale service like Napster considering there's around 3000 gigs of music online at any point in time. Next.
posted by Mark at 9:51 AM on April 21, 2001
posted by Mark at 9:51 AM on April 21, 2001
jpoulos - I don't think they will make their service appreciably less usable. Perhaps in the few days before some enterprising programmer looking to get some cred writes an extension to this or that Napster client which effectively circumvents their scheme. This isn't to say that they aren't being made to slowly kill themselves, but I don't believe this is their death knell by a long shot. The record industry may be very sly after all for dragging it out like this, rather than pushing for defeat in one fell swoop. If Napster were suddenly over, people would more likely migrate en masse to the Next Best Thing. But by slowly driving people away, they foster fragmentation of the peer-to-peer/file sharing 'scene'.
Mark - They won't need to download your entire collection; they'll simply do the fingerprinting locally and send up the results, which should be quite compact. Those results will then be compared to their (c) db; matches will be unavailable for sharing.
I'm not saying they'll do it right. The Napster client is beta quality and it shows. Layering this filtering on top gracefully could be too much to ask.
posted by brantstrand at 11:28 AM on April 21, 2001
Mark - They won't need to download your entire collection; they'll simply do the fingerprinting locally and send up the results, which should be quite compact. Those results will then be compared to their (c) db; matches will be unavailable for sharing.
I'm not saying they'll do it right. The Napster client is beta quality and it shows. Layering this filtering on top gracefully could be too much to ask.
posted by brantstrand at 11:28 AM on April 21, 2001
I worked at a .com that did exactly this (audio fingerprinting), and had dealings with Napster (our people had root on their servers, with permission). I can speak to how this could be implemented.
Napster is all about keeping as much work as possible happening on the client side. Count on the fingerprint generating tech to appear in a future version of the client. Future clients may require that the fingerprint (tiny, 1 kB or less) be sent to Napster's servers along with each filename, either when a peer sends Napster a set of songs available or when a download of a publicized song is requested.
The transmitted fingerprint would be checked against a database of fingerprints from songs that record companies have asserted copyright over. If the user submitted fingerprint does not match any in the database, then Napster's server connects the two peers so they can make the transfer.
Song identification based on tiny audio fingerprints does work, though there will be problems encountered in massive implementations. I can imagine reasonably solid ways of ensuring that the fingerprint submitted actually comes from the file offered for transfer without requiring proprietary alterations to the mp3 format, but I don't feel like helping anyone do so.
posted by NortonDC at 2:52 PM on April 21, 2001
Napster is all about keeping as much work as possible happening on the client side. Count on the fingerprint generating tech to appear in a future version of the client. Future clients may require that the fingerprint (tiny, 1 kB or less) be sent to Napster's servers along with each filename, either when a peer sends Napster a set of songs available or when a download of a publicized song is requested.
The transmitted fingerprint would be checked against a database of fingerprints from songs that record companies have asserted copyright over. If the user submitted fingerprint does not match any in the database, then Napster's server connects the two peers so they can make the transfer.
Song identification based on tiny audio fingerprints does work, though there will be problems encountered in massive implementations. I can imagine reasonably solid ways of ensuring that the fingerprint submitted actually comes from the file offered for transfer without requiring proprietary alterations to the mp3 format, but I don't feel like helping anyone do so.
posted by NortonDC at 2:52 PM on April 21, 2001
This latest feature is thanks in large part to Gigabeat, who Napster acquired recently. Their technology is pretty amazing. It was developed at Stanford (if I remember correctly), and really had no direct application to music in particular. After it was ready, they looked for something to apply it to which would catch on with the market. Music was a natural choice, but don't be surprised if Naptser begins using this technology for things OTHER than music...
(Note: I was under an NDA with Gigabeat before the acquisition, and I'm fairly sure that it still applies, hence I can't say anything that's not public knowledge. For that reason, the information below is accompanied by reference links to that which is publically available on the web. To be clear, I know nothing of Naptser's plans for actual integration of the Gigabeat technology. I only know what Gigabeat can do and that knowledge allows me to make some educated guesses as to the eventual implementation. I repeat: I know nothing about Napster's integration plans for Gigabeat technology...)
In regards to fingerprinting, this can be done in two different ways. The more traditional way is by the use of hash function (MD5 or SHA for instance--For more info about encryption algorithms, look here (novice) and here (advanced)). The output of the hashing function is a much smaller string (usually 128 bits) which matches uniquely to the file with an extremely high degree of probability. If two hashes match, the files are almost certainly identical.
The second fingerprinting method is done by soundwave analysis. Acoustic analysis will come to play an increasingly large part of the RIAA's fight against digital music piracy because hashing functions don't work with digital music (among other things) because even the tiniest, most acoustically-unnoticible change in the file creates a completely different hash. In fact, if you were to rip the same exact song twice from the same copy of a CD, using the same software, bitrate and file format, the resulting files would still be slightly different due to random anomalies. The hashes generated by files with even one bit changed (for reference, a 4MB MP3 file, if I'm not mistaken has around 32 million bits) are completely different, and hence cannot be compared directly for similarity.
That's where sound analysis comes in. Sound analysis and recognition tools have been around for a long time and are getting play everywhere, from military and medical applications to my cell phone ("Call Mom") and even a fairly new web category called voice portals. Now the record labels are hot on it. Through digital signal processing (DSP) and waveform characteristic analysis, any sound can be represented mathmatically. This mathematical representation can be filtered to reduce everything but the signal. Using this method, one could identify two songs as being the same even if one is live and one is a studio version!
There are a host of opportunistic companies who have thrown their hat into the digital-music-piracy-war ring to provide the solution to the record labels including TunePrint, Cantametrix and eTantrum.
I bring all of this up because Gigabeat also does DSP and sound analysis:
From Gigabeat's former site:
To streamline the process of finding your song, Gigabeat compiles the results categorically by song and quality of file. For example, if you search under "Sarah McLachlan," Gigabeat lists her songs and directs you to download the highest quality file available for each song, based on harmonic quality, song length and connection speed.
From a Newsbytes article on Gigabeat:
Individual music files are also analyzed to compare songs and artists to one another based on various aspects of the music itself, such as genre, composition and style. This results in a list of related music for the listener.
I suspect that this was a big reason Napster bought Gigabeat. The record companies have them by the balls for copyrighted material and users complain of bad quality or truncated songs. Ta dum. Both problems solved.
posted by fooljay at 5:12 PM on April 21, 2001
(Note: I was under an NDA with Gigabeat before the acquisition, and I'm fairly sure that it still applies, hence I can't say anything that's not public knowledge. For that reason, the information below is accompanied by reference links to that which is publically available on the web. To be clear, I know nothing of Naptser's plans for actual integration of the Gigabeat technology. I only know what Gigabeat can do and that knowledge allows me to make some educated guesses as to the eventual implementation. I repeat: I know nothing about Napster's integration plans for Gigabeat technology...)
In regards to fingerprinting, this can be done in two different ways. The more traditional way is by the use of hash function (MD5 or SHA for instance--For more info about encryption algorithms, look here (novice) and here (advanced)). The output of the hashing function is a much smaller string (usually 128 bits) which matches uniquely to the file with an extremely high degree of probability. If two hashes match, the files are almost certainly identical.
The second fingerprinting method is done by soundwave analysis. Acoustic analysis will come to play an increasingly large part of the RIAA's fight against digital music piracy because hashing functions don't work with digital music (among other things) because even the tiniest, most acoustically-unnoticible change in the file creates a completely different hash. In fact, if you were to rip the same exact song twice from the same copy of a CD, using the same software, bitrate and file format, the resulting files would still be slightly different due to random anomalies. The hashes generated by files with even one bit changed (for reference, a 4MB MP3 file, if I'm not mistaken has around 32 million bits) are completely different, and hence cannot be compared directly for similarity.
That's where sound analysis comes in. Sound analysis and recognition tools have been around for a long time and are getting play everywhere, from military and medical applications to my cell phone ("Call Mom") and even a fairly new web category called voice portals. Now the record labels are hot on it. Through digital signal processing (DSP) and waveform characteristic analysis, any sound can be represented mathmatically. This mathematical representation can be filtered to reduce everything but the signal. Using this method, one could identify two songs as being the same even if one is live and one is a studio version!
There are a host of opportunistic companies who have thrown their hat into the digital-music-piracy-war ring to provide the solution to the record labels including TunePrint, Cantametrix and eTantrum.
I bring all of this up because Gigabeat also does DSP and sound analysis:
From Gigabeat's former site:
To streamline the process of finding your song, Gigabeat compiles the results categorically by song and quality of file. For example, if you search under "Sarah McLachlan," Gigabeat lists her songs and directs you to download the highest quality file available for each song, based on harmonic quality, song length and connection speed.
From a Newsbytes article on Gigabeat:
Individual music files are also analyzed to compare songs and artists to one another based on various aspects of the music itself, such as genre, composition and style. This results in a list of related music for the listener.
I suspect that this was a big reason Napster bought Gigabeat. The record companies have them by the balls for copyrighted material and users complain of bad quality or truncated songs. Ta dum. Both problems solved.
posted by fooljay at 5:12 PM on April 21, 2001
Good post, NortonDC.
I can imagine reasonably solid ways of ensuring that the fingerprint submitted actually comes from the file offered for transfer without requiring proprietary alterations to the mp3 format, but I don't feel like helping anyone do so.
This could be a big problem. I can't imagine that they would do the fingerprinting each and every time a file is sent, since it would put a major strain on the clients. It's not exactly CPU-light... So if they don't do it every time, what's to stop me from doing a "cup 'o urine bait-n-switch"?
I also am not particularly interested in helping. ;-)
posted by fooljay at 5:18 PM on April 21, 2001
I can imagine reasonably solid ways of ensuring that the fingerprint submitted actually comes from the file offered for transfer without requiring proprietary alterations to the mp3 format, but I don't feel like helping anyone do so.
This could be a big problem. I can't imagine that they would do the fingerprinting each and every time a file is sent, since it would put a major strain on the clients. It's not exactly CPU-light... So if they don't do it every time, what's to stop me from doing a "cup 'o urine bait-n-switch"?
I also am not particularly interested in helping. ;-)
posted by fooljay at 5:18 PM on April 21, 2001
fooljay, it requires some thought, but the solution is apparent to me. Not that I'm motivated to "solve" their problem for them.
Not all of those companies you listed position their tech as anti-piracy measures. At least one designed their infrastructure to prevent just such a use.
Also, if Relatable's tech works the way I have reason to believe it works, it is vulnerable to attacks that losslessly shuffle and unshuffle frames of mp3 data - lossless audio pig-latinizing of mp3 files.
posted by NortonDC at 6:00 PM on April 21, 2001
Not all of those companies you listed position their tech as anti-piracy measures. At least one designed their infrastructure to prevent just such a use.
Also, if Relatable's tech works the way I have reason to believe it works, it is vulnerable to attacks that losslessly shuffle and unshuffle frames of mp3 data - lossless audio pig-latinizing of mp3 files.
posted by NortonDC at 6:00 PM on April 21, 2001
Either way, it really doesn't matter. People will always find a way to get around technological roadblocks, unless the bar for the desired method of access is set low enough so as to make circumvention undesireable or unnecessary.
The record companies will figure this out sooner or later. It doesn't have to mean free either...
posted by fooljay at 6:44 PM on April 21, 2001
The record companies will figure this out sooner or later. It doesn't have to mean free either...
posted by fooljay at 6:44 PM on April 21, 2001
People will always find a way to get around technological roadblocks, unless the bar for the desired method of access is set low enough so as to make circumvention undesirable or unnecessary.
When Napster creates it's catalog-o-songs, it will scan each one in entirety to verify size (I know this, only because I tried to share 74gb over the network, which took forever). I would assume that fingerprinting would happen at that level. Otherwise it could happen during the transmission. But you're right, someone will hack it so that it will send Lawrence Welk fingerprints.
The keypoint will be on how easy Napster makes this to crack. If it's simple, then they will keep the userbase of "illegal" mp3 trading while essentially getting out of the liability of hosting it. Get it? It's the users doing the criminal acts from this point since the software's intent has changed to circumvent it.
posted by samsara at 9:46 AM on April 22, 2001
When Napster creates it's catalog-o-songs, it will scan each one in entirety to verify size (I know this, only because I tried to share 74gb over the network, which took forever). I would assume that fingerprinting would happen at that level. Otherwise it could happen during the transmission. But you're right, someone will hack it so that it will send Lawrence Welk fingerprints.
The keypoint will be on how easy Napster makes this to crack. If it's simple, then they will keep the userbase of "illegal" mp3 trading while essentially getting out of the liability of hosting it. Get it? It's the users doing the criminal acts from this point since the software's intent has changed to circumvent it.
posted by samsara at 9:46 AM on April 22, 2001
« Older The Big Breakfast | Summit of the Americas Newer »
This thread has been archived and is closed to new comments
posted by jpoulos at 8:13 AM on April 21, 2001