|
|
Choosing A Webhost: |
RE: Searching: msg#00009audio.freedb.devel
> >> I'd say link automatically if the match is "good enough" and > >> the track offsets are at least "fuzzy matching". > > > That's fine as long as there is a way for the user to get > the linked > > albums that weren't returned. For instance, if the album > returned by > > the search contains a field that lists the albums that link to it. > > The idea is to have the data fields only saved once! I.e. > such a linked entry would have the track offsets and then > just a link field, that tells to which other entry it's > linked to - but none of the data fields. This way we can get > exact matches for all the variations of the entry, but will > always have one (hopefully) correct set of data fields in the > master entry. Therefore there are not really entries that > weren't returned any longer. The idea is good, I think - only > question is, if it is possible to put this into code and who > will do it. This probably requires rewrite of major parts of > the server software. > > - Joerg Yes, I agree with everything you said. When I said I wanted to be able to get the other, non-master, entries I meant the offsets. Any other data that might still be contained in these non-master entries would be redundent. I can't personally offer to write the code - I've never used C, Linux, or any database - but perhaps I could find someone who would do a code swap with me? Otherwise I can write (C++, Win32) client code, and I could prototype an algorithm that determines which albums should be linked and what data the master should have. I could also do manual 'eye-ball' work to verify that the master-album's data is correct for all the albums I own. But I've never let the lack of a skill stop me from having and voicing an opinion! so here are my thoughts on the implemenation of this. (I refer to the link as a field here, based on the existing, flat file database.) An initial, partial, implementation wouldn't be too hard, but it wouldn't save any space - it would take a little more. It would involve: (1) Adding the link field to the database. This means changing the server code to parse and return this new field. (If the field doesn't exist then the album is its own master). I think Yuri was talking about adding some new fields, so this new field could be added at the same time. (2) Writing code to create the link field for all existing albums, based on similar songs and offsets. It would also pick the best data for the master album (eg. the longest extended data field). The first part of this code would also be used for maintenance: it would run as part of the index update process, and would determine links for all new albums. (3) In the search, only adding master albums to the index. A fuller implementation would involve removing everything but the offsets from all non-master albums, and altering large amounts of the server code so that it automatically uses the links whenever it encounters a non-master album. Logically, the master-album becomes a whole new datatype, distinct from the album which just contains a discID and offsets. The master-albums could even get their own, sequential, 32-bit ID number. These are not conflicting implementations, but stages of the same implementation. Tom.
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Searching, Joerg Hevers |
|---|---|
| Next by Date: | RDBMS schema (MySQL), Ryan Fox |
| Previous by Thread: | Re: Searching, Joerg Hevers |
| Next by Thread: | Re: Searching, Joerg Hevers |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |