Version 18 (modified by kiesel, 16 years ago) (diff) |
---|
In progress
MP3Importer (works for me)
ExpertRecommender
Things to do
- Adaptation for music domain
- finish skiptrax ontology (see Music Ontology) - 1-3 weeks
- handle remixes/live/unplugged (song variants)
- discern song and file - model file ownership
- Proper GUI with the following functions:
- metadata importers/exporters (ID3, LastFM, etc.) - 1-2 weeks
- (adapter for own audio collection)
- simplify entering song/band information - 1 week
- (generate/play playlists, possibly with changing style over time (AutoDJ) - 2-6 weeks)
- finish skiptrax ontology (see Music Ontology) - 1-3 weeks
- Metadata-based functionality
- different sharing rights for contacts (e.g. A only ontologies, B also metadata, C everything) - <1 week
- expert recommender (ask friends about people having knowledge about X/interests in Y) - 1-3 weeks
- trust visualization
- scalable algorithms - scalable = void algo(Graph oldFacts, Graph additionalFacts, Datastructure internalData, Notifications interestingThings)
- computation of feature summaries - 1 week
- feature/item similarity measure - 2 weeks
- structured search (index)
- implicit trust (= peer competence) metric - 2-5 weeks
- explicit representation of trust (see Skippies ontology) - 2-5 weeks
- user interface changes for that
- trust metric changes for that
- by combining computed and explicit trust ratings, identifying shared interests/fields of competence should be possible.
Ideas
Skipforward is intended to provide content-bases recommendations, but there might be some cool variants of combining it with collaborative filtering/recommendation:
Collaborative filtering per feature
Instead of the boring "people who like this also like that" recommendations, this could be adapted in the following way:
People who like A also like B => People who think A has feature F also think that B has feature F
People who like the same items like you also like this => People who annotated items the same way as you annotated this that way
The last strategy could be used to rate items that haven't been reviewed by the user, but by his friends.
Music Ontology
Artists
Artist collaboration
At first glance, one might say that only one class Artist is sufficient, but it quickly reveals its limits. A few examples:
- Chicane feat. Tom Jones - Stoned in Love (at discogs.com) (at musicbrainz.com)
- Cerf, Mitiska & Jaren - Light The Skies (at discogs.com)
Let's forget for a moment that this are in fact releases and not songs and concentrate on the artists: In the first case there are two artists, Chicane and Tom Jones. Or is there one artist called Chicane feat. Tom Jones? Musicbrainz has a special FeaturingArtistStyle, whereas discogs just has multiple artists.
The second case is a bit more difficult, as there is a an artist on discogs called ''Cerf, Mitiska & Jaren'', but it consists of three members, Jaren, Shawn Mitiska and Matt Cerf.
Discogs has the following guidelines for artists, which say that "Artists which commonly collaborate together should be listed as one artist" and "Do NOT attempt to split artists who regularly collaborate. (Regular collaboration consists of 3 or more collaborations (different releases), excluding remix EPs)".
Artist Aliases
A different problem is the one of the same artist having different names. The problem is rather easy with music groups, because it can be modelled accurately by two different music groups having the same members.
But with solo artists (persons) one has to distinguish name variations (DJ Tiësto, Tiësto, Tiesto, ...) and different names (aliases) under which he releases stuff (Drumfire is an alias of DJ Tiësto)
To make this mess even worse, there are fictitious artists, e.g. Bernd das Brot.
http://musicbrainz.org/doc/ArtistAlias http://musicbrainz.org/doc/FeaturingArtistStyle
Conclusion
Even though discogs.com and musicbrainz.com have put much thought in it, they couldn't cover all cases and there are always exceptions. We have to come up with a solution that has to special enough to fit our cause, but simple enough to be used by non-nerds.
more to follow
Meetings
2008-10-15/17
- start thesis tex (2008-10-24)
- contents (2008-10-24)
- terminology
- function signatures (mathematical) (2008-10-24)
- usage of these functions (what combinations for what purposes) (2008-10-24)
- design music domain evaluation (2008-10-24, start)
- test with min() as co-rated weight (2008-10-24)
- commit code (2008-10-24)
- RDF<->FactsDb<->Item/Feature/User matrix<->ExpRecAlg<->API (no inference: 2008-10-24)
- Matrix is used for "inference": Push positive features up, negative features down, end/start with abstract classes
- speed/memory benchmarking logs (2008-10-24)
- RDF<->FactsDb<->Item/Feature/User matrix<->ExpRecAlg<->API (no inference: 2008-10-24)
- talk with Raffael/Darko (done)
- @Malte: Proper diffs (done)
- @Malte: portforwarding/new group on skipforward.net?
2008-09-05
- @Malte: Syncing between own nodes (done)
- Expert Recommender/Item Opinion distance metric
- have a look at Dempster-Shafer
- do not forget confidence values (use them as weights when aggregating)
- start with .tex
- look at evaluations in related papers
- test algorithm (and simplifications) with testing data
2008-08-29
- (minor) fix bugs of importer
- add possibility in GUI to annotate multiple items at a time (web frontend or Swing GUI)
- possibility to copy all annotations from one item to another (maybe copy/paste)
- write "competence calculator" (expert recommender)
- public float calculateCompetence(String jid, String featureNamespace)
- incremental
2008-08-22
- Keep Trac pages and tickets updated
- Evaluation...
- Song annotation: Let users annotate one set of fixed songs (to get good overlap) and other songs of their choice (to test recommendations)
- Same for feature sets to annotate: Create one certificate that must be completed and let users complete other features freely
- Evaluating recall will be a problem and probably less relevant than evaluating precision (see here). Still, do not omit recall completely; best collect user comments ("I'd have expected to get X here").
2008-08-18
- Create paper prototype for importer/annotator GUI
- Write/complete wiki page for importer
2008-07-30
- Assemble MP3 file test collection
- Manually create baseline
- Create unit tests for importer
2008-07-11
- Put global file id data in skip:// namespace (in normal store) - including md5sum, size, bitrate, file basename (normal OS info; perhaps also specialized "core md5sum/core mp3 size")
- ItemName is basename/Skipmedia:File
- Global file id data is useful for fast lookup of file (without "importing" them) - lurker use case
- Put local file id data in file:// namespace (again, in normal store) - won't be shared. Include localpath, modifieddate, etc. Use own properties/ontology for this.
- Importer should keep "open questions" list (and persist this in RDF store, again as local non-shared info - proper import might take several weeks)
- Idea: Annotate certificates and recommendations with "first issued" date
2008-07-04
- Skipmedia: Thing->MediaInstance->File->(MP3,OGG,FLAC) - bitrate,md5sum attributes (not Features!) of File.
- Some XMPP/bot discussions. Music importer should import into user's namespace.
2008-06-27
- Skiptrax (Album, Record, etc. part) looks fine.
- TODO
- Importer: Have a look at GNAT and MusicBrainz services.
- Create Skipmedia ontology (File, md5sum, etc.).
- Implement Jabber chat bot as preliminary frontend?
- Think about GUI.