[[PageOutline]] = In progress = [wiki:MP3Importer] (works for me) [[BR]] [wiki:ExpertRecommender] = Things to do = * '''Adaptation for music domain''' * finish skiptrax ontology (see [http://musicontology.com/ Music Ontology]) - 1-3 weeks * handle remixes/live/unplugged (song variants) * discern song and file - model file ownership * Proper GUI with the following functions: * metadata importers/exporters (ID3, LastFM, etc.) - 1-2 weeks * (adapter for own audio collection) * simplify entering song/band information - 1 week * (generate/play playlists, possibly with changing style over time (AutoDJ) - 2-6 weeks) * '''Metadata-based functionality''' * different sharing rights for contacts (e.g. A only ontologies, B also metadata, C everything) - <1 week * expert recommender (ask friends about people having knowledge about X/interests in Y) - 1-3 weeks * trust visualization * scalable algorithms - ''scalable'' = void algo(Graph oldFacts, Graph additionalFacts, Datastructure internalData, Notifications interestingThings) * computation of feature summaries - 1 week * feature/item similarity measure - 2 weeks * structured search (index) * implicit trust (= peer competence) metric - 2-5 weeks * explicit representation of trust (see Skippies ontology) - 2-5 weeks * user interface changes for that * trust metric changes for that * by combining computed and explicit trust ratings, identifying shared interests/fields of competence should be possible. [RelatedWork] = Ideas = Skipforward is intended to provide content-bases recommendations, but there might be some cool variants of combining it with collaborative filtering/recommendation: === Collaborative filtering per feature === Instead of the boring "people who like this also like that" recommendations, this could be adapted in the following way: People who like A also like B => People who think A has feature F also think that B has feature F[[BR]] People who like the same items like you also like this => People who annotated items the same way as you annotated this that way The last strategy could be used to rate items that haven't been reviewed by the user, but by his friends. = Music Ontology = == Artists == === Artist collaboration === At first glance, one might say that only one class Artist is sufficient, but it quickly reveals its limits. A few examples: 1. '''Chicane feat. Tom Jones - Stoned in Love''' ([http://www.discogs.com/release/650389 at discogs.com]) ([http://musicbrainz.org/release/f23e55ac-b7d5-4993-a310-17648b6c318e.html at musicbrainz.com]) 2. '''Cerf, Mitiska & Jaren - Light The Skies''' ([http://www.discogs.com/release/944142 at discogs.com]) Let's forget for a moment that this are in fact ''release''s and not ''song''s and concentrate on the ''artists'': In the first case there are two artists, '''Chicane''' and '''Tom Jones'''. Or is there one artist called '''Chicane feat. Tom Jones'''? Musicbrainz has a special [http://musicbrainz.org/doc/FeaturingArtistStyle FeaturingArtistStyle], whereas discogs just has multiple artists.[[BR]] The second case is a bit more difficult, as there is a an artist on discogs called [http://www.discogs.com/artist/Cerf%2C+Mitiska+%26+Jaren '''Cerf, Mitiska & Jaren'''], but it consists of three members, Jaren, Shawn Mitiska and Matt Cerf. Discogs has the following [http://www.discogs.com/help/submission-guidelines-release-artist.html guidelines for artists], which say that "''Artists which commonly collaborate together should be listed as one artist''" and "''Do NOT attempt to split artists who regularly collaborate. (Regular collaboration consists of 3 or more collaborations (different releases), excluding remix EPs)''". === Artist Aliases === A different problem is the one of the [http://musicbrainz.org/doc/SameArtistWithDifferentNames same artist having different names]. The problem is rather easy with music groups, because it can be modelled accurately by two different music groups having the same members.[[BR]] But with solo artists (persons) one has to distinguish name variations (DJ Tiësto, Tiësto, Tiesto, ...) and different names (aliases) under which he releases stuff ([http://www.discogs.com/artist/Drumfire Drumfire] is an alias of [http://www.discogs.com/artist/DJ+Tiësto DJ Tiësto]) To make this mess even worse, there are [http://musicbrainz.org/doc/FictitiousArtist fictitious artists], e.g. [http://musicbrainz.org/artist/3c6754ff-0e14-49f9-8e84-c323f3f3a8b3.html Bernd das Brot]. http://musicbrainz.org/doc/ArtistAlias http://musicbrainz.org/doc/FeaturingArtistStyle === Conclusion === Even though discogs.com and musicbrainz.com have put much thought in it, they couldn't cover all cases and there are always exceptions. We have to come up with a solution that has to special enough to fit our cause, but simple enough to be used by non-nerds. more to follow = Meetings = == 2008-09-05 == * @Malte: Syncing between own nodes * Expert Recommender/Item Opinion distance metric * have a look at [http://en.wikipedia.org/wiki/Dempster-Shafer_theory Dempster-Shafer] * do not forget confidence values (use them as weights when aggregating) * start with .tex * look at evaluations in related papers * test algorithm (and simplifications) with testing data == 2008-08-29 == * (minor) fix bugs of importer * add possibility in GUI to annotate multiple items at a time (web frontend or Swing GUI) * possibility to copy all annotations from one item to another (maybe copy/paste) * write "competence calculator" (expert recommender) * public float calculateCompetence(String jid, String featureNamespace) * incremental == 2008-08-22 == * Keep Trac pages and tickets updated * Evaluation... * Song annotation: Let users annotate one set of fixed songs (to get good overlap) and other songs of their choice (to test recommendations) * Same for feature sets to annotate: Create one certificate that must be completed and let users complete other features freely * Evaluating recall will be a problem and probably less relevant than evaluating precision (see [http://en.wikipedia.org/wiki/Precision_and_recall here]). Still, do not omit recall completely; best collect user comments ("I'd have expected to get X here"). == 2008-08-18 == * Create paper prototype for importer/annotator GUI * Write/complete wiki page for importer == 2008-07-30 == * Assemble MP3 file test collection * Manually create baseline * Create unit tests for importer == 2008-07-11 == * Put global file id data in skip:// namespace (in normal store) - including md5sum, size, bitrate, file basename (normal OS info; perhaps also specialized "core md5sum/core mp3 size") * !ItemName is basename/Skipmedia:File * Global file id data is useful for fast lookup of file (without "importing" them) - lurker use case * Put local file id data in file:// namespace (again, in normal store) - won't be shared. Include localpath, modifieddate, etc. Use own properties/ontology for this. * Importer should keep "open questions" list (and persist this in RDF store, again as local non-shared info - proper import might take several weeks) * Idea: Annotate certificates and recommendations with "first issued" date == 2008-07-04 == * Skipmedia: Thing->!MediaInstance->File->(MP3,OGG,FLAC) - bitrate,md5sum attributes (not Features!) of File. * Some XMPP/bot discussions. Music importer should import into user's namespace. == 2008-06-27 == * Skiptrax (Album, Record, etc. part) looks fine. * TODO * Importer: Have a look at [http://blog.dbtune.org/post/2007/08/30/GNAT-01-released GNAT] and !MusicBrainz services. * Create Skipmedia ontology (File, md5sum, etc.). * Implement Jabber chat bot as preliminary frontend? * Think about GUI.