Changes between Version 3 and Version 4 of ExpertRecommender


Ignore:
Timestamp:
10/17/08 10:51:56 (16 years ago)
Author:
fmittag
Comment:

raw description of the expert recommendation process (Pearson-correlation)

Legend:

Unmodified
Added
Removed
Modified
  • ExpertRecommender

    v3 v4  
    5353 
    5454 
    55 Example 3: 
     55== Idea == 
    5656 
    57  * Player_order 
    58   * All_players_simultaneously 
    59   * Player_order_changes 
    60    * Bidding_on_player_order 
     57We postpone the problem of calculating an overall similarity between users by looking at each feature class individually. For this, we take advantage of a method widely used^[citation needed]^ in collaborative filtering/recommendation. The [http://en.wikipedia.org/wiki/Correlation#Pearson.27s_product-moment_coefficient Pearson Produkt-Moment Correlation Coefficient] (or Pearson-correlation for short) "indicates the strength and direction of a linear relationship between two random variables". 
    6158 
    62 Let there be the following opinions: 
     59In Collaborative Filtering (CF), the ratings of user X and user Y represent these two random variables and a (positive) linear relationship between these two is equivalent to both users rating items in the same way (but not necessarily identical). The Pearson-correlation normalizes both random variables to their respective mean value, meaning that two random variables are considered positively correlated if both differ from their respective mean in the same direction. (TODO: example needed) 
    6360 
    64  * User A says that game G has the features: +All_players_simultaneously 
    65  * User B says that game G has the features: +Player_order_changes 
    66  * User C says that game G has the features: -Bidding_on_player_order 
    67  * User D says that game G has the features: +Player_order_changes, -Bidding_on_player_order 
     61Our approach is to look at one feature class at a time and interpret the opinion of a user as rating. The applicability represents the rating value whereas the confidence acts as a weight of this rating, therefor needing a more generalized formula for the Pearson-correlation allowing weighted ratings. (TODO: include formula) 
    6862 
    69 Naturally, one would say that users B and D have a similar opinions, so have users C and D. But what about B and C? One might be tempted to say, that the opinions of users A and B exclude each other, but this can't be known for sure, because there might be some game that has different phases, one with all players playing simultaneously, one with changing player order. Stating that the opinion of A and B are equal because both state that there is some playing order would also be wrong, because here the feature Player_order is merely for grouping purposes and has no own meaning. 
    70  
    71 == Conclusion == 
    72  
    73 I looked into the !SkipTrax and the Ludopinions ontology and only found two cases: 
    74  
    75  * a feature with sub-features is for grouping purpose only and it would have no meaning to state something about this feature 
    76  * the sub-features of a feature are specialized cases of the super-feature, so the super-feature should have at least the maximum appliance value of all of its sub-features 
    77  
    78 == Suggestion == 
    79  
    80 === Comparing features === 
    81  
    82 Define a similarity metric that compares two features x and y of the same (direct) type. Until now, the value of a feature is only the applicability value of the feature. 
    83 {{{ 
    84 sim(x,y) = 1 - dist(x,y)/2 
    85 }}} 
    86 The distance between two features x and y is defined as follows: 
    87 {{{ 
    88 dist(x,y) = |x-y| 
    89 }}} 
    90 This means, that two features with the same applicability have the distance 0 and thus the similary 1 - 0/2 = 1. Two features with applicability -1 and 1 would have the distance 2 and the similarity 1 - 2/2 = 0. (TODO: prove the properties of a metric) 
    91  
    92 === Comparing items === 
    93  
    94 The similarity of two items is defined through the similarity of their features. The outline of a potential algorithm looks like this: 
    95  
    96  * The similarity of two items is the arithmetic mean of the similarities of all features 
    97  * Features that are not annotated will be ignored 
    98  * If a feature type is only annotated in one item, feature values need to be inferred until they can be compared 
    99  
    100 Example: 
    101  
    102 Let there be a simple feature-hierarchy as follows: 
    103 {{{ 
    104   A 
    105  / \ 
    106 B   C 
    107 }}} 
    108  
    109 Example similarities would be: ("-" means: not annotated) 
    110 {{{ 
    111   1      -1 
    112  / \  ;  / \      => similarity = 0 
    113 -   -   -   - 
    114 }}} 
    115 {{{ 
    116    1       1 
    117   / \  ;  / \     => similarity = (1 + 0) / 2 = 0.5 
    118 -1   -   1   - 
    119 }}} 
    120  
    121 Some non-trivial cases: 
    122 {{{ 
    123    -      -1                               -       - 
    124   / \  ;  / \     => similarity = ?       / \  ;  / \     => similarity = ? 
    125 -1   -   -   -                          +1   -   -  -1 
    126 }}} 
    127  
    128 Suggestion: Propagate possible values as intervals up or down the hierarchy 
    129  
    130 We extend the distance metric on intervals, where x1 and x2 denote the interval bounds of x = [x1;x2] (if x1 = x2, we just write [x1], which is equal to the value x1) 
    131 {{{ 
    132 dist(x,y) = (|x1-y1| + |x2-y2|) / 2 
    133 }}} 
    134  
    135 The above example can then be compared: 
    136 {{{ 
    137    -      -1        [-1;+1]       -1 
    138   / \  ;  / \   =>   /   \   ;    / \    => similarity = (sim(-1,[-1]) + sim([-1;+1],-1)) / 2 = (1 + 0.5) / 2 = 0.75 
    139 -1   -   -   -     -1     -    [-1]  - 
    140 }}} 
     63The resulting correlation coefficient always is in range [-1;+1] and indicated how similar two users opinions are regarding only this specific feature class. Users with a high correlation are then called experts regarding this feature class. This correlation can also be used to add an additional weight to this users opinion about an item. (TODO: example)