69 | | Naturally, one would say that users B and D have a similar opinions, so have users C and D. But what about B and C? One might be tempted to say, that the opinions of users A and B exclude each other, but this can't be known for sure, because there might be some game that has different phases, one with all players playing simultaneously, one with changing player order. Stating that the opinion of A and B are equal because both state that there is some playing order would also be wrong, because here the feature Player_order is merely for grouping purposes and has no own meaning. |
70 | | |
71 | | == Conclusion == |
72 | | |
73 | | I looked into the !SkipTrax and the Ludopinions ontology and only found two cases: |
74 | | |
75 | | * a feature with sub-features is for grouping purpose only and it would have no meaning to state something about this feature |
76 | | * the sub-features of a feature are specialized cases of the super-feature, so the super-feature should have at least the maximum appliance value of all of its sub-features |
77 | | |
78 | | == Suggestion == |
79 | | |
80 | | === Comparing features === |
81 | | |
82 | | Define a similarity metric that compares two features x and y of the same (direct) type. Until now, the value of a feature is only the applicability value of the feature. |
83 | | {{{ |
84 | | sim(x,y) = 1 - dist(x,y)/2 |
85 | | }}} |
86 | | The distance between two features x and y is defined as follows: |
87 | | {{{ |
88 | | dist(x,y) = |x-y| |
89 | | }}} |
90 | | This means, that two features with the same applicability have the distance 0 and thus the similary 1 - 0/2 = 1. Two features with applicability -1 and 1 would have the distance 2 and the similarity 1 - 2/2 = 0. (TODO: prove the properties of a metric) |
91 | | |
92 | | === Comparing items === |
93 | | |
94 | | The similarity of two items is defined through the similarity of their features. The outline of a potential algorithm looks like this: |
95 | | |
96 | | * The similarity of two items is the arithmetic mean of the similarities of all features |
97 | | * Features that are not annotated will be ignored |
98 | | * If a feature type is only annotated in one item, feature values need to be inferred until they can be compared |
99 | | |
100 | | Example: |
101 | | |
102 | | Let there be a simple feature-hierarchy as follows: |
103 | | {{{ |
104 | | A |
105 | | / \ |
106 | | B C |
107 | | }}} |
108 | | |
109 | | Example similarities would be: ("-" means: not annotated) |
110 | | {{{ |
111 | | 1 -1 |
112 | | / \ ; / \ => similarity = 0 |
113 | | - - - - |
114 | | }}} |
115 | | {{{ |
116 | | 1 1 |
117 | | / \ ; / \ => similarity = (1 + 0) / 2 = 0.5 |
118 | | -1 - 1 - |
119 | | }}} |
120 | | |
121 | | Some non-trivial cases: |
122 | | {{{ |
123 | | - -1 - - |
124 | | / \ ; / \ => similarity = ? / \ ; / \ => similarity = ? |
125 | | -1 - - - +1 - - -1 |
126 | | }}} |
127 | | |
128 | | Suggestion: Propagate possible values as intervals up or down the hierarchy |
129 | | |
130 | | We extend the distance metric on intervals, where x1 and x2 denote the interval bounds of x = [x1;x2] (if x1 = x2, we just write [x1], which is equal to the value x1) |
131 | | {{{ |
132 | | dist(x,y) = (|x1-y1| + |x2-y2|) / 2 |
133 | | }}} |
134 | | |
135 | | The above example can then be compared: |
136 | | {{{ |
137 | | - -1 [-1;+1] -1 |
138 | | / \ ; / \ => / \ ; / \ => similarity = (sim(-1,[-1]) + sim([-1;+1],-1)) / 2 = (1 + 0.5) / 2 = 0.75 |
139 | | -1 - - - -1 - [-1] - |
140 | | }}} |
| 63 | The resulting correlation coefficient always is in range [-1;+1] and indicated how similar two users opinions are regarding only this specific feature class. Users with a high correlation are then called experts regarding this feature class. This correlation can also be used to add an additional weight to this users opinion about an item. (TODO: example) |