View Forum Topic

Forums
CS520
Question about Pearson Correlation Coefficient

Author	Message
hsankav Posts: 18	Posted 15:32 Mar 30, 2010 \| Consider for example, If A is highly correlated with B and B is highly correlated with C. Can we say A is highly correlated with C?
cysun Posts: 2935	Posted 15:56 Mar 30, 2010 \| In the context of the rating prediction problem, the answer is no, because the items rated by AB, BC, and AC could be different.
niteenborge Posts: 7	Posted 16:07 Mar 30, 2010 \| And also correlation is not transitive.
Phanit Posts: 10	Posted 10:14 Mar 31, 2010 \| A Question regarding Predicted Rating/ Pearson Correlation Coefficient: In the video lecture, the example for calculating r average of Meg to be (2+4)/2 , and not (2+4+3)/3. My question is, for r average of Ken, in order to calculate the Similarity (W) of ken and meg, do we do (1+5)/2 = 3 or (1+5+2+4)/4 = 3. 2nd question regarding the Predicting the rating of item i by user x: In order to calculate the predicted rating for user x on item i, we need to add the average of the user x, so for that average, do we take the average of all of the items that user x has rated, even the ones that other users have not rated? Or else user x might have a different average against each users, and the formula might be different. Thanks -Phanit
alomo Posts: 70	Posted 10:51 Mar 31, 2010 \| At first I was having same kind of doubts on that formula. However, the Pearson correlation coefficients are the similarity weights between the active user and other users sharing the rating history. For each pair of users we calculate coefficient W based on specific (different) set of items that they both rated. So the formula is correct. Last edited by alomo at 10:52 Mar 31, 2010.
Phanit Posts: 10	Posted 11:06 Mar 31, 2010 \| alomo wrote: At first I was having same kind of doubts on that formula. However, the Pearson correlation coefficients are the similarity weights between the active user and other users sharing the rating history. For each pair of users we calculate coefficient W based on specific (different) set of items that they both rated. So the formula is correct. yes, so my first question was: In the video lecture, the example for calculating r average of Meg to be (2+4)/2 , and not (2+4+3)/3. My question is, for r average of Ken, in order to calculate the Similarity (W) of ken and meg, do we do r average of ken = (1+5)/2 = 3 since meg only had the first 2 items rated, or r average of ken = (1+5+2+4)/4 = 3? 2nd question regarding the Predicting the rating of item i by user x: In order to calculate the predicted rating for user x on item i, we need to add the average of the user x, so for that average, do we take the average of all of the items that user x has rated, even the ones that other users have not rated? example could be Ken rates 3rd item although Lee and Meg did not rate it. Or else user x might have a different average against each users, and the formula might be different. For example, if ken had a different average rating such that (2+5)/2 = 3.5 is the r average of ken with respect to meg, but his average rating could be (2+5+1+4) = 3 with respect to Lee, since Lee rated 4 items similar. Sorry if the question is confusing. -Phanit
alomo Posts: 70	Posted 12:02 Mar 31, 2010 \| For the 1st question, to calculate the similarity (W) of ken and meg, we take r average of ken = (1+5)/2 = 3 since ken and meg share rating history of only the first 2 items. For the 2nd question, we use r_x average for all items rated by user x. As Dr. Sun mentioned in the video, to calculate the prediction p_x,i we use the behavior of current user x (which is r_x average) modified by weighted behavior of other (similar) users. Last edited by alomo at 12:08 Mar 31, 2010.
Phanit Posts: 10	Posted 12:04 Mar 31, 2010 \| Thanks.
cysun Posts: 2935	Posted 12:28 Apr 01, 2010 \| alomo is correct.