**Property 1**: Suppose sample 1 has size *n _{1} *and rank sum

*R*and sample 2 has size

_{1}*n*and rank sum

_{2}*R*, then

_{2}*R*+

_{1}*R*=

_{2}*n*(

*n*+1)/2 where

*n*=

*n*+

_{1}*n*.

_{2}Proof: This is simply a consequence of the fact that the sum of the first *n* positive integers is . This can be proven by induction. For *n* = 1, we see that = 1 = *n*. Assume the result is true for *n*, then for *n* + 1 we have, 1 + 2 + … + *n* + (*n*+1) = + (*n* + 1) = =

**Property 2**: When the two samples are sufficiently large (say of size > 10, although some say 20), then the *W* statistic is approximately normal *N*(*μ, σ*) where

Proof: We prove that the mean and variance of *W = R*_{1} are as described above. The normal approximation was proven in Mann & Whitney (1947) of Bibliography and we won’t repeat the proof here.

Let *x _{i}* = the rank of the

*i*th data element in the smaller sample. Thus, under the assumption of the null hypothesis, by Property 1

By Property 4a of Expectation

As we did in the proof of Property 1, we can show by induction on *n* that

From these it follows that

We can now calculate the following expectations:

Also where *i* ≠* j*

By Property 2 of Expectation (case where *i = j*)

By Property 3 of Basic Concepts of Correlation when *i* ≠* j*

By an extended version of Property 5 of Basic Concepts of Correlation

thanx! really helpful!

thanks 😀

Thanks. Good one

I’m grateful for dis better understanding

Thanks alot, you realy put it out on paper

Your site is very good!

Have you thougt about translate your work in other languages?

Marcel,

I hadn’t. What languages did you have in mind?

Charles

I think about German and Polish. I could help you .

Marcel,

Thanks for the offer, but so far people haven’t been asking for translations. In any case, I’ll think about it.

Charles

I have one question about this proof. You calculate the expectation E(rirj) for all j not equal i. I don’t understand why we could take this expectation as equivalent to E(rirj) for all j, i. In the covariance we have to use E(rirj) of all rangs, but you use the expectation for all j not equal i, why is it correct? Can you explain me this problem?

Thanks for your answer!

Marcel,

I show E[ri rj] both where i = j (i.e. var(ri)) and where i is not equal to j.

Charles

I am grateful to you for your answer. But I wanted to say that you take the expectation E(rirj) with i not equal to j by the covariance cov(rirj). I don’t understand, why we can do this. I thought we need the expectation of all i and j (also of the double sum i*j).

I made a print-screen with both places in your proof: https://image.prntscr.com/image/7lBOfbg1RribBEjyIbN1Kw.png

Marcel,

Thanks for clarifying things. There are two case: (1) where i = j and (2) where i is not equal to j. In case (1) cov(rirj) = var(ri), which is the described in your print-screen. In case (2) the formula is the one shown in your print-screen. I have just updated the referenced webpage to try to make this a bit clearer. Does it help?

Charles

Thank you very much!

I understand it now.

Marcel,

Good to hear. Glad I could help.

Charles

It’s 2Σ i!=j to n1 [-(n+1)/12].I write wrong. There should’t ‘2’ in there i think .

Jack,

You need to sum all the terms cov(x_i,x_j) where i not equal to j. Note that each such covariance is repeated twice, once for cov(x_i,x_j) and once for cov(x_j,x_i). Thus, if you assume that the sum is where i < j, then you need to double the result. Another way to look at this is to determine how many pairs there are for the indices 1 to n1 where the indices are not equal. The answer is n1(n1-1), which is the value used in the proof. This is the same as 2 times n1(n-1)/2, the later being the number of pairs where the first index is less than the second index. To make this much clearer and more accurate, I have now replaced the lower limit of the summation symbol by i < j (instead of i not equal to j). Thanks for bringing this issue to my attention. Charles

oh ，now i understand it！thanks your respond