How to execute χ-square test with cross tabulation?

You can execute \chi^2 test with cross tabulation by such formula as below. In each cells, subtract expected value (E) from observed value (O), square the subtraction, divide the squared by expected value and add them all.

\displaystyle\chi^2(df)=\sum\frac{(O-E)^2}{E}

df: degree of freedom

\chi^2 statistics follows \chi^2 distribution. When degree of freedom is 1, \chi^2 statistics is 3.841 if probability is smaller than 0.05 in one sided test, \chi^2 is 6.635 if p < 0.01, [latex]\chi^2[/latex] is 10.828 if p < 0.001, respectively. In two-tailed test, [latex]\chi^2[/latex] is 5.024 if p < 0.05, [latex]\chi^2[/latex] is 7.879 if p < 0.01, respectively.

  TRUE FALSE Marginal total
POSITIVE a b a + b
NEGATIVE c d c + d
Marginal total a + c b + d N 
\displaystyle \begin{array}{rcl}\chi^2&=&(ad-bc)^2\times\frac{N}{(a+b)(c+d)(a+c)(b+d)}\vspace{0.2in}\\\chi^2(Yates)&=&\left(|ad-bc|-\frac{1}{2}\right)^2\times\frac{N}{(a+b)(c+d)(a+c)(b+d)}\end{array}

When should you execute Fisher exact test, not chi-square test?

You should not execute chi-square test but Fisher exact probability test when gland total of cross tabulation was smaller than 20 or one or greater than one cells had smaller than 5 expected value. In this article, I would like to describe how to solve expected value. Expected value is calculated with marginal total.

We have cross tabulation below;

  TRUE FALSE Marginal total
POSITIVE a b a + b
NEGATIVE c d c + d
Marginal total a + c b + d N 

Expected value of each cells is below;

  TRUE FALSE Marginal total
POSITIVE (a + b)*(a + c)/N (a + b)*(b + d)/N a + b
NEGATIVE (c + d)*(a + c)/N (c + d)*(b + d)/N c + d
Marginal total a + c b + d N 

How to calculate Fisher’s exact test with logarithm?

How to calculate four numbers from marginal total in cross tabulation?

When you execute Fisher’s exact test with cross tabulation, the marginal total is constant. Therefore, you could calculate the remaining three numbers if you could get ‘a’, the number of true positive. Because all four numbers are 0 or greater than 0, the range of ‘a’ is between 0 and the smaller one of either the number of ‘TRUE’ or the number of ‘positive’.

Fisher’s probability is function dependent on ‘a’. Fisher’s probability follows super geometric distribution. If ‘positive’ and ‘negative’ are separated by cut-off value in continuous variable, changing cut-off value makes a change in numbers of positive, negative and true-positive. The numbers of true and false never change regardless of change in cut-off value. Therefore, Fisher’s probability is function dependent on cut-off value.

  TRUE FALSE Marginal total
POSITIVE a P – a P
NEGATIVE T – a a + N – P – T N – P
Marginal total T N – T N 

At first you know only about numbers of ‘N’ meaning of grand total, ‘T’ meaning of true and ‘P’ meaning of positive.

  TRUE FALSE Marginal total
POSITIVE     P
NEGATIVE      
Marginal total T   N 

Next, you can calculate numbers ‘N – P’ meaning of negative and ‘N – T’ meaning of false. Then you have got marginal total.

  TRUE FALSE Marginal total
POSITIVE     P
NEGATIVE     N – P
Marginal total T N – T N 

If you could get ‘a’, you would get false negative ‘T – a’ and false positive ‘P – a’.

  TRUE FALSE Marginal total
POSITIVE a P – a P
NEGATIVE T – a   N – P
Marginal total T N – T N 

At last, you could get true negative ‘a + N – P – T’.

  TRUE FALSE Marginal total
POSITIVE a P – a P
NEGATIVE T – a a + N – P – T N – P
Marginal total T N – T N