Download the datafile adult.data.txt. The description of the data, and its attributes, is available at UCI Machine Learning Repository: Adult Dataset. This is a selection of the Census data from 1994, and it has 48842 instances over 14 categorial, real and integer attributes.

Compute the contingency matrix for variables `education`

and `race`

, and compute the \(\chi^2\) statistic using your own function, i.e., write a function that takes as input two categorical column-vectors, and returns the \(\chi^2\) value and its p-value. At the 99% confidence level, are `education`

and `race`

dependent?

Retrieved from http://www.cs.rpi.edu/~zaki/dataminingbook/pmwiki.php/Main/ContingencyTableAnalysis

Page last modified on September 06, 2014, at 01:33 PM