分类变量

分类变量或称类别变量是统计学中的有限多个取值的变量，其每个值对应于定性属性（英语：qualitative property）的特定分组（group）或定类类别（英语：nominal category）。^[1]在电脑科学或一些数学分支中，分类变量对应于列举法或枚举类型。通常，分类变量的每个值成为一个level。其概率分布称为分类分布（英语：categorical distribution）。

分类数据（Categorical data）是一种统计数据类型（英语：Statistical data type），由分类变量及其数据组成。具体说，分类数据可从定性数据计数汇总或生成列联表，或从定量数据按照给定的间隔分组得到。

分类变量如果只可能有两个取值，被称为二值变量（英语：binary variable）（binary variable或dichotomous variable），如伯努利变量。分类变量如果取多于2个值，成为多值变量（polytomous variables）。

分类变量的例子

血型： A, B, AB 或 O.
一个国家的合法政党
岩石类型：火成岩, 沉积岩, 变质岩.

表示法

为使统计处理简便，分类变量可以赋以数值索引值，如从1到K，对于K值分类变量。这种表示可以用于相等比较、作为集合的元素做集合运算。

分类变量的集合的集中趋势可用众数表示，但不能定义均值或中位数。

可能值的数量

分类的随机变量用统计学的分类分布（英语：categorical distribution），允许任意K值分类变量用每个值的单独的概率来表示（即K值的离散概率分布）。这种多值分类变量常用多项分布来分析。分类结果的回归分析是通过多项逻辑回归、multinomial probit（英语：multinomial probit）或相关的discrete choice（英语：discrete choice）模型。

分类变量也可以只有两种可能结果，称为二值变量或伯努利变量。由于重要性，这种情形常被视作独立分布（伯努利分布）、独立的回归模型（逻辑回归、probit regression（英语：probit regression）等）。反之，分类变量常被用于指大于等于3种结果，或称“多值变量”（multi-way variable）。

参考文献

^ Yates, Daniel S.; Moore, David S; Starnes, Daren S. The Practice of Statistics 2nd. New York: Freeman. 2003 [2014-09-28]. ISBN 978-0-7167-4773-4. （原始内容存档于2005-02-09）.

拓展阅读

Andersen, Erling B. 1980. Discrete Statistical Models with Social Science Applications. North Holland, 1980.
Bishop, Y. M. M.; Fienberg, S. E.; Holland, P. W. Discrete Multivariate Analysis: Theory and Practice . MIT Press. 1975. ISBN 978-0-262-02113-5. MR 0381130.
Christensen, Ronald. Log-linear models and logistic regression. Springer Texts in Statistics Second. New York: Springer-Verlag. 1997: xvi+483. ISBN 0-387-98247-7. MR 1633357.
Friendly, Michael. Visualizing categorical data （页面存档备份，存于互联网档案馆）. SAS Institute, 2000.
Lauritzen, Steffen L. Lectures on Contingency Tables (PDF) updated electronic version of the (University of Aalborg) 3rd (1989). 2002 [1979] [2020-11-20]. （原始内容存档 (PDF)于2020-04-30）.
NIST/SEMATEK (2008) Handbook of Statistical Methods （页面存档备份，存于互联网档案馆）

[yates-1] Yates, Daniel S.; Moore, David S; Starnes, Daren S. The Practice of Statistics 2nd. New York: Freeman. 2003 [2014-09-28]. ISBN 978-0-7167-4773-4. （原始内容存档于2005-02-09）.

[1]