Publication:
Outliers in Binary Choice Models

Loading...
Thumbnail Image
Official URL
Full text at PDC
Publication Date
1995-05
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Facultad de Ciencias Económicas y Empresariales. Instituto Complutense de Análisis Económico (ICAE)
Citations
Google Scholar
Research Projects
Organizational Units
Journal Issue
Abstract
This paper focuses on the problem of outliers in binary choice models. It is show that identifying outliers as observation with a residual close to one in absolute value might be misleading and outlier detection procedures should rely on influence measures on the fit. Two scalar measures are derived to evaluate the influence of each observation as well as the influence of a group of observations on i) the vector of estimated parameters and ii) the vector of estimated probabilities. Also, the method proposed by Peña and Yohai (1995) to treat the problem of masking in linear models has been generalized to the case of binary choice models. A small Monte Carlo study analyzes the performance of a11 measures and an empirical application presents a diagnostic strategy for detection of outliers.
Este trabajo trata el problema de la presencia de observaciones extremas en modelos de elección binaria. Se muestra que identificar como atípicas observaciones con residuos próximos a uno, en valor absoluto, puede resultar erróneo y que los procedimientos de detección de estas observaciones deberían basarse en medidas de influencia sobre el ajuste. Se proponen dos medidas para evaluar la influencia de cada observación así como para un grupo de observaciones sobre: i) El vector de parámetros estimados y ii) el vector de las probabilidades estimadas. También, se ha generalizado al caso de modelos de elección binaria, el método propuesto por Peña y Yohai (1995) para tratar el problema de enmascaramiento en modelos lineales. Con un pequeño estudio de Monte Carlo se analizan las propiedades de las medidas propuestas y una aplicación empírica presenta una estrategia para la detección de medidas extremas.
Description
Keywords
Citation
Amemiya, T. (1981), "Qualitative Response Models: A Survey", Journal of Economic Literature, XIX, 1483-1536. Amemiya, T. (1985), Advanced Econometrics, Oxford, Basil Blackwell Ltd. Bedrick, E. J., and Hill, J. R. (1990), "Outlier Tests for Logistic Regression, a Conditional Approach", Biometrika, 77, 4, 815-827. Belsley, D. A., Kuh, E. and We1sch, R. E. (1981), Regression Diagnostics. ldentifying Influential Data and Sources of Collinearity, New York, John Wiley & Sons. Box, G. E. P., and Tiao, G. C. (1968), "A Bayesian Approach to some Outlier Problems", Biometrika, 55, 1, 119-129. Cook, R. D. (1977), "Detection of Influential Observation in Linear Regression", Technometrics, 19, 1, 15-18. Cook, R. D. and Weisberg, S. (1982), Residuals and Influence in Regression, New York: Chapman and Hall. Copas, J. B. (1988), "Binary Regression Models for Contaminated Data", Journal of the Royal Statistical Society, B, 50, 2, 225-265. Dhillon, U. S., Shilling, J. D. and Sirmans, C.F. (1987), "Choosing between Fixed and Adjustable Rate Mortgages", Journal of Money, Credit and Banking, 19, 1, 260-267. Guttman, I. (1973), "Premium and Protection of Several Procedures for Dealing with Outliers when Sample Sizes are Moderate to Large", Technometrics, 15, 385-404. Jennings, D. E. (1986), "Outliers and Residual Distributions in Logistic Regression", Journal of the American Statistical Association, 81, 396, 987-990. McCullagh, P. and Nelder, J. A. (1983), Generalized Linear Models, London: Chapman and Hall, Inc. McFadden, D.L. (1983), "Econometric Analysis of Qualitative Response Models", Handbook of Econometrics, Griliches, I. and Intrilligator, M. eds., 2, 24, 1396-1457. Nelder, J. A. and Wedderburn, R. W. M. (1972), "Generalized Linear Models", Journal of the Royal Statistical Society, A, 135, 370-384. Peña, D. and Ruiz-Castillo, J. (1984), "Robust Methods of Building Regression Models - An Applieation to the Housing Sector", Journal of Business and Economic Statistics, 2, 1, 10-20. Peña, D. and Yohai, V. J. (1995), "The Detection of Influential Subsets in Linear Regression using an Influence Matrix", Journal of the Royal Statistical Society, B, 57, 2. Pierce, D. A. and Schafer, D.V. (1986), "Residuals in Generalized Linear Models", Journal of the American Statistical Association, 81,977-986. Pregibon, D. (1981), "Logistic Regression Diagnostics", The Annals of Statistics, 9, 4, 705-724. Rao, C. R. (1973), Linear Statistical Inference and its Applications. 2nd. ed., John Wiley. Williams, D. A. (1987), "Generalized Linear Model Diagnostics: The Deviance and Single Case Deletion", Applied Statistics, 36, 2, 181-191.