Estimator and Application of Quantile Sample Selection Models with Endogenous Variables
The problem of nonrandom sample selection in empirical research is of wide interest.Typically,various unobserved factors affect both individual choice decisions and outcomes,e.g.,an individual's choice of whether to participate in the labor market and the wage received after participating in the labor market,leading to selection bias in estimating models with sample selection problems.For example,in the study of wages and employment,we can only observe the wages of employed individuals,and the decision to enter the labor market is a self-selecting behavior.Thus,traditional measures of returns to education or wage inequality may be biased.In addition to sample selection problems,the presence of endogeneity in explanatory variables caused by simultaneous equations,measurement errors,or omitted variables is also common.Having both problems of sample selection and endogenous variables is common in empirical research in economics.Consider again the classic example of wages and employment,where an individual's decision whether to enter the labor market creates a sample selection problem,while the explanatory variable in the outcome equation,education,is often considered an endogenous variable,and unobservable ability in the outcome equation usually affects education as well.In this study,we propose an estimation method for a quantile sample selection model with continuous endogenous explanatory variables.We use a control function approach to deal with endogenous variables in quantile regression models by including an additional unknown control function in the outcome equation to control for the effects of correlation between the endogenous explanatory variables and the error term in the outcome equation.Thus,we use the control function method to transform a linear quantile regression model with endogenous variables into a partial linear model in the exogenous case.We then approximate the nonlinear part with a series of basis functions,and the partial linear quantile regression model can be easily estimated by minimizing the convex function.Compared with the IQR method,the control function method avoids the optimization of the nonconvex problem.In addition,as the control function method avoids lattice search for the coefficients of endogenous explanatory variables,the control function method can be easily estimated even if the number of endogenous explanatory variables increases.In addition,we correct the sample selection problem by modeling the perturbation terms in the outcome and selection equations as a binary joint distribution(Copula).In practice,we can set the Copula to depend on a low-dimensional parameter vector.The estimation algorithm for the quantile sample selection model with continuous endogenous explanatory variables is divided into four steps:the first two steps are to estimate the control function and the selection equation;in the third step,the parameters of the Copula are given,and then the quantile parameters are estimated using basis function approximation and"rotated"quantile regression,and the nonlinear part is estimated using local quantile regression.Fourth,the Copula parameters are estimated using the method of moments.We apply the method proposed in this study to estimate the returns to education for married women using data from CHIPS 2013.The results reveal that the returns to education for married women range from 5.6%to 17.8%and decrease with increasing quantile levels(understood as unobservable individual ability gains),but increasing years of education still has a significant positive effect on women's earnings,and the estimation results are not significant at the local quantile level but positive.The estimation results are robust when choosing different basis functions and Copula functions,reflecting the application value and robustness of the model proposed in this study.