This paper provides a novel mechanism for identifying and estimating latent group structures in panel data using penalized regression techniques. We focus on linear models where the slope parameters are heterogeneous across groups but homogenous within a group and the group membership is unknown. Two approaches are considered — penalized least squares (PLS) for models without endogenous regressors, and penalized GMM (PGMM) for models with endogeneity. In both cases we develop a new variant of Lasso called classifier-Lasso (C-Lasso) that serves to shrink individual coefficients to the unknown group-specific coefficients. C-Lasso achieves simultaneous classification and consistent estimation in a single step and the classification exhibits the desirable property of uniform consistency. For PLS estimation C-Lasso also achieves the oracle property so that group-specific parameter estimators are asymptotically equivalent to infeasible estimators that use individual group identity information. For PGMM estimation the oracle property of C-Lasso is preserved in some special cases. Simulations demonstrate good finite-sample performance of the approach both in classification and estimation. An empirical application investigating the determinants of cross-country savings rates finds two latent groups among 56 countries, providing empirical confirmation that higher savings rates go in hand with higher income growth.
Keywords: Classification, Cluster analysis, Convergence club, Dynamic panel, Group Lasso, High dimensionality, Oracle property, Panel structure model, Parameter heterogeneity, Penalized least squares, Penalized GMM
JEL Classification: C33, C36, C38, C51