Randomization Inference Tests for Shift-Share Designs111We would like to thank Peter Hull for excellent comments and suggestions.
First draft: June 1st, 2022
Abstract
We consider the problem of inference in shift-share research designs. The choice between existing approaches that allow for unrestricted spatial correlation involves tradeoffs, varying in terms of their validity when there are relatively few or concentrated shocks, and in terms of the assumptions on the shock assignment process and treatment effects heterogeneity. We propose alternative randomization inference methods that combine the advantages of different approaches. These methods are valid in finite samples under relatively stronger assumptions, while asymptotically valid under weaker assumptions.
Keywords: shift-share designs; inference; spatial correlation;
JEL Codes: C18,C21,C26.
1 Introduction
Shift-share research designs consider instrumental variables that are constructed based on a common set of shocks that differentially affect different regions, depending on their exposure to those shocks. Prominent examples of papers that used this methodology include Bartik, (1991), Blanchard and Katz, (1992), Card, (2001), and Autor et al., (2013).
Adão et al., (2019) (henceforth, AKM) show that usual standard error formulas may substantially over-state the true variability of the shift-share estimator, if regions with similar exposures to the sector-level shocks also have correlated errors. AKM and Borusyak et al., (2021) (henceforth, BHJ) propose alternative estimators for the asymptotic variance of the shift-share estimator that are valid under arbitrary cross-regional correlation in the regression residuals. These methods rely on an asymptotic theory in which we have a large number of sectors, and the relevance of each sector becomes asymptotically negligible. While these methods provide reliable inference in many applications, they may lead to large over-rejection when such asymptotic theory does not provide a reasonable approximation to the empirical setting (Ferman,, 2019). Borusyak and Hull, (2021) (henceforth, BH) propose another alternative, based on the ideas of randomization inference (RI), that is valid even in finite samples. However, their approach relies on assumptions on the shock assignment mechanism (such as, for example, knowledge of the distribution of the shocks, or that shocks are iid), which may be relatively harder to justify in some settings. Given the advantages and disadvantages of each approach, BH state that “the choice between RI and asymptotic approaches involves tradeoffs.”
In this paper, we consider alternative inference methods based on RI that combine the advantages of the asymptotic methods proposed by AKM and BHJ, and of the RI method proposed by BH. The inference methods we propose are valid in finite samples under relatively stronger assumptions, including homogeneous treatment effects, and correct specification of the distribution of the shocks up to a scale parameter. We also consider alternatives that rely on the assumptions that the distribution of shocks is symmetric around a known mean or that shocks are iid, instead of assuming correct specification of the distribution of the shocks. At the same time, these inference methods are also asymptotically valid when the number of sectors increases under weaker assumptions on the treatment effects heterogeneity, and even when the distribution of the shocks is misspecified.555The RI tests we consider will be asymptotically conservative whenever the conditions stated by AKM in their Appendix A.1.6 hold. Those conditions limit the correlation between treatment effect heterogeneity and exposure weights. Therefore, we provide inference methods for shift-share designs that are valid under relatively stronger assumptions in finite samples, but that we can relax those assumptions once the number of sectors increases. This eliminates the tradeoffs between RI and asymptotic approaches mentioned by BH.
Our approaches build on a large literature that studies the use of RI methods in other settings, and considers RI with studentized test statistics that are valid under stronger assumptions (or, alternatively, for inference on sharper null hypotheses) in finite samples, and also asymptotically valid under weaker assumptions (or, alternatively, for inference on less stringent null hypotheses). See, for example, Janssen, (1997), Chapter 15 of Lehmann and Romano, (2005), Chung and Romano, (2013), Bugni et al., (2018), Wu and Ding, (2020), Ferman, (2021).
2 Main results
2.1 Setting
For an outcome of interest , we consider the structural model
(1) |
where denotes a treatment of interest, and are the remaining determinants of . We consider for simplicity the case without a constant and without other covariates. However, all our results remain valid for a more general setting. For a sample of units, observed outcomes are given by
(2) |
where are random variables. We have access to a shift-share instrument constructed as
(3) |
where are a set of exposures of and are a set of sector shocks. Let and . We consider the following assumption on the distribution of the shocks, which is standard in the shift-share design literature (see AKM and BHJ).
Assumption 1 (Shock exogeneity)
.
Assumption 1 imposes that shocks are mean-independent from unobserved determinants, conditional on exposures, with a common mean. More generally, we could have assumed that there exists such that for every . In this case, the researcher may conduct inference by working with statistics that depend on demeaned shocks (e.g. the shift-share estimator constructed with demeaned shocks). This is the solution proposed by BH to inference in linear shift-share designs where the sum of exposures may be uneven across units.666Our results remain valid in such setting. Specifically, valid finite sample inference under correct specification of the shock assignment mechanism (Proposition 1) would solely require that shocks are correctly specified up to a common location shift. Another alternative would be to control for the sum of the exposures, as proposed by BHJ.
In our setting, the shift-share estimator is given by
(4) |
where , and . All our results remain valid if we consider the reduced-form case, in which case we set . Also, some of the results we present in Section 2.2 are only valid for the reduced-form case.
AKM and BHJ show that, when shocks are independent and the importance of each sector becomes asymptotically negligible, then, as ,
(5) |
for every , where . This representation motivates the variance estimators proposed by AKM and BHJ. While their approaches provide reliable inference in empirical applications with many sectors, we may have relevant size distortions when there are few or concentrated sectors.
2.2 Randomization inference in shift-share designs
Given that the inference methods proposed by AKM and BHJ may not work well in some applications when there are few or concentrated sectors, we consider the use of RI in this setting. BH propose randomization-based inference in shock-based designs for a more general setting in which we have non-random exposure to exogenous shocks, where shift-share designs would be a particular example. Differently from BH, by focusing on shift-share design applications we are able to consider RI tests that are valid under relatively stronger assumptions in finite samples (similar to the approach proposed by BH), but that are also valid under weaker assumptions when the number of sectors increases. The main reason is that, in the setting we consider, the shift-share estimator has well-stablished asymptotic results (AKM, BHJ), so we are able to consider a studentized test statistic. In contrast, many of the settings considered by BH do not have well-stablished asymptotic results.
2.2.1 Finite-sample results
Suppose our goal is to test the null that against either a unilateral or bilateral alternative. Let be a test statistic, where large values of constitute evidence against the null. We assume the following condition on the test statistic.
Assumption 2
Under the null, the map satisfies for some other map .
That is, under the null, the test-statistic depends on and solely through the “null-imposed residuals” .
Suppose now that the researcher has a guess on the shock assignment mechanism, i.e on the conditional probabilities for each . Let us denote such guess by a conditional distribution function which specifies, for each in the support of , a cummulative distribution function on . In this case, the researcher is able to compute critical values by analysing the quantiles of
(6) |
Such quantity is easily estimable by simulation. Indeed, if we are able to draw independent draws , , from , then may be estimated as
(7) |
Clearly, if the shock-assignment process is correctly specified, then the procedure above provides valid inference.
Proposition 1
Suppose that Assumption 2 holds, and that . Then, under the null ,
Consequently, a test that rejects the null if exceeds the quantile of is (conditional on ) level , where . Similarly, a test that rejects the null if exceeds the quantile of is conditionally level .
The result presented above is similar to Proposition S.3 from BH, with a couple of minor differences. First, we allow the shock assignment mechanism to depend on the non-observables. This way, we allow for the distribution of to depend on (though in practice we expect that applied researchers would rarely choose a that depends on ). Also, we provide level guarantees with a finite number of simulations. We present details of the proof in Appendix A.1.1.
In their paper, BH consider basing inference on the test statistic
(8) |
which depends on solely through , so Assumption 2 holds.
In contrast, we consider the test-statistic
(9) |
where is the null-imposed variance estimator from AKM and BHJ,777In a setting without an intercept or controls, BHJ show their variance estimator collapses to AKM’s.
Observe that, under the null, the above map satisfies
(10) |
so this test statistic also satisfies Assumption 2. This corresponds to a rescaled version of the test statistic in BH.
Inference may be conducted as follows:
Algorithm 1
-
1.
For simulations :
-
(a)
Draw .
-
(b)
Construct simulated instruments, .
-
(c)
Run the shift-share IV estimator and null-imposed standard errors using the original data and with artificial instruments . Construct the -test based on the obtained values.
-
(a)
-
2.
Reject the null if the observed test statistic is at the tails of the simulated distribution.
We adopt null-imposed standard errors in the test statistic , because otherwise it would depend on the endogenous regressor , and we do not model assignment of these. If, however, , such problem disappears, and we may consider the test statistic
(11) |
where is the AKM or BHJ standard errors without imposing the null. Under the null, such statistic may be written as
(12) | |||||
Therefore, when we are in the case in which , we have that this test statistic satisfies Assumption 2. Note, however, that this assumption would not be satisfied for this test statistic if we considered the case in which .
In this case, inference may be conducted as follows:
Algorithm 2
-
1.
For a given , compute .
-
2.
For simulations :
-
(a)
Draw .
-
(b)
Construct data .
-
(c)
Run the shift-share regression and shock-robust standard errors using the artificial data , and . Construct the -test based on the obtained values.
-
(a)
-
3.
Reject the null if the observed test statistic is at the tails of the simulated distribution.
Following Proposition 1, this inference procedure would be valid for settings in which .
Remark 1 (Scale-invariance of and )
We observe that, when inference is based on the test-statistics or , the requirement in Proposition 1 may be weakened to: the distribution of shocks is correctly specified, up to multiplication of by a positive scalar. Indeed, test statistics and are invariant to multiplication of the shocks by a common positive constant. This contrasts with the test statistic , which requires the researcher to correctly specify the scale of shocks.
Remark 2 (Group transformations)
Instead of assuming that the shock-assignment mechanism is known, an alternative would be to consider a group of transformations on such that, under the null, for any , . In these settings, it follows from well-established results on randomization tests (Lehmann and Romano,, 2005, Theorem 15.21) that the procedure described in Proposition 1 remains valid if simulated shocks are constructed as , where , independently from the data. For example, if, conditional on , shocks were assumed independently drawn from symmetric distributions with known common symmetry point , then one could take the group of transformations to be recentred sign changes, i.e. for , where denotes entry-by-entry multiplication and is a dimensional vector of ones.888 If the symmetry point were estimated (for example, by using the sample mean as an estimator of ), then the simulation procedure would no longer retain finite sample validity. In this case, conservative inference could be conducted by computing p-values under different choices of , as varies over a valid confidence set, and then taking the supremum and adding one minus the confidence of the confidence set to it (Berger and Boos,, 1994). See Proposition S6 in BH for details. BH consider this kind of simulations in their Appendix D4. Similarly, if, conditional on , shocks were assumed to be iid, then one could take the group to be the set of permutations of a -dimensional vector, as also discussed by BH.
2.2.2 Asymptotic results
In this section, we consider the asymptotic properties of the simulation-based approach. We consider the properties of the tests in a framework where the number of sectors, , is large. The number of units, , is (implicitly) indexed by , and is also allowed to grow.
We adopt a finite population perspective and allow for treatment effects to vary by unit. Formally, for a given , potential outcomes are given by
(13) |
and observed outcomes are given by
(14) |
whereas the instrument is given by . We will treat the , and as nonrandom throughout – the only source of randomness stems from the assignment of and the treatment . In other words, we follow a “design-based” approach. In this setting, the shift-share identification assumption is written as follows.
Assumption 3 (Shock-exogeneity)
.
We rewrite the outcome model as
(15) |
where
(16) |
and
(17) |
We consider the goal of the researcher to be to conduct inference on , an affine combination of individual treatment effects. Specifically, she would like to test the null that , and for that she uses one of the procedures described in the previous section.
Following AKM and BHJ, we put and assume . In the next proposition, we provide conditions for (conditional) asymptotic normality of , the test statistic constructed under simulated shocks .
Proposition 2 (Asymptotic normality of statistic)
Assume that, conditional on and , simulated shocks are drawn independently across from distributions (not necessarily identical) satisfying:
-
(i)
;
-
(ii)
, where ; and
-
(iii)
.
Then, the simulated distribution converges in distribution to a standard normal, in probability, i.e. , for every .
We present details of the proof in Appendix A.1.2. Proposition 2 provides high-level conditions for conditional asymptotic normality of the simulated statistic. In Appendix A.2.1, we show that these conditions are satisfied for three examples of simulation distributions: (i) when we sample with replacement from the (recentered) empirical distribution of shocks; (ii) when we consider shocks iid , independently from and , and; (iii) when we consider sign-changes of observed shocks.999In the Appendix, we consider sign changes without recentering shocks (). We note, however, that convergence would hold for any choice of recentering parameter , including the case in which it is misspecified, and the case in which is replaced by an estimator such as the sample mean of shocks; provided we work with the shift-share estimator that uses demeaned shocks (as per footnote 2). The crucial point is that we consider a studentized test statistic, so the simulated test statistic is asymptotically . In contrast, if we considered alternative test statistics, such as , then we would not reach this conclusion. Studentizing the test statistic using robust standard errors would also generally not work.
As a byproduct of Proposition 2, whenever inference based on and normal critical values provides asymptotically conservative inference, the simulation-based approach will also lead to asymptotically conservative inference. We summarize this fact in the corollary below.
Corollary 1
Suppose that, under the null , there exists such that, for every , . Assume that the conditions in Proposition 2 hold under the null. Then the simulation-based approach to inference of Algorithm 1 will be asymptotically conservative, in the sense that, under the null, the probability of rejecting the null converges to a number smaller than the nominal significance level.
When there is no treatment effect heterogeneity, it follows that, under the conditions in AKM and BHJ, . These conditions include that shocks are independent, that the number of sectors increase, and that the relevance of sectors are asymptotically negligible (AKM and BHJ consider alternatives that relax the assumption that shocks are independent, and we discuss that in Remark 4). In this case, inference based on the simulation approach is asymptotically size . More generally, when there is treatment effect heterogeneity, AKM provide sufficient conditions for inference based on and normal critical-values being conservative (i.e. ). These conditions limit the correlation between treatment effect heterogeneity and exposure weights. In this case, our simulation-based approach will also lead to conservative inference. Notice that, in contrast to our finite sample results, which require homogeneous treatment effects, asymptotically our method may be able to provide conservative inference under treatment effect heterogeneity.
Remark 3
We note that the statement of Proposition 2 does not require the null to be true. Specifically, if the conditions in Proposition 2 can be shown to be valid under a given sequence of alternatives,101010See Appendix A.2.1 for sufficient conditions in our three examples of simulation distributions. then it follows that the distribution of the simulated statistic converges to a standard normal along such sequence. In this case, the power of the null-imposed t-test and our simulation-based approach coincide asymptotically along this sequence.
Remark 4
Suppose that instead of assuming that shocks are independent, we consider that we have clusters of shocks that are independent, but that there may be correlation between shocks within the same cluster. In this case, our results from Proposition 2 and Corollary 1 should remain valid if we studentized the test statistic using AKM and BHJ standard errors with clusters of shocks (with the null imposed), provided the number of clusters is large. We may also consider using a distribution for the simulated shocks that allows for correlation within clusters.
Next, we analyze the test statistic . In this case, since the shift-share estimator is being recomputed across samples and then used in the calculation of the standard error, we need to ensure that , the simulated shift-share estimator from Algorithm 2, is consistent at a given rate. In addition to the assumptions in Proposition 2, we require a “strong simulated shock” assumption that ensures that the variance of the simulated shift-share regressor does not vanish asymptotically; as well as conditions that ensure the estimation error of the standard error vanishes. We state these requirements in the proposition below:
Proposition 3
Suppose, in addition to the assumptions in Proposition 2, that: (i) . Then . Moreover, if we assume that:
-
(ii)
-
(iii)
we may then conclude that for every .
We present details of the proof in Appendix A.1.3. In Appendix A.2.2, we discuss assumptions (i)-(iii) of the proposition in the context of our three examples of simulation distributions.
Corollary 2
Suppose that, under the null , there exists such that, for every , . Assume that the conditions in Proposition 3 hold under the null. Then the simulation-based approach to inference provided by Algorithm 2 will be asymptotically conservative, in the sense that, under the null, the probability of rejecting the null converges to a number smaller than the nominal significance level.
3 Conclusions
We consider the problem of inference in shift-share research designs. There are two main existing approaches that allow for unrestricted spatial correlation. The RI approach is valid even with relatively few or concentrated shocks, but relies on relatively strong assumptions on the shock assignment process and on treatment effect heterogeneity. In contrast, the asymptotic approach relies on weaker assumptions on the shock assignment process and on treatment effect heterogeneity, but asymptotic approximations may be inaccurate in some applications.
We propose alternative RI methods that combine the advantages of both approaches. More specifically, the inference methods we propose are exact under relatively strong assumptions, and also asymptotically valid under weaker assumptions. The latter is achieved through studentization, which ensures convergence of the simulated distribution of the test-statistic to a standard normal under mild regularity conditions.
References
- Adão et al., (2019) Adão, R., Kolesar, M., and Morales, E. (2019). Shift-Share Designs: Theory and Inference*. The Quarterly Journal of Economics, 134(4):1949–2010.
- Autor et al., (2013) Autor, D. H., Dorn, D., and Hanson, G. H. (2013). The china syndrome: Local labor market effects of import competition in the united states. American Economic Review, 103(6):2121–68.
- Bartik, (1991) Bartik, T. (1991). Who Benefits from State and Local Economic Development Policies? W.E. Upjohn Institute for Employment Research.
- Berger and Boos, (1994) Berger, R. L. and Boos, D. D. (1994). P values maximized over a confidence set for the nuisance parameter. Journal of the American Statistical Association, 89(427):1012–1016.
- Billingsley, (1995) Billingsley, P. (1995). Probability and Measure. Wiley Series in Probability and Statistics. Wiley.
- Blanchard and Katz, (1992) Blanchard, O. and Katz, L. (1992). Regional evolutions. Brookings Papers on Economic Activity, 23(1):1–76.
- Borusyak and Hull, (2021) Borusyak, K. and Hull, P. (2021). Non-random exposure to exogenous shocks: Theory and applications. Working paper.
- Borusyak et al., (2021) Borusyak, K., Hull, P., and Jaravel, X. (2021). Quasi-Experimental Shift-Share Research Designs. The Review of Economic Studies. rdab030.
- Bugni et al., (2018) Bugni, F. A., Canay, I. A., and Shaikh, A. M. (2018). Inference under covariate-adaptive randomization. Journal of the American Statistical Association, 113(524):1784–1796. PMID: 30906087.
- Card, (2001) Card, D. (2001). Immigrant inflows, native outflows, and the local labor market impacts of higher immigration. Journal of Labor Economics, 19(1):22–64.
- Chung and Romano, (2013) Chung, E. and Romano, J. P. (2013). Exact and asymptotically robust permutation tests. The Annals of Statistics, 41(2):484 – 507.
- Durrett, (2019) Durrett, R. (2019). Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
- Ferman, (2019) Ferman, B. (2019). Assessing Inference Methods. arXiv e-prints, page arXiv:1912.08772.
- Ferman, (2021) Ferman, B. (2021). Matching estimators with few treated and many control observations. Journal of Econometrics, 225(2):295–307.
- Janssen, (1997) Janssen, A. (1997). Studentized permutation tests for non-i.i.d. hypotheses and the generalized behrens-fisher problem. Statistics & Probability Letters, 36(1):9–21.
- Lehmann and Romano, (2005) Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses. Springer New York.
- Wu and Ding, (2020) Wu, J. and Ding, P. (2020). Randomization tests for weak null hypotheses in randomized experiments. Journal of the American Statistical Association, 0(0):1–16.
Appendix A Online Appendix
A.1 Proof of main results
A.1.1 Proof of Proposition 1
Proof. The first assertion is immediate. We prove the second assertion. Recall that the quantile function of a distribution function is given by . Consequently, under the null, , as desired. Similarly, by the definition of quantile function, implies that . Observe that implies the sample statistic is strictly greater than simulated draws. Under the null, the latter event occurs with probability , where is the -th order statistic associated with the simulations. Using the quantile representation of a random variable and the definition of a quantile function, one obtains that:
where is the -th order statistic of a sample of independent normals, which has expected value .
A.1.2 Proof of Proposition 2
Proof.
Observe that, under the null is written as
(18) |
Let denote the conditional expectation on and . We begin by showing the squared denominator, rescaled by , which we denote by is consistent. To see this, we note that, by Assumption (iii) in the statement of the Proposition:
(19) | |||
which proves, by application of the conditional Markov inequality and bounded convergence, that . Combined with Assumption (ii) in the statement of the Proposition, we obtain that .
Next, we need to show that the distibution of the numerator, rescaled by , which we denote by , converges to a normal distibution. We first note that, passing through a subsequence if needed, the convergence in probability requirements in the statement of the proposition, as well as the consistency of the denominator previously shown, may be taken as almost-sure convergence.111111By the fact that a sequence converges in probability if, and only if, every subsequence admits a further subsequence that converges almost surely (Billingsley,, 1995, Theorem 20.10) We are thus able to apply a CLT for triangular arrays (Durrett,, 2019, Theorem 3.4.10). Indeed, we observe that
(20) |
with . Notice that, a.s. and a.s.. It thus suffices to verify the Lindeberg condition in our problem. Fix . We have:
(21) |
where the (a.s.) convergence follows by Assumption (iii) in the proposition. It then follows by the Lindeberg-Feller theorem and consistency of the denominator to that, for each ,
(22) |
as desired.
A.1.3 Proof of Proposition 3
Proof. That follows from Assumption (i) in the statement of the proposition and convergence in distribution of the numerator of the shift-share regression estimator rescaled by , which we proved in the previous proposition. Moreover, in light of the previous proposition, to conclude that the test statistic converges in probability, it suffices to show that estimation of the residuals using does not asymptotically affect consistency of the variance estimator to . Specifically, it suffices to verify that, under the null,
(23) | |||
In particular, it is sufficient that
(24) | |||
which is ensured by assumptions (ii) and (iii).
A.2 Examples of simulation distributions
A.2.1 Verification of conditions of Proposition 2
We verify the conditions of 2 in three examples.
Nonparametric bootstrap
In this case, , where is the empirical distribution of recentered shocks, i.e. and . In this case, , which ensures condition (i). As for the second condition, since , requirement (ii) in the Proposition will be satisfied if converges in probability to a positive nonrandom limit and converges in probability to a positive nonrandom limit. Finaly, requirement (iii) is satisfied if converges in probability and converges in probability to zero.
Normal distribution
Suppose , independently from . In this case, , for , which ensures condition (i). As for the second condition in the Proposition, , which converges in probability if the latter term converges. As discussed in Appendix A of AKM, convergence in probability of to a positive limit requires the existence of at least one “non-negligible” shock in most units, where by non-negligible shock in a unit we mean its exposure weight is bounded away from zero. In an “extreme” case, where and each unit is affected by a single distinct shock with unit exposure, this term simplifies to , which is expected to converge to a positive limit under mild conditions. Finally, the third condition simplifies to , which converges to zero if the latter term converges. In the single-shock-exposure setting, this term simplifies to , which converges in probability to zero under mild conditions.
Sign changes
In this case, , where denotes entry-by-entry multiplication, and , independently from . By construction, . As for the second condition, , which converges in probability to a positive constant if: (a) converges to a positive constant; and (b) converges to zero. A condition like (a) is required for existing inference methods in shift-share designs to work (see the discussion surounding Assumption A.1. in AKM). Condition (b) is satisfied if shocks are independent, is uniformly bounded and converges in probability to zero. Finally, condition (iii) in the Proposition is satisfied if is uniformly bounded and converges to zero.
A.2.2 Verification of conditions of Proposition 3
We now discuss Assumptions (i)-(iii) of Proposition 3 in the context of the three examples in the previous section.
Verification of condition (i) of Proposition 3
When the distribution of simulated shocks is standard normal, , which we require to converge to a positive constant. Such condition is analogous to Assumption A1.(ii) in AKM. Moreover, we note that, in the single-exposure case, this quantity is exactly equal to one. To conclude that the denominator of the simulated shift-share regression estimator converges in probability to a positive constant, it is sufficient to require that . Observe that, in the single-exposure case, this variance is given by , which converges to zero as . Similar arguments establish convergence of the denominator of the shift-share regression estimator in bootstrap and sign changes examples.
Verification of condition (ii-iii) of Proposition 3
Assumptions (ii-iii) implicitly restrict moments of the simulated shocks and the relation between exposure weights and the rate of growth of . Indeed, in the single-exposure case, requirement (ii) subsumes to
(25) |
which is expected to hold under mild conditions in our three main examples. Similarly, in the single exposure case, condition (iv) subsumes to,
(26) |
which is also expected to hold.