1 Introduction

At the end of 2019 an unknown virus spread throughout Wuhan city and the Hubei province of China, causing a SARS-type disease with severe respiratory symptoms and fatal consequences in numerous cases. The virus was later identified as belonging to the Corona virus family and termed SARS-CoV-2, and the related disease COVID-19. At the end of January 2020, a Public Health Emergency of International Concern was declared as the virus rapidly spread around the world. On March 11, the World Health Organization officially declared the COVID-19 emergency a pandemic. By April 21, 178 countries had confirmed cases of COVID-19 infection.Footnote 1 The total count of reported cases and casualties was still rising at the time of writing this paper.

Starting from the first weeks of the COVID-19 pandemic, many important questions have been rising within the scientific community. A very difficult puzzle, which is in large part still unsolved, refers to the huge and dramatic differences in the contagion and mortality rates recorded across the different parts of the world and even across different areas inside the same country. The question that COVID-19 seems a “rich man’s disease” is arising from the public debate, motivated by the observation that rich countries seem to have been hit at the beginning of the COVID-19 pandemic more severely than the poor ones. Some indirect links between wealth and the COVID-19 pandemic, such as the age structure of people in different countries, the international flows of goods and tourists, the endowment of the health facilities, can understandably have had a role during the first wave and eventually still have. However, the relationship between economic wealth and the initial spread of COVID-19 infection and mortality is still largely under-investigated. The aim of this work is precisely to assess whether such a relationship exists and if it relates to both the spread and the mortality of COVID-19.

It is worth noticing that the focus on the first wave of the pandemic offers a unique opportunity to check for the effects of wealth, since contagion/death rates in the following periods might have been influenced by the adoption of diversified containment policies. The diffusion of COVID-19 has taken a certain amount of time to spread worldwide and countries have reacted with very different speed and severity with respect to lockdown and other containment measures, which generate huge social and economic costs. All this has affected the measurement of the effects of stock variables. Therefore, we measure the effects of wealth on the diffusion and mortality of COVID-19 from right after the start of the outbreak up until the moment when the strictest lockdown measurers started to be lifted in the European countries that were hit earliest and most severely (Italy and Spain).

Focusing on such a period, we propose here an econometric analysis to test whether the COVID-19 infection and mortality rates are related to the economic wealth of countries, as measured by their GDP per capita, as well as to other relevant variables that characterize their demographic, health, and economic structure. We merge data on COVID-19 infections and deaths provided by the European Centre for Disease Prevention and Control (ECDC) with macro-economic data collected by the World Development Indicators of the World Bank group for 138 countries.

Our estimates show that economic wealth is among the factors that are more strongly correlated with both the early infection and the mortality rates of the SARS-Cov2, together with a high share of elderly population and the country endowment of health facilities. We also check for the robustness of these results to endogeneity, to the possible poor quality of the data on COVID-19 deaths, to the potential role of early restriction policies adopted by some countries and to the addition of subsequent waves of the pandemic.

The paper is organized as follows. Section 2 discusses the literature on the interactions between economic structure and COVID-19 pandemic. Section 3 presents the data and the variables used in the empirical analysis, and their main descriptive statistics; Sect. 4 outlines the empirical strategy; Sect. 5 presents the econometric results and Sect. 6 the robustness tests. Section 7 concludes.

2 Background Literature on the Interaction Between Economic Structure and COVID-19 Contagion

Given its large impact, the scientific literature on Covid-19 pandemic is rapidly developing, including the contributions in the socio-economic area. Several epidemiological studies have focused on the socio-economic factors that have promoted the diffusion of Covid-19 diseases. A parallel stream of literature has considered the reverse impacts of COVID-19 on the economy. The present work is closer to the former.

The possible factors through which wealth can have enhanced the early stage of the COVID-19 pandemic are several. In this literature review we focus on: the age structure of population; the quality of the air; the economic structure of the country; the local and international flows of goods and people; the quality of the health services; the so called “hygiene hypothesis”.

Results about the effectiveness of the policies adopted in the different countries to bind the spread of the pandemic do not identify clear conclusions. After about 1 year from the initial spreading of the COVID-19 pandemic and after having observed many solutions adopted around the World, Fernandez-Villaverde and Jones (2020) achieve inconclusive results. There are examples where the adoption of strict policies has produced both successful and unsuccessful results across different countries. For example, Spain, Italy, France suffered high rates of contagion and death despite the adoption of restrictive policies in a similar way as Sweden or Chile, where containment measure has been much lighter.

On the contrary, cross-section and panel analyses on economic regional data seem to be more conclusive. Adopting such an approach, but restricting to a single country level, Ascani et al. (2021) for Italy, and Paez et al. (2020) for Spain, observe that the COVID-19 spread more heavily in the regions generating the higher level of their country GDP. Paez et al. (2020) develop a spatial–temporal analysis of the incidence of Covid-19 comparing the 16 autonomous communities in Spain in the period 13-March to 13-April. Adopting a spatial SUR model, they regress the incidence of COVID-19 on GDP per capita, percentage of elderly people (age > 65), some climate variables, and presence of mass transit system. Interestingly they find that higher GDP per capita and presence of mass transit system are associated with higher contagion and mortality rates of Covid-19, while the high percentage of elderly people are associated to lower levels. Ascani et al. (2021) focus on the strong differences of contagion rates across Italian regions. They provide evidence about the hypothesis that the prevalence of specific economic activities has played a role as a vehicle of disease transmission. Meetings, mobility (especially public) and face-to-face interactions occurring in standard economic activities are clear opportunities of contagion.

Turning specifically to the point of the age structure of the population, in both papers cited above the authors assess a causal connection with the first wave of the COVID-19 pandemic. It is common knowledge that the share of elderly citizens in wealthier countries is higher than average. The impact of high percentage of elderly people has been analyzed in Paez et al. (2020), in their cross-section study on the Spanish regions and they find a double influence on the Covid-19, yet with the opposite sign. On the one hand, as expected, the mortality rate increases in the regions with a higher percentage of people 65 (or more) years old. However, in the same regions the contagion rate of the pandemic turned out significantly lower. The plausible explanation is that younger generations are those more subject to move and meet people on a regular basis. Therefore, an international cross-section analysis is relevant to analyze how wealth and higher percentage of elderly citizens have interacted worldwide with the COVID-19 pandemic.

The role of air quality in the COVID-19 pandemic has also been largely debated. There is a twofold rationale behind the link identified between air pollution and the COVID-19 pandemic. First, it can be argued that poor air quality correlates with a greater diffusion of COVID-19 because pollutants, such as particulate matter (PM), would facilitate the spread of the virus conveyed by the droplets of human saliva floating in the air. Since this latter is one of the main sources of contagion, PM could serve as a carrier of COVID-19 virus. Second, there may be a relationship between air quality and COVID-19 mortality because chronic exposure to environmental pollution in general, and poor air quality in particular, have a debilitating effect on the human body, increasing its exposure to other respiratory diseases, and reducing the immune system’s response to infections. All these effects can increase the mortality risk associated with COVID-19. The relationship between the environment and the COVID-19 pandemic has also attracted attention because it was notable that the areas being hardest hit by the virus were also among the most polluted of the planet. Wuhan and the province of Hubei, where the outbreak began, the Lombardy region in Italy and the Madrid area in Spain, which have all heavily suffered because of the viral infection, are in regions with a normally very poor air quality. Research into these aspects is ongoing. Wu et al. (2020) provide evidence of the link between mortality rates and long period exposure to air pollutants (PM2.5 in particular). To this purpose, they estimate an ecological regression model on the data of 35 US counties. Other published studies seem to confirm the above-mentioned links between air quality and coronavirus diffusion. Wu et al. (2020) estimated an 8% increase in the COVID-19 death rate associated with a rise of 1 mg/m3 in PM2.5 levels in parts of the US. Ogen (2020) found a positive correlation between NOx exposure and COVID-19-related mortality in 66 administrative regions in Italy, Spain, France and Germany. Setti et al. (2020) found evidence of COVID-19 on outdoor PM in samples tested in the province of Bergamo (Lombardy, Italy), which experienced the highest diffusion and mortality rates in Italy (and among the highest worldwide).

Economic wealth interacts with air pollution levels in several contradicting ways, where some show a positive impact and some a negative one. It is a general accepted fact that developed countries are in general responsible of higher levels of emissions. Yet, we know that a combination of less efficient production and transport systems, particularly in less developed countries, and a lower quality of energy consumption correspond to high environmental externalities (Sovacol 2012). The cross-country data we use here confirm the prevalence of the second type factor, showing a negative relationship between real per capita GDP and air pollution, as measured from the concentrations of small particulate (PM2.5).

Countries with significant manufacturing sectors (where people are less able to work remotely) can have complex supply chains characterized by a greater degree of physical proximity than those of services. Such countries may also come under greater pressure from industrial lobbies to limit, or delay, the policies to prevent the spread of the contagion. The role of agriculture needs to be considered as well, for its contribution to GDP and for the relationship it has with air pollution. The latter however is unclear. On the one hand, country based on large agricultural sectors might be expected to be less exposed to air pollution because of the relative smaller share of the industrial activity. However, agro-industrial production, such as high-tech and intensive animal breeding is associated with the extensive use of manure for fertilization which can be associated with large particulate formation. In order to take these factors into account, our analysis on the COVID-19 pandemic include the percentages of manufacturing and agriculture sectors contribution to national GDP.

The movement of goods and people is a factor notably well linked with the wealth of regions and countries. It can explain the spread of the COVID-19 in two distinct ways: directly through the number of contagion opportunities and, more subtly, through the timing of the first contagion in a region/country. The information about the effective starting time of the contagion in the different regions and countries worldwide are highly uncertain. Yet, it could explain much of the observed differences in both the spreading and the mortality rates. In particular, the time interval between the first (unobserved) contagion events and the adoption of containment measures can be a major explanatory factor. For example, in Lombardy (Italy) which has been one of the mostly hit regions worldwide, Russo et al. (2020) estimate the day-zero—of the COVID-19 outbreak has been January 18. Parodi and Aloisi (2020) report that several base medical doctors observed a suspect increase of cases of bilateral pneumonia in Lombardy already during December 2019. Considering that Italy adopted the first national restriction policies by the third week of February 2020, the time interval during which the virus has spread freely in Lombardy has possibly been about 3 months.

A factor that increases the probability of early contagion in a region, or a country, is certainly the movement of the citizens outside and inside its borders. In their cross-sectional analysis based on the Spanish regions, Paez et al. (2020) observe that local public mass transportation system, more than international airport facilities, is a factor linked to higher severity of contagion rates. Probably the two types of connection (international and local) have acted differently. The strength and frequency of the international links can have increased the chances of early contagion events, while well-developed local mass transportation system can have played the role of second-order contagion enhancer. However, to offer an evidence that international mobility (inter-continental in particular) has played a role, a cross-country analysis offers a larger set of comparison than a regional one. Therefore, it can be of interest to confirm whether the mobility of merchandise and of people are an explanatory factor of the COVID-19 pandemic. In the empirical analysis, we consider this aspect by using the level of imports on GDP as a proxy of the international mobility of goods. Besides, we take the number of incoming tourists per capita as a proxy of the international mobility of people.

Higher levels of wealth improve the quality of the health care systems across different countries, in terms of facilities, personnel, and organization. Wealthier countries probably have a better chance of taking care of infected people and of testing larger proportions of the population for contagion. This last aspect might also be a factor introducing a significant measurement bias in the COVID-19-related hospitalizations and deaths statistics. In this analysis, we check the distinct contribution of the quality of the health system of a country by means of a variable accounting for the number of beds available to the population in the hospitals. We also check the robustness of data quality with respect to the death rates, by means of a specific test using the difference of such rates between 2018, 2019 and 2020.

Finally, it is worth mentioning the study of Chatterjee et al. (2021) who consider the so called “hygiene hypothesis”, which is in contrast with the expectation that higher quality of the health system provides a mitigation to the severity of the COVID-19 pandemic. This hypothesis also starts observing that richer countries tend to develop higher hygiene standards, including for example better sanitation, availability of safe drinking water, hand washing facilities, and other similar things. It recognizes therefore these standards prevent the spread of various types of diseases. However, the hypothesis supposes that they also prevent the exposure to pathogens early in life of people and so they reduce the ability to develop a robust self-protection from diseases. Such a hypothesis could explain a possible positive correlation between COVID-19 spread and the levels of wealth across different areas in the World. Chatterjee et al. (2021) develop a multivariate regression analysis including development and demographic variables, sanitation, tropical diseases, and autoimmune disorders as possible explaining factors of the COVID-19 mortality rates observed across the countries Worldwide. A confirmation of such a hypothesis is clearly of high interest, and it can benefit of the inclusion of additional confounding covariates such as the economic structure and the level of import and export (which proxy the level of international mobility of countries), the annual average level of particulate matter pollution. Besides, it is of interest to observe whether the infection rate, other than the mortality one, is also explained by the hygiene hypothesis.

In any case, we highlight that our work does not suggest any causality of the socio-economic factors analyzed here on the initial spread of the COVID-19 pandemic. Rather, and similarly to previous contributions in the literature, our analysis helps to assess whether some economic factors have favored the spread and the severity of COVID-19, with the purpose of helping our society to identify its risk points and improve its future resilience.

3 Data

To build the dataset we merge information from two sources. Data on COVID-19 early diffusion are used to define our dependent variables, namely the infection (INF) and death (DEATH) rates, computed as the cumulative number of infections and deaths per million of resident population. Data are drawn from the ECDC, an EU agency for the protection of European citizens against infectious diseases and pandemics. The data on the distribution of COVID-19 worldwide are updated daily by the ECDC’s Epidemic Intelligence team, based on reports provided by national health authorities.Footnote 2

Data for these two variables were collected across 5 weeks which correspond to the first wave of the virus diffusion, between 24 March and 21 April 2020. We just consider these 4 weeks to account for possible factors which might have affected the diffusion of COVID-19 and its consequences. Specifically, at the beginning of our observation period, the relationship might have been influenced by the different pace at which COVID-19 was spreading around the globe, while, by the end of April 2020, lockdown measures might have influenced the spread of the phenomenon.Footnote 3 Figure 1 shows the evolution of COVID-19 infections and deaths (absolute numbers on the left and ratios-to-population on the right) across the weeks in March and April 2020.

Fig. 1
figure 1

Source: authors’ elaborations on ECDC data

Evolution of COVID-19 pandemic in March–April 2020.

Data on the infection and death rates are merged with macro-economic information provided by the World Development Indicators of the World Bank on: GDPPC: real per capita GDP (in 2010 US$ at PPP); POP65 + : share of population aged 65 or more; HBEDS: hospital beds per capita; IMPORT/GDP: import intensity (i.e. import value as a proportion of domestic GDP); AGRVA/GDP: agriculture value added as a proportion of GDP; MANVA/GDP: manufacturing value added as a proportion of GDPFootnote 4; PM2.5: mean annual exposure to PM2.5 (micrograms per cubic meter); TEMP: average temperature in February and March (in degree Celsius, °C).

The variable GDPPC is used here as a measure of the economic wealth of a country, while POP65+ is included to control for the share of elderly population, which is the most vulnerable target. We also add the stock of public health facilities, given by the total number of hospital beds per capita (HBEDS), including inpatient beds in public, private, general, and specialized hospitals, and rehabilitation centers. As explained in Sect. 2, we also consider a variable of international trade, given by the 2019 value of imports on domestic GDP (IMPORT/GDP). We also include a variable of tourism intensity (TOURISM/POP), given by the total amount of tourists’ inflows on total resident population in 2019, as provided by the WDI. This variable can be considered a proxy for the international mobility of people but is available only for 131 countries. The two variables AGRVA/GDP and MANVA/GDP are used to account for the aggregate industry composition of a country’s GDP: with respect to services, we do expect that countries with a higher weight of the manufacturing sector are more likely to be affected by COVID-19 diffusion because of the higher need of physical proximity that manufacturing processes require with respect to services. For the same reasons, we do expect countries characterized by a higher weight of agricultural activities to be less vulnerable to COVID-19 diffusion. The variable PM2.5 is taken to capture the degree of air pollution in a country. The World Bank provides information on PM2.5 exposure until 2017. To measure all the explanatory variables in the same year, 2019, we have added the country-level information on PM2.5 exposure in 2018 and 2019 by linear interpolation.Footnote 5 Finally, TEMP captures some of the location-specific, or geological, characteristics of a country, as proxied by the average temperature at the time of the first wave of the outbreak (average February–March 2020, in degree Celsius).

The final sample consists of 138 countries, observed across 5 weeks between 24 March and 21 April 2020. In this way we have a panel of countries, where the dependent variables vary across weeks, while all the explanatory variables are fixed in 2019. Table 1 shows the main summary statistics, while Table 2 shows the pairwise correlations among the explanatory variables.

Table 1 Summary statistics
Table 2 Correlation matrix

4 Econometric Strategy

To test for our research hypothesis, we estimate the following equation:

$${logY}_{it}={\beta }_{0}+{{\beta }_{1}{logGDPPC}_{i2019}+(\mathbf{X}}_{i2019}^{^{\prime}}{\beta }_{X})+{\delta }_{t}+{\mu }_{r}+{\varepsilon }_{it},$$
(1)

where i represents the country, t the week, Y represents, alternatively, the rate of infections (INF) and deaths (DEATH), logGDPPC is the log value of real GDP per capita in 2019, X is the vector of additional country-level variables mentioned above, δt is a vector of week dummies to control for the global trends in the diffusion of the coronavirus, μr is a vector of area dummies that control for unobserved time-invariant factors that characterize countries belonging to the same geographical area,Footnote 6εit is the stochastic error term, and the βs are the parameters to be estimated, with special attention on β1, which we do expect to be positive and statistically significant.

We start estimating Eq. 1 using only logGDPPC and the week and area fixed effects as explanatory variables. Then, we add all the other regressors and we use the Akaike (AIC) and Bayesian (BIC) Information Criteria to check which model better fits our data.

Since all our explanatory variables are observed in one single year, i.e., 2019, we cannot estimate Eq. 1 using a fixed effects estimator, but we simply use a pooled OLS estimator. On the one hand, we lose the possibility to control for unobserved country-specific fixed effects, but on the other we can estimate the impact of our set of time-invariant regressors, while accounting for time and area fixed effects. We also transform all our variables but TEMP in natural logarithm so to interpret the estimated coefficients as elasticities.Footnote 7 To control for unobserved arbitrary within-group correlation, we cluster the standard errors at country level. Moreover, we check whether multicollinearity can be an issue using a variance inflation factor (VIF) test. As a robustness check, we also test for the possible omitted variable bias using three statistics, the Ramsey-RESET, the DeBenedictis and Giles, and the Oster δ, as explained in Sect. 6.

5 Results

Table 3 shows the results of the estimates of Eq. 1 when the dependent variable is logINF. Column 1 refers to a specification where we do not include any other variable but logGDPPC, time, and region dummies. We find that the estimated coefficient of logGDPPC is positive and significant at 1% level: a 10% increase in level of economic wealth corresponds to an average 10% increase in the infection rate.

Table 3 Pooled OLS estimates: infection rate

In column 2 we replace the area dummies with the other variables on a country’s demographic, economic, health, and trade structure.Footnote 8 Still, we find that the estimated coefficient of logGDPPC remains positive and statistically different from zero at the 1% level, and slightly higher than in column 1. Quite intriguingly, we find that the COVID-19 infection rate increases not only with the level of economic wealth, but also with the share of elderly population, the import intensity of a country, and the lower is the average temperature in February and March. We do not find any significant result for the other variables.

The high value of the VIF statistic for the logGDPPC variable, however, reveals a possible problem of multicollinearity. For this reason, in column 4, we re-estimate Eq. 1 excluding logAGR/GDP and logMAN/GDP, two variables that were not significantly related to logINF. Our estimates show that the estimated coefficients remain very similar to those in column 3, while the VIF statistics become lower than the commonly accepted cutoff of 5.

Finally, column 4 shows the results when tourism intensity is added as a regressor in Eq. 1. While the estimated coefficient of wealth remains in line with those shown in columns 1–3, the coefficient of logIMPORT/GDP becomes not statistically significant and that of logTOURISM/POP becomes significant at 10% level, and positive: specifically, a 10% increase in tourism penetration corresponds to an average 2% increase in the infection rate. The low value of the VIF statistics indicate that our estimates are not affected by multicollinearity,Footnote 9 while the AIC and BIC statistics reveal that the model in column 4 is the most appropriate one to explain the variability of the infection rate across countries and weeks. Differently from the models in columns 1 to 3, however, the number of countries for which all the data are available is 131. Among the models in columns 1–3, the lowest values of the AIC and BIC statistics correspond to the model in column 1, with only logGDPPC, week, and region-specific dummies as regressors.

Table 4 presents the results of the estimates of Eq. 1 when the dependent variable is logDEATH. As for the estimates of the infection rate, in column 1 we find a positive and highly significant coefficient for logGDPPC. Specifically, we find that a 10% increase in GDP per capita corresponds to an average 3.2% increase in the death rate. Such a correlation becomes even higher when we add the variables in Columns 2, 3 and 4: a 10% increase in GDP per capita corresponds to an average 4.2–4.7% increase in the number of deaths per capita. Differently from Table 3, we find that the correlation between the share of elderly population and the mortality rate is even stronger than that with the infection rate, as the estimated coefficient of logPOP65+ is not only higher but also significant at 1% level. Moreover, we find that another strong predictor of the death rate is the low quality of the health system, as captured by a lower availability of hospital beds per capita. In this case, we find that a 10% increase in the number of hospital beds per capita corresponds to an average 3.5% lower death rate.

Table 4 Pooled OLS estimates: death rate

For what concerns the role of international trade and tourism, in columns 2 to 4 we do not find a significant coefficient. As for the contagion rate, we do not find any significant relation of the mortality rate with the weight of agriculture and manufacturing on domestic GDP, as well as with PM2.5 exposure, while the estimated coefficient of the average temperature in February and March does not remain significant in column 4. Finally, the AIC and BIC statistics show that the model that better explains the country heterogeneity in the COVID-19 death rate is that in column 1.

6 Robustness Tests

In this Section, we test for the robustness of our results with respect to four aspects: (i) endogeneity, (ii) the possible poor quality of the data of COVID-19-related deaths, (iii) the potential role of early restriction policies adopted by some countries before, or right at the beginning, of the first wave of the COVID-19 pandemic, and (iv) the addition of successive waves of the pandemic up to 4 weeks after the 21st of April 2020.Footnote 10

With respect to point (i), the estimated coefficients in Tables 3 and 4, column 1, might be biased by two forms of endogeneity, (i.i) unobserved heterogeneity, and (i.ii) simultaneity. The former (point (i.i)) can occur because of unobserved omitted variables that might affect both the infection and the mortality rate, without being related to the level of wealth in a country. One standard way to deal with this issue is by using a fixed effect estimator. Since our explanatory variables are time-invariant, we cannot use this strategy. To mitigate a possible omitted variables bias, we first include week and area-specific dummies, which should capture, respectively, global trends and unobserved, fixed, factors that should characterize all the countries belonging to a common area of the world. For example, some of these factors can be related to the level of institutional quality, the endowment of transport infrastructures, or the availability of health personnel. On top of this, we also use the Ramsey (1969) regression specification-error test (RESET) for omitted variables, where, for simplicity, we use the first order locally valid Taylor-type polynomial approximation of the model’s conditional mean. Since the RESET test might be of low power, or even biased, we also use the test provided by DeBenedictis and Giles (1998), which use a globally valid Fourier approximation to the model’s conditional mean. Specifically, we use the FRESETS version of the test, namely that based on a sinusoidal Fourier transformation, which is the most recommended by the authors with panel data. In addition, we use the test recently developed by Oster (2019), which allows computing the share of logINF and logDEATH variance explained by unobserved components and compare it to the share of variance explained by the observed controls, for given values of the R2. Specifically, we estimate Eq. 1 (using the specification of column 4 in Tables 3, 4) through ordinary least squares (OLS), and we compute the coefficient of proportionality δ between the share of variance explained by unobserved variables and the share of variance explained by observed controls that corresponds to a value of β1 equal to 0. These δs are computed for given (maximum) values of the R2 that should be higher than those obtained in the main estimates of Tables 3 and 4. A value of δ higher than 1 can be interpreted as the sign that an omitted variable bias is unlikely, because the unobserved component should be more important than the set of observed controls to make β1 = 0.

From Table 3, columns 1 and 4, which correspond to the models that best goodness of fit, we find that all the RESET statistics never reject the null hypothesis of correct specification of the model in Eq. 1. We conclude that a model that includes only GDP per capita, time, and region-specific effects is enough to explain the cross-country variation in the COVID-19 infection rate.

From Table 4, column 1, instead, we find that the test rejects the null hypothesis at the 1% level, revealing a possible omitted variable problem. However, the RESET statistic in columns 4 does not reject the null hypothesis, showing that adding the share of elderly population and the number of hospital beds increases the capability of our model to explain the heterogeneity in the rate of COVID-19 mortality across countries. For what concerns the Oster (2019) omitted variable test, from Table 3 we find that, for a maximum value of the R2 coefficient equal to 0.9, the corresponding δ is either slightly below 1 (column 4) and above 1 (column 1), where the infection rate is regressed against GDP per capita, the time dummies, and the area dummies. From Table 4, we find that, for a maximum value of the R2 coefficient equal to 0.8, the Oster δ is always above 1, while it becomes lower than 1 in the specification of column 4 and for a maximum value of the R2 of 0.9. In any case, the fact that the specification of column 1 is always associated to a value of δ above 1, makes us conclude that the possibility that our main parameter of interest β1 might be biased by the presence of omitted variables is unlikely. In this respect, we can also consider the estimated β1 in column 1 of Tables 3 and 4 as the most reliable.

With respect to point (i.ii), we check whether β1 is biased by the possible simultaneity GDP per capita with both the infection and the death rate. Even if logGDPPC is measured at the end of year 2019, making it reasonably exogenous with respect to logINF and logDEATH (these data are referred to 2020), it can be that the SARS-Cov2 was already circulating in some countries at the end of 2019 (Apolone et al. 2020; Nsoesie et al. 2020). Although it is difficult to conceive whether, and to what extent, the diffusion of the coronavirus in the pre-pandemic period could have affected the level of wealth in a country in 2020, we test for the possible simultaneity bias by using a two-stage least squares (2SLS) approach. As an instrument for the (log) level of GDP per capita in 2019, we use an index that captures the level of economic complexity of a country in 2009. Using data from the Atlas of Economic Complexity (http://www.atlas.cid.harvard.edu/) provided by Harvard University, we retrieve a ready-to-use economic complexity index (ECI). This index is computed using trade data from UN COMTRADE and merging two elements: the number of products that a country can produce with a (revealed) comparative advantage (diversity), and the number of countries that can manufacture a given product (ubiquity). Following the seminal contributions of Hidalgo et al. (2007) and Hidalgo and Hausmann (2009), the overall economic complexity of a country is obtained applying the method of reflections and is greater the higher the diversity of its product basket and/or the lower the ubiquity of its products. Our identification strategy is the following: ceteris paribus, a higher economic complexity in the past does not have a direct influence on COVID-19 infection and death rate, but only indirect through the relation that the past economic complexity has with the current level of a country’s GDP per capita.

Table 5 shows the results of the 2SLS estimates. Unfortunately, data on ECI are available only for 112 countries. For this reason, in columns 1 and 3 we also provide the results of the pooled OLS estimates on the restricted sample of 560 observations (112 countries × 5 weeks). The results show that the 2SLS estimates are in line with the pooled OLS ones. From columns 1 and 2 we find that a 10% increase in GDP per capita corresponds to an average 9.5–12.3% increase in the infection rate, while from columns 3 to 4 the corresponding increase in the death rate is of the order of 4–5.3%. The first stage estimates show that ECI2009 is strongly related to the level of GDP per capita in 2019, and the value of the Kelibergen–Paap F statistic well above 10 demonstrate that our instrument is strong. In any case, the endogeneity test does not reject the null hypothesis of exogeneity of logGDPPC. We conclude that our estimates are robust to a possible simultaneity bias.

Table 5 Wealth and COVID-19 diffusion: 2SLS estimate

A second aspect that can bias our results is that of the poor quality of the data on COVID-19 contagion and mortality, as in point (ii). Indeed, there is no guarantee with respect to the homogeneity of the data collection process across countries. Differences in the spending capacity have probably influenced the rules to apply contagion tests to the population and to attribute the relative costs. There has been a generalized scarcity of test kits, which has influenced how the phenomenon has been measured. Overall, it is safe to assume that the official COVID counts fall abundantly short of the real number of infections around the world.

This may be true of the real number of deaths as well, though probably to a lesser extent. There are non-trivial problems with certifying a death as being due to COVID-19. It preliminarily requires an exam. However, many of the elderly people infected with COVID-19 have been treated outside hospitals, and died in nursing homes, adding to the difficulty of applying such an exam and ultimately to include the cases in the national statistics. Besides, a share of the people dying with the infection (especially elderly people) suffered of other medical problems, including cardio-circulatory and respiratory problems. In such cases, establishing the ultimate cause of death is not obvious and cases have been reported of arbitrary classification choices.

The distribution of the accounting noise might depend on countries’ level of wealth, since wealthier countries might be in a better position to manage the monitoring process and bear the related costs. To check whether our estimates are affected by a possible misreporting of mortality data, we follow Rodriguez-Pose and Burlina (2021) and we calculate the excess mortality rate as the difference between logDEATH and the (log) average death rate in 2016–18, provided by the World Bank’s World Development Indicators. Such an index is considered more reliable than the standard death rate because it compares actual COVID-19 deaths with the expected death rate in a period where no other pandemics have been reported. Table 6 reports the pooled OLS estimates of Eq. 1, where we have replaced the dependent variable with the (log) excess mortality rate.

Table 6 Wealth and excess mortality

Compared to Table 4, we find that the results remain of the same order. Again, there is a positive and strongly significant association between the death rate and a country’s level of economic wealth, and a negative and significant relation of economic wealth with the availability of health facilities. Interestingly, we find that now the estimated coefficient of logPOP65+ is not statistically significant, meaning that the share of elderly population is a variable that helps explaining the average death rate across countries, but not why a country has a death rate above the average. We conclude that our results are robust to the possible poor quality of data on COVID-19 deaths.

Concerning point (iii), we consider countries that have adopted early contact tracing or restriction policies that could have significantly reduced the number of infections and deceases between March and April 2020. Due to the many local, sectoral, and national policies adopted, it is not easy to identify these countries. To account for this heterogeneity, we define two sets of countries according to the type of policies adopted. In the first (SET 1), we exclude from the estimates the following eight countries that have adopted early lockdown or movement control strategies, as documented by Han et al. (2020): Japan, New Zealand, Singapore, and South Korea in the Asia Pacific region, and Germany, Norway, Spain, and the UK in Europe. In the second (SET 2), we exclude all the countries in the South-East region of Asia (such as Brunei, Cambodia, Indonesia, Malaysia, Myanmar, Philippines, Singapore, Thailand, Timor-Leste, and Vietnam), many of which have adopted early contact tracing policies that have significantly limited the early spread of the pandemic (Djalante et al. 2020). Table 7 shows the results of the robustness estimates, with reference to the specification shown in column 4 of Tables 3 and 4: we do not find any significant change with respect to the main results.Footnote 11 We conclude that our estimates are not biased by the presence of countries adopting early restriction or contact tracing policies.

Table 7 Wealth and COVID-19: excluding countries adopting early contact tracing policies

Finally, we test for the robustness of our results to a different time set. Specifically, we re-estimate Eq. 1 (using the specifications in Columns 1 and 2 of Tables 3, 4) by extending the period of the first wave to the four following weeks: 22–28 April; 29 April–5 May; 6–12 May; and 13–19 May 2020. We do not add further additional weeks to avoid the risk that the number of infections and deceases might be affected by the adoption of local and national restriction policies by all the countries in the dataset. After re-estimating Eq. 1, we check whether the estimated coefficients of our regressors change across time, and when the infection and the death rates are computed over a wider period.

Tables 8 and 9 show the results for logINF and logDEATH respectively. Generally speaking, all the results remain of the same order as those shown in Tables 3 and 4, confirming the robustness of our estimates with respect to a wider time frame. More in detail, from Table 8, columns 1–4, we observe that the estimated coefficient of logGDPPC, as well as the R2, slightly decreases as long as we add weeks, whereas it remains stable in columns 5–8, where the other regressors are included. Interestingly, we also observe that the estimated coefficient of logPOP65+ and logTOURISM/POP decreases, and loses significance, across time. From Table 9, instead, we find that the estimated coefficient of logGDPPC, as well as that of logPOP65+ and of logHBEDS, slightly increases across weeks, both in columns 1–4 and in columns 5–8. In any case, the coefficient of logGDPPC remains statistically significant at the 1% level, confirming its role as a key predictor for the dynamics of the first wave of COVID-19 pandemic.

Table 8 Wealth and COVID-19: extending the infection rate until 19 May 2020
Table 9 Wealth and COVID-19: extending the death rate until 19 May 2020

7 Conclusions

This study confirms the positive relationship between the average wealth of countries (as measured by the level of GDP per capita) and the outcomes of first wave of COVID-19 diffusion. We consider a sample of 138 countries and information on COVID-19 infection and mortality rates across 5 weeks between March and April 2020, observed with respect to a set of socio-economic and environmental variables. All the regressions and robustness tests confirm that the link between COVID-19 diffusion and countries’ economic wealth is positive and highly significant.

It is quite striking to observe that GDP per capita is much more significant than the average air quality of countries (as measured by the PM2.5 concentration). Indeed, the average level of pollution of countries resulted at the same time negatively correlated with their GDP per capita, and never statistically significant. However, since ours is a between-country analysis, results cannot exclude that the relation between air quality and COVID-19 diffusion might occur within countries, or at sub-national level. What we can conclude is that the level of the economic wealth, resulting from the daily professional activities of millions of people, the interpersonal physical relationships necessary to talk, move, cooperate, and accomplish tasks, explains the distribution of the first wave of COVID-19 diffusion around the world better than better than the concentration of the PM2.5 in the air.

The economic covariates considered here provide interesting insights of why the economic activity has such a high explanatory power. We find that mortality rates increase with the share of elderly population, which is a well-known feature of wealthy countries. This was expected to some extent. Quite more surprising is our finding that contagion rates are also positively impacted by higher shares of elderly people. As the literature has been pointing out, there is a robust argument of why we should expect even a negative impact. Indeed, elderly people are typically excluded from the contagion-transmission circuit within and across work activities, so higher shares of people aged 65 or more should in principle reduce contagion rates. The factors that possibly could have subverted such an expectation can be two. First, rich countries have provided more tests in general (because of larger expense capacity and better health system quality) and, in particular, to elderly people (because of their naturally weak condition). Second, elderly people have lower health defenses and tend to develop the symptoms of COVID-19 more easily than younger generations. As well known, significant shares of younger generations have developed the asymptomatic form of COVID-19. Therefore, even in the presence of equal or lower real contagion rates, elderly people might have been subject to higher testing frequency.

The international flows of people, here captured by the inward tourism penetration, along with the imports of goods, are additional, but weaker, vehicles of COVID-19 transmission. Such insight can be observed only focusing on the first wave of the pandemic. First because international transports have been severely stopped after the WHO declared the pandemic status. Secondly, because after international transport re-started, the infection and death dynamics evolved depending on local conditions and lockdown policies. This evidence is maybe the most valuable in terms of preventive policy and pandemic risk management and it would certainly deserve a deeper analysis. It confirms that travelling for tourism and business is a fundamental element of our wealth but is also a weakness for our societies. Indeed, the COVID-19 has spread and hit those countries with the higher volumes of imports and international tourism. Yet, such mobility continued to and from the region of Wuhan for about 2 months, after the first worldwide announcement of COVID-19 in January 2020.

The availability of health infrastructures is a factor that has helped mitigating the mortality rate in the first wave of COVID-19 outbreak. This does not come as a surprise. However, it is interesting to note that we observe a negative impact (though not statistically significant) also on the contagion rates. If the number of beds available to the population proxies the general standard quality of the health system of a country, then it can be argued that here we have a pale evidence the “hygiene hypothesis” is not working as expected. Such a hypothesis would predict that, contrary to our findings, the contagion rates should increase with the standard quality of the health system of countries.

Our findings shed light on the relationship between wealth and COVID-19 diffusion and mortality in the first wave of the pandemic. This can be of interest for both the actual management of the pandemic, which is far from being over at the time of the writing of this paper, as well as provide suggestion for policymaker in the unfortunate yet possible case of future pandemics.