Empirical Project 10 Working in Excel

Part 10.1 Summarizing the data

We will be using the World Bank’s Global Financial Development Database.

Download the data and documentation:

The World Bank’s Database contains information about four categories:

We will be looking at the first three categories, focusing particularly on measures of stability before and after the 2008 global financial crisis. Each category is measured by a number of indicators. Figure 10.1 shows the indicators we will be using in this project.

Category Indicator name Indicator code
Depth Private credit by deposit money banks to GDP (%) GFDD.DI.01
  Deposit money banks’ assets to GDP (%) GFDD.DI.02
Access Bank accounts per 1,000 adults GFDD.AI.01
  Bank branches per 100,000 adults GFDD.AI.02
  Firms with a bank loan or line of credit (%) GFDD.AI.03
  Small firms with a bank loan or line of credit (%) GFDD.AI.04
Stability Bank Z-score GFDD.SI.01
  Bank regulatory capital to risk-weighted assets (%) GFDD.SI.05

Indicators used in this project.

Figure 10.1 Indicators used in this project.

  1. The ‘Definitions and Sources’ tab in your spreadsheet contains a description of all indicators in the Database. Use the information provided to explain briefly why each of the indicators listed in Figure 10.1 may be a good measure of that category, or may give misleading information about that category.

The ‘Data – June 2016’ tab contains the values of each indicator over time (1960–2014) for various countries around the world, though data may be missing for some countries and years.

  1. For each category, calculate and interpret the correlation coefficient between the indicators. Remember that the correlation between A and B is the same as the correlation between B and A, so you only need to calculate the correlation for each pair of items once. (Hint: For Access, you may find it helpful to display the pairwise correlations in a table like Figure 10.2.)
  GFDD.AI.01 GFDD.AI.02 GFDD.AI.03 GFDD.AI.04
GFDD.AI.01 1
GFDD.AI.02   1
GFDD.AI.03     1
GFDD.AI.04       1

Correlation table for indicators in Question 2.

Figure 10.2 Correlation table for indicators in Question 2.

Box and whisker plots are useful for looking at a single variable and checking if there are many extreme values (either very large or very small, relative to the rest of the values).

  1. Make a separate box and whisker plot for each indicator, with the outliers displayed (see Excel Walk-through 6.3 for help on how to do this). From looking at your plots, do extreme values appear to be an issue? (In Question 6, we will look at one way to handle extreme values if there is a concern that one or a few very extreme values will significantly affect the average.)

Now we will use Excel’s PivotTable option to make summary tables of some indicators and look at how they have changed over time. Each country belongs to a particular region and income group.

  1. Choose one indicator in Depth and one indicator in Access:

So far, we have been looking at simple averages where each observation is given the same weight (importance), so we simply add up all the numbers and divide by the number of observations. However, when we take averages across regions or income groups, we may want to account for the fact that countries differ in size (population or GDP). For example, if one country is far larger than another, we may want that country to have a larger influence on the average. See the box below for more about weighted averages.

weighted average
A type of average that assigns greater importance (weight) to some components than to others, in contrast with a simple average, which weights each component equally. Components with a larger weight can have a larger influence on the average.

Weighted averages

An example of weighted averages that you have probably experienced is in calculating course grades. Usually, course grades are not calculated by simply summing up the scores in all components and dividing by the number of components. Instead, certain components such as the final exam are given more importance (influence over the overall grade) than the midterm exam or course assignments.

To calculate the weighted average, we first determine the weight of each component (these are fractions or proportions that sum to 1). Then we multiply each component by its respective weight, and then sum over all components. Using the course grade as an example, suppose the final exam is worth 70% of the final grade and the midterm exam is worth 30%, with both being scored out of 100. Then the weighted average would be:

In comparison, the simple average would give both components equal weight:

To develop your intuition for this concept, you can experiment by choosing values for the final exam score and midterm exam score and seeing how a change in one of the scores affects the weighted and simple averages.

The indicator ‘Bank regulatory capital to risk-weighted assets (%)’ in the Database also uses weights to account for the fact that some assets are riskier than others, and should therefore not be considered equally.

We will practice calculating weighted averages for the indicator ‘Bank accounts per 1,000 adults’, weighting according to total population in each region (so countries with a larger population will have a larger influence on the average). Since data is missing for some countries, we will calculate the total population in each region as the total population for countries with non-missing data.

  1. For each region and for the years 2004–2014:

Extension Using Winsorization to handle extreme values

If we are interested in combining indicators into a single index (as in Empirical Project 4), we may be concerned about extreme values, but still want to include these countries in the index (rather than excluding them from the calculations). On page 19 of the paper ‘Benchmarking financial systems around the world’, the authors discuss Winsorization (replacing extreme values with either the 95th or the 5th percentile value) as one way to handle these extreme values. Sometimes the extreme values are due to peculiar features of a single country, so we might want to adjust the data to make the values ‘less extreme’.

  1. For an indicator you have used in Questions 4 and 5 and for the year 2010:
  • Calculate the 95th and 5th percentile value of that indicator, across all countries. (Hint: Use Excel’s PERCENTILE.INC function.)
  • Replace any value larger than the 95th percentile value with the 95th percentile value, and replace any value smaller than the 5th percentile value with the 5th percentile value. (Hint: Use Excel’s IF function.)
  • Use your ‘Winsorized’ values from Question 6(b) to calculate the average values of the indicator, by region and income group (separately). Compare these values to the simple averages from Question 4(a).

Part 10.2 Comparing financial stability before and after the 2008 global financial crisis

Now we will assess whether financial stability (measured by the two indicators in Figure 10.1) has changed since the 2008 global financial crisis.

  1. For both indicators of stability in Figure 10.1, explain what effect the post-crisis banking regulations are likely to have on the value of the indicator (for example, would the value increase or decrease?), and why. You may find it helpful to research the regulations that were implemented as a result of the 2008 global financial crisis.
  1. For the years 2007 and 2014:
  1. For each indicator:
leverage ratio (for banks or households)
The value of assets divided by the equity stake (capital contributed by owners and shareholders) in those assets.