Empirical Project 12 Working in Google Sheets

Google Sheets-specific learning objectives

In addition to the learning objectives for this project, in this section you will learn how to convert cells from text to number format.

Part 12.1 Inequality

Learning objectives for this part

  • draw Lorenz curves and calculate Gini coefficients
  • assess the effect of a policy on income inequality
  • convert nominal values to real values (extension).

One reason cited for Scheme $6,000 was to share the gains from economic growth among everyone in the society. We will be using household income data collected by the Hong Kong government to assess the potential effects of this scheme.

Download the data. The spreadsheet contains information about household incomes for certain percentiles of the population. These incomes are ‘pre-intervention’, meaning that they do not include the effects of handouts or policy interventions from the government. (If you are curious about the difference between the two tables in this spreadsheet, see the extension section ‘Nominal and real values’ at the end of Part 12.1.)

  1. Using the table ‘Monthly real household income (pre-tax, $HKD)’, plot a separate line chart for each percentile, with year on the horizontal axis and income on the vertical axis. Describe any patterns you see over time.

Now we will use this data to draw Lorenz curves and compare changes in the income distribution for 2011–2012. One way to do this is to make the following simplifying assumptions:

  • There are 100 households in the economy (so we can think of each percentile as corresponding to one household).
  • Households between the 15th and 25th percentile have the same income as the household in the 15th percentile, households between the 25th and 50th percentile have the same income as the household in the 25th percentile, and so on. (Households below the 15th percentile earn nothing.)
  1. Draw Lorenz curves for 2011 and 2012 by carrying out the following:
  • Create a new column for 2012 only, showing incomes following the $6,000 handout. Assume that nothing has changed except for the cash handout. (Remember that this amount was given to all households, including those with no income.)
  • Calculate the economy-wide earnings in 2011 and 2012 (with the $6,000 cash handout included). (Hint: Multiply the income of a given percentile by the number of households assumed to earn that amount.)
  • Use your answer to Question 2(b) to complete the table in Figure 12.1 below. (The second row also shows zeros in 2011 because the bottom 15% of households earned nothing.) (For help on how to calculate cumulative shares, see Google Sheets walk-through 5.1).
Cumulative share of the population (%) Perfect equality line Cumulative share of income in 2011 (%) Cumulative share of income in 2012 (%)
0 0 0 0
15 15 0
25 25
50 50
75 75
85 85
100 100 100 100

Figure 12.1 Cumulative share of income, for some percentiles of the population.

  • Draw the Lorenz curves for 2011 and 2012 in the same chart, with cumulative share of population on the horizontal axis and cumulative share of income on the vertical axis. In order for the chart to be drawn to scale, use the scatterplot option ‘Scatter with Straight Lines and Markers’ instead of the usual line chart option. Add the perfect equality line to your chart, and add a chart legend.

Now we will compare the Gini coefficients in 2011, 2012 (with the cash handout), and 2013 (a year after the policy came into effect). Instead of calculating the coefficients manually, we will use an online Gini coefficient calculator, which takes a list of values and calculates the Gini coefficient.

  1. Calculate the Gini coefficient by carrying out the following:
  • Create a table as in Figure 12.2, and fill in the remaining values (some values for 2011 have been filled in for you). The first column should contain the numbers 1 to 100, in intervals of 1, and the remaining columns should contain the incomes earned by a household at that percentile (using the same assumption as in Question 2). Remember that the 2012 data should include the cash handout. To obtain more accurate answers, use two decimal places instead of rounding incomes to the nearest dollar.
Percentile 2011 2012 2013
1 0.00
2 0.00
3 0.00
98 44,516.67
99 44,516.67
100 44,516.67

Figure 12.2 Incomes earned by each percentile of the population.

  • Create a new column for each year, containing the incomes with a comma added at the end (e.g. ‘40,000’ would be ‘40,000,’). Then, for each year, copy and paste the income values for all percentiles into the Gini coefficient calculator to obtain the Gini coefficient for that year.

Hint: The CONCATENATE function can combine the contents of cells and/or text together. The formula =CONCATENATE(cell1, “,”) adds a comma after the contents of cell1.

  • Based on the Gini coefficients from Question 3(b), what effect did the $6,000 handout appear to have on income inequality in the short run, and in the long run? Suggest some explanations for what you observe.
  1. In our analysis we assumed that the $6,000 handout was the only policy that affected households in 2012. In reality a household’s disposable income will also depend on taxes and transfers. Without doing additional calculations, explain what would happen to the shape of the Lorenz curve and inequality in 2012 if:
  • households in and below the 15th percentile received cash transfers from the government
  • households in and above the 75th percentile had to pay income tax.

Extension Nominal and real values

In this extension section, we will discuss how the table ‘Monthly nominal household income (pre-tax, $HKD)’ taken from the ‘Hong Kong poverty situation report 2016’ was used to create the table ‘Monthly real household income (pre-tax, $HKD)’, and why we needed to make this conversion.

An increase in the general price level in the economy. Usually measured over a year. See also: deflation, disinflation.

The difference between real and nominal income is that real income takes inflation into account. You may be familiar with the concept of inflation, which is an increase in the general price level in the economy. Usually inflation is measured by taking a fixed bundle of goods and services and looking at how much it would cost to buy that bundle, compared to a reference year. (For more details about real vs nominal variables, see the Einstein ‘Comparing income at different times, and across different countries’ in Section 1.2 of The Economy). If the bundle of goods and services has become more expensive, then we conclude that the price level in the economy has increased.

In this case, the values from 2010 onwards have been adjusted to account for the fact that prices have increased since 2009, so the same income would be able to purchase fewer goods and services. Without making this adjustment, we would conclude that households in the 15th percentile had the same purchasing power in 2009 and 2010, when in fact they do not as they can buy fewer goods and services in 2010 because of the overall price increase.

  1. Convert nominal values to real values, using 2009 as the reference year:
  • To understand what happens to a given nominal value over time due to inflation, create a table as in Figure 12.3 below, and fill it in according to the percentage increases shown. (These percentages were taken from the Monthly Digest of Statistics.) (For example, $1 in 2009 would be $1 × (1 + (2.40/100)) = $1.024 in 2010.) For greater accuracy, round your answers to three decimal places. With a starting value of $1 in 2009, what would the value be in 2016?
Year Percentage increase (from previous year) Inflation index
2009   1.000
2010 2.40 1.024
2011 5.30  
2012 4.10  
2013 4.30  
2014 4.40  
2015 3.00  
2016 2.40  

Figure 12.3 Creating an index-based series from percentage increases.

  • Use this table to convert nominal incomes to real incomes by dividing the nominal income by the corresponding value in the third column (for example, divide nominal incomes in 2010 by 1.024 to get the value in 2009 terms). You should get the same values as in the ‘real household income’ table.

Part 12.2 Government popularity

Learning objectives for this part

  • assess the effect of a policy on government popularity.

One possible reason why the government implemented Scheme $6,000 was to gain public approval, since there was some pressure on the government to spend the surplus on alleviating current social issues rather than reinvesting it (for example, in pension schemes). We will use a public opinion poll conducted by the University of Hong Kong to assess whether this scheme could have improved public satisfaction with the government.

  1. Think about the groups who would be affected by this scheme (for example, the government or members of the public). Who would be likely to support or oppose this scheme, and why?

Download the data:

  • Go to the overall performance results page on the HKU POP website, which contains half-yearly survey data on the overall performance of the government.
  • Under the subheading ‘Collapsed data’, copy and paste the entire table directly into a new tab in Google Sheets. (The data is not available in Google Sheets format, so you will have to resort to this method.)
  • Google Sheets will recognize the data you have pasted as text, but to make charts, the data needs to be in number format. Follow the steps in Google Sheets walk-through 12.1 below to reformat the column called ‘Net Value’.
  • Read the HKU POP survey methods page for a description of how the survey data was collected. Explain whether you think the sample is representative of the target population, and discuss some limitations of the survey method.

Google Sheets walk-through 12.1 Using SUBSTITUTE to clean text in cells

Figure 12.4 How to use SUBSTITUTE to clean text in cells by removing spaces before and after the text.

The data

We will use the data in Column I as an example.

Numbers stored as text

Google Sheets currently recognizes the values in Column I as text rather than numbers, so when you try to plot a line chart, it will come out like this, with no actual data being plotted.

Remove unnecessary spaces

Google Sheets thinks the numbers are text because there is a space before and after the number. To remove the spaces, we use the SUBSTITUTE function (which removes non-breaking spaces i.e. spaces that do not result in a new line), then the TRIM function (which removes spaces before and after text).

Copy and paste values into a new column

The numbers in Column J are still stored as text, but we cannot convert Column J to numbers because there is a formula in those cells. Instead, we need to copy and paste (values only) into a new column, then convert the pasted data to numbers.

Convert text to numbers

Now Google Sheets will recognize that numbers are stored as text, and will give you the option to convert them to numbers.

Numbers correctly stored in Google Sheets

Now Google Sheets recognizes the values in Column K as numbers, so you can use them to plot a line chart.

  1. Assess public satisfaction with the government by carrying out the following:
  • Make a line chart with overall public satisfaction (net value, which is the difference between percentage of positive and negative responses) on the vertical axis, and time (Jan–June 2006 to the latest period available) on the horizontal axis. Remember to sort your data in ascending date order (earliest to latest) before making your chart (for help, see Google Sheets walk-through 5.3). Comment on any trends in overall public satisfaction over this time period.
  • Go to the POP polls main data page and choose one or two other indicators that are directly related to the policy (for example, improving people’s ‘Degree of prosperity’ or ‘Degree of equality’). Find a table of the data by clicking ‘Content’, then ‘Table’ (if half-yearly data is not available, choose the most similar time interval). Copy and paste the data into a new tab, and reformat the variable of interest as in Question 2.
  • For each of your chosen indicators, make a separate line chart as in Question 3(a) and comment on any similarities to or differences from the chart in Question 3(a). (Since some indicators may be measured on a different scale, focus on changes over time.)
  • Do you think the scheme had the intended effect on government popularity? Besides the scheme, what other factors or events could explain the observed patterns?
  1. In 2018, the government decided to do another cash handout. Read the article ‘Hong Kong cash handout scheme will cost government HK$330 million to administer’ and discuss how this scheme differs from the 2011 scheme. Explain whether you think this policy is an improvement over the 2011 scheme.
  1. Suppose you are a policymaker in a developed country with a large budget surplus, and one of the government’s aims is to reduce income inequality. Would you recommend that the government implement a scheme similar to either the 2011 or 2018 scheme? If you recommend a cash handout, suggest some modifications that could make the scheme more effective. If not, suggest other policies that may be more effective in reducing inequality. (You may find it helpful to research policies aimed at reducing income inequality, for example Universal Basic Income, which some countries have tried.)