Contents

# Empirical Project 2 Collecting and analysing data from experiments

Learning objectives

In this project you will:

• collect data from an experiment and enter it into Excel
• use summary measures, for example, mean and standard deviation, and line and column charts to describe and compare data
• define and explain what statistical significance means (and what it does not mean)
• calculate and interpret the p-value
• evaluate the usefulness of experiments for determining causality, and the limitations of these experiments.

### Key concepts

• Concepts needed for this project: mean, variance, correlation, and causation.
• Concepts introduced in this project: standard deviation, range, statistical significance, and p-value.

## Introduction

### CORE projects

This empirical project is related to material in:

• Unit 2 of Economy, Society, and Public Policy
• Unit 4 of The Economy.

Just as scientists use experiments to investigate how physical processes work under certain conditions, social scientists use experiments to investigate how people might behave in particular situations. Although we cannot perfectly predict how people will actually behave, the controlled environment of experiments allows us to isolate the effects of a given change and identify specific reasons for the observed behaviour. If we keep all conditions the same and only changed one thing, then we can be more certain that any differences we observe are due to that one change.

Economists use experiments to look at social interactions where one person’s decision affects the outcome for that individual and the outcomes for others. Some goods and services are called public goods because when one person bears the cost of providing the good, everyone can enjoy it. Irrigation projects and the production of new knowledge are examples of public goods. The problem with public goods provisioning is that completely self-interested people prefer to benefit from the good without paying anything for it—this is known as ‘free-riding’.

However, there are real-world examples of successful public goods provisioning, such as common irrigation projects in India and Nepal. What could explain such sustained contributions to a public good?

One explanation is that people contribute out of fear of peer punishment. If others in your community know that you haven’t contributed and could punish you (for example, by withholding help in the future or ostracizing you), then you may contribute even if you were self-interested. To see whether punishment could result in sustained contributions to a public good, researchers Herrmann, Thöni, and Gächter (2008) did a study where different groups of people, in various countries, participated in two public goods experiments.

The first experiment had ten rounds. In each round:

• Each person in the experiment (we call them subjects) is given $20. • The subjects are randomly sorted into small groups, typically of four people who don’t know each other. • They are asked to decide on a contribution from their$20 to a common pool of money.
• The pool of money is a public good. For every dollar contributed, each person in the group receives $0.40, including the contributor. • After each round, the participants are told how much other members of their group contributed. The second experiment was the same as the first, except with an additional punishment option. After observing the contributions of their group, individual players could pay to punish other players by making them pay a$3 fine. The punisher remained anonymous but had to pay \$1 per player punished. You can read about the Herrmann et al (2008) study in the Science magazine, and the economic concepts behind their experiment in Section 2.7 of Economy, Society, and Public Policy.

In this project, we will first learn more about how experimental data can be collected by playing a public goods game to collect our own data. Then we will look at ways to describe and analyse the experimental data from the two experiments described above, in order to answer the following research questions:

• Were there any differences in behaviour (average contributions) between the experiments?
• Can we attribute the observed differences in behaviour to the change in conditions, rather than to chance or coincidence?

## Working in Excel

### Part 2.1 Collecting data by playing a public goods game

(Note: You can still do Parts 2.2 and 2.3 without completing this part of the project.)

Before taking a closer look at the experimental data, you will play a similar public goods game with your classmates to learn how experimental data can be collected. If your instructor has not set up a game, follow the instructions below to set up your own game.

#### Instructions How to set up the public goods game

Form a group of at least four people. Choose one person to be the game administrator. The administrator will monitor the game, while the other people play the game.

1. Create the game: Go to this website, scroll down to the bottom of the page, and click ‘Create a Multiplayer Game and Get Logins’. Then click ‘Externalities and public goods’. Under the heading ‘Voluntary contribution to a public good’, click ‘Choose this Game’. Enter in the number of people playing the game, and select ‘1’ for the number of universes. Then click ‘Get Logins’. A pop-up will appear, showing the login IDs and passwords for the players and for the administrator.
2. Start the game: Give each player a different login ID. The game should be played anonymously, so make sure that players do not know the login IDs of other players. You are now ready to start the first round of game. There are ten rounds in total.
3. Confirm that all the rounds are complete: On the top right corner of the webpage, click ‘Login’, enter your login ID and password, and then click the green ‘Login’ button. You will be taken to the game administration page, which will show the average contribution in each round, and the results of the round just played. Wait until all the players have finished playing ten rounds before refreshing this page.
4. Collect the game results: Once the players have finished playing ten rounds, refresh this page. The table at the top of the page will now show the average contribution (in euros) for each of the ten rounds played. Select the whole table, then copy and paste it into a new worksheet in Excel.
##### Players
2. Play the first round of the game: Read the instructions at top of the page carefully before starting the game. In each round, you must decide how much to contribute to the public good. Enter your choice for each universe (group of players) that you are a part of (if the same players are in two universes, then make the same contribution in both), then click ‘Validate’.
3. View the results of the first round: You will then be shown the results of the first round, including how much each player (including yourself) contributed, the payoffs, and the profits. Click ‘Next’ to start the next round.
4. Complete all the rounds of the game: Repeat steps 2 and 3 until you have played ten rounds in total, then collect the results of the game from your administrator.

The results from your game will look like Figure 2.1. In the questions below you will compare your results with those in Figure 3 of Herrmann et al (2008), but first you need to reformat your table to look like Figure 2.2. Follow the steps in Walk-through 2.1 to reformat your table.

Round 10 9 8 7 6 5 4 3 2 1
Average contribution

A table formatted with ‘Round’ and ‘Average contribution’ as the row variables.

Figure 2.1 A table formatted with ‘Round’ and ‘Average contribution’ as the row variables.

Round Average contribution
1
2
3
4
5
6
7
8
9
10

A table formatted with ‘Round’ and ‘Average contribution’ as the column variables.

Figure 2.2 A table formatted with ‘Round’ and ‘Average contribution’ as the column variables.

#### Walk-through 2.1 Reformatting a table

Figure 2.3 How to reformat a table.

The data

The table generated from playing the public goods game will look like the one shown in Rows 1–2. We need to reformat it so that the columns show the different variables, rather than the rows.

Figure 2.3a The table generated from playing the public goods game will look like the one shown in Rows 1–2. We need to reformat it so that the columns show the different variables, rather than the rows.

First, the table headings need to be copied and pasted so that they occupy one row, rather than one column.

Figure 2.3b First, the table headings need to be copied and pasted so that they occupy one row, rather than one column.

Copy and paste the data values

After completing step 4, your table will look like this.

Figure 2.3c After completing step 4, your table will look like this.

Rearrange rows in the correct order

The rows in Column A are currently arranged in descending order (Round 10, Round 9, and so on), so we will use Excel’s ‘Sort and Filter’ option to reverse this order.

Figure 2.3d The rows in Column A are currently arranged in descending order (Round 10, Round 9, and so on), so we will use Excel’s ‘Sort and Filter’ option to reverse this order.

Rearrange rows in the correct order

The ‘Expand the selection’ option means that the values in Column B will move along with the values in Column A (for example, the average contribution of 9 will still show up next to Round 1). This option prevents the information from being mismatched during the sorting process.

Figure 2.3e The ‘Expand the selection’ option means that the values in Column B will move along with the values in Column A (for example, the average contribution of 9 will still show up next to Round 1). This option prevents the information from being mismatched during the sorting process.

The reformatted table

The table is now reformatted as required.

Figure 2.3f The table is now reformatted as required.

Use the results of the game you have played to answer the following questions.

1. Make a line chart with average contribution as the vertical axis variable, and period (from 1 to 10) on the horizontal axis. Describe how average contributions have changed over the course of the game.

#### Walk-through 2.2 Plotting a line chart with multiple variables

Figure 2.4 How to plot a line chart with multiple variables.

The data

This is what the data looks like. Each column has data for a particular country, and each row has data for a given time period (1 to 10). We will draw Figure 3 as an example; the steps to do Figure 2A are identical.

Figure 2.4a This is what the data looks like. Each column has data for a particular country, and each row has data for a given time period (1 to 10). We will draw Figure 3 as an example; the steps to do Figure 2A are identical.

Draw a line graph

After completing step 3, the graph will look like this. Notice that the horizontal axis variable and vertical axis variables are not the same as Figure 3 (due to Excel’s default setting).

Figure 2.4b After completing step 3, the graph will look like this. Notice that the horizontal axis variable and vertical axis variables are not the same as Figure 3 (due to Excel’s default setting).

Switch the horizontal and vertical axis variables

We can switch the horizontal and vertical axis variables in the ‘Select Data’ options.

Figure 2.4c We can switch the horizontal and vertical axis variables in the ‘Select Data’ options.

Switch the horizontal and vertical axis variables

After step 7, the lines on your chart will look like those in Figure 2.8a or Figure 2.8b.

Figure 2.4d After step 7, the lines on your chart will look like those in Figure 2.8a or Figure 2.8b.

Move the legend to the right

After step 9, the legend will now be on the right-hand side of your chart. You can also experiment with the other positions to see which looks better.

Figure 2.4e After step 9, the legend will now be on the right-hand side of your chart. You can also experiment with the other positions to see which looks better.

Add axis titles and a chart title

After step 13, your chart will look like Figure 2.8a or Figure 2.8b.

Figure 2.4f Add axis titles and a chart title

1. Compare your line chart with Figure 3 of Herrmann et al (2008). Comment on any similarities or differences between the results (for example, the amount contributed at the start andend, or the change in average contributions over the course of the game).
1. Can you think of any reasons why your results are similar to (or different from) those in Figure 3? You may find it helpful to read the ‘Experiments’ section of the Herrmann et al (2008) study for a more detailed description of how the experiments were conducted.

### Part 2.2 Describing the data

We will now use the data for Figures 2A and 3 of Herrmann et al (2008), and evaluate the effect of the punishment option on average contributions. Rather than compare two charts showing all of the data from each experiment, as the authors of the study did, we will use summary measures to compare the data, and show the data from both experiments (with and without punishment) on the same chart.

• The first table shows average contributions in a public goods game without punishment (Figure 3).
• The second shows average contributions in a public goods game with punishment (Figure 2A).

You can see that in each period (row), the average contribution varies across countries, in other words, there is a distribution of average contributions in each period.

variance
A measure of dispersion in a frequency distribution, equal to the mean of the squares of the deviations from the arithmetic mean of the distribution. The variance is used to indicate how ‘spread out’ the data is. A higher variance means that the data is more spread out. Example: The set of numbers 1, 1, 1 has zero variance (no variation), while the set of numbers 1, 1, 999 has a high variance of 2178 (large spread).

The mean and variance are two ways to summarize distributions. We will now use these measures, along with other measures (range and standard deviation) to summarize and compare the distribution of contributions in both experiments.

Before answering these questions, make sure you understand mean and variance, and how to calculate these measures in Excel.

1. Using the data for Figures 2A and 3 of Herrmann et al (2008):
• Calculate the mean contribution in each period (row) separately for both experiments, using Excel’s AVERAGE function.
• Plot a line chart of mean contribution on the vertical axis and time period (from 1 to 10) on the horizontal axis (with a separate line for each experiment). Make sure the lines in the legend are clearly labelled according to the experiment (with punishment or without punishment).
• Describe any differences and similarities you see in the mean contribution over time in both experiments.
1. Instead of looking at all periods, we can focus on contributions in the first and last period. Plot a column chart showing the mean contribution in the first and last period for both experiments. Your chart should look like Figure 2.6 below.

#### Walk-through 2.3 Drawing a column chart to compare two groups

Figure 2.5 How to draw a column chart to compare two groups.

The data

This is what the data looks like. Column R has the means for Figure 3 (without punishment). Column S has the means for Figure 2A (with punishment). We will use the cells in bold font to make the chart.

Figure 2.5a This is what the data looks like. Column R has the means for Figure 3 (without punishment). Column S has the means for Figure 2A (with punishment). We will use the cells in bold font to make the chart.

Draw the column chart

After completing step 3, the column chart will look like this.

Figure 2.5b After completing step 3, the column chart will look like this.

Change horizontal and vertical axis variables, and change legend labels

After completing step 6, the chart will now look like this, with the data for Period 1 grouped together, and the data for Period 10 grouped together.

Figure 2.5c After completing step 6, the chart will now look like this, with the data for Period 1 grouped together, and the data for Period 10 grouped together.

Change horizontal and vertical axis variables, and change legend labels

After step 7, ‘Series 1’ will be renamed as ‘Without punishment’.

Figure 2.5d After step 7, ‘Series 1’ will be renamed as ‘Without punishment’.

Change the horizontal axis labels

Once you change the series title, the changes will show up in the legend.

Figure 2.5e Once you change the series title, the changes will show up in the legend.

Change the horizontal axis labels

Instead of ‘1’ and ‘2’ on the horizontal axis, the labels will change to ‘1’ and ‘10’ (referring to the period number in the game) once you exit the box.

Figure 2.5f Instead of ‘1’ and ‘2’ on the horizontal axis, the labels will change to ‘1’ and ‘10’ (referring to the period number in the game) once you exit the box.

Add data labels on top of the columns

After completing step 14, numbers showing the height of the column will appear on top of the columns selected.

Figure 2.5g After completing step 14, numbers showing the height of the column will appear on top of the columns selected.

Add axis titles and a chart title

After step 18, your chart will look like Figure 2.6.

Figure 2.5h After step 18, your chart will look like Figure 2.6.

Average contributions in Rounds 1 and 10, with and without punishment.

Figure 2.6 Average contributions in Rounds 1 and 10, with and without punishment.

variance
A measure of dispersion in a frequency distribution, equal to the mean of the squares of the deviations from the arithmetic mean of the distribution. The variance is used to indicate how ‘spread out’ the data is. A higher variance means that the data is more spread out. Example: The set of numbers 1, 1, 1 has zero variance (no variation), while the set of numbers 1, 1, 999 has a high variance of 2178 (large spread).
standard deviation
A measure of dispersion in a frequency distribution, equal to the square root of the variance. The standard deviation has a similar interpretation to the variance. A larger standard deviation means that the data is more spread out. Example: The set of numbers 1, 1, 1 has a standard deviation of zero (no variation or spread), while the set of numbers 1, 1, 999 has a standard deviation of 46.7 (large spread).

The mean is one useful measure of the ‘middle’ of a distribution, but is not a complete description of what our data looks like. We also need to know how ‘spread out’ the data is in order to get a clearer picture and make comparisons between the distributions. The variance is one way to measure spread—the higher the variance, the more spread out the data is.

A similar measure is standard deviation, which is the square root of the variance. Standard deviation is commonly used because it provides a handy rule of thumb for large datasets—most of the data (95% if there are many observations) will be two standard deviations away from the mean.

1. Using the data for Figures 2A and 3 of Herrmann et al (2008):
• Calculate the standard deviation for Periods 1 and 10 separately, for both experiments. Does the rule of thumb apply? (In other words, are most values within two standard deviations of the mean?)
• As shown in Figure 2.6, the mean contribution for both experiments was 10.6 in Period 1. With reference to your standard deviation calculations, explain whether this means that the two sets of data are the same.

#### Walk-through 2.4 Calculating and understanding the standard deviation

Figure 2.7 How to calculate and understand the standard deviation.

The data

We will compare the example data from Walk-through 2.1 with some new data that is less spread out. You can see from Column H that all the values are between 10 to 12 (inclusive).

Figure 2.7a We will compare the example data from Walk-through 2.1 with some new data that is less spread out. You can see from Column H that all the values are between 10 to 12 (inclusive).

Standard deviation calculation and interpretation

Excel’s STDEV.P function will calculate the standard deviation over the selected cells. To enter in the formula, click on an empty cell.

Figure 2.7b Excel’s STDEV.P function will calculate the standard deviation over the selected cells. To enter in the formula, click on an empty cell.

The relationship between the standard deviation and the variance

Both the variance and standard deviation measure spread. We need the variance to calculate the standard deviation, but we usually use the standard deviation to describe distributions because of the handy rule of thumb.

Figure 2.7c Both the variance and standard deviation measure spread. We need the variance to calculate the standard deviation, but we usually use the standard deviation to describe distributions because of the handy rule of thumb.

range
The interval formed by the smallest (minimum) and the largest (maximum) value of a particular variable. The range shows the two most extreme values in the distribution, and can be used to check whether there are any outliers in the data. (Outliers are a few observations in the data that are very different from the rest of the observations.)

Another measure of spread is the range, the interval formed by the smallest (minimum) and the largest (maximum) values of a particular variable. For example, we might say that the number of periods in the public goods experiment ranges from 1 to 10. Once we know the most extreme values in our dataset, we have a better picture of what our data looks like.

1. Calculate the maximum and minimum value for Periods 1 and 10 separately, for both experiments.

#### Walk-through 2.5 Finding the minimum, maximum, and range of a variable

Figure 2.8 How to find the minimum, maximum, and range of a variable.

The data

Here we are going to calculate the minimum and maximum value of the example data in Walk-through 2.1: Reformatting a table.

Figure 2.8a Here we are going to calculate the minimum and maximum value of the example data in Walk-through 2.1: Reformatting a table.

Calculate the minimum value

The minimum value is 4.

Figure 2.8b The minimum value is 4.

Calculate the maximum value and the range

The maximum value is 18. The range is the interval formed by the minimum and the maximum value. In words, we say ‘the range is 4 to 18’ or ‘the average contribution ranges from 4 to 18’. In numbers, we say ‘the range is [4,18]’.

Figure 2.8c The maximum value is 18. The range is the interval formed by the minimum and the maximum value. In words, we say ‘the range is 4 to 18’ or ‘the average contribution ranges from 4 to 18’. In numbers, we say ‘the range is [4,18]’.

1. A concise way to describe the data is in a summary table. With just four numbers (mean, standard deviation, minimum value, maximum value), we can get a general idea of what the data looks like.
• In Excel, create a summary table as shown in Figure 2.9 below. Make three more summary tables, for Period 10 (without punishment), Period 1 (with punishment), and Period 10 (with punishment). Use your answers to Questions 2 to 4 to complete the summary tables.
• Comment on any similarities and differences in the distributions, both across time and across experiments.
Mean Standard deviation Minimum Maximum
Contribution (Period 1, without punishment)

A summary table for contributions in a given period.

Figure 2.9 A summary table for contributions in a given period.

### Part 2.3 Did changing the rules of the game have a significant effect on behaviour?

The punishment option was introduced into the public goods game in order to see whether it could help sustain contributions, compared to the game without a punishment option. We will now learn how to compare the results from both experiments more formally.

By comparing the results in Period 10 of both experiments, we can see that the mean contribution in the experiment with punishment is 8.5 units higher than in the experiment without punishment (see Figure 2.6). Is it more likely that this behaviour is due to chance, or is it more likely to be due to the difference in experimental conditions?

1. You can conduct another experiment to understand why we might see differences in behaviour that are due to chance.
• First, flip a coin six times, using one hand only, and record the number of times that you get ‘heads’. Then, using the same hand, flip a coin six times and record the number of times that you get ‘heads’.
• Compare the outcomes from Question 1(a). Did you get the same number of heads in both cases? Even if you did, was the sequence of the outcomes (for example. heads, then tails, then tails …) the same in both cases?

The important point to note is that even when we conduct experiments under the same controlled conditions, due to an element of randomness, we may not observe the exact same behaviour each time we do the experiment.

statistically significant
When a relationship between two or more variables is unlikely to be due to chance, given the assumptions made about the variables (for example, having the same mean). Statistical significance does not tell us whether there is a causal link between the variables.

If the observed differences are not likely to be due to chance, then we say the differences are statistically significant. To determine whether the differences in means is statistically significant or not, we need to consider the size of the difference relative to the standard deviation of both distributions (how spread out the data is).

The fact that statistical significance relies on a relative comparison is very important. The size of the difference alone cannot tell us whether something is statistically significant or not. In fact, even if the observed differences are large, it is not a guarantee that the differences are statistically significant. Figure 2.10 and 2.11 show the mean exam score of two groups (represented by the height of the columns, and reported in the boxes above the columns), with the dots representing the underlying data. Figure 2.10 shows that a relatively large difference in means is not statistically significant because the data is widely spread out (the standard deviation is large), while Figure 2.11 shows that a relatively small difference is statistically significant because the data is tightly clustered together (the standard deviation is very small). In Figure 2.10, the difference in means is likely to be due to chance, but in Figure 2.11, the difference in means is not likely to be due to chance.

An example of a large difference in means that is not statistically significant.

Figure 2. 10 An example of a large difference in means that is not statistically significant.

An example of a small difference in means that is statistically significant.

Figure 2. 11 An example of a small difference in means that is statistically significant.

To determine statistical significance, we use the size of the difference and the standard deviation to calculate the probability (called a p-value) of seeing the data we observe, assuming that the means of both distributions are the same. Since the p-value is a probability, it ranges from 0 to 1 (inclusive). The smaller the probability (the smaller the p-value), the less likely it is that we will observe the given data, so the more likely it is that our assumption is false (in other words, the means of both distributions is not the same).

significance level
A cut-off probability that determines whether a p-value is considered statistically significant. If a p-value is smaller than the significance level, it is considered unlikely that the differences observed are due to chance, given the assumptions made about the variables (for example, having the same mean). Common significance levels are 1% (p-value of 0.01), 5% (p-value of 0.05), and 10% (p-value of 0.1). See also: statistically significant, p-value.

Our conclusions will depend on our definition of a ‘small’ probability. We define ‘small’ by choosing a cut-off (a percentage, also referred to as a significance level). Any probability smaller than that cut-off would be considered ‘small’. Some commonly used cut-offs are 1% (p-value of 0.01), 5% (p-value of 0.05), and 10% (p-value of 0.1).

To calculate the p-value, we use a function in Excel called T.TEST.

1. Using the data for Figures 2A and 3:
• Use Excel’s T.TEST function to calculate the p-value for the difference in means in Period 1 (with and without punishment). What is the p-value?
• With a cut-off of 5% (p-value of 0.05), can we conclude that the difference in means is significant? Why or why not?

#### Walk-through 2.6 Using Excel’s T.TEST function

Figure 2.12 How to use Excel’s T.TEST function.

The data

Here we are going to calculate the p-values we need.

Figure 2.12a Here we are going to calculate the p-values we need.

Calculate the p-value for Period 1 data

Excel’s T.TEST function will calculate the p-value for the two groups of cells selected. In the example shown, the formula to type is =T.TEST(B3:Q3,B17:Q17,2,1).

Figure 2.12b Excel’s T.TEST function will calculate the p-value for the two groups of cells selected. In the example shown, the formula to type is =T.TEST(B3:Q3,B17:Q17,2,1).

Do a t-test with Period 10 data

Follow these steps to calculate the p-value for Period 10 (the cells in the dotted boxes). The only difference from the previous formula (for Period 1) is the data selected.

Figure 2.12c Follow these steps to calculate the p-value for Period 10 (the cells in the dotted boxes). The only difference from the previous formula (for Period 1) is the data selected.

1. Using the data for period 10:
• Use Excel’s T.TEST function to calculate the p-value for the difference in means in Period 10 (with and without punishment). What is the p-value?
• With a cut-off of 5%, can we conclude that the difference in means is significant? Why or why not?
• With reference to Figure 2.10 and 2.11, explain why we cannot use the size of the difference to directly conclude whether the difference in means is significant or not.
spurious correlation
A strong linear association between two variables that does not result from any direct relationship, but instead may be due to coincidence or to another unseen factor.

An important point to note is that statistical significance cannot tell us anything about causation. In the example of house size and exam scores shown in Figure 2.11, there was a statistically significant relationship between the two variables (students living in a three-bedroom house had higher exam scores, on average, than students living in a two-bedroom house). However, we cannot say that the larger size of the house was the cause of higher exam scores because it’s unlikely that building an extra room would automatically make someone smarter. Statistical significance cannot help us detect these spurious correlations.

However, experiments can help us determine whether there is a causal link between two variables. If we conduct an experiment and find a statistically significant difference in outcomes, then we can conclude that one variable is the cause of the other.

1. Refer to the results from the public goods games.
• Which characteristics of the experimental setting make it likely that the punishment option was the cause of the change in behaviour?
• With reference to Figure 2.6, explain why we need to compare the two groups in Period 1 in order to conclude that there is a causal link between the punishment option and behaviour in the game.

Laboratory experiments can be useful for identifying causal links. However, if people’s behaviour in experimental conditions were different to their behaviour in the real world, our results would not be applicable anywhere outside the lab.

1. Discuss some limitations of lab experiments, and suggest some ways to address (or partially address) them. (You may find pages 158–171 of the paper ‘What do laboratory experiments measuring social preferences reveal about the real world?’ helpful, as well as the discussion on free-riding and altruism in Section 2.6 of Economy, Society, and Public Policy.)

## Working in R

This section is under development and will be in the next release of Doing Economics: Empirical Projects