Technical reference

Excel versions

The Excel walk-throughs have been created using Excel version 16.16.8 on Windows. Some of the Excel instructions will vary for Mac or Linux users. For more on using Excel see the official Excel help centre.

R versions

The R walk-throughs have been created using R version 3.5.0 and RStudio version 1.1.453 on Windows. If you are inexperienced in the use of R, we recommend the free Beginner’s guide to R by Computerworld. This course also describes the main components of the RStudio user interface. For R support more specific to economists, you may want to consult the pages on the ECLR wiki. On the CRAN Task Views page you can find links to useful packages sorted by topic.

Google Sheets versions

The Google Sheets walk-throughs have been created using Google Sheets on Windows, prior to the Google Sheets interface update in early 2019. There are no functionality changes, so the instructions are still valid, but there may be some visual differences between the walk-through images and the current Google Sheets interface. For more on using Google Sheets, see the official G Suite Learning Center.

Python versions

The Python walk-throughs were created with Python and Operating System versions:

Last updated: 2023-08-25T10:56:39.879607+01:00

Python implementation: CPython
Python version       : 3.9.10
IPython version      : 8.14.0

Compiler    : Clang 12.0.0 (clang-1200.0.32.29)
OS          : Darwin
Release     : 21.6.0
Machine     : x86_64
Processor   : i386
CPU cores   : 10
Architecture: 64bit

The versions of Python packages used are:

import toml

toml_file = toml.load("pyproject.toml")
pgks = toml_file["tool"]["poetry"]["dependencies"]
del pgks["python"]
pgks
{'pandas': '^2.0.3',
 'jupyterlab': '^3.2.1',
 'matplotlib': '^3.4.3',
 'openpyxl': '^3.0.9',
 'pingouin': '^0.5.0',
 'toml': '^0.10.2',
 'rich': '^10.15.2',
 'jupyter': '^1.0.0',
 'skimpy': '^0.0.10',
 'black-nb': '^0.7',
 'regex': '^2022.9.13',
 'html5lib': '^1.1',
 'lets-plot': '^3.2.0'}

Character encoding in R on Windows

Windows machines use a limited character encoding set by default, which means you might see some symbols as ‘???’. To resolve this, open the file in which you see the unusual symbols in RStudio, select ‘File’ > ‘Reopen with encoding…’ > ‘UTF-8’, and select ‘Set as default encoding for source files’, which will prevent this from happening in the future in RStudio.

Creating a folder on Mac

To create a folder on Mac, on your desktop or in Finder, use Shift-Command-N.

Finding your path on Mac

Right-click on the file or folder for which you want to find the path and select ‘Get info’. You can copy the file path from the ‘Where’ section of the info box that appears.

Finding your path on Windows

Right-click on the file or folder for which you want to find the path and select ‘Properties’. You can copy the file path from the ‘Location’ section of the ‘General’ tab of the properties box that appears.

Getting started in Python

Congratulations on starting your coding and economics journey! In this introductory chapter, we’re going to help you install or access the tools you need. We recommend you read and follow these instructions before starting your first project because it’s important to be able to run code. First, we’re going to give you some background on key concepts for coding. Then, we’ll give you an option to either follow instructions to start coding on your own computer or use a popular online cloud service for coding. Either way, by the end of these instructions, you will be running code.

Remember that all of the code in this section is available online on Github too: you can find it at Core Python.

Python and programming languages

Programming languages are a way to issue explicit instructions to a computer to perform operations—here, the operations will all be about doing economics!

This version of Doing Economics uses a programming language called Python. Python usually ranks as the most or second-most popular programming language in the world and is also considered to be one of the easiest to learn. It’s a general-purpose programming language, which means it can perform a wide range of tasks. This combination of being relatively easy to learn but extendable to many applications is why people say Python has a ‘low floor and a high ceiling’. As a language, it is widely used across industry, academia, and the public sector, and is frequently taught in schools. It has been applied to create the first image of a black hole, perform economic analysis, and it is behind the large language models (like ChatGPT) that are revolutionising how we think about AI.

Programming languages come in versions and, at the time of writing, Python 3.11 was the most recent version released. But the ‘base’ language isn’t the only thing you’ll need: some of the most important functionality of programming languages is provided by add-ons called packages or libraries, which themselves have versions.

The combination of the language and its version (e.g. Python 3.9), the packages and their versions (e.g. numpy 1.24), and the operating system the code is being run on (e.g. MacOS Catalina) is called the computational environment.

Preliminaries for doing economics with Python

Useful concepts

To do economics using Python, you will need three things on your computer or on a cloud computer. For now, we’re just going to tell you what they are and what they do—we’ll come to how to get them shortly.

  • An installation (or ‘distribution’) of the programming language, also known as an ‘interpreter’. Python is both a programming language that you can read, and a language that computers can read, interpret, and then carry out instructions based on. For a computer to be able to read and execute Python code, you will need to get a Python interpreter installed on it. There are lots of ways to install a Python ‘interpreter’ on your own computer; we recommend the Anaconda distribution of Python for its flexibility and simplicity.
  • An integrated development environment (IDE). This is where you write those instructions that are then executed by the interpreter. The most important of these is a way to write the code itself! IDEs are not the only way to programme, but they are perhaps the most useful. There are many IDEs out there. We strongly recommend Microsoft’s Visual Studio Code (or VS Code for short), which works on all major operating systems and is one of the most popular. Here are some of the useful features that Visual Studio Code provides:

    • a way to run your code interactively (line by line) or all at once
    • a way to debug (look for errors) in your code
    • a quick way to access helpful information about commonly used software packages
    • automatic code formatting, so that your code follows best practice guidelines
    • auto-completion of your code when you press TAB
    • automatic code checking for basic errors
    • colouring your brackets in pairs so you can keep track of the logical order of execution of your code!
  • To know how to install Python packages. A Python package is a collection of functions, data, and documentation that extends the capabilities of an installed version of Python. Using packages is key to most economic analysis because most of the functionality we’ll need comes from extra packages. You’ll see statements like import numpy as np at the start of many Python code scripts—these are instructions to use an installed package (here, one named numpy) and to give it a shortened name (np, for convenience) in the rest of the script. The functions in the numpy package are then accessed through syntax like np.; for example, you can take logs with np.log(x) where x is a variable containing a number. You need only install packages once, but you must import them into each script you need to use them in.

Typical coding workflow

It would also be useful for you to know what a typical coding workflow might look like. It goes something like this:

  • Open up your integrated development environment (IDE).
  • Write some code in a script (a text file with code in it) in your IDE.
  • If necessary for the analysis that you’re doing, install any extra packages.
  • Use the IDE to send bits of code from the script, or the entire script, to be executed by Python and add-on packages, and to display results.

We’ll see two ways to achieve this workflow:

  1. Installing an IDE, Python, and any extra packages on your own computer—essentially coding on your own computer.
  2. Coding on a computer in the cloud that you access through your internet browser. The cloud computer has an IDE and Python built in, and you can easily install extra packages in it too. However, you should be aware that the cloud service we recommend has a 60 hours per month free tier—beyond this, you’ll need to pay for extra hours.

How to get started on your own computer

Use these instructions if you’ve decided to code on your own computer.

Installing Python

To download and install Python, we’ll use the Anaconda ‘distribution’ of Python, which is available on all major operating systems. To install it, follow the instructions below or watch this video on how to install Python using the Anaconda distribution of Python.

Download the individual edition of the Anaconda distribution of Python for your operating system and install it. This will provide you with a Python installation and a host of the most useful libraries. If you get stuck, there are more detailed instructions available for installing the Anaconda distribution of Python on Windows, on Mac, and on Linux.

You can confirm that you’ve set up Anaconda correctly by following the verify installation instructions on the Anaconda website.

If you’re using Microsoft’s Windows operating system, you can check if Anaconda has installed properly by opening the ‘Anaconda prompt’ (a special text-based way to issue commands to your computer) and type where python. You should see a path rendered as text in the prompt that includes ‘Anaconda3’, for example something like C:\Users\<your-username>\Anaconda3\.... On Mac and Linux you may need to run conda init on your command line to activate your Anaconda Python environment. You can check you’ve got the right Python with which python, which should result in a message back saying /Users/<your-username>/opt/anaconda3/bin/python.

Installing your integrated development environment, Visual Studio Code

Visual Studio Code is a free and open-source IDE from Microsoft that is available on all major operating systems. Just like Python itself, Visual Studio can be extended with packages, and it is those packages, called extensions in this case, that make it so useful. As well as Python, Visual Studio Code supports a ton of other languages.

Download and install Visual Studio Code. If you need some help, there is a video below that will walk you through downloading and installing Visual Studio Code, and then using it to run Python code in both scripts and in notebooks. We’ll go through these instructions in detail in the rest of this chapter.

Coding in the cloud

These instructions are for if you wish to code in the cloud rather than on your own computer.

Github Codespaces

There are many ways to use cloud computers to do economics, but we’re going to share with you the absolute simplest. For this, you will need to sign up for a Github Account. Github is an organisation that’s owned by Microsoft and which provides a range of services including a way to back up code on the cloud, and cloud computing. One of the services offered is Github Codespaces. A GitHub Codespace is an online cloud computer that you connect to from your browser window. It has a generous 60 hours free of computing per month, and it uses Visual Studio Code as an IDE.

Do note that if you go over the free tier hours on Github Codespaces, your credit card will be charged for any further hours of GitHub Codespaces you use.

Once you’ve signed up for a Github account, head to Github Codespaces and click on ‘Get Started for Free’. You should see a menu of ‘quick start templates’. Under where it says ‘Jupyter Notebook’, select ‘Use this template’.

You will find that a new page loads with several panels in it. This is an online version of Visual Studio Code that works much like if you had installed it on your own computer. It will already have a version of Python installed—you can check which one by running python --version in the terminal. The terminal is usually found in the lowest panel of Visual Studio Code, and in Codespaces it will typically display a welcome message.

Binder

Binder is a free online service that provides code notebooks in the cloud. Notebooks mix text and code together and each Doing Economics project is available as a notebook too. Binder uses a different (but still phenomenal) IDE to the one we recommend, Visual Studio Code: it uses JupyterLab instead.

You can find the link to the notebooks of the Doing Economics projects here, or by clicking on the button below:

Binder

To get started in Binder, click on one of the chapters on the left (these have names like ‘empirical_project_1.ipynb’). This will start one of the chapter notebooks. You will need to install the packages for the chapters too. You can access a terminal to install packages by going to ‘New -> Terminal’, and then typing in the install package command, which is pip install packagename.

Alternative coding in the cloud options

As well as working through these projects on your own computer or on the cloud via Github Codespaces, you can run the code online through a few other options. The first is the easiest to get started with.

  1. Google Colab notebooks. Colab is really easy to use, is free, and has Python pre-installed. However, it only covers notebooks (a mix of text and code) rather than code-only scripts.
  2. Gitpod Workspace. An alternative to Codespaces. This is a remote, cloud-based version of Visual Studio Code with Python installed and will run Python scripts.

Running your first Python code

Getting to grips with Visual Studio Code

Once you have Visual Studio Code installed and opened (either on your own computer or in the cloud), navigate to the ‘extensions’ tab on the vertical bar of icons on the left (it’s the one that looks like four squares). You’ll need to install the Python extension for VS Code, which you can search for by using the text box within VS Code’s extensions panel. If you’re using the cloud version, you may find that it’s already installed.

There are some other extensions it’s useful to have and install (if they aren’t already installed):

  • Jupyter
  • Pylance
  • indent-rainbow

Although you won’t have any Python code to play with yet, or an interactive window to execute that Python code, it’s worth us spending a brief moment familiarising ourselves with the different bits of a typical view in Visual Studio Code.

A typical user view in Visual Studio Code
Fullscreen

A typical user view in Visual Studio Code

The figure above shows the typical layout of Visual Studio Code once you have a Python session running, and a Python script open. The long vertical panel on the far left-hand side changes what is seen in panels 1 and 2; it currently has the file explorer selected. Let’s run through the numbered parts of the figure.

  1. When the explorer option is selected from the icons to the left of 1 and 2, the contents of the folder that’s currently open are shown in 1.
  2. This is an outline of the key parts of the file that is open in 3.
  3. This is just a fancy text editor. In the figure above, it’s showing a Python script (a file that contains code and has a name that ends in .py). Shortly, we’ll see how selecting code and pressing Shift + Enter (‘Enter’ is labelled as ‘Return’ on some keyboards) will execute code whose results appear in panel 5.
  4. This is the command line or terminal, a place where you can type in commands that your computer will then execute. If you want to try a command, type date (Mac/Linux) or date /t (Windows). This is where we install extra packages.
  5. This is the interactive Python window, which is where code and code outputs appear after you select and execute them from a script (see 3). It shows the code that you executed and any outputs from that execution—in the screenshot shown, the code has created a plot. The name and version of Python you’re using appear at the top of the interactive window.

Note that there is lots of useful information arrayed right at the bottom of the window in the bar at the bottom of the screen, including the version of Python currently being used by VS Code.

Running Python code

Now you will create and run your first code. If you get stuck, there’s a more in-depth tutorial over at the VS Code documentation.

In Visual Studio Code, click on the ‘Explorer’ symbol (some files on the left-hand side of the screen) to bring up a file explorer. Check you’re in a good location on your computer to try things out and, if not, change the folder you’re in using File -> Open Folder until you’re happy.

Now, still with the explorer panel open, click on the symbol that looks like a blank piece of paper with a ‘+’ sign on it. This will create a new file, and your cursor should move to name it. Name it hello_world.py. The file extension, .py, is very important as it implicitly tells Visual Studio Code that this is a Python script.

In the Visual Studio Code editor, add a single line to the file:

print('Hello World!')

Save the file.

If you named this file with the extension .py then VS Code will recognise that it is Python code and you should see the name and version of Python pop up in the bar at the bottom of your VS Code window. (You can have multiple versions of Python installed—if you ever want to change which Python version your code uses, click on the version shown in the bar and select the version you want.)

Alright, shall we actually run some code? Select/highlight the print("Hello world!") text you typed in the file and right-click. You’ll get a lot of options here, but the one you want is “Run Selection/Line in Interactive Window”.

This should cause a new ‘interactive’ panel to appear within Visual Studio Code and you should now see:

print("Hello world!")
Hello world!

The interactive window is a convenient and flexible way to run code that you have open in a script or that you type directly into the interactive window code box. The interactive window will ‘remember’ any variables that have been assigned (for example, code statements like x = 5), whether they came from running some lines in your script or from you typing them in directly. Working with the interactive window will feel familiar to anyone who has used Stata, Matlab, or R. It doesn’t require you to write the whole script, start to finish, ahead of time. Instead, you can change code as you go, (re-)running it line by line.

It would be cumbersome to have to right-click every time we wanted to run some code, so we’re going to make a keyboard shortcut to send whatever code is highlighted to the interactive window to be executed. To do this:

  • Open up the Visual Studio Code configuration menu (the cog on the lower left-hand side).
  • Go to Settings.
  • Type ‘jupyter send’ in the box to make an entry ‘Interactive Window > Text Editor: Execute Selection’ appear.
  • Ensure the box next to this entry is ticked.

Now return to your script, put your cursor on the line with print("Hello world!") on, and press Shift+Enter. You should see ‘Hello world!’ appear again, only this time, it was much easier.

Let’s make more use of the interactive window. At the bottom of it, there is a box that says ‘Type code here and press Shift+Enter to run’. Go ahead and type print('Hello World!') directly in there to achieve the same effect as running the line from your script. Also, any variables you run in the interactive window (from your script or directly by entering them in the box) will persist.

To see how variables persist, type hello_string = 'Hello World!' into the interactive window’s code entry box and press Shift+Enter. If you now type hello_string and press Shift+Enter, you will see the contents of the variable you just created. You can also click the grid symbol at the top of the interactive window (between the stop symbol and the save file symbol); this is the variable explorer and will pop open a panel showing all of the variables you’ve created in this interactive session. You should see one called hello_string of type str with a value Hello World!.

This shows the two ways of working with the interactive window—running (segments) from a script, or writing code directly in the entry box. It doesn’t matter which way you entered variables, they will all be remembered within that session in your interactive window.

Packages and how to install them

Packages (also called libraries) are key to extending the functionality of Python. The default installation of Anaconda comes with many (around 250) of the packages you’ll need, but it won’t be long before you’ll need to install some extra ones. There are packages for geoscience, for building websites, for analysing genetic data, and, of course, for economics. Packages are typically not written by the core maintainers of the Python language but by its users, including enthusiasts, firms, researchers, and academics. Because anyone can write packages, they vary widely in their quality and usefulness. There are some that are key for an economics workflow, though, and you’ll be seeing them again and again.

Python packages don’t come built in (by default) so you need to install them (just once, like installing any other application), and then import them into your scripts (whenever you use them in a script). When you issue an install command for a specific package, it is automatically downloaded from the internet and installed in the appropriate place on your computer.

Installing packages

To install extra Python packages, you issue install commands to a text-based window called the ‘terminal’ or ‘command line’. This is a text-based way of issuing instructions to your computer’s operating system. In the figure earlier, the terminal is labelled as panel number 4, and it may well be open when you start Visual Studio Code. If not, to open up the command line within Visual Studio Code, use the + \` keyboard shortcut (Mac) or ctrl + \` (Windows/Linux), or click ‘View > Terminal’.

To install a package, navigate to the terminal, type in

pip install packagename

and then press return. Some packages that are very useful that you might want to install (and that aren’t automatically included with the Anaconda distribution) are:

Problems when using packages

There are a couple of common pitfalls when using packages (and Python environments) that it’s useful to be aware of, which we’ve included here. For now, you can skip this section, but remember to come back here if you do have a problem either installing a package or using it.

  1. You install them in the wrong place. The signs of this error are often when you try to import a package and you get a ‘module not found’ error or similar (though it’s not the only reason you might get this). The reason this happens is because you can simultaneously have many versions of Python installed on your computer. This is useful for some developers, who want to test that the packages they’re creating can work against any version of Python! But it can lead to confusion. Essentially, whatever version of Python your terminal defaults to is the one that packages will also be installed into. You can check the directory of the Python version in the terminal using which python and python --version. For example, on a fresh install of Anaconda on MacOS, the first command produces ‘/opt/anaconda3/bin/python’ and the second ‘Python 3.8.8’. If you’re using Anaconda, you can list all installed Anaconda versions of Python with conda list in your terminal. If you want to change to a different version of Python to then install packages, it’s conda activate environmentname, where environmentname is the name of the Python installation you want taken from the conda list command (all Anaconda Python environments are named). If you can’t see the word (base) at the start of your terminal’s line, you may need to run conda activate first, or open the Anaconda prompt on Windows and use that instead of the terminal.
  2. You run the wrong version of Python in a script or notebook, so even though you’re sure you’ve installed the package in the right place, the code can’t find the package. Again, this results in a ‘module not found’ error. What’s going wrong is that you’re using the wrong Python environment to run your code. In this case, you need to tweak the settings in Visual Studio Code to use the correct Python environment. Whether you’re using a script or a notebook to code in, the currently active version of Python is in the top right-hand corner of the screen. For example, the default Anaconda environment will say base (Python x.x) where x.x is your Python version number.

Python version used in examples

The Python walkthroughs in the Doing Economics projects were created with Python and Operating System versions:

Last updated: 2023-08-25T10:56:42.338334+01:00

Python implementation: CPython
Python version       : 3.9.10
IPython version      : 8.14.0

Compiler    : Clang 12.0.0 (clang-1200.0.32.29)
OS          : Darwin
Release     : 21.6.0
Machine     : x86_64
Processor   : i386
CPU cores   : 10
Architecture: 64bit