Summary and Schedule
In this lesson, you’ll explore the different types of graphs and their use cases. You’ll then dive into the concept of statistical inference. Next, you’ll get hands-on with Python coding to analyze the happiness and income dataset provided below. Finally, you’ll use the graphs you’ve created to make informed estimates about countries not included in the dataset.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction |
How can the humanities benefit from data visualization? What are some of the most useful graphs for humanities research? What is inferential statistics? How can python be used for data visualization, to serve statistical inference and data storytelling? |
Duration: 00h 10m | 2. Graph Categories |
Why is data visualization important in humanities research? What are some effective graph types for use in humanities research? |
Duration: 00h 25m | 3. Statistical Inference |
What does statistical inference mean? What other mathematical concepts are needed to understand statistical inference better? |
Duration: 00h 35m | 4. Data Visualization with Python for Statistical Inference and Storytelling |
How can you create scatter plots, bubble charts, and correlograms with
Python? How can these graphs be implemented in data storytelling? How can you infer statistical information from a dataset, using these visualizations? How can these visualizations contribute to humanists research? |
Duration: 01h 35m | 5. Conclusion |
How can I apply what I have learned in this lesson? In which areas can I further expand my knowledge based on what I have learned? |
Duration: 01h 45m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
What background knowledge do you need for this lesson?
- Basic acquaintance with Python: you should know how to import Python packages and load data into your code. You also need basic familiarity with Python syntax.
- Basic mathematical background: you need a basic understanding of statistics and probabilities.
- Curiosity to learn more about Python programming, statistics and data storytelling
Dataset
The dataset we are working with in this lesson originates from Kaggle. If you wish to save the the dataset on your computer, go ahead and download the Income and Happiness Correlation dataset and save it to your working directory. Otherwise, you can directly load it into your code later using the following link:
https://raw.githubusercontent.com/HERMES-DKZ/stat_inf_data_vis/main/episodes/data/income_happiness_correlation.csv
Software Setup
Python and Jupyter Notebook/Google Colab
To do the exercises in this lesson, you need an IDE (Integrated Development Environment). We recommend you use Jupyter Notebook or cloud-based equivalent such as Google Colab.
If you’re using Google Colab, you don’t need any installation. Just create a Google account - if you don’t have one already -, create a new Colab Notebook by clicking on New Notebook in the above link, and start coding.
Otherwise, to install Jupyter Notebook and Python on your computer together, we recommend using Anaconda. To do so, click on your operating system from the list below and follow the instructions.
- Open https://www.anaconda.com/download/success with your web browser.
- Download the Anaconda for Windows installer with Python 3. (If you are not sure which version to choose, you probably want the 64-bit Graphical Installer Anaconda3-…-Windows-x86_64.exe.)
- Install Python 3 by running the Anaconda Installer, using all of the defaults for installation except make sure to check Add Anaconda to my PATH environment variable.
This video tutorial can help you with the installation.
- Open https://www.anaconda.com/download/success with your web browser.
- Download the Anaconda Installer with Python 3 for macOS (you can either use the Graphical or the Command Line Installer).
- Install Python 3 by running the Anaconda Installer using all of the defaults for installation.
This video tutorial can help you with the installation.
- Open https://www.anaconda.com/download/success with your web browser.
- Download the Anaconda Installer with Python 3 for Linux. (The installation requires using the shell. If you aren’t comfortable doing the installation yourself stop here and request help at the workshop.)
- Open a terminal window and navigate to the directory where the
executable is downloaded (e.g.,
cd ~/Downloads
). - Type
bash Anaconda3-
and then press Tab to autocomplete the full file name. The name of file you just downloaded should appear. - Press Enter (or Return depending on your
keyboard). You will follow the text-only prompts. To move through the
text, press Spacebar. Type
yes
and press Enter (or Return) to approve the license. Press Enter (or Return) to approve the default location for the files. Typeyes
and press Enter (or Return) to prepend Anaconda to yourPATH
(this makes the Anaconda distribution the default Python). - Close the terminal window.
After installing Python and Anaconda, make sure to install the Python libraries that we are using in this lesson. If you haven’t installed Python libraries on your computer before, ask the instructors for support.