Finding and choosing data sets

Introduction

Data is an essential ingredient of most research projects. In this module, we will explore various data sources and repositories that you can utilize for your research, as well as tools that facilitate access to these resources through the McGill Library. In addition to finding relevant data, we will examine key properties of the data we collect and use, including some legal considerations.

Learning outcomes

This module will help you do the following:

Readings

Tools

Odesi Canadian Census Analyser OpenML

Assessment: data sources

Submission deadline: by Session 4.

Submitted material: a single PDF file containing the data sources identified and analyzed (in-depth stage).

Basic stage (up to 5 pts)

Identify three potentially relevant datasets or data repositories for your project. Prepare a summary of the resources found, highlighting the connection with your research.

Rubric:

In-depth stage (up to 5 pts)

Download one of the datasets you discussed in the basic stage and analyze it in detail (2 pts). Perform an exploratory data analysis of the gathered data and prepare a brief report with your initial results (2 pt). Based on your initial analysis, determine if the selected dataset is viable for your research project (1 pt).