Skip to content

Latest commit

 

History

History
76 lines (46 loc) · 2.02 KB

02-provenance.md

File metadata and controls

76 lines (46 loc) · 2.02 KB

Keeping on top of provenance

  • Licenses
  • Streamlining for reproducibility

Licenses

Where does the file come from?

  • How can we describe this later to somebody?
    • Point and click is long to describe
    • What are the rights we have?

What is a license?

A license (licence) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit).1 2

Examples

License applying to Geodist data

Can we re-publish the file?

Downloading via code

Easiest:

Stata

use "$URL" , clear

Why not?

  • will it be there in two months? in 6 years?
  • what if the internet connection is down?

Easy:

Stata

global URL "https://www.cepii.fr/distance/dist_cepii.dta"
copy "$URL" (outputfile), replace

R

download.file(url="$URL",destfile="(outputfile)")

We will get to even better methods a bit later

Creating a README

  • Template README
    • Cite both dataset and working paper
    • Add data URL and time accessed (can you think of a way to automate this?)
    • Add a link to license (also: download and store the license)

Link

Step 1: Stata, R 3

Footnotes

  1. Cambridge Dictionary

  2. Wikipedia

  3. 🔒Tag: stage1