---
title: "Problem Set 1"
author: "YOUR NAME HERE" # put your name here!
date: "ECON 480 — Fall 2020"
output: html_document # change to pdf if you'd like, this will make a webpage (you can email it and open in any browser)
---
*Due by 11:59 PM Sunday September 6, 2020*
# The Popularity of Baby Names
Install and load the package `babynames`. Get help for `?babynames` to see what the data includes.
```{r}
# write your code here!
```
## 1.
### a.
**What are the top 5 boys names for 2017, and what *percent* of overall names is each?**
```{r 1-a}
# write your code here!
```
### b.
**What are the top 5 *girls* names, and what *percent* of overall names is each?**
```{r 1-b}
# write your code here!
```
## 2.
**Make two barplots, of these top 5 names, one for each sex. Map `aes`thetics `x` to `name` and `y` to `prop`^[Or `percent`, if you made that variable, as I did.] and use `geom_col` (since you are declaring a specific `y`, otherwise you could just use `geom_bar()` and just an `x`.)**
```{r 2}
# write your code here!
```
## 3.
**Find your name.^[If your name isn't in there 😟, pick a random name.] `count` by `sex` how many babies since 1880 were named your name.^[Hint: if you do this, you'll get the number of *rows* (years) there are in the data. You want to add the number of babies in each row (`n`), so inside `count`, add `wt=n` to weight the count by `n`.] Also add a variable for the percent of each sex.**
```{r 3}
# write your code here!
```
## 4.
**Make a line graph of the number of babies with your name over time, `color`ed by `sex`.**
```{r 4}
# write your code here!
```
## 5.
### a.
**Make a table of the most common name for boys by year between 1980-2017.^[Hint: once you've got all the right conditions, you'll get a table with a lot of data. You only want to `slice` the `1`st row for each table.]**
```{r 5-a}
# write your code here!
```
### b.
**Now do the same for girls.**
```{r 5-b}
# write your code here!
```
## 6. Now let's graph the evolution of the most common names since 1880.
### a.
**First, find out what are the top 10 *overall* most popular names for boys and for girls. You may want to create two vectors, each with these top 5 names.**
```{r 6-a}
# write your code here!
```
### b.
**Now make two `line`graphs of these 5 names over time, one for boys, and one for girls.**
```{r 6-b}
# write your code here!
```
## 7.
**_Bonus (hard!)_**: What are the 10 most common "gender-neutral" names?^[This is hard to define. For our purposes, let's define this as names where between 48 and 52% of the babies with the name are Male.]**
```{r 7}
# write your code here!
```
---
# Political and Economic Freedom Around the World
**For the remaining questions, we'll look at the relationship between Economic Freedom and Political Freedom in countries around the world today. Our data for economic freedom comes from the [Fraser Institute](https://www.fraserinstitute.org/economic-freedom/dataset?geozone=world&year=2016&page=dataset), and our data for political freedom comes from [Freedom House](https://freedomhouse.org/content/freedom-world-data-and-resources).**
## 8.
**Download these two datasets that I've cleaned up a bit:^[If you want, try downloading them from the websites yourself!]**
- [ `econfreedom.csv`](http://metricsf20.classes.ryansafner.com/data/econfreedom.csv)
- [ `freedomhouse2018.csv`](http://metricsf20.classes.ryansafner.com/data/freedomhouse2018.csv)
**Load them with `df<-read_csv("name_of_the_file.csv")` and save one as `econfreedom` and the other as `polfreedom`. Look at each `tibble` you've created.**
```{r 8}
# write your code here!
```
## 9.
**The `polfreedom` dataset is still a bit messy. Let's overwrite it (or assign to something like `polfreedom2`) and select `Country.Territory` and `Total` (total freedom score) and rename `Country.Territory` to `Country`.**
```{r 9}
# write your code here!
```
## 10.
**Now we can try to merge these two datasets into one. Since they both have `Country` as a variable, we can merge these tibbles using `left_join(econfreedom, polfreedom, by="Country")`^[Note, if you saved as something else in question 8., use that instead of `polfreedom`!] and save this as a new tibble (something like `freedom`).**
```{r 10}
# write your code here!
```
## 11.
**Now make a scatterplot of Political Freedom (`total`)^[Feel free to `rename` these!] as `y` on Economic Freedom (`ef`) as `x` and `color` by `continent`.**
```{r 11}
# write your code here!
```
## 12.
**Let's do this again, but highlight some key countries. Pick three countries, and make a new tibble from `freedom` that is only the observations of those countries. Additionally, *install* and *load* a packaged called `ggrepel`^[This automatically adjusts labels so they don't cover points on a plot!] Next, redo your plot from question 11, but now add a layer: `geom_label_repel` and set its `data` to your three-country tibble, use same `aes`thetics as your overall plot, but be sure to add `label = ISO`, to use the ISO country code to label.^[You might also want to set a low `alpha` level to make sure the labels don't obscure other points!]**
```{r 12}
# write your code here!
```
## 13.
**Make another plot similar to 12, except this time use GDP per Capita (`gdp`) as `y`. Feel free to try to put a regression line with `geom_smooth()`!^[If you do, be sure to set its data to the full `freedom`, not just your three countries!] Those of you in my Development course, you just made my graphs from [Lesson 2](https://devf19.classes.ryansafner.com/slides/02-slides#23)!**
```{r 13}
# write your code here!
```