3.8 — Polynomial Regression

ECON 480 • Econometrics • Fall 2020

Ryan Safner
Assistant Professor of Economics
safner@hood.edu
ryansafner/metricsF20
metricsF20.classes.ryansafner.com

Outline

The Quadratic Model

The Quadratic Model: Maxima and Minima

Are Polynomials Necessary?

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear
Get rid of the outliers (>$60,000)

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear
Get rid of the outliers (>$60,000)

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear
Get rid of the outliers (>$60,000)

Linear Regression

OLS is commonly known as "linear regression" as it fits a straight line to data points
Often, data and relationships between variables may not be linear
Get rid of the outliers (>$60,000)

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

Nonlinear Effects in Linear Regression

Despite being "linear regression", OLS can handle this with an easy fix
OLS requires all parameters (i.e. the 's) to be linear, the regressors 's) can be nonlinear:

In the end, each is always just a number in the data, OLS can always estimate parameters for it
Plotting the modelled points can result in a curve!

Sources of NonlinearitiesEffect of X1→Y might be nonlinear if:
   

Sources of NonlinearitiesEffect of X1→Y might be nonlinear if:
X1→Y is different for different levels of X1e.g. diminishing returns: ↑X1 increases Y at a decreasing rate
e.g. increasing returns: ↑X1 increases Y at an increasing rate

   

Sources of Nonlinearities

Effect of might be nonlinear if:

is different for different levels of
- e.g. diminishing returns: increases at a decreasing rate
- e.g. increasing returns: increases at an increasing rate
is different for different levels of
- e.g. interaction effects (last lesson)

Nonlinearities Alter Marginal Effects

Linear:
marginal effect (slope), is constant for all

Nonlinearities Alter Marginal Effects

Polynomial:
Marginal effect, “slope” depends on the value of !

Sources of Nonlinearities III

Interaction Effect:
Marginal effect, “slope” depends on the value of !
Easy example: if is a dummy variable:
- (control) vs. (treatment)

Polynomial Functions of I

Linear

Polynomial Functions of I

Linear

Quadratic

Polynomial Functions of I

Linear

Quadratic

Cubic

Polynomial Functions of I

Linear

Quadratic

Cubic

Quartic

Polynomial Functions of I

Where r is the highest power Xi is raised to
- quadratic
- cubic

Polynomial Functions of I

Where is the highest power is raised to
- quadratic
- cubic
The graph of an ^th-degree polynomial function has bends

Polynomial Functions of I

Where is the highest power is raised to
- quadratic
- cubic
The graph of an ^th-degree polynomial function has bends
Just another multivariate OLS regression model!

The Quadratic Model

Quadratic Model

Quadratic model has and variables in it (yes, need both!)

Quadratic Model

Quadratic model has and variables in it (yes, need both!)
How to interpret coefficients (betas)?
- as “intercept” and as “slope” makes no sense 🧐
- as effect holding constant??^†

^† Note: this is not a perfect multicollinearity problem! Correlation only measures linear relationships!

Quadratic Model

Quadratic model has and variables in it (yes, need both!)
How to interpret coefficients (betas)?
- as “intercept” and as “slope” makes no sense 🧐
- as effect holding constant??^†

^† Note: this is not a perfect multicollinearity problem! Correlation only measures linear relationships!

Estimate marginal effects by calculating predicted for different levels of

Quadratic Model: Calculating Marginal Effects

What is the marginal effect of ?

Quadratic Model: Calculating Marginal Effects

What is the marginal effect of ?
Take the derivative of with respect to :

Quadratic Model: Calculating Marginal Effects

What is the marginal effect of ?
Take the derivative of with respect to :
Marginal effect of a 1 unit change in is a unit change in

Quadratic Model: Example I

Example:

Use gapminder package and data

library(gapminder)

Quadratic Model: Example IIThese coefficients will be very large, so let's transform gdpPercap to be in $1,000's
gapminder <- gapminder %>%
  mutate(GDP_t = gdpPercap/1000)
gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fctr>
continent
<fctr>
year
<int>
lifeExp
<dbl>
pop
<int>
gdpPercap
<dbl>
GDP_t
<dbl>
AfghanistanAsia195228.8018425333779.44530.7794453
AfghanistanAsia195730.3329240934820.85300.8208530
AfghanistanAsia196231.99710267083853.10070.8531007
AfghanistanAsia196734.02011537966836.19710.8361971
AfghanistanAsia197236.08813079460739.98110.7399811
AfghanistanAsia197738.43814880372786.11340.7861134
6 rows
   

country <fctr>	continent <fctr>	year <int>	lifeExp <dbl>	pop <int>	gdpPercap <dbl>	GDP_t <dbl>
Afghanistan	Asia	1952	28.801	8425333	779.4453	0.7794453
Afghanistan	Asia	1957	30.332	9240934	820.8530	0.8208530
Afghanistan	Asia	1962	31.997	10267083	853.1007	0.8531007
Afghanistan	Asia	1967	34.020	11537966	836.1971	0.8361971
Afghanistan	Asia	1972	36.088	13079460	739.9811	0.7399811
Afghanistan	Asia	1977	38.438	14880372	786.1134	0.7861134

Quadratic Model: Example IIILet’s also create a squared term, gdp_sq
gapminder <- gapminder %>%
  mutate(GDP_sq = GDP_t^2)
gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fctr>
continent
<fctr>
year
<int>
lifeExp
<dbl>
pop
<int>
gdpPercap
<dbl>
GDP_t
<dbl>
AfghanistanAsia195228.8018425333779.44530.7794453
AfghanistanAsia195730.3329240934820.85300.8208530
AfghanistanAsia196231.99710267083853.10070.8531007
AfghanistanAsia196734.02011537966836.19710.8361971
AfghanistanAsia197236.08813079460739.98110.7399811
AfghanistanAsia197738.43814880372786.11340.7861134
6 rows | 1-7 of 8 columns
   

country <fctr>	continent <fctr>	year <int>	lifeExp <dbl>	pop <int>	gdpPercap <dbl>	GDP_t <dbl>
Afghanistan	Asia	1952	28.801	8425333	779.4453	0.7794453
Afghanistan	Asia	1957	30.332	9240934	820.8530	0.8208530
Afghanistan	Asia	1962	31.997	10267083	853.1007	0.8531007
Afghanistan	Asia	1967	34.020	11537966	836.1971	0.8361971
Afghanistan	Asia	1972	36.088	13079460	739.9811	0.7399811
Afghanistan	Asia	1977	38.438	14880372	786.1134	0.7861134

Quadratic Model: Example IVCan “manually” run a multivariate regression with GDP_t and GDP_sq
library(broom)
reg1<-lm(lifeExp ~ GDP_t + GDP_sq, data = gapminder)
reg1 %>% tidy()
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows
   

Quadratic Model: Example VOR use gdp_t and add the “transform” command in regression, I(gdp_t^2)
reg1_alt<-lm(lifeExp ~ GDP_t + I(GDP_t^2), data = gapminder)
reg1_alt %>% tidy()
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
I(GDP_t^2)-0.015019270.0005794139-25.921493.935809e-125
3 rows
   

Quadratic Model: Example VIABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows
   

Quadratic Model: Example VI

ABCDEFGHIJ0123456789

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	50.52400578	0.2978134673	169.64984	0.000000e+00
GDP_t	1.55099112	0.0373734945	41.49976	1.292863e-260
GDP_sq	-0.01501927	0.0005794139	-25.92149	3.935809e-125

Quadratic Model: Example VI

ABCDEFGHIJ0123456789

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	50.52400578	0.2978134673	169.64984	0.000000e+00
GDP_t	1.55099112	0.0373734945	41.49976	1.292863e-260
GDP_sq	-0.01501927	0.0005794139	-25.92149	3.935809e-125

Positive effect , with diminishing returns
Effect on Life Expectancy of increasing GDP depends on initial value of GDP!

Quadratic Model: Example VIIABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows
Marginal effect of GDP per capita on Life Expectancy:

   

Quadratic Model: Example VII

ABCDEFGHIJ0123456789

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	50.52400578	0.2978134673	169.64984	0.000000e+00
GDP_t	1.55099112	0.0373734945	41.49976	1.292863e-260
GDP_sq	-0.01501927	0.0005794139	-25.92149	3.935809e-125

Marginal effect of GDP per capita on Life Expectancy:

Quadratic Model: Example VIII