Unifying theory of GLMs

Prof. Maria Tackett

Feb 05, 2024

Announcements

  • HW 02 due Wed, Feb 07 at 11:59pm

  • Project 01

    • presentations in class Wed, Feb 14

    • write up due Thu, Feb 15 at noon

Topics

  • Identify the components common to all generalized linear models

  • Find the canonical link based on the distribution of the response variable

  • Properties of GLMs

Notes based on Chapter 5 Roback and Legler (2021) unless noted otherwise.

Unifying theory of GLMs

Many models; one family

We have studied models for a variety of response variables

  • Least squares (Normal)
  • Logistic (Bernoulli, Binomial, Multinomial)
  • Log-linear (Poisson, Negative Binomial)

These models are all examples of generalized linear models.

GLMs have a similar structure for their likelihoods, MLEs, variances, so we can use a generalized approach to find the model estimates and associated uncertainty.

Components of a GLM

Nelder and Wedderburn (1972) defines a broad class of models called generalized linear models that generalizes multiple linear regression. GLMs are characterized by three components:


1️⃣ Response variable with parameter θ whose probability function can be written in exponential family form (random component)


2️⃣ A linear combination of predictors, η=β1x1+β2x2+⋯+βpxp (systematic component)


3️⃣ A link function g(θ) that connects θ to η

One-parameter exponential family form

Suppose a probability (mass or density) function has a parameter θ. It is said to have a one-parameter exponential family form if


✅ The support (set of possible values) does not depend on θ, and

✅ The probability function can be written in the following form

f(y;θ)=e[a(y)b(θ)+c(θ)+d(y)]

Mean and variance

On-parameter exponential family form

f(y;θ)=e[a(y)b(θ)+c(θ)+d(y)]

Using this form:

E(Y)=−c′(θ)b′(θ)Var(Y)=b″(θ)c′(θ)−c″(θ)b′(θ)[b′(θ)]3

Poisson in one-parameter exponential family form

P(Y=y)=e−λλyy!y=0,1,2,…,∞

P(Y=y)=e−λeylog⁡(λ)e−log⁡(y!)=eylog⁡(λ)−λ−log⁡(y!)

Recall the form: f(y;θ)=e[a(y)b(θ)+c(θ)+d(y)], where the parameter θ=λ for the Poisson distribution

  • a(y)=y
  • b(λ)=log⁡(λ)
  • c(λ)=−λ
  • d(y)=−log⁡(y!)

Poisson in exponential family form

✅ The support for the Poisson distribution is y=0,1,2,…,∞. This does not depend on the parameter λ.

✅ The probability mass function can be written in the form f(y;θ)=e[a(y)b(θ)+c(θ)+d(y)]


The Poisson distribution can be written in one-parameter exponential family form.

Canonical link

Suppose there is a response variable Y from a distribution with parameter θ and a set of predictors that can be written as a linear combination η=β0+∑j=1pβjxj=β0+β1x1+β2x2+⋯+βpxp


  • A link function, g(), is a monotonic and differentiable function that connects θ to η

  • When working with a member of the one-parameter exponential family, b(θ) is called the canonical link

    • Most commonly used link function

Canonical link for Poisson

Recall the exponential family form:

P(Y=y)=eylog⁡(λ)−λ−log⁡(y!)


then the canonical link is b(λ)=log⁡(λ)

GLM framework: Poisson response variable

1️⃣ Response variable with parameter θ whose probability function can be written in exponential family form

P(Y=y)=eylog⁡(λ)−λ−log⁡(y!)


2️⃣ A linear combination of predictors, η=β0+β1x1+β2x2+⋯+βpxp


3️⃣ A function g(λ) that connects λ and η

log⁡(λ)=η=β0+β1x1+β2x2+⋯+βpxp

Activity: Generalized linear models

For your group’s distribution

  • Write the pmf or pdf in one-parameter exponential form.

  • Describe an example of a setting where this random variable may be used.

  • Identify the canonical link function.

Activity: Generalized linear models

Distributions

  1. Exponential
  2. Gamma (with fixed r)
  3. Geometric
  4. Binary

See BMLR - Section 3.6 for details on the distributions.

If your group finishes early, try completing the exercise for another distribution.

08:00

Using the exponential family form

The one-parameter exponential family form is utilized for

  • Calculating MLEs of coefficients (recall iteratively reweighted least squares)

  • Inference for coefficients

  • Likelihood ratio and drop-in-deviance tests

The specific calculations are beyond the scope of this course. See Section 4.6 of Dunn, Smyth, et al. (2018) for more detail (available at Duke library).

References

Dunn, Peter K, Gordon K Smyth, et al. 2018. Generalized Linear Models with Examples in r. Vol. 53. Springer.
Nelder, John Ashworth, and Robert WM Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society Series A: Statistics in Society 135 (3): 370–84.
Roback, Paul, and Julie Legler. 2021. Beyond multiple linear regression: applied generalized linear models and multilevel models in R. CRC Press.

🔗 STA 310 - Spring 2024

1 / 17
Unifying theory of GLMs Prof. Maria Tackett Feb 05, 2024

  1. Slides

  2. Tools

  3. Close
  • Unifying theory of GLMs
  • Announcements
  • Topics
  • Unifying theory of GLMs
  • Many models; one family
  • Components of a GLM
  • One-parameter exponential family form
  • Mean and variance
  • Poisson in one-parameter exponential family form
  • Poisson in exponential family form
  • Canonical link
  • Canonical link for Poisson
  • GLM framework: Poisson response variable
  • Activity: Generalized linear models
  • Activity: Generalized linear models
  • Using the exponential family form
  • References
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help