08:00
Feb 05, 2024
HW 02 due Wed, Feb 07 at 11:59pm
Project 01
presentations in class Wed, Feb 14
write up due Thu, Feb 15 at noon
Identify the components common to all generalized linear models
Find the canonical link based on the distribution of the response variable
Properties of GLMs
We have studied models for a variety of response variables
These models are all examples of generalized linear models.
GLMs have a similar structure for their likelihoods, MLEs, variances, so we can use a generalized approach to find the model estimates and associated uncertainty.
Nelder and Wedderburn (1972) defines a broad class of models called generalized linear models that generalizes multiple linear regression. GLMs are characterized by three components:
1️⃣ Response variable with parameter \(\theta\) whose probability function can be written in exponential family form (random component)
2️⃣ A linear combination of predictors, \(\eta = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\) (systematic component)
3️⃣ A link function \(g(\theta)\) that connects \(\theta\) to \(\eta\)
Suppose a probability (mass or density) function has a parameter \(\theta\). It is said to have a one-parameter exponential family form if
✅ The support (set of possible values) does not depend on \(\theta\), and
✅ The probability function can be written in the following form
\[f(y;\theta) = e^{[a(y)b(\theta) + c(\theta) + d(y)]}\]
On-parameter exponential family form
\[f(y;\theta) = e^{[a(y)b(\theta) + c(\theta) + d(y)]}\]
Using this form:
\[E(Y) = -\frac{c'(\theta)}{b'(\theta)} \hspace{20mm} Var(Y) = \frac{b''(\theta)c'(\theta) - c''(\theta)b'(\theta)}{[b'(\theta)]^3}\]
\[P(Y = y) = \frac{e^{-\lambda}\lambda^y}{y!} \hspace{10mm} y = 0, 1, 2, \ldots, \infty\]
\[\begin{aligned}P(Y = y) &= e^{-\lambda}e^{y\log(\lambda)}e^{-\log(y!)}\\ & = e^{y\log(\lambda) - \lambda - \log(y!)}\end{aligned}\]
Recall the form: \(f(y;\theta) = e^{[a(y)b(\theta) + c(\theta) + d(y)]}\), where the parameter \(\theta = \lambda\) for the Poisson distribution
✅ The support for the Poisson distribution is \(y = 0, 1, 2, \ldots, \infty\). This does not depend on the parameter \(\lambda\).
✅ The probability mass function can be written in the form \(f(y;\theta) = e^{[a(y)b(\theta) + c(\theta) + d(y)]}\)
The Poisson distribution can be written in one-parameter exponential family form.
Suppose there is a response variable \(Y\) from a distribution with parameter \(\theta\) and a set of predictors that can be written as a linear combination \(\eta = \beta_0 + \sum_{j=1}^{p}\beta_jx_j = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\)
A link function, \(g()\), is a monotonic and differentiable function that connects \(\theta\) to \(\eta\)
When working with a member of the one-parameter exponential family, \(b(\theta)\) is called the canonical link
Recall the exponential family form:
\[P(Y = y) = e^{y\log(\lambda) - \lambda - \log(y!)}\]
then the canonical link is \(b(\lambda) = \log(\lambda)\)
1️⃣ Response variable with parameter \(\theta\) whose probability function can be written in exponential family form
\[P(Y = y) = e^{y\log(\lambda) - \lambda - \log(y!)}\]
2️⃣ A linear combination of predictors, \[\eta = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]
3️⃣ A function \(g(\lambda)\) that connects \(\lambda\) and \(\eta\)
\[\log(\lambda) = \eta = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p\]
For your group’s distribution
Write the pmf or pdf in one-parameter exponential form.
Describe an example of a setting where this random variable may be used.
Identify the canonical link function.
Distributions
See BMLR - Section 3.6 for details on the distributions.
If your group finishes early, try completing the exercise for another distribution.
08:00
The one-parameter exponential family form is utilized for
Calculating MLEs of coefficients (recall iteratively reweighted least squares)
Inference for coefficients
Likelihood ratio and drop-in-deviance tests
The specific calculations are beyond the scope of this course. See Section 4.6 of Dunn, Smyth, et al. (2018) for more detail (available at Duke library).