STA 310 - Spring 2024 - Using likelihoods

year	winner	condition	speed	starters
1896	Ben Brush	good	51.66	8
1897	Typhoon II	slow	49.81	6
1898	Plaudit	good	51.16	4
1899	Manuel	fast	50.00	5
1900	Lieut. Gibson	fast	52.28	7

Candidate models

Model 1: Main effects model (year, condition, starters)

model1 <- lm(speed ~ starters + year + condition, data = derby)

Model 2: Main effects + $y e a r^{2}$ , the quadratic effect of year

model2 <- lm(speed ~ starters + year + I(year^2) + condition,
             data = derby)

Model 3: Main effects + interaction between year and condition

model3 <- lm(speed ~ starters + year + condition + year * condition, 
             data = derby)

Inference for regression

When LINE assumptions are met… . . .

Use least squares regression to obtain the estimates for the model coefficients $β_{0}, β_{1}, \dots, β_{j}$ and for $σ^{2}$
$\hat{σ}$ is the regression standard error

$\hat{σ} = \sqrt{\frac{\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}{n - p - 1}} = \sqrt{\frac{\sum_{i = 1}^{n} e_{i}^{2}}{n - p - 1}}$

where $p$ is the number of non-intercept terms in the model (e.g., $p = 1$ in simple linear regression)
Goal is to use estimated values to draw conclusions about $β_{j}$
- Use $\hat{σ}$ to calculate $S E_{{\hat{β}}_{j}}$ . Click here for more detail.

Hypothesis testing for $β_{j}$

State the hypotheses. $H_{0} : β_{j} = 0 vs. H_{a} : β_{j} \neq 0$ , given the other variables in the model.

Calculate the test statistic.

$t = \frac{{\hat{β}}_{j} - 0}{S E_{{\hat{β}}_{j}}}$

Calculate the p-value. The p-value is calculated from a $t$ distribution with $n - p - 1$ degrees of freedom.

$p-value = 2 P (T > | t |) T \sim t_{n - p - 1}$

State the conclusion in context of the data.
- Reject $H_{0}$ if p-value is sufficiently small.

model	r.squared	adj.r.squared	AIC	BIC
Model1	0.730	0.721	259.478	276.302
Model2	0.827	0.819	207.429	227.057
Model3	0.751	0.738	253.584	276.016

game	date	visitor	hometeam	foul1	foul2	foul3
166	20100126	CLEM	BC	V	V	V
224	20100224	DEPAUL	CIN	H	H	V
317	20100109	MARQET	NOVA	H	H	H
214	20100228	MARQET	SETON	V	V	H
278	20100128	SETON	SFL	H	V	V

foul1	foul2	foul3	n
H	H	H	3
H	H	V	2
H	V	H	3
H	V	V	7
V	H	H	7
V	H	V	1
V	V	H	5
V	V	V	2

Foul 1	Foul 2	Foul 3	n	Likelihood contribution
H	H	H	3	$p_{H}^{3}$
H	H	V	2	$p_{H}^{2} (1 - p_{H})$
H	V	H	3	$p_{H}^{2} (1 - p_{H})$
H	V	V	7	A
V	H	H	7	B
V	H	V	1	$p_{H} (1 - p_{H})^{2}$
V	V	H	5	$p_{H} (1 - p_{H})^{2}$
V	V	V	2	$(1 - p_{H})^{3}$

Find the MLE using numerical approximation

Specify a finite set of possible values the for $p_{H}$ and calculate the likelihood for each value

# write an R function for the likelihood
ref_lik <- function(ph) {
  ph^46 *(1 - ph)^44
}

# search possible values for p and return max
nGrid = 1000
ph <- seq(0, 1, length = nGrid)
lik <- ref_lik(ph)
ph[lik == max(lik)]

[1] 0.5115115

# use the optimize function to find the MLE
optimize(ref_lik, interval = c(0,1), maximum = TRUE)

$maximum
[1] 0.5111132

$objective
[1] 8.25947e-28

Model 2: Conditional model

Now let’s assume fouls are not independent within each game. We will specify this dependence using conditional probabilities.

Conditional probability: $P (A | B) =$ Probability of $A$ given $B$ has occurred

Define new parameters:

$p_{H | N}$ : Probability referees call foul on home team given there are equal numbers of fouls on the home and visiting teams
$p_{H | H B i a s}$ : Probability referees call foul on home team given there are more prior fouls on the home team
$p_{H | V B i a s}$ : Probability referees call foul on home team given there are more prior fouls on the visiting team

Model 2: Likelihood contributions

Foul 1	Foul 2	Foul 3	n	Likelihood contribution
H	H	H	3	$(p_{H \| N}) (p_{H \| H B i a s}) (p_{H \| H B i a s}) = (p_{H \| N}) (p_{H \| H B i a s})^{2}$
H	H	V	2	$(p_{H \| N}) (p_{H \| H B i a s}) (p_{H \| H B i a s}) = (p_{H \| N}) (p_{H \| H B i a s})^{2}$
H	V	H	3	$(p_{H \| N}) (p_{H \| H B i a s}) (1 - p_{H \| H B i a s})$
H	V	V	7	A
V	H	H	7	B
V	H	V	1	$(1 - p_{H \| N}) (p_{H \| V B i a s}) (1 - p_{H \| N}) = (1 - p_{H \| N})^{2} (p_{H \| V B i a s})$
V	V	H	5	$(1 - p_{H \| N}) (1 - p_{H \| V B i a s}) (p_{H \| V B i a s})$
V	V	V	2	$\begin{aligned} (1 - p_{H \| N}) (1 - p_{H \| V B i a s}) (1 - p_{H \| V B i a s}) \\ = (1 - p_{H \| N}) (1 - p_{H \| V B i a s})^{2} \end{aligned}$

Fill in A and B.

Foul 1	Foul 2	Foul 3	n	Likelihood contribution
H	H	H	3	$(p_{H \| N}) (p_{H \| H B i a s}) (p_{H \| H B i a s}) = (p_{H \| N}) (p_{H \| H B i a s})^{2}$
H	H	V	2	$(p_{H \| N}) (p_{H \| H B i a s}) (p_{H \| H B i a s}) = (p_{H \| N}) (p_{H \| H B i a s})^{2}$
H	V	H	3	$(p_{H \| N}) (p_{H \| H B i a s}) (1 - p_{H \| H B i a s})$
H	V	V	7	A
V	H	H	7	B
V	H	V	1	$(1 - p_{H \| N}) (p_{H \| V B i a s}) (1 - p_{H \| N}) = (1 - p_{H \| N})^{2} (p_{H \| V B i a s})$
V	V	H	5	$(1 - p_{H \| N}) (1 - p_{H \| V B i a s}) (p_{H \| V B i a s})$
V	V	V	2	$\begin{aligned} (1 - p_{H \| N}) (1 - p_{H \| V B i a s}) (1 - p_{H \| V B i a s}) \\ = (1 - p_{H \| N}) (1 - p_{H \| V B i a s})^{2} \end{aligned}$