# 04.Shrinkage Methods

## Shrinkage methods

1. Why? Some variables might be redundant. Shrink the model.

### Lasso

1. Small constraint $t$ cause some of the coefficients reduce exactly to 0: this is variable selection, while producing sparse model.
2. Convex optimization.

Why would lasso leads to exact 0 coefficients?

Would spot the reason as long as you plot out the constraints and the RSS. Fig. 3.11.

### Compare Lasso and Ridge

For sparse models, lasso is better. Otherwise, lasso can make the fitting worse than ridge.

No rule of thumb.

### Generalization

Ridge and lasso can be generalized. Replace the distance calculation with other definitions, i.e., $\sum \lvert \beta_j \rvert^q$.

1. $q=0$: subset selection
2. $q=1$: lasso
3. $q=2$: ridge

Smaller $q$ leads to tighter selection.