## Shrinkage methods

- Why? Some variables might be redundant. Shrink the model.

### Ridge Regression

### Lasso

- Small constraint $t$ cause some of the coefficients reduce exactly to 0: this is
**variable selection**, while producing**sparse model**. - Convex optimization.

**Why would lasso leads to exact 0 coefficients?**

Would spot the reason as long as you plot out the constraints and the RSS. Fig. 3.11.

### Compare Lasso and Ridge

For sparse models, lasso is better. Otherwise, lasso can make the fitting worse than ridge.

No rule of thumb.

### Generalization

Ridge and lasso can be generalized. Replace the distance calculation with other definitions, i.e., $\sum \lvert \beta_j \rvert^q$.

- $q=0$: subset selection
- $q=1$: lasso
- $q=2$: ridge

Smaller $q$ leads to tighter selection.

comments powered by Disqus