The lars package has one version LASSO fitting algorithm. The result
of which is a path of coefficients, evaluated as a function of the
norm of the estimated coefficients.
library(lars)
data(diabetes)
diabetes.lasso = lars(diabetes$x, diabetes$y, type = "lasso")
plot(diabetes.lasso)
You can also get an estimate of prediction error based on cross-validation error from LARS.
cv.lars(diabetes$x, diabetes$y, K = 10, type = "lasso")
The LASSO can scale up to very large problems because of efficient
solvers that take advantage of the fact that for large
values the solution is sparse so it can be solved quickly.
library(glmnet)
## Loading required package: Matrix
## Loading required package: lattice
## Loaded glmnet 1.7.4
load("/Users/jonathantaylor/Documents/work/stats191.git/notebooks/NewsGroup.RData")
Y = (NewsGroup$y + 1)/2
X = NewsGroup$x
print(dim(X))
## [1] 11314 777811
news = glmnet(X, as.factor(Y), family = "binomial")
news
##
## Call: glmnet(x = X, y = as.factor(Y), family = "binomial")
##
## Df %Dev Lambda
## [1,] 0 2.01e-13 0.12000
## [2,] 1 3.68e-03 0.11400
## [3,] 1 7.03e-03 0.10900
## [4,] 1 1.01e-02 0.10400
## [5,] 1 1.29e-02 0.09930
## [6,] 1 1.54e-02 0.09480
## [7,] 1 1.78e-02 0.09050
## [8,] 3 2.10e-02 0.08640
## [9,] 4 2.53e-02 0.08250
## [10,] 4 3.04e-02 0.07870
## [11,] 4 3.53e-02 0.07510
## [12,] 4 3.99e-02 0.07170
## [13,] 5 4.46e-02 0.06850
## [14,] 5 4.97e-02 0.06540
## [15,] 7 5.54e-02 0.06240
## [16,] 7 6.12e-02 0.05960
## [17,] 15 6.89e-02 0.05680
## [18,] 15 7.81e-02 0.05430
## [19,] 17 8.76e-02 0.05180
## [20,] 18 9.68e-02 0.04940
## [21,] 21 1.06e-01 0.04720
## [22,] 24 1.17e-01 0.04510
## [23,] 26 1.27e-01 0.04300
## [24,] 28 1.38e-01 0.04100
## [25,] 31 1.48e-01 0.03920
## [26,] 40 1.58e-01 0.03740
## [27,] 46 1.70e-01 0.03570
## [28,] 53 1.82e-01 0.03410
## [29,] 63 1.94e-01 0.03250
## [30,] 72 2.07e-01 0.03110
## [31,] 85 2.21e-01 0.02960
## [32,] 95 2.36e-01 0.02830
## [33,] 101 2.50e-01 0.02700
## [34,] 112 2.64e-01 0.02580
## [35,] 118 2.78e-01 0.02460
## [36,] 129 2.92e-01 0.02350
## [37,] 143 3.05e-01 0.02240
## [38,] 157 3.19e-01 0.02140
## [39,] 169 3.33e-01 0.02040
## [40,] 184 3.47e-01 0.01950
## [41,] 203 3.61e-01 0.01860
## [42,] 216 3.75e-01 0.01780
## [43,] 230 3.88e-01 0.01700
## [44,] 249 4.02e-01 0.01620
## [45,] 273 4.15e-01 0.01550
## [46,] 312 4.29e-01 0.01480
## [47,] 352 4.42e-01 0.01410
## [48,] 385 4.56e-01 0.01340
## [49,] 429 4.69e-01 0.01280
## [50,] 457 4.83e-01 0.01220
## [51,] 539 4.96e-01 0.01170
## [52,] 587 5.10e-01 0.01120
## [53,] 637 5.23e-01 0.01070
## [54,] 693 5.37e-01 0.01020
## [55,] 756 5.50e-01 0.00971
## [56,] 842 5.64e-01 0.00926
## [57,] 918 5.78e-01 0.00884
## [58,] 993 5.91e-01 0.00844
## [59,] 1099 6.05e-01 0.00806
## [60,] 1189 6.18e-01 0.00769
## [61,] 1314 6.31e-01 0.00734
## [62,] 1425 6.44e-01 0.00701
## [63,] 1551 6.57e-01 0.00669
## [64,] 1660 6.70e-01 0.00639
## [65,] 1785 6.83e-01 0.00610
## [66,] 1895 6.95e-01 0.00582
## [67,] 2014 7.07e-01 0.00555
## [68,] 2225 7.19e-01 0.00530
## [69,] 2389 7.31e-01 0.00506
## [70,] 2473 7.42e-01 0.00483
## [71,] 2606 7.53e-01 0.00461
## [72,] 2726 7.64e-01 0.00440
## [73,] 2824 7.74e-01 0.00420
## [74,] 2927 7.84e-01 0.00401
## [75,] 3119 7.94e-01 0.00383
## [76,] 3207 8.03e-01 0.00365
## [77,] 3329 8.12e-01 0.00349
## [78,] 3502 8.21e-01 0.00333
## [79,] 3583 8.29e-01 0.00318
## [80,] 3801 8.37e-01 0.00303
## [81,] 3855 8.44e-01 0.00290
## [82,] 4001 8.51e-01 0.00276
## [83,] 4031 8.58e-01 0.00264
## [84,] 4160 8.65e-01 0.00252
## [85,] 4197 8.71e-01 0.00240
## [86,] 4267 8.77e-01 0.00229
## [87,] 4318 8.83e-01 0.00219
## [88,] 4396 8.88e-01 0.00209
## [89,] 4504 8.93e-01 0.00200
## [90,] 4578 8.98e-01 0.00191
## [91,] 4621 9.03e-01 0.00182
## [92,] 4814 9.07e-01 0.00174
## [93,] 4831 9.11e-01 0.00166
## [94,] 4907 9.15e-01 0.00158
## [95,] 4982 9.19e-01 0.00151
## [96,] 5101 9.23e-01 0.00144
## [97,] 5148 9.26e-01 0.00138
## [98,] 5205 9.30e-01 0.00131
## [99,] 5189 9.33e-01 0.00125
## [100,] 5255 9.36e-01 0.00120
The package gglasso has an implementation of the group LASSO.
library(gglasso)
Here is an example from the documentation in the gglasso package.
data(bardet)
The variables are grouped into groups of size 5.
group1 = rep(1:20, each = 5)
group1
## [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5
## [24] 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10
## [47] 10 10 10 10 11 11 11 11 11 12 12 12 12 12 13 13 13 13 13 14 14 14 14
## [70] 14 15 15 15 15 15 16 16 16 16 16 17 17 17 17 17 18 18 18 18 18 19 19
## [93] 19 19 19 20 20 20 20 20
Let’s fit the model.
m1 <- gglasso(x = bardet$x, y = bardet$y, group = group1, loss = "ls")
m1
##
## Call: gglasso(x = bardet$x, y = bardet$y, group = group1, loss = "ls")
##
## Df Lambda
## s0 0 7.58e-03
## s1 5 7.07e-03
## s2 5 6.59e-03
## s3 5 6.14e-03
## s4 5 5.73e-03
## s5 10 5.34e-03
## s6 10 4.98e-03
## s7 10 4.65e-03
## s8 15 4.34e-03
## s9 20 4.04e-03
## s10 25 3.77e-03
## s11 25 3.52e-03
## s12 25 3.28e-03
## s13 25 3.06e-03
## s14 25 2.85e-03
## s15 35 2.66e-03
## s16 40 2.48e-03
## s17 40 2.31e-03
## s18 40 2.16e-03
## s19 35 2.01e-03
## s20 45 1.88e-03
## s21 55 1.75e-03
## s22 55 1.63e-03
## s23 60 1.52e-03
## s24 65 1.42e-03
## s25 65 1.32e-03
## s26 65 1.23e-03
## s27 65 1.15e-03
## s28 65 1.07e-03
## s29 65 1.00e-03
## s30 75 9.34e-04
## s31 75 8.71e-04
## s32 75 8.12e-04
## s33 75 7.58e-04
## s34 75 7.07e-04
## s35 75 6.59e-04
## s36 80 6.14e-04
## s37 80 5.73e-04
## s38 85 5.34e-04
## s39 90 4.98e-04
## s40 90 4.65e-04
## s41 90 4.34e-04
## s42 95 4.04e-04
## s43 95 3.77e-04
## s44 100 3.52e-04
## s45 100 3.28e-04
## s46 100 3.06e-04
## s47 100 2.85e-04
## s48 100 2.66e-04
## s49 100 2.48e-04
## s50 100 2.31e-04
## s51 100 2.16e-04
## s52 100 2.01e-04
## s53 100 1.88e-04
## s54 100 1.75e-04
## s55 100 1.63e-04
## s56 100 1.52e-04
## s57 100 1.42e-04
## s58 100 1.32e-04
## s59 100 1.23e-04
## s60 100 1.15e-04
## s61 100 1.07e-04
## s62 100 1.00e-04
## s63 100 9.34e-05
## s64 100 8.71e-05
## s65 100 8.12e-05
## s66 100 7.58e-05
## s67 100 7.07e-05
## s68 100 6.59e-05
## s69 100 6.14e-05
## s70 100 5.73e-05
## s71 100 5.34e-05
## s72 100 4.98e-05
## s73 100 4.65e-05
## s74 100 4.34e-05
## s75 100 4.04e-05
## s76 100 3.77e-05
## s77 100 3.52e-05
## s78 100 3.28e-05
## s79 100 3.06e-05
## s80 100 2.85e-05
## s81 100 2.66e-05
## s82 100 2.48e-05
## s83 100 2.31e-05
## s84 100 2.16e-05
## s85 100 2.01e-05
## s86 100 1.88e-05
## s87 100 1.75e-05
## s88 100 1.63e-05
## s89 100 1.52e-05
## s90 100 1.42e-05
## s91 100 1.32e-05
## s92 100 1.23e-05
## s93 100 1.15e-05
## s94 100 1.07e-05
## s95 100 1.00e-05
## s96 100 9.34e-06
## s97 100 8.71e-06
## s98 100 8.12e-06
## s99 100 7.58e-06
plot(m1)