Title: | SparseStep Regression |
---|---|
Description: | Implements the SparseStep model for solving regression problems with a sparsity constraint on the parameters. The SparseStep regression model was proposed in Van den Burg, Groenen, and Alfons (2017) <arXiv:1701.06967>. In the model, a regularization term is added to the regression problem which approximates the counting norm of the parameters. By iteratively improving the approximation a sparse solution to the regression problem can be obtained. In this package both the standard SparseStep algorithm is implemented as well as a path algorithm which uses golden section search to determine solutions with different values for the regularization parameter. |
Authors: | Gertjan van den Burg [aut, cre], Patrick Groenen [ctb], Andreas Alfons [ctb] |
Maintainer: | Gertjan van den Burg <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2024-11-11 02:57:57 UTC |
Source: | https://github.com/gjjvdburg/sparsestep |
In the SparseStep regression model the ordinary least-squares problem is
augmented with an approximation of the exact pseudonorm.
This approximation is made increasingly more accurate in the SparseStep
algorithm, resulting in a sparse solution to the regression problem. See
the references for more information.
The main SparseStep functions are:
sparsestep
Fit a SparseStep model for a given range of
values
path.sparsestep
Fit the SparseStep model along a path
of values which are generated such that a model is created at
each possible level of sparsity, or until a given recursion depth is
reached.
Other available functions are:
plot
Plot the coefficient path of the SparseStep model.
predict
Predict the outcome of the linear model using SparseStep
coef
Get the coefficients from the SparseStep model
print
Print a short description of the SparseStep model
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) plot(fit) fits <- path.sparsestep(x, y) plot(fits) x2 <- matrix(rnorm(50*20), 50, 20) y2 <- predict(fits, x2)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) plot(fit) fits <- path.sparsestep(x, y) plot(fits) x2 <- matrix(rnorm(50*20), 50, 20) y2 <- predict(fits, x2)
Returns the coefficients of the SparseStep model.
## S3 method for class 'sparsestep' coef(object, ...)
## S3 method for class 'sparsestep' coef(object, ...)
object |
a |
... |
further argument are ignored |
The coefficients of the SparseStep model (i.e. the betas) as a dgCMatrix. If the model was fitted with an intercept this will be the first row in the resulting matrix.
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) coef(fit)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) coef(fit)
Fits the entire regularization path for SparseStep using a
Golden Section search. Note that this algorithm is approximate, there is no
guarantee that the solutions _between_ induced values of lambdas do not
differ from those calculated. For instance, if solutions are calculated at
and
, this
algorithm ensures that
has one more zero
than the solution at
(provided the recursion
depth is large enough). There is however no guarantee that there are no
different solutions between
and
. This is an ongoing research topic.
Note that this path algorithm is not faster than running the
sparsestep
function with the same sequence.
path.sparsestep( x, y, max.depth = 10, gamma0 = 1000, gammastop = 1e-04, IMsteps = 2, gammastep = 2, normalize = TRUE, intercept = TRUE, force.zero = TRUE, threshold = 1e-07, XX = NULL, Xy = NULL, use.XX = TRUE, use.Xy = TRUE, quiet = FALSE )
path.sparsestep( x, y, max.depth = 10, gamma0 = 1000, gammastop = 1e-04, IMsteps = 2, gammastep = 2, normalize = TRUE, intercept = TRUE, force.zero = TRUE, threshold = 1e-07, XX = NULL, Xy = NULL, use.XX = TRUE, use.Xy = TRUE, quiet = FALSE )
x |
matrix of predictors |
y |
response |
max.depth |
maximum recursion depth |
gamma0 |
starting value of the gamma parameter |
gammastop |
stopping value of the gamma parameter |
IMsteps |
number of steps of the majorization algorithm to perform for each value of gamma |
gammastep |
factor to decrease gamma with at each step |
normalize |
if TRUE, each variable is standardized to have unit L2 norm, otherwise it is left alone. |
intercept |
if TRUE, an intercept is included in the model (and not penalized), otherwise no intercept is included |
force.zero |
if TRUE, absolute coefficients smaller than the provided threshold value are set to absolute zero as a post-processing step, otherwise no thresholding is performed |
threshold |
threshold value to use for setting coefficients to absolute zero |
XX |
The X'X matrix; useful for repeated runs where X'X stays the same |
Xy |
The X'y matrix; useful for repeated runs where X'y stays the same |
use.XX |
whether or not to compute X'X and return it |
use.Xy |
whether or not to compute X'y and return it |
quiet |
don't print search info while running |
A "sparsestep" S3 object is returned, for which print, predict, coef, and plot methods exist. It has the following items:
call |
The call that was used to construct the model. |
lambda |
The value(s) of lambda used to construct the model. |
gamma0 |
The gamma0 value of the model. |
gammastop |
The gammastop value of the model |
IMsteps |
The IMsteps value of the model |
gammastep |
The gammastep value of the model |
intercept |
Boolean indicating if an intercept was fitted in the model |
force.zero |
Boolean indicating if a force zero-setting was performed. |
threshold |
The threshold used for a forced zero-setting |
beta |
The resulting coefficients stored in a sparse matrix format (dgCMatrix). This matrix has dimensions nvar x nlambda |
a0 |
The intercept vector for each value of gamma of length nlambda |
normx |
Vector used to normalize the columns of x |
meanx |
Vector of column means of x |
XX |
The matrix X'X if use.XX was set to TRUE |
Xy |
The matrix X'y if use.Xy was set to TRUE |
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
coef
, print
, predict
,
plot
, and sparsestep
.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) pth <- path.sparsestep(x, y)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) pth <- path.sparsestep(x, y)
Plot the coefficients of the SparseStep path
## S3 method for class 'sparsestep' plot(x, ...)
## S3 method for class 'sparsestep' plot(x, ...)
x |
a |
... |
further argument to matplot |
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) plot(fit) pth <- path.sparsestep(x, y) plot(pth)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) plot(fit) pth <- path.sparsestep(x, y) plot(pth)
Predicts the outcome variable for the SparseStep model for each value of lambda supplied to the model.
## S3 method for class 'sparsestep' predict(object, newx, ...)
## S3 method for class 'sparsestep' predict(object, newx, ...)
object |
Fitted |
newx |
Matrix of new values for |
... |
further argument are ignored |
a matrix of numerical predictions of size nobs x nlambda
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) yhat <- predict(fit, x)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) yhat <- predict(fit, x)
Prints a short text of a fitted SparseStep model
## S3 method for class 'sparsestep' print(x, ...)
## S3 method for class 'sparsestep' print(x, ...)
x |
a |
... |
further argument are ignored |
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) print(fit)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y) print(fit)
Fits the SparseStep model for a chosen values of the regularization parameter.
sparsestep( x, y, lambda = c(0.1, 0.5, 1, 5, 10), gamma0 = 1000, gammastop = 1e-04, IMsteps = 2, gammastep = 2, normalize = TRUE, intercept = TRUE, force.zero = TRUE, threshold = 1e-07, XX = NULL, Xy = NULL, use.XX = TRUE, use.Xy = TRUE )
sparsestep( x, y, lambda = c(0.1, 0.5, 1, 5, 10), gamma0 = 1000, gammastop = 1e-04, IMsteps = 2, gammastep = 2, normalize = TRUE, intercept = TRUE, force.zero = TRUE, threshold = 1e-07, XX = NULL, Xy = NULL, use.XX = TRUE, use.Xy = TRUE )
x |
matrix of predictors |
y |
response |
lambda |
regularization parameter |
gamma0 |
starting value of the gamma parameter |
gammastop |
stopping value of the gamma parameter |
IMsteps |
number of steps of the majorization algorithm to perform for each value of gamma |
gammastep |
factor to decrease gamma with at each step |
normalize |
if TRUE, each variable is standardized to have unit L2 norm, otherwise it is left alone. |
intercept |
if TRUE, an intercept is included in the model (and not penalized), otherwise no intercept is included |
force.zero |
if TRUE, absolute coefficients smaller than the provided threshold value are set to absolute zero as a post-processing step, otherwise no thresholding is performed |
threshold |
threshold value to use for setting coefficients to absolute zero |
XX |
The X'X matrix; useful for repeated runs where X'X stays the same |
Xy |
The X'y matrix; useful for repeated runs where X'y stays the same |
use.XX |
whether or not to compute X'X and return it (boolean) |
use.Xy |
whether or not to compute X'y and return it (boolean) |
A "sparsestep" S3 object is returned, for which print, predict, coef, and plot methods exist. It has the following items:
call |
The call that was used to construct the model. |
lambda |
The value(s) of lambda used to construct the model. |
gamma0 |
The gamma0 value of the model. |
gammastop |
The gammastop value of the model |
IMsteps |
The IMsteps value of the model |
gammastep |
The gammastep value of the model |
intercept |
Boolean indicating if an intercept was fitted in the model |
force.zero |
Boolean indicating if a force zero-setting was performed. |
threshold |
The threshold used for a forced zero-setting |
beta |
The resulting coefficients stored in a sparse matrix format (dgCMatrix). This matrix has dimensions nvar x nlambda |
a0 |
The intercept vector for each value of gamma of length nlambda |
normx |
Vector used to normalize the columns of x |
meanx |
Vector of column means of x |
XX |
The matrix X'X if use.XX was set to TRUE |
Xy |
The matrix X'y if use.Xy was set to TRUE |
Gerrit J.J. van den Burg, Patrick J.F. Groenen, Andreas Alfons
Maintainer: Gerrit J.J. van den Burg <[email protected]>
Van den Burg, G.J.J., Groenen, P.J.F. and Alfons, A. (2017). SparseStep: Approximating the Counting Norm for Sparse Regularization, arXiv preprint arXiv:1701.06967 [stat.ME]. URL https://arxiv.org/abs/1701.06967.
coef
, print
, predict
,
plot
, and path.sparsestep
.
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y)
x <- matrix(rnorm(100*20), 100, 20) y <- rnorm(100) fit <- sparsestep(x, y)