pycasso package

pycasso

PICASSO: Penalized Generalized Linear Model Solver - Unleash the Power of Non-convex Penalty

Author:Jason Ge, Haoming Jiang
Maintainer:Haoming Jiang <jianghm@gatech.edu>
pycasso.test()

Show welcome information.

pycasso.core

Main Interface of the package

class pycasso.core.Solver(x, y, lambdas=(100, 0.05), family='gaussian', penalty='l1', gamma=3, useintercept=False, prec=0.0001, max_ite=1000, verbose=False)

Bases: object

The PICASSO Solver For GLM.

Parameters:
  • x – An n*m design matrix where n is the sample size and d is the data dimension.
  • y – The n dimensional response vector. y is numeric vector for gaussian and sqrtlasso, or a two-level factor for binomial, or a non-negative integer vector representing counts for gaussian.
  • lambdas – The parameters of controling regularization. Can be one of the following two cases:
    Case1 (default): A tuple of 2 variables (n, lambda_min_ratio), where the default values are (100,0.05). The program calculates lambdas as an array of n elements starting from lambda_max to lambda_min_ratio * lambda_max in log scale. lambda_max is the minimum regularization parameter which yields an all-zero estimates. Caution: logistic and poisson regression can be ill-conditioned if lambda is too small for nonconvex penalty. We suggest the user to avoid using any lambda_min_raito smaller than 0.05 for logistic/poisson regression under nonconvex penalty.
    Case2: A manually specified sequence (size > 2) of decreasing positive values to control the regularization.
  • family – Options for model. Sparse linear regression and sparse multivariate regression is applied if family = “gaussian”, sqrt lasso is applied if family = “sqrtlasso”, sparse logistic regression is applied if family = “binomial” and sparse poisson regression is applied if family = “poisson”. The default value is “gaussian”.
  • penalty – Options for regularization. Lasso is applied if method = “l1”, MCP is applied if ` method = “mcp”` and SCAD Lasso is applied if method = “scad”. The default value is “l1”.
  • gamma – The concavity parameter for MCP and SCAD. The default value is 3.
  • useintercept – Whether or not to include intercept term. Default value is False.
  • prec – Stopping precision. The default value is 1e-7.
  • max_ite – The iteration limit. The default value is 1000.
  • verbose – Tracing information is disabled if verbose = False. The default value is False.
coef()

Extract model coefficients.

Returns:a dictionary of the model coefficients.
Return type:dict{name : value}

The detail of returned list:

  • beta - A matrix of regression estimates whose columns correspond to regularization parameters for sparse linear regression and sparse logistic regression. A list of matrices of regression estimation corresponding to regularization parameters for sparse column inverse operator.
  • intercept - The value of intercepts corresponding to regularization parameters for sparse linear regression, and sparse logistic regression.
  • ite_lamb - Number of iterations for each lambda.
  • size_act - An array of solution sparsity (model degree of freedom).
  • train_time - The training time on each lambda.
  • total_train_time - The total training time.
  • state - The training state.
  • df - The number of nonzero coefficients
plot()

Visualize the solution path of regression estimate corresponding to regularization parameters.

predict(newdata=None, lambdidx=None)

Predicting responses of the new data.

Parameters:
  • newdata – An optional data frame in which to look for variables with which to predict. If omitted, the training data of the model are used.
  • lambdidx – Use the model coefficient corresponding to the lambdidx th lambda.
Returns:

The predicted response vectors based on the estimated models.

Return type:

np.array

train()

The trigger function for training the model

pycasso.libpath

Find the path to picasso dynamic library files.

exception pycasso.libpath.PicassoLibraryNotFound

Bases: Exception

Error thrown by when picasso is not found

pycasso.libpath.find_lib_path()

Find the path to picasso dynamic library files.

Returns:List of all found library path to picasso
Return type:list(string)