# Welcome to SALSA’s documentation!¶

**SALSA**: `Software`

Lab for `Advanced`

Machine `Learning`

with `Stochastic`

`Algorithms`

is a native Julia implementation of stochastic algorithms for:

- linear and non-linear
**Support Vector Machines** **sparse linear modelling**

**SALSA** is an open source project available at Github under the GPLv3 license.

# Installation¶

The **SALSA** package can be installed from the `Julia`

command line with `Pkg.add("SALSA")`

or by running the same command directly with `Julia`

executable by `julia -e 'Pkg.add("SALSA")'`

.

# Mathematical background¶

The **SALSA** package aims at stochastically learning a classifier or regressor via the Regularized Empirical Risk Minimization [Vapnik1992] framework. We approach a family of the well-known Machine Learning problems of the type:

where is given as a pair of input-output variables and belongs to a set of independent observations, the loss functions measures the disagreement between the true target and the model prediction while the regularization term penalizes the complexity of the model . We draw uniformly from at most times due of the i.i.d. assumption and a fixed computational budget. Online passes and optimization with the full dataset are available too. The package includes stochastic algorithms for linear and non-linear Support Vector Machines [Boser1992] and sparse linear modelling [Hastie2015].

Particular choices of loss functions are (but are not restricted to the selection below):

Particular choices of the regularization term are:

- -regularization,
*i.e.* - -regularization,
*i.e.* - reweighted -regularization
- reweighted -regularization

# References¶

**SALSA** is stemmed from the following algorithmic approaches:

- Pegasos: S. Shalev-Shwartz, Y. Singer, N. Srebro, Pegasos: Primal Estimated sub-GrAdient SOlver for SVM, in: Proceedings of the 24th international conference on Machine learning, ICML ’07, New York, NY, USA, 2007, pp. 807–814.
- RDA: L. Xiao, Dual averaging methods for regularized stochastic learning and online optimization, J. Mach. Learn. Res. 11 (2010), pp. 2543–2596.
- Adaptive RDA: J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 (2011), pp. 2121–2159.
- Reweighted RDA: V. Jumutc, J.A.K. Suykens, Reweighted stochastic learning, Neurocomputing Special Issue - ISNN2014, 2015. (In Press)

# Dependencies¶

- MLBase: to support generic Machine Learning routines
- StatsBase: to support generic routines from Statistics
- Distances: to support distance metrics between vectors
- Distributions: to support sampling from various distributions
- DataFrames: to support and process files instead of in-memory matrices
- Clustering: to support Stochastic K-means Clustering (experimental feature)
- ProgressMeter: to support progress bars and ETA of different routines

# Indices and tables¶

[Vapnik1992] | Vapnik, Vladimir. “Principles of risk minimization for learning theory”, In Advances in neural information processing systems (NIPS), pp. 831-838. 1992. |

[Boser1992] | Boser, B., Guyon, I., Vapnik, V. “A training algorithm for optimal margin classifiers”, In Proceedings of the fifth annual workshop on Computational learning theory - COLT‘92., pp. 144-152, 1992. |

[Hastie2015] | Hastie T., Tibshirani R., Wainwright M. Statistical Learning with Sparsity: The Lasso and Generalizations, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, 2015. |