Subtropical algebra, or max-times algebra, is a commutative semi-ring over the nonnegative real numbers with the addition defined as the maximum-operator. Matrix multiplication over the subtropical algebra is defined as usual: if \(A\in\mathbb{R}_+^{m\times k}\) and \(B\in\mathbb{R}_+^{k\times n}\) are two nonnegative matrices, then their subtropical product, \(A\boxtimes B\), is defined elementwise as \[ (A\boxtimes B)_{ij} = \max_{l=1}^k A_{il}B_{lk} . \] It is closely related to the tropical, or max-plus, algebra.

Finding subtropical factorizations allows us to model different interplay between the rank-1 matrices than, say, NMF. When NMF models parts-of-whole interplay, where each rank-1 matrix contributes some part of the total sum of each element in the product matrix, subtropical factorization models the winner-takes-it-all behaviour: the value of each element is the largest value any rank-1 component matrix has at the corresponding element. For example, \[ \begin{pmatrix} 1 & 0 \\ 2 & 1 \\ 0 & 2 \end{pmatrix} \begin{pmatrix} 1 & 2 & 0 \\ 0 & 2 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 2 & 0 \\ 2 & 6 & 1 \\ 0 & 4 & 2 \end{pmatrix}, \] but \[ \begin{pmatrix} 1 & 0 \\ 2 & 1 \\ 0 & 2 \end{pmatrix} \boxtimes \begin{pmatrix} 1 & 2 & 0 \\ 0 & 2 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 2 & 0 \\ 2 & \mathbf{4} & 1 \\ 0 & 4 & 2 \end{pmatrix}. \] Indeed, subtropical product yields a matrix with full rank, although its subtropical rank is clearly only 2.

Our framework for approximate subtropical matrix factorizations is called Equator [4]. It contains our earlier algorithms, Capricorn [1] and Cancer [2], for finding subtropical factorizations with fixed subtropical rank. Capricorn is designed for discrete data and noise, e.g. counting data where the noise causes the count to be replaced with another random count. Cancer, on the other hand, is designed for continuous data with Gaussian noise.

Source code

Both Capricorn and Cancer have been implemented with Matlab, and have been tested with Matlab 2015b on Linux. The package for Equator contains the source code for both Capricorn and Cancer, together with the extension and scripts for experiments presented in the Data Mining and Knowledge Discovery paper [4]. The specific packages for Capricorn and Cancer contain the algorithms and scripts to run the experiments from the respective publications. See the README files in the packages for more information.

The software is given free of charge for academic use. We recommend using the Equator package, for which you should cite [4]. If you need the code from [2] or [3] specifically, you should cite those papers.

References

  1. Sanjar Karaev Pauli Miettinen Capricorn: An Algorithm for Subtropical Matrix Factorization. Proc. 2016 SIAM International Conference on Data Mining (SDM '16), , 702710.
    10.1137/1.9781611974348.79
    [manuscript | pdf (SIAM) | source code]
  2. Sanjar Karaev Pauli Miettinen Cancer: Another Algorithm for Subtropical Matrix Factorization. Proc. 2016 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD '16), LNCS vol. 9852, , 576592.
    10.1007/978-3-319-46227-1_36
    [manuscript | pdf (Springer) | source code]
  3. Sanjar Karaev Pauli Miettinen Algorithms for Approximate Subtropical Matrix Factorization. arXiv:1707.08872 [cs.LG] .
    [pdf (arXiv) | source code]
  4. Sanjar Karaev Pauli Miettinen Algorithms for Approximate Subtropical Matrix Factorization. Data Mining and Knowledge Discovery 33(2), , 526576.
    10.1007/s10618-018-0599-1
    [pdf (Springer) | tech. rep. | preliminary version 1 | preliminary version 2 | source code | manuscript]