Nonnegative matrix factorization (
But what if a matrix has neither (pure) parts-of-whole structure nor (pure) winner-takes-it all structure? For that, we can use a mixture model. A simple (universal) convex combination of the two models would be \[ A \approx \alpha(B\boxtimes C^T) + (1 - \alpha)(BC^T) , \] where \(B\) and \(C\) are the factor matrices and \(0 \leq \alpha \leq 1\) is the mixture coefficient. If different parts of the matrix exhibit the different structure, however, it is better to have a different \(\alpha\) for each element of \(A\). To as this allows us to decompose an arbitrary big matrix exactly in rank-1 decomposition [1, Proposition 3.1], we require that the parameter matrix has a rank-1 structure. In particular, we assume that we have a vector \(\vec{\theta}\) for row parameters and a vector \(\vec{\phi}\) for column parameters. Our full model now becomes \[ a_{ij} \approx \sigma(\theta_i + \phi_j)(B\boxtimes C^T)_{ij} + (1 - \sigma(\theta_i + \phi_j)(BC^T)_{ij} \] for the \(i, j\) element of \(A\). Here, \(\sigma(x) = 1/(1+\exp\{-x\})\) is the sigmoid function.
The mixed model provides more power to the decomposition algorithm – indeed, in our experiments we obtain results better than
Consider, for example, a dataset about the climate in Europe (the climate data is available from WorldClim). This data has 48 columns (monthly minimum, maximum, and average temperatures in degrees Celsius and monthly precipitation in millimetres). The rows correspond to different locations in Europe. The below figures depict the parameter vectors for rows (left) and columns (right). Values below 0 indicate more ordinary structure, while values above 0 indicate more subtropical structure.
Our algorithm for finding such mixed decompositions – the one used also to find the results in Figure 1 – is called Latitude [1].
Source code
Latitude has been implemented with Matlab, and have been tested with Matlab 2015b on Linux. The package for Latitude contains the source code for the algorithm and the scripts to generate the synthetic data and to run the experiments. See the README file in the package for more information.
The software is given free of charge for academic use. If you use either package on your own work, you must cite the original publication [1].
References
-
Latitude: A Model for Mixed Linear–Tropical Matrix Factorization.
Proc. 2018 SIAM International Conference on Data Mining (SDM '18),
,
360–368.
10.1137/1.9781611975321.41
[pdf (SIAM) | manuscript | tech. rep. | source code] -
Latitude: A Model for Mixed Linear–Tropical Matrix Factorization.
arXiv:1801.06136 [cs.LG]
.
[pdf (arXiv) | source code]