Nonnegative matrix factorization can be used to automatically detect topics within a corpus in an unsupervised fashion. The technique amounts to an approximation of a nonnegative matrix as the product of two nonnegative matrices of lower rank. In this paper, we show this factorization can be combined with regression on a continuous response variable. In practice, the method performs better than regression done after topics are identified and retrains interpretability.
Lindstrom, Michael R., et al. "Continuous Semi-Supervised Nonnegative Matrix Factorization." arXiv preprint arXiv:2212.09858 (2022).
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.