Towards provably efficient quantum algorithms for large-scale machine-learning models

Junyu Liu; Minzhao Liu; Jin-Peng Liu; Ziyu Ye; Yunfei Wang; Yuri Alexeev; Jens Eisert; Liang Jiang

doi:10.1038/s41467-023-43957-x

Towards provably efficient quantum algorithms for large-scale machine-learning models

Nat Commun. 2024 Jan 10;15(1):434. doi: 10.1038/s41467-023-43957-x.

Authors

Junyu Liu^{1

2

3

4

5

6}, Minzhao Liu^{7

8}, Jin-Peng Liu^{9

10

11}, Ziyu Ye², Yunfei Wang¹², Yuri Alexeev^{2

3

8}, Jens Eisert¹³, Liang Jiang^{1

3}

Affiliations

¹ Pritzker School of Molecular Engineering, The University of Chicago, Chicago, IL, 60637, USA.
² Department of Computer Science, The University of Chicago, Chicago, IL, 60637, USA.
³ Chicago Quantum Exchange, Chicago, IL, 60637, USA.
⁴ Kadanoff Center for Theoretical Physics, The University of Chicago, Chicago, IL, 60637, USA.
⁵ qBraid Co., Chicago, IL, 60615, USA.
⁶ SeQure, Chicago, IL, 60615, USA.
⁷ Department of Physics, The University of Chicago, Chicago, IL, 60637, USA.
⁸ Computational Science Division, Argonne National Laboratory, Lemont, IL, 60439, USA.
⁹ Simons Institute for the Theory of Computing, University of California, Berkeley, CA, 94720, USA.
¹⁰ Department of Mathematics, University of California, Berkeley, CA, 94720, USA.
¹¹ Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
¹² Martin A. Fisher School of Physics, Brandeis University, Waltham, MA, 02453, USA.
¹³ Dahlem Center for Complex Quantum Systems, Free University Berlin, Berlin, 14195, Germany. jense@zedat.fu-berlin.de.

Abstract

Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as [Formula: see text], where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.

Grants and funding

CRC 183/Deutsche Forschungsgemeinschaft (German Research Foundation)