This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantum Circuit Distillation and Compression

Shunsuke Daimon daimon.shunsuke@qst.go.jp Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan. Department of Applied Physics and Physico-Informatics, Keio University, Yokohama 223-8522, Japan. Institute for AI and Beyond, The University of Tokyo, Tokyo 113-8656, Japan. Quantum Materials and Applications Research Center, National Institutes for Quantum Science and Technology, Tokyo 152-8550, Japan.    Kakeru Tsunekawa Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan.    Ryoto Takeuchi Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan.    Takahiro Sagawa Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan.    Naoki Yamamoto Department of Applied Physics and Physico-Informatics, Keio University, Yokohama 223-8522, Japan.    Eiji Saitoh Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan. Department of Applied Physics and Physico-Informatics, Keio University, Yokohama 223-8522, Japan. Institute for AI and Beyond, The University of Tokyo, Tokyo 113-8656, Japan. Institute for Materials Research, Tohoku University, Sendai 980-8577, Japan. WPI Advanced Institute for Materials Research, Tohoku University, Sendai 980-8577, Japan.
Abstract

Quantum coherence in a qubit is vulnerable to environmental noise. When long quantum calculation is run on a quantum processor without error correction, the noise often causes fatal errors and messes up the calculation. Here, we propose quantum-circuit distillation to generate quantum circuits that are short but have enough functions to produce an output almost identical to that of the original circuits. The distilled circuits are less sensitive to the noise and can complete calculation before the quantum coherence is broken in the qubits. We created a quantum-circuit distillator by building a reinforcement learning model, and applied it to the inverse quantum Fourier transform (IQFT) and Shor’s quantum prime factorization. The obtained distilled circuit allows correct calculation on IBM-Quantum processors. By working with the quantum-circuit distillator, we also found a general rule to generate quantum circuits approximating the general nn-qubit IQFTs. The quantum-circuit distillator offers a new approach to improve performance of noisy quantum processors.

I Introduction

Quantum calculation has been performed on gate-type quantum processors by designing a sequence of quantum logic gates called a quantum circuitLadd et al. (2010); Alexeev et al. (2021). Numerous algorithms based on quantum circuits have theoretically been proposed to realize quantum computationNielsen and Chuang (2010); Montanaro (2016). However, existing quantum processors are too noisy and not yet ready to perform these quantum algorithms properlyPreskill (2018); Córcoles et al. (2019). The noise causes errors in running quantum circuits and prevents the circuits from working properlyChow et al. (2012); Willsch et al. (2017), and the quantum processors cannot even perform simple quantum algorithms.

Here, we propose a quantum-circuit distillator to generate a quantum circuit that is short but has enough functions to yield almost the same output distribution as that of a long original quantum circuit. We show that the quantum-circuit distillator enables existing quantum processors to work properly by reducing errors effectively by demonstrating the distillation of the IQFT and Shor’s integer factorization circuits. By extrapolating the outputs from the distillator, we also found approximated general nn-qubit IQFTs for any qubit number nn.

II Results and Discussion

First, we show a result of the conventional four-qubit IQFT obtained with an existing superconducting quantum computer IBM Quantum (IBMQ)Gambetta et al. (2017); Wendin (2017); IBM . Figure 1(a) shows the conventional quantum circuit for the four-qubit IQFTNielsen and Chuang (2010). By applying the circuit to an initial quantum state and measuring the final state, the IBMQ outputs a four-digit bitstring. By repeating the measurement, we obtained probability distribution of the bitstrings. Figure 1(e) shows an output distribution profile for the four-qubit IQFT obtained from IBMQ. On the other hand, Fig. 1(d) shows the exact analytical result of the four-qubit IQFT calculation for the same input. The distribution observed experimentally from IBMQ [Fig. 1(e)] is far different from the expected distribution [Fig. 1(d)], showing that the conventional quantum circuit is too long to properly work on IBMQ due to the noise.

Refer to caption
Figure 1: (a) A quantum circuit for the inverse quantum Fourier transform (IQFT) for four qubits q0q_{0}, q1q_{1}, q2q_{2}, and q3q_{3}. The blue, red, green, and orange components represent the Hadamard, phase-shift, controlled NOT, and SWAP gates, respectively. (b) Schematic of a quantum circuit search. (c) A distilled quantum circuit for the four-qubit IQFT generated from the distillator. (d) A probability distribution obtained from the exact IQFT calculation without errors. (e), (f) Probability distributions obtained from the quantum computer IBMQ for the conventional four-qubit IQFT circuit (e) and for the distilled quantum circuit (f).

Before showing the details of the distillator developed in the present study, we demonstrate how it works. When our distillator is applied to the four-qubit IQFT [Fig. 1(b)], it compresses the conventional IQFT circuit and outputs a distilled brief quantum circuit as shown in Fig. 1(c). We can carry out IQFT calculation using the obtained circuit. Figure 1(f) shows the probability distribution of the four-qubit IQFT obtained from the distilled quantum circuit for the same input as those in Fig. 1(d) and (e). Significantly, the output profile is almost the same as the exact calculation [Fig. 1(d)]; the four-qubit IQFT is successfully completed in spite of the noisy qubit environment in IBMQ owing to the distillation. The distilled quantum circuit is four times shorter than the conventional quantum circuit with 36 quantum gates, and the brevity is responsible for the fewer errors and the successful calculation.

Refer to caption
Figure 2: The conventional four-qubit IQFT circuit and the distilled quantum circuit were executed for randomly initialized input states on IBMQ. The probabilities obtained from the exact calculations without errors (the gray bars) and experimental results on IBMQ for the conventional circuit and the distilled quantum circuit (the green and orange bars, respectively) for six different initial states are exemplified.

The main concept of the quantum-circuit distillator is to compress a target quantum circuit into a shorter quantum circuit that well approximates input/output relation of the target quantum circuit. To realize such a distillator, it is necessary to search an ideal quantum circuit from a huge number of combinations of quantum logic gates. We addressed this problem by taking advantage of the remarkable searching abilities of reinforcement learningKaelbling et al. (1996); Arulkumaran et al. (2017), known to be suitable for solving optimization problems for finite element combinations, such as quantum gates. To construct a reinforcement learning model for quantum circuit compression, we designed a variable-length quantum circuit search based on the Monte-Carlo tree search in which a deep neural network is used to accelerate the tree searchSilver et al. (2017a, b) (see Appendixes D and E and Figs. A·3 and A·4 for more details). In contrast to the quantum circuit search using fixed-length quantum circuitsPeruzzo et al. (2014); Farhi et al. (2014); Endo et al. (2021); Cerezo et al. (2021); Nakaji et al. (2022), such as the quantum circuit learningMitarai et al. (2018), our model enables us to search quantum circuits with different lengths and different circuit structures. The deep neural network makes the quantum circuit search efficient by sequentially predicting the most suitable gate to be placed next to the gate array constructed so far in the tree search.

We found that the Bhattacharyya coefficientFuchs and Van De Graaf (1999) BB works well as a reward function in the reinforcement learning used in the quantum-circuit distillation. The Bhattacharyya coefficient BB is a classical expression of the quantum fidelity, which gives a measure for the similarity between two quantum circuits based on the probability distributions (see Appendix A for the definition of BB). BB takes a value between 0 and 1, and approaches 1 (0) when the two probability distributions are the same (dissimilar). The quantum gate fidelityFuchs and Van De Graaf (1999), which compares quantum circuits as unitary transformations, has often been used as a reward function in the previous studies on quantum circuit searchKhaneja et al. (2005); Khatri et al. (2019); Dalgaard et al. (2020); Peng et al. (2020); Magann et al. (2021); Fösel et al. (2021); Moro et al. (2021); Kimura et al. (2022). However, we found that it imposes too strong constraints to the search in the present reinforcement learning [see Fig. 3(e)]. On the other hand, BB only compares probability distributions obtained from quantum circuits and makes it easier to find optimal circuits among a large number of quantum gate combinations.

By applying the developed quantum-circuit distillator using the reinforcement learning to the four-qubit IQFT circuit shown in Fig. 1(a), we obtained the distilled IQFT circuit shown in Fig. 1(c). As we discussed above, the distilled circuit outputs the probability distribution shown in Fig. 1(f) for an input, and the distribution is almost the same as the exact calculation shown in Fig. 1(d), greatly improved from the original IQFT circuit [Fig. 1(e)]. By comparing the probability distributions obtained from the distilled circuit [Fig. 1(f)] and the exact calculation [Fig. 1(d)], we obtained the BB value of 0.91. In contrast, for the probability distribution obtained from the conventional quantum circuit shown in Fig. 1(e), BB takes a lower value, 0.69, demonstrating the superiority of the distilled circuit.

We confirmed the generalization performance of the present reinforcement learning in the circuit distillator; we found that the distilled quantum circuit shown in Fig. 1(c) well reproduces the exact calculation of the four-qubit IQFT also for the input quantum states not used for the learning (see Appendix C for the theoretical analysis of the distilled quantum circuit). Figure 2 shows output probability distributions for several input quantum states prepared by random sequences of quantum gates. All the outputs from the distilled quantum circuit (orange bars in Fig. 2) well reproduce the exact calculations (gray bars in Fig. 2), while the outputs from the conventional IQFT circuit (green bars in Fig. 2) are far from the exact calculations (gray bars in Fig. 2).

We calculated the average value of the Bhattacharyya coefficient BaveB_{\rm{ave}} for randomly initialized input quantum states; we prepared input states by randomly applying quantum logic gates to the ground state of the quantum bits, and then the distilled quantum circuit is applied to the initial states (see Appendix A and Fig. A·1). BaveB_{\rm{ave}} for the distilled four-qubit IQFT circuit was estimated to be \sim 0.93, which is greater than that for the conventional circuit 0.90. The result shows again that the distilled quantum circuit with fewer quantum gates generates better outputs than the longer circuit derived from the exact quantum algorithm.

Refer to caption
Figure 3: (a) The distilled quantum circuits obtained from the distillator for the two-, three-, and four-qubit IQFTs. (b) The generalized nn-qubit quantum circuit for the nn-qubit IQFT. (c) The average performance of the generalized quantum circuits normalized by that of the conventional IQFT circuits, where BgenB_{\rm{gen}} and BconvB_{\rm{conv}} are the averaged Bhattacharyya coefficients of the generalized circuits and the conventional IQFT circuits, respectively. The number of qubits nn is changed from 4 to 9. (d) The number of quantum gates used in the generalized circuits and the conventional IQFT circuits. (e) The averaged Bhattacharyya coefficients BaveB_{\rm{ave}} and the quantum gate fidelity FaveF_{\rm{ave}} numerically calculated for the generalized circuits. Here, BaveB_{\rm{ave}} (FaveF_{\rm{ave}}) is calculated by comparing the probabilities (unitary matrices) obtained from the exact analytical calculations of the generalized quantum circuits without errors.

By exploiting some outputs from the quantum-circuit distillator, we can infer general quantum circuits which can approximate the nn-qubit IQFTs for any natural number nn, as follows. In Fig. 3(a), we show distilled circuits obtained from our distillator for the 2-, 3-, and 4-qubit IQFTs. By comparing these distilled quantum circuits shown in Fig. 3(a), we found a regularity on the array of the quantum gates: the gates are arranged in the order of Hadamard, SWAP, and CNOT gates in the distilled circuits [compare the distilled 2-, 3- , and 4-qubit IQFTs in Fig. 3(a)]. From the observation, we inductively predicted approximate quantum circuits generalized to arbitrary nn as shown in Fig. 3(b). We found that the generalized quantum circuits actually well approximate the conventional nn-qubit IQFT even for nn greater than four. The averaged Bhattacharyya coefficients for the generalized circuits were found to be greater than that for the conventional IQFT circuits even for nn greater than four as shown in Fig. 3(c). The result demonstrates that the generalized circuits outperform the conventional IQFT circuits on the existing quantum computer IBMQ. As shown in Fig. 3(d), the generalized quantum circuits reduce the number of the gates from O(n2)O(n^{2}) to O(n)O(n) compared to the conventional IQFTsNielsen and Chuang (2010). The number of gates is less than that of an approximation circuit O(nlogn)O(n\log n) proposed in previous studiesBarenco et al. (1996); Nam et al. (2020). The gate number reduction mitigates errors in computation and contributes to the performance of the generalized circuits. Note that the generalized circuits do not approximate the IQFT in terms of unitary transformations while they approximate the output probability distributions. As shown in Fig. 3(e), while the output probability distributions show high values of BaveB_{\rm{ave}}, the quantum gate fidelity FaveF_{\rm{ave}} takes much lower values for all nn. If the quantum gate fidelity is used as a reward for the search, as in previous studiesFösel et al. (2021); Moro et al. (2021), the distilled circuit cannot be found. In addition, we have succeeded in analytically justifying the generalized approximate quantum circuits for a special case as described in Appendix C.

Finally, we demonstrate the application of the distilled IQFT circuit to executing Shor’s integer factorizationNielsen and Chuang (2010); Ekert and Jozsa (1996) of 57. Figure 4(a) shows the conventional quantum circuit for the factorization. The quantum circuit ideally outputs 0000 or 1000 [Fig. 4(b)] for the input of 57, and the prime factors of 3 and 19 can be derived from the output numbers. When the conventional quantum circuit is run on the quantum computer IBMQ, the output distribution profile [Fig. 4(c)] is far different from the exact calculation [Fig. 4(b)], and the desired numbers are no longer outputted as shown in Fig. 4(c). On the other hand, by replacing the IQFT with the distilled quantum circuit, we obtained almost the same distribution profile as the exact calculation [compare Fig. 4(b) and (d)]. The result shows that the distilled quantum circuits enable Shor’s integer factorization on the system.

Refer to caption
Figure 4: (a) A quantum circuit for the factorization of 57. When the conventional IQFT is chosen in the gray, the quantum circuit solves the order (rr) finding problem xr1(mod57)xr\equiv 1\pmod{57}, where we chose xx as 37. In this case, the quantum circuit is expected to output “0000” or “1000”, and then the prime factors of 3 and 19 can be derived from the output numbers. The blue, yellow, and green components represent the Hadamard, NOT, and CNOT gates, respectively. (b) Ideal probabilities obtained from the exact analytical calculation without errors. (c), (d) Experimental results for the conventional IQFT circuit (c) and the distilled IQFT circuit (d) obtained from IBMQ.

III Conclusion

In summary, we proposed and demonstrated a quantum-circuit distillator to generate quantum circuits less sensitive to noise in existing quantum processors. We applied the quantum-circuit distillator to the four-qubit IQFT and demonstrated that the generated distilled quantum circuit outperforms the conventional IQFT circuit on an existing quantum computer IBMQ. The distilled quantum circuits also let us discover the generalized quantum circuits approximating the general nn-qubit IQFTs. The result shows a new direction of scientific research, in which humans and machine learning work together to create knowledge. Our method has a wide range of applications, including not only distillation of quantum algorithms such as IQFT, but also data encoding for quantum machine learning. This work will accelerate the realization of valuable quantum computations for practical tasks.

Acknowledgements.
The authors thank N. Yokoi for valuable discussions. This work was supported by ERATO (No. JPMJER1402), CREST (Nos. JPMJCR20C1, JPMJCR20T2) from JST, Japan; Grant-in-Aid for Scientific Research (S) (No. JP19H05600), Grant-in-Aid for Research Activity Start-up (No. JP19K21035), Grant-in-Aid for Transformative Research Areas (No. JP22H05114) from JSPS KAKENHI, Japan. T.S. is supported by JST ERATO-FS (No. JPMJER2204). N.Y. is supported by MEXT Quantum Leap Flagship Program (Nos. JPMXS0118067285, JPMXS0120319794). We acknowledge the use of IBM Quantum services for this work as part of the IBM Quantum Hub at The University of Tokyo and Keio University. The views expressed are those of the authors and do not reflect the official policy or position of IBM or the IBM Quantum team.

Appendix A Performance estimation of quantum circuits

Performance of a quantum circuit on a quantum processor is estimated by calculating BB between the output probabilities from the quantum circuit pqcp_{\rm{qc}} and the correct answer pansp_{\rm{ans}} calculated from the ideal IQFT for an initial quantum state |Ψ\left|\Psi\right>. The initial quantum state is prepared by applying a random sequence of quantum gates g1,g2,,gmg_{1},g_{2},\cdots,g_{m} to the zero state |0:|Ψ=gmg2g1|0\left|0\right>:\left|\Psi\right>=g_{m}\cdots g_{2}g_{1}\left|0\right>, where g1,g2,,gmg_{1},g_{2},\cdots,g_{m} are selected from the Hadamard, NOT, Phase-shift, and CNOT gates. Figure A·3(b) shows an example of quantum gates used for the initialization of 3-qubit quantum states. The number of the quantum gates for the initialization mm is set to 4n4n for nn-qubit quantum circuits in our calculation. Next, the |Ψ\left|\Psi\right> is transformed by the quantum circuit and measured [Fig. A·1(b)]. By repeating the measurement, we obtain the probability distribution pqcp_{\rm{qc}} for the input quantum state |Ψ\left|\Psi\right>. At the same time, we numerically calculate the IQFT for the input quantum state |Ψ\left|\Psi\right> and obtained the probability distribution pansp_{\rm{ans}} [Fig. A·1(a)]. Then, B(pqc,pans)=iΩpqc(i)pans(i)B(p_{\rm{qc}},p_{\rm{ans}})=\sum_{i\in\Omega}\sqrt{p_{\rm{qc}}(i)p_{\rm{ans}}(i)} is calculated, where Ω\Omega is the set of the observed binary numbers for the nn-qubit quantum circuits. By changing the initialization gates and repeating the BB calculation for many input quantum states, BaveB_{\rm{ave}} is obtained as the average value of BB for the input quantum states.

Refer to caption
Figure A·1: (a), (b) Procedure of performance estimation on a quantum processor for the conventional IQFT circuit (a) and the distilled quantum circuit (b). Firstly, a quantum state is prepared by randomly applying quantum gates to the ground state. Secondly, the prepared quantum state is transformed by the conventional IQFT or distilled quantum circuits. Thirdly, the transformed state is measured. The average performance BaveB_{\rm{ave}} is calculated by averaging the Bhattacharyya coefficient BB for many initial quantum states with different initialization quantum gates.

Appendix B Devices and its noise parameters used in the performance estimation

The calculation results shown in Figs. 1 and 2 were obtained from the quantum computer IBMQ Vigo, where the average relaxation times T1T_{1} and T2T_{2}, and readout error rate are about 8×1018\times 10^{1} μ\mus, 7×1017\times 10^{1} μ\mus, and 3×1023\times 10^{-2}, respectively. The results shown in Fig. 3(c) were obtained from the quantum computer IBMQ Melbourne, where the average relaxation times T1T_{1} and T2T_{2} and readout error rate are about 6×1016\times 10^{1} μ\mus, 6×1016\times 10^{1} μ\mus, and 5×1025\times 10^{-2}, respectively. The results shown in Fig. 4 were obtained from the quantum computer IBMQ Sydney, where the average relaxation times T1T_{1} and T2T_{2} and readout error rate are about 1×1021\times 10^{2} μ\mus, 8×1018\times 10^{1} μ\mus, and 3×1023\times 10^{-2}, respectively.

Appendix C Quantum phase estimation using the generalized nn-qubit quantum circuits

The generalized nn-qubit quantum circuit shown in Fig. 3(b) was found to work well in the quantum phase estimation (QPE) problemNielsen and Chuang (2010); Aspuru-Guzik et al. (2005). The QPE is a quantum algorithm to estimate the phase θ\theta for a unitary matrix, UU, and its eigen quantum state |ϕ\left|\phi\right> such that U|ϕ=e2πiθ|ϕU\left|\phi\right>=e^{2\pi i\theta}\left|\phi\right>. The QPE is performed by the following three steps. Firstly, a product state |0|ϕ\left|0\right>\otimes\left|\phi\right> is prepared. Secondly, the Hadamard and controlled phase-shift gates are applied to the state as shown in Fig. A·2(a), and the phase θ\theta is introduced to the quantum state: 1/2nk=02n1e2πiθk|k|ϕ1/\sqrt{2^{n}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\theta k}\left|k\right>\otimes\left|\phi\right>, where kk denotes the integer representation of nn-bit binary numbers. Thirdly, the IQFT is applied, and the state is converted into |2nθ|ϕ\left|2^{n}\theta\right>\otimes\left|\phi\right>, where we assume that 2nθ2^{n}\theta is an integer for simplicity. Then, we can obtain θ\theta by measuring the quantum state. Here, we replace the IQFT with the generalized nn-qubit quantum circuit shown in Fig. 3(b), and we obtain the final quantum state as j1,,jn{0,1}En0j1En1j1j2E1jn1jn|jnj2j1\sum_{j_{1},\cdots,j_{n}\in\{0,1\}}E_{n}^{0\oplus j_{1}}E_{n-1}^{j_{1}\oplus j_{2}}\cdots E_{1}^{j_{n-1}\oplus j_{n}}\left|j_{n}\cdots j_{2}j_{1}\right>, where Ek0=(1+eiπ2kθ)/2E_{k}^{0}=\left(1+e^{i\pi 2^{k}\theta}\right)/2 and Ek1=(1eiπ2kθ)/2E_{k}^{1}=\left(1-e^{i\pi 2^{k}\theta}\right)/2, \oplus symbol denotes the exclusive disjunction between two binary digits. The probability that the state |jnj2j1\left|j_{n}\cdots j_{2}j_{1}\right> is observed is |En0j1En1j1j2E1jn1|2\left|E_{n}^{0\oplus j_{1}}E_{n-1}^{j_{1}\oplus j_{2}}\cdots E_{1}^{j_{n-1}}\right|^{2}. We analytically proved that the probability takes the maximum value whenjnj2j1j_{n}\cdots j_{2}j_{1} is the binary representation of 2nθ2^{n}\theta or 2n2nθ2^{n}-2^{n}\theta [Fig. A·2(b)]. The result means that the generalized nn-qubit quantum circuit outputs the correct answer with the maximum probability at least in the QPE. Here, we note that the probability of obtaining the correct answer decreases exponentially with increasing the number of qubits. The generalized quantum circuits cannot solve the QPE in polynomial time with bounded error probability.

Refer to caption
Figure A·2: (a) A quantum circuit to perform the QPE using the generalized nn-qubit quantum circuit. Compared to the conventional QPE algorithm, the IQFT part is replaced with the generalized nn-qubit quantum circuit. The blue boxes and the red components represent the Hadamard gates and the controlled phase shift gates, respectively. (b) Probability distributions for the four-qubit QPE using the conventional IQFT (the gray bars) and the generalized quantum circuit (the orange bars) obtained from the exact calculations without errors. The generalized quantum circuit outputs binary representations of 2nθ2^{n}\theta and 2n2nθ2^{n}-2^{n}\theta with the maximum probability, where θ\theta is the estimated phase for a unitary matrix UU and its eigen quantum state |ϕ\left|\phi\right> such that U|ϕ=e2πiθ|ϕU\left|\phi\right>=e^{2\pi i\theta}\left|\phi\right> (see Appendix C for detailed calculations of probability distributions).
Refer to caption
Figure A·3: (a) The MCTS used as a policy of the reinforcement learning model. The MCTS searches quantum circuits by repeatedly appending a quantum gate starting from the present quantum circuit sks_{k}. Reflecting state values of the searched quantum circuits, the MCTS generates a policy 𝝅k{\boldsymbol{\pi}}_{k}. Then, sks_{k} is updated to the next quantum circuit sk+1s_{k+1} by selecting an action aka_{k} according to 𝝅k{\boldsymbol{\pi}}_{k}. By repeating to update the quantum circuit and conduct the MCTS, distilled quantum circuits are searched from a huge number of combinations of quantum gates. See Appendix D for more detailed procedure of the quantum circuit search. (b) An example of the set of quantum gates used in the reinforcement learning for 3-qubit quantum circuits. The yellow boxes and the green components represent the NOT gates and the controlled NOT gates, respectively. An action aka_{k} is defined as selecting a quantum gate from the gates in (b) and appending the gate to sks_{k}. A policy 𝝅k{\boldsymbol{\pi}}_{k} is defined as a probability distribution over the actions. (c) A reward calculation for a quantum circuit sks_{k}. First, the Bhattacharyya coefficient BB between the ideal probabilities and output from sks_{k} is calculated for several initial quantum states |ϕ\left|\phi\right>. Then, the reward zkz_{k} is calculated as 1 or 0 depending on whether BB is larger than the threshold value BthB_{\rm{th}} for all the initial quantum states or not.
Refer to caption
Figure A·4: (a) The dual neural network consists of convolution (A), policy prediction (B), and state-value prediction (C) parts. The input to the network is an NN-dimensional vector representing an input quantum circuit ss. The outputs are a GG-dimensional vector q predicting a policy and a scalar vv predicting a state value of the input quantum circuit ss, where NN, GG, and CC are the maximum length of input quantum circuits, the number of selectable quantum gates, and the number of channels, respectively. Details of the network structure and hyperparameters are summarized in Table A·I.

Appendix D Reinforcement learning model

We formulated the search problem as combination optimization of quantum logic gates. We have applied reinforcement learning Arulkumaran et al. (2017) to the problem by treating quantum circuits and appending one quantum gate to the end of the quantum circuit as states and action, respectively [Fig. A·3(a)]. A state sks_{k} consisting of kk quantum gates is updated to a state sk+1s_{k+1} consisting of k+1k+1 quantum gates by applying an action aka_{k}. The quantum gates are selected from the Hadamard, NOT, Phase-shift, and CNOT gates, where magnitude of the phase shift is selected from 2π/21-2\pi/2^{1}, 2π/22-2\pi/2^{2}, 2π/23-2\pi/2^{3}, or 2π/24-2\pi/2^{4} to construct a universal gate set. For example, we prepared 24 quantum gates for constructing 3-qubit quantum circuits as shown in Fig. A·3(b). Then, various quantum circuits with different lengths and different combinations of quantum gates can be generated depending on a series of actions. aka_{k} is selected by using the action probability functions 𝝅(sk){\boldsymbol{\pi}}(s_{k}) Kaelbling et al. (1996) obtained from the Monte-Carlo tree search (MCTS)Silver et al. (2017a, b) (see Fig. A·3). The component of the 𝝅(sk){\boldsymbol{\pi}}(s_{k}) is defined as 𝝅(sk,a)=Pr(a|sk){\boldsymbol{\pi}}(s_{k},a)={\rm{Pr}}(a|s_{k}), where aa is an action. The reward zkz_{k} for the reinforcement learning is determined by calculating the Bhattacharyya coefficient BkB_{k} between the output probabilities from sk+1s_{k+1} and the correct answer [Fig. A·3(c)]. When BkB_{k} is above a threshold value, BthB_{\rm{th}}, for several input quantum states, we substitute 1 for zkz_{k} and abort the search. When BkB_{k} is less than BthB_{\rm{th}}, we assign 0 to zkz_{k} and repeat searching for the next action, updating the state, and comparing Bk+1B_{k+1} and BthB_{\rm{th}}. If no states satisfy the condition until kk reaches the upper limit NN, we assign -1 to zNz_{N}. NN determines the maximum length of the quantum circuits in the search, and we set NN to around 4n4n in this study. The total reward zz for the search is defined as the sum of the rewards z=kzkz=\sum_{k}z_{k} . The obtained set (sk,𝝅(sk),z)(s_{k},{\boldsymbol{\pi}}(s_{k}),z) is used as the training data to update the MCTS and dual neural network as explained below.

The reinforcement learning aims to learn a policy so as to maximize the total reward. As the policy model, we used the Alpha Zero algorithm based on the MCTS and a dual neural networkSilver et al. (2017a, b). The dual neural network predicts a policy, 𝐪{\bf{q}}, and a state value, vv, in response to the input state ss (see the next section, Fig. A·4, and Table A·I for more details). The network parameters are trained by using the training data (s,𝝅(s),z)(s,{\boldsymbol{\pi}}(s),z) by minimizing the loss function L=(zv)2𝝅Tlog𝐪L=(z-v)^{2}-{\boldsymbol{\pi}}^{\rm{T}}\log{\bf{q}}. The MCTS efficiently searches quantum circuits reflecting the prediction of the dual neural network. The nodes and edges of the tree are defined as the states and the pairs of the state and action, respectively, where the root node is a present state, sks_{k} [see Fig. A·3(a)]. Each edge (s,a)(s,a) has parameters: the visit count N(s,a)N(s,a), the action value Q(s,a)Q(s,a), and the prior probability P(s,a)P(s,a), where N(s,a)N(s,a) and Q(s,a)Q(s,a) are initialized to zero. The tree search starts from the root node. Other nodes and edges are generated by selecting an action to maximize the upper confidence bound Q(s,a)+U(s,a)Q(s,a)+U(s,a), where U(s,a)=cpuctP(s,a)bN(s,b)/[1+N(s,a)]U(s,a)=c_{\rm{puct}}P(s,a)\sqrt{\sum_{b}N(s,b)}/[1+N(s,a)], until it reaches one of the leaf nodes. cpuctc_{\rm{puct}} is an exploration parameter. At the leaf node, the next node (ss^{\prime}) is appended and the prior probability P(s,)P(s^{\prime},\cdot) and the state value V(s)V(s^{\prime}) for ss^{\prime} are predicted by using the dual neural network as P(s,)=𝐪(s)P(s^{\prime},\cdot)={\bf{q}}(s^{\prime}) and V(s)=v(s)V(s^{\prime})=v(s^{\prime}). Then the parameters of all visited edges are updated as follows: N(s,a)N(s,a)+1N(s,a)\leftarrow N(s,a)+1 and Q(s,a)[N(s,a)Q(s,a)+V(s)]/[1+N(s,a)]Q(s,a)\leftarrow[N(s,a)Q(s,a)+V(s^{\prime})]/[1+N(s,a)]. By repeating the search and updating the parameters, we obtain the policy for the root state sks_{k}: 𝝅(a|sk)=N(sk,a)/bN(sk,b){\boldsymbol{\pi}}(a|s_{k})=N(s_{k},a)/\sum_{b}N(s_{k},b). Following the obtained policy 𝝅(sk){\boldsymbol{\pi}}(s_{k}), the present state sks_{k} is updated to sk+1s_{k+1} and we move on to the next search [see Fig. A·3(a)].

Appendix E Dual neural network

The dual neural network consists of convolution, policy prediction, and state-value prediction parts, respectively labeled as (A), (B), and (C) in Fig. A·4. The convolution part (A) is composed of 5 convolution and 2 fully-connected layers with Leaky ReLU activation functionsXu et al. (2015), the batch normalization techniqueIoffe and Szegedy (2015), and the dropout techniqueSrivastava et al. (2014). The input to the convolution part is an NN-dimensional vector representing a quantum circuit, and the output is 2G2G-dimensional vectors. The policy prediction part (B) is a fully-connected network with a softmax activation function. The input to the policy prediction part is a 2G2G-dimensional vector, and the output is a GG-dimensional vector 𝐪{\bf{q}} predicting a policy to search quantum circuits, where GG is the number of selectable quantum gates. The state-value prediction part (C) is a fully-connected network with a hyperbolic tangent activation function. The input to the policy prediction part is a 2G2G-dimensional vector, and the output is a scalar vv predicting a state value of the input quantum circuit. Please see Table A·I for more detailed parameters of the network. The network is trained by the adaptive moment estimationKingma and Ba (2014) with the beta hyperparameters of (0.9,0.999)(0.9,0.999) and the learning rate of 0.001.

References

  • Ladd et al. (2010) T. D. Ladd, F. Jelezko, R. Laflamme, Y. Nakamura, C. Monroe,  and J. L. O’Brien, Nature 464, 45 (2010).
  • Alexeev et al. (2021) Y. Alexeev, D. Bacon, K. R. Brown, R. Calderbank, L. D. Carr, F. T. Chong, B. DeMarco, D. Englund, E. Farhi, B. Fefferman, et al., PRX Quantum 2, 017001 (2021).
  • Nielsen and Chuang (2010) M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information, 10th ed. (Cambridge Univ. Press, Cambridge, 2010).
  • Montanaro (2016) A. Montanaro, npj Quant. Inform. 2, 1 (2016).
  • Preskill (2018) J. Preskill, Quantum 2, 79 (2018).
  • Córcoles et al. (2019) A. D. Córcoles, A. Kandala, A. Javadi-Abhari, D. T. McClure, A. W. Cross, K. Temme, P. D. Nation, M. Steffen,  and J. M. Gambetta, Proc. IEEE 108, 1338 (2019).
  • Chow et al. (2012) J. M. Chow, J. M. Gambetta, A. D. Corcoles, S. T. Merkel, J. A. Smolin, C. Rigetti, S. Poletto, G. A. Keefe, M. B. Rothwell, J. R. Rozen, et al., Phys. Rev. Lett. 109, 060501 (2012).
  • Willsch et al. (2017) D. Willsch, M. Nocon, F. Jin, H. De Raedt,  and K. Michielsen, Phys. Rev. A 96, 062302 (2017).
  • Gambetta et al. (2017) J. M. Gambetta, J. M. Chow,  and M. Steffen, npj Quant. Inform. 3, 2 (2017).
  • Wendin (2017) G. Wendin, Rep. Prog. Phys. 80, 106001 (2017).
  • (11) IBM Quantum. https://quantum-computing.ibm.com/ (2021).
  • Kaelbling et al. (1996) L. P. Kaelbling, M. L. Littman,  and A. W. Moore, J. Artif. Intel. Res. 4, 237 (1996).
  • Arulkumaran et al. (2017) K. Arulkumaran, M. P. Deisenroth, M. Brundage,  and A. A. Bharath, IEEE Signal Process. Mag. 34, 26 (2017).
  • Silver et al. (2017a) D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., Nature 550, 354 (2017a).
  • Silver et al. (2017b) D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, et al., arXiv preprint arXiv:1712.01815  (2017b).
  • Peruzzo et al. (2014) A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik,  and J. L. O’brien, Nat. Commun. 5, 4213 (2014).
  • Farhi et al. (2014) E. Farhi, J. Goldstone,  and S. Gutmann, arXiv preprint arXiv:1411.4028  (2014).
  • Endo et al. (2021) S. Endo, Z. Cai, S. C. Benjamin,  and X. Yuan, J. Phys. Soc. Jpn. 90, 032001 (2021).
  • Cerezo et al. (2021) M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, et al., Nat. Rev. Phys. 3, 625 (2021).
  • Nakaji et al. (2022) K. Nakaji, S. Uno, Y. Suzuki, R. Raymond, T. Onodera, T. Tanaka, H. Tezuka, N. Mitsuda,  and N. Yamamoto, Phys. Rev. Res. 4, 023136 (2022).
  • Mitarai et al. (2018) K. Mitarai, M. Negoro, M. Kitagawa,  and K. Fujii, Phys. Rev. A 98, 032309 (2018).
  • Fuchs and Van De Graaf (1999) C. A. Fuchs and J. Van De Graaf, IEEE Trans. Inform. Theory 45, 1216 (1999).
  • Khaneja et al. (2005) N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbrüggen,  and S. J. Glaser, J. Mag. Res. 172, 296 (2005).
  • Khatri et al. (2019) S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sornborger,  and P. J. Coles, Quantum 3, 140 (2019).
  • Dalgaard et al. (2020) M. Dalgaard, F. Motzoi, J. J. Sørensen,  and J. Sherson, npj Quant. Inform. 6, 6 (2020).
  • Peng et al. (2020) T. Peng, A. W. Harrow, M. Ozols,  and X. Wu, Phys. Rev. Lett. 125, 150504 (2020).
  • Magann et al. (2021) A. B. Magann, C. Arenz, M. D. Grace, T.-S. Ho, R. L. Kosut, J. R. McClean, H. A. Rabitz,  and M. Sarovar, PRX Quant. 2, 010101 (2021).
  • Fösel et al. (2021) T. Fösel, M. Y. Niu, F. Marquardt,  and L. Li, arXiv preprint arXiv:2103.07585  (2021).
  • Moro et al. (2021) L. Moro, M. G. Paris, M. Restelli,  and E. Prati, Commun. Phys. 4, 178 (2021).
  • Kimura et al. (2022) T. Kimura, K. Shiba, C.-C. Chen, M. Sogabe, K. Sakamoto,  and T. Sogabe, J. Phys. Commun. 6, 075006 (2022).
  • Barenco et al. (1996) A. Barenco, A. Ekert, K.-A. Suominen,  and P. Törmä, Phys. Rev. A 54, 139 (1996).
  • Nam et al. (2020) Y. Nam, Y. Su,  and D. Maslov, npj Quant. Inform. 6, 26 (2020).
  • Ekert and Jozsa (1996) A. Ekert and R. Jozsa, Rev. Mod. Phys. 68, 733 (1996).
  • Aspuru-Guzik et al. (2005) A. Aspuru-Guzik, A. D. Dutoi, P. J. Love,  and M. Head-Gordon, Science 309, 1704 (2005).
  • Xu et al. (2015) B. Xu, N. Wang, T. Chen,  and M. Li, arXiv preprint arXiv:1505.00853  (2015).
  • Ioffe and Szegedy (2015) S. Ioffe and C. Szegedy, arXiv preprint arxiv:1502.03167  (2015).
  • Srivastava et al. (2014) N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever,  and R. Salakhutdinov, JMLR 15, 1929 (2014).
  • Kingma and Ba (2014) D. P. Kingma and J. Ba, arXiv preprint arXiv:1412.6980  (2014).
Table A·I: Architecture of the dual neural network
Part Layer # Layer Hyperparameters
(A) 1 Convolution output_channel=256, kernel_size=3, stride=1, padding=1
2 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
3 Leaky ReLU negative_slope=0.01
4 Convolution output_channel=256, kernel_size=3, stride=1, padding=1
5 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
6 Leaky ReLU negative_slope=0.01
7 Convolution output_channel=256, kernel_size=3, stride=1, padding=1
8 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
9 Leaky ReLU negative_slope=0.01
10 Convolution output_channel=256, kernel_size=3, stride=1, padding=1
11 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
12 Leaky ReLU negative_slope=0.01
13 Convolution output_channel=256, kernel_size=3, stride=1, padding=1
14 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
15 Leaky ReLU negative_slope=0.01
16 Fully-connected in_features=256×GG , out_features=4×GG, bias=True
17 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
18 Leaky ReLU negative_slope=0.01
19 dropout p=0.3
20 Fully-connected in_features=4×GG , out_features=2×GG, bias=True
21 Batch normalization eps=1e-05, momentum=0.1, affine=True, track_running_stats=True
22 Leaky ReLU negative_slope=0.01
23 dropout p=0.3
(B) 24-1 Fully-connected in_features=2×GG , out_features=GG, bias=True
25-1 Log softmax
(C) 24-2 Fully-connected in_features=2×GG , out_features=1, bias=True
25-2 tanh