This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Implementing arbitrary quantum operations via quantum walks on a cycle graph

Jia-Yi Lin National Lab of Solid State Microstructure, Collaborative Innovation Center of Advanced Microstructures, and Department of Physics, Nanjing University, Nanjing 210093, China.    Xin-Yu Li Institute for Brain Sciences and Kuang Yaming Honors School, Nanjing University, Nanjing 210023, China.    Yu-Hao Shao Institute for Brain Sciences and Kuang Yaming Honors School, Nanjing University, Nanjing 210023, China.    Wei Wang wangwei@nju.edu.cn National Lab of Solid State Microstructure, Collaborative Innovation Center of Advanced Microstructures, and Department of Physics, Nanjing University, Nanjing 210093, China. Institute for Brain Sciences and Kuang Yaming Honors School, Nanjing University, Nanjing 210023, China.    Shengjun Wu sjwu@nju.edu.cn National Lab of Solid State Microstructure, Collaborative Innovation Center of Advanced Microstructures, and Department of Physics, Nanjing University, Nanjing 210093, China. Institute for Brain Sciences and Kuang Yaming Honors School, Nanjing University, Nanjing 210023, China. Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
Abstract

The quantum circuit model is the most commonly used model for implementing quantum computers and quantum neural networks whose essential tasks are to realize certain unitary operations. The circuit model usually implements a desired unitary operation U(N)U(N) by a sequence of single-qubit and two-qubit unitary gates from a universal set. Although this certainly facilitates the experimentalists as they only need to prepare several different kinds of universal gates, the number of gates required to implement an arbitrary desired unitary operation is usually large. Hence the efficiency in terms of the circuit depth or running time is not guaranteed. Here we propose an alternative approach; we use a simple discrete-time quantum walk (DTQW) on a cycle graph to model an arbitrary unitary operation U(N)U(N) without the need to decompose it into a sequence of gates of smaller sizes. Our model is essentially a quantum neural network based on DTQW. Firstly, it is universal as we show that any unitary operation U(N)U(N) can be realized via an appropriate choice of coin operators. Secondly, our DTQW-based neural network can be updated efficiently via a learning algorithm, i.e., a modified stochastic gradient descent algorithm adapted to our network. By training this network, one can promisingly find approximations to arbitrary desired unitary operations. With an additional measurement on the output, the DTQW-based neural network can also implement general measurements described by positive-operator-valued measures (POVMs). We show its capacity in implementing arbitrary 2-outcome POVM measurements via numeric simulation. We further demonstrate that the network can be simplified and can overcome device noises during the training so that it becomes more friendly for laboratory implementations. Our work shows the capability of the DTQW-based neural network in quantum computation and its potential in laboratory implementations.

I introduction

Quantum walk Aharonov , the quantum counterpart of classical random walk, has been widely applied in achieving various quantum information processing tasks Portugal . Because of its quadratic enhancement of variances, the quantum walk plays a vital role in many quantum search algorithms and provides possible exponential speedups due to the quantum interference during the walk Kempe . Moreover, various experimental implementations of quantum walks prove its feasibility in real-life circumstances of quantum information processing Kia .

On the other hand, machine learning is a core technology in the age of artificial intelligence. Since machine learning faces the challenge of the lack of computational power and quantum computing has a vast computational potential, the possibility of combining quantum computing and machine learning has been considered. Quantum neural networks (QNNs), a newer class of models in the field of quantum machine learning, operate on quantum computers and perform calculations using quantum effects like superposition, entanglement, and interference. Investigations on QNNs farhi2018classification ; zhao2019building ; mitarai2018quantum ; dallaire2018quantum ; amin2018quantum ; CZoufal ; VDunjko ; MSchuld have revealed their potential advantages, such as training and processing speedups. Despite significant developments in the growing field of quantum machine learning, the trade-offs between quantum and classical models have not been systematically studied. In particular, the question of whether quantum neural networks are more powerful than classical neural networks is still open SAaronson2015nph .

A gate-model QNN is a QNN constructed on a gate-model quantum computer using a sequence of unitaries with associated gate parameters farhi2018classification . Recent developments, such as quantum generative adversarial networks and quantum circuit learning, have more general and diverse QNN structures dallaire2018quantum ; mitarai2018quantum ; gyongyosi2019training . Researchers have already proved that typical quantum walks are universal for quantum computation childs2009universal ; lovett2010universal ; kurzynski2013quantum ; bian2015realization ; zhao2015experimental . However, these works mainly focus on state processing, and many auxiliary systems should be employed in general. In contrast, what we are attempting to achieve in this work is a universal control of the quantum system to implement arbitrary quantum operations, without any auxiliary system. For this purpose, we shall introduce a QNN based on discrete-time quantum walks (DTQW) on a cycle graph with specifically parameterized coin operators. We choose the graph to be a cycle because it is simple for laboratory implementations. We will prove that the DTQW-based QNN is indeed capable of realizing arbitrary unitary evolution of the closed system.

Determining the parameters of the DTQW-based QNN analytically is possible. However, any further adjustments on the network, such as a reduction in the number of circuit depth, will pose extraordinary difficulties for analytical methods. In contrast, we will show that such adjustments can be effectively made with gradient descent, a well-known optimization algorithm frequently employed to train machine learning models, including both classical and quantum neural networks darken1992learning ; bengio2013advances . Another significant advantage of using gradient descent is that explicitly decomposing the desired operator into a sequence of gates from a universal set is no longer necessary. Furthermore, we shall simplify the network in various ways to facilitate laboratory implementations. For example, we shall use only rotations along the x-axis as the gates involved in the DTQW. We can still find decent approximations of the desired quantum operations in this situation using our DTQW-based QNN.

Our work is organized as follows. We first introduce our DTQW-based neural network in Sec. II and then prove its universality for quantum control in Sec. III. We further modify gradient descent and apply it to our DTQW-based QNN in Sec. IV. Finally, we simplify the QNN in Sec. V to facilitate the laboratory implementations.

II Quantum neural network based on discrete time quantum walk

The quantum neural network based on quantum gates, the gate-model QNN, was first introduced due to its high experimental feasibility farhi2018classification . The gate-model QNNs utilize a series of unitary operations in a certain order to process the quantum state. The unitary operations involve adjustable parameters. By optimizing these parameters and encoding information to the input and output states, the gate-model QNNs are sufficient to solve various learning tasks. In this section, we introduce the DTQW on a cycle graph. We choose the graph to be a cycle because it is simple for laboratory implementation. We will show that such DTQW also involves a series of adjustable unitary gates and is sufficient to learn quantum operations. Thus the DTQW on a cycle graph can be treated as a special type of the gate-model QNN.

The DTQW on a cycle graph involves two Hilbert spaces, namely the coin space c\mathcal{H}_{c} and the position space p\mathcal{H}_{p}, which are spanned by orthonormal basis {|0c,|1c}\{|{0}\rangle_{c},|{1}\rangle_{c}\} and {|xp}x=0n1\{|{x}\rangle_{p}\}_{x=0}^{n-1} respectively, where nn is the number of sites in the cycle. The walker state |Ψ|{\Psi}\rangle is then in the space =cp\mathcal{H}=\mathcal{H}_{c}\otimes\mathcal{H}_{p}. A schematic representation of the DTQW on a cycle graph is shown in Fig. 1. The process of the DTQW is an iteration of applying coin operators C^(t)\hat{C}^{(t)} and shift operators

S^=c=01x=0n1|c,x+δc(modn)c,x|\hat{S}=\sum_{c=0}^{1}\sum_{x=0}^{n-1}\left|{c,x+\delta_{c}\ (\mathrm{mod}\ {n})}\middle\rangle\middle\langle{c,x}\right| (1)

to the walker state, i.e.,

|Ψ(t+1)=S^C^(t)|Ψ(t),|{\Psi^{(t+1)}}\rangle=\hat{S}\hat{C}^{(t)}|{\Psi^{(t)}}\rangle, (2)

where t=0,1,2,t=0,1,2,... denotes the ordinal of iterations, integer δc\delta_{c} represents how far the walker is shifted if its coin is in the state |c|{c}\rangle. For simplicity, we choose δc=c\delta_{c}=c throughout this letter. To make sure that the DTQW is flexible enough to implement various quantum operations, the coin operator C^(t)\hat{C}^{(t)} need to be site-dependent, i.e.,

C^(t)=x=0n1c^x(t)|xx|,\hat{C}^{(t)}=\sum_{x=0}^{n-1}\hat{c}_{x}^{(t)}\otimes\left|{x}\middle\rangle\middle\langle{x}\right|, (3)

where c^x(t)U(2)\hat{c}_{x}^{(t)}\in\mathrm{U}(2) flips the coin of the walker during the tt-th iteration if the walker is at the site xx. Since the operators c^x(t)\hat{c}_{x}^{(t)} are applied to the coin only if the walker is at certain sites xx, they are called single-site coin operators.

Refer to caption
Figure 1: This is a schematic representation of the DTQW on a cycle graph with 12 sites, i.e., n=12n=12. The spin-1/2 particle represents the coin system of the DTQW. The dots on the cycle represent the positions that the walker possibly takes. The walker randomly walks on the sites for TT-steps. The coin system and the position system combined becomes the total quantum system of the DTQW.

Since the operations during every iteration are unitary, the total effect of a TT-step DTQW

U^T,0=𝒯t=0T1S^C^(t)\hat{U}_{T,0}=\mathcal{T}\prod_{t=0}^{T-1}\hat{S}\hat{C}^{(t)} (4)

is also unitary, where 𝒯\mathcal{T}\prod denotes the time-ordered product. We define U^t1,t0=𝒯t=t0t11S^C^(t)\hat{U}_{t_{1},t_{0}}=\mathcal{T}\prod_{t=t_{0}}^{t_{1}-1}\hat{S}\hat{C}^{(t)} so that it is the time evolution operator, i.e., |Ψ(t1)=U^t1,t0|Ψ(t0)|{\Psi^{(t_{1})}}\rangle=\hat{U}_{t_{1},t_{0}}|{\Psi^{(t_{0})}}\rangle. One can notice that our version of the DTQW on a cycle graph is a straightforward generalization of the conventional Hadamard walk of which c^x(t)=H^\hat{c}_{x}^{(t)}=\hat{H} and δc=12c\delta_{c}=1-2c.

Every step of DTQW is unitary and is parameterized by c^x(t)U(2)\hat{c}_{x}^{(t)}\in\mathrm{U}(2). These operators c^x(t)\hat{c}_{x}^{(t)} can be treated as the adjustable gates in a gate-model quantum neural network. By adjusting these gates c^x(t)\hat{c}_{x}^{(t)}, we can use the DTQW to implement various quantum operations. Therefore the DTQW can be seen as a special type of gate-model quantum neural network. A schematic representations of the quantum neural network based on the DTQW on a cycle graph is shown in Fig. 2. The circuit depth of this network is the number of walking steps TT of the DTQW. In this work, we will denote the quantum neural network based on the DTQW on a cycle graph simply as the DTQW-QNN. We call the system of the quantum walker the underlying system of the DTQW-QNN.

Refer to caption
Figure 2: The DTQW on a cycle graph is represented in a fashion similar to gate-model QNNs. The operators c^x(t)\hat{c}_{x}^{(t)} in little boxes are single-site coin operators of the DTQW, while the operators S^\hat{S} in large dashed boxes are shift operators. Each c^x(t)\hat{c}_{x}^{(t)} acts on two energy levels. These c^x(t)\hat{c}_{x}^{(t)} are adjusted so that the total effect of the DTQW meets one’s needs.

III Universality and complexity of DTQW-based neural network

The implementation of quantum operations via the DTQW on a cycle graph is one of the primary motivations of our work. The universality of DTQW for quantum computation has been shown in general childs2009universal ; lovett2010universal . While the previous work mainly focuses on the mapping from the initial state to the final state in a certain small subspace of the total system, in this work we take the overall effect on the total system into account. In this section, we investigate the capacity and universality of the DTQW on a cycle graph in implementing quantum operations, and show that it is universal for unitary operations, which is the main theorem of this section.

By saying that the DTQW on a cycle graph is universal, we mean that any unitary operation on the overall Hilbert space =cp\mathcal{H}=\mathcal{H}_{c}\otimes\mathcal{H}_{p} can be realized by a DTQW. Hence it is not only universal for computation but also universal for controlling the whole quantum system. To be more formal and specific, the following theorem is provided.

Theorem 1.

For any unitary operator V^U(2n)\hat{V}\in\mathrm{U}(2n), there exists a positive integer TT and a family of single-site coin operators {c^x(t)}U(2)\{\hat{c}_{x}^{(t)}\}\subset\mathrm{U}(2) indexed by the set {(x,t):0x<n and 0t<T}\{(x,t):0\leq x<n\mbox{ and }0\leq t<T\} such that the total effect of the TT-step DTQW is V^\hat{V}, i.e., U^T,0=V^\hat{U}_{T,0}=\hat{V}, as long as δ0δ1\delta_{0}\neq\delta_{1} and gcd(|δ0δ1|,n)=1\gcd(|\delta_{0}-\delta_{1}|,n)=1.

We prove the universality of the DTQW on a cycle by decomposing arbitrary unitary operators V^\hat{V} into a product of two-level unitary operators V^=u^m,,u^2u^1\hat{V}=\hat{u}_{m},\dots,\hat{u}_{2}\hat{u}_{1} and construct a DTQW to implement every u^i\hat{u}_{i} for i=1,2,,mi=1,2,\dots,m. A detailed proof is provided in Appendix A.

As a demonstration of Theorem 1, we first implement the controlled NOT (CNOT) gate with a DTQW on a cycle with two sites. We can find that according to Eq. (1), the shift operator

S^=[1000010000010010]\hat{S}=\begin{bmatrix}1&0&0&0\\ 0&1&0&0\\ 0&0&0&1\\ 0&0&1&0\end{bmatrix} (5)

is just the CNOT gate we need. Hence a simple one-step DTQW is equivalent to the CNOT gate if we choose all the single-site coin operators c^x(t)\hat{c}_{x}^{(t)} to be the identity operator.

Next, let us consider a more complicated two-level unitary operator, a unitary U^\hat{U} controlled by two qubits

V^=[100000000100000000100000000100000000100000000100000000ab000000cd],\hat{V}=\begin{bmatrix}1&0&0&0&0&0&0&0\\ 0&1&0&0&0&0&0&0\\ 0&0&1&0&0&0&0&0\\ 0&0&0&1&0&0&0&0\\ 0&0&0&0&1&0&0&0\\ 0&0&0&0&0&1&0&0\\ 0&0&0&0&0&0&a&b\\ 0&0&0&0&0&0&c&d\end{bmatrix}, (6)

where a,b,c,da,b,c,d are four matrix elements of U^\hat{U}. Comparing this operator V^\hat{V} with the general form of two-level operators in Eq. (17), we can find that c0=c1=1c_{0}=c_{1}=1, x0=3x_{0}=3 and x1=4x_{1}=4. By substituting c0,c1,x0,x1c_{0},c_{1},x_{0},x_{1} in Eqs.(26) and (27) with their respective values, we get

c^x(t)={σ^xif t=0 or 4, and x=4[dcba]if t=1 and x=4I^cotherwise\hat{c}_{x}^{(t)}=\begin{cases}\hat{\sigma}_{x}&\mbox{if $t=0$ or $4$, and $x=4$}\\ \begin{bmatrix}d&c\\ b&a\end{bmatrix}&\mbox{if $t=1$ and $x=4$}\\ \hat{I}_{c}&\mbox{otherwise}\end{cases} (7)

where σ^x\hat{\sigma}_{x} is the Pauli xx matrix. By choosing the single-site coin operators c^x(t)\hat{c}_{x}^{(t)} according to Eq.(7), we can realize the unitary operator V^\hat{V} with an eight-step DTQW on a cycle with four sites.

For the most general two-level unitary operators V^\hat{V}, the calculation is essentially the same as the above example, i.e., find the values of c0,c1,x0,x1c_{0},c_{1},x_{0},x_{1} by comparing V^\hat{V} with Eq. (17) and then substitute them in Eqs. (18) and (19) if c0=c1c_{0}=c_{1} or Eqs. (26) and (27) if otherwise. For unitary operators V^\hat{V} which are not two-level, we decompose them into a product of two-level unitary operators V^=u^m,,u^2u^1\hat{V}=\hat{u}_{m},\dots,\hat{u}_{2}\hat{u}_{1} Nielsen2007Quantum . By combining the DTQWs for u^i\hat{u}_{i} one after one, we can realize V^\hat{V} with the final combined DTQW. As an example, the calculation to implement the Fourier transformation is provided in Appendix B.

Implementing a unitary operation with the construction in the proof of Theorem 1 as above involves numerous steps of the walk. To reduce the number of steps, we provide in Appendix C a further optimized scheme for implementations. With this scheme, no more than 2n22n+12n^{2}-2n+1 steps of walk is needed for the DTQW-QNN to be universal, where nn is the number of sites in the cycle.

IV finding approximations via gradient descent

It is sometimes cumbersome to find exact realizations of desired quantum operations in analytical ways. However, fair approximations to desired operations are often acceptable for practical purposes. In this section, we introduce an algorithm in a machine learning fashion to find the approximations by applying gradient descent to the DTQW-QNN. With this algorithm, the required number of depth can be further reduced when approximations are allowed.

In order to apply gradient descent to the DTQW-QNN, we have to do the following three things in advance.

  1. 1.

    Parameterize the single-site coin operators with a four dimensional real vector α(x,t)\vec{\alpha}^{(x,t)}:

    c^x(t)=eiα3(x,t)σ^3eiα2(x,t)σ^2eiα1(x,t)σ^1eiα0(x,t)σ^0,\hat{c}_{x}^{(t)}=e^{i\alpha_{3}^{(x,t)}\hat{\sigma}_{3}}e^{i\alpha_{2}^{(x,t)}\hat{\sigma}_{2}}e^{i\alpha_{1}^{(x,t)}\hat{\sigma}_{1}}e^{i\alpha_{0}^{(x,t)}\hat{\sigma}_{0}}, (8)

    in which σ^j\hat{\sigma}_{j} is the jjth Pauli matrix, σ^0=I^\hat{\sigma}_{0}=\hat{I}.

  2. 2.

    Introduce a state-wise loss function L|ΨL_{|{\Psi}\rangle}:

    L|Ψ=12|Ψ(T)|Φ(T)2,L_{|{\Psi}\rangle}=\frac{1}{2}\left\lVert|{\Psi^{(T)}}\rangle-|{\Phi^{(T)}}\rangle\right\rVert^{2}, (9)

    where |Ψ(T)=U^T,0|Ψ|{\Psi^{(T)}}\rangle=\hat{U}_{T,0}|{\Psi}\rangle and |Φ(T)=V^|Ψ|{\Phi^{(T)}}\rangle=\hat{V}|{\Psi}\rangle are the final state and the desired final state respectively.

  3. 3.

    Derive the partial derivative:

    L|Ψαj(x,t)=Im(Φ(t)|Σ^j(x,t)|Ψ(t)),\frac{\partial{L_{|{\Psi}\rangle}}}{\partial{\alpha^{(x,t)}_{j}}}=\operatorname{Im}\left(\langle{\Phi^{(t)}}|{\hat{\Sigma}_{j}^{(x,t)}}|{\Psi^{(t)}}\rangle\right), (10)

    where |Ψ(t)=U^t,0|Ψ|{\Psi^{(t)}}\rangle=\hat{U}_{t,0}|{\Psi}\rangle and |Φ(t)=U^T,t|Φ(T)|{\Phi^{(t)}}\rangle=\hat{U}_{T,t}^{\dagger}|{\Phi^{(T)}}\rangle are the forward-propagation and back-propagation states respectively, Σ^j(x,t)=(n^j(x,t)σ)|xx|+ξxI^c|ξξ|\hat{\Sigma}_{j}^{(x,t)}=(\hat{n}_{j}^{(x,t)}\cdot{\vec{\sigma}})\otimes\left|{x}\middle\rangle\middle\langle{x}\right|+\sum_{\xi\neq x}\hat{I}_{c}\otimes\left|{\xi}\middle\rangle\middle\langle{\xi}\right|, σ=j=03σ^jej{\vec{\sigma}}=\sum_{j=0}^{3}\hat{\sigma}_{j}\vec{e}_{j}, and n^0(x,t)\hat{n}_{0}^{(x,t)}, n^1(x,t)\hat{n}_{1}^{(x,t)}, n^2(x,t)\hat{n}_{2}^{(x,t)}, n^3(x,t)\hat{n}_{3}^{(x,t)} equals

    [1000],[0100],[00cos2α1(x,t)sin2α1(x,t)],[0sin2α2(x,t)cos2α2(x,t)sin2α1(x,t)cos2α2(x,t)cos2α1(x,t)]\begin{bmatrix}1\\ 0\\ 0\\ 0\end{bmatrix},\begin{bmatrix}0\\ 1\\ 0\\ 0\end{bmatrix},\begin{bmatrix}0\\ 0\\ \cos{2\alpha_{1}^{(x,t)}}\\ \sin{2\alpha_{1}^{(x,t)}}\end{bmatrix},\begin{bmatrix}0\\ \sin{2\alpha_{2}^{(x,t)}}\\ -\cos{2\alpha_{2}^{(x,t)}}\sin{2\alpha_{1}^{(x,t)}}\\ \cos{2\alpha_{2}^{(x,t)}}\cos{2\alpha_{1}^{(x,t)}}\end{bmatrix}

    respectively.

Gradient descent iteratively moves the parameters in the opposite direction of the gradient, i.e.,

new αj(x,t)old αj(x,t)ηL|Ψαj(x,t),\mbox{new }\alpha_{j}^{(x,t)}\leftarrow\mbox{old }\alpha_{j}^{(x,t)}-\eta\frac{\partial{L_{|{\Psi}\rangle}}}{\partial{\alpha_{j}^{(x,t)}}}, (11)

where η\eta is a positive real number called learning rate. Hence, the loss gradually drops during the iteration and the approximation to V^\hat{V} by U^T,0\hat{U}_{T,0} becomes better and better.

The details of the algorithm to find the parameters of the DTQW-QNN {α(x,t):0x<n and 0t<T}\left\{\vec{\alpha}^{(x,t)}:0\leq x<n\mbox{ and }0\leq t<T\right\} to approximate a desired unitary operator V^\hat{V} are as the following.

  1. 1.

    Set the total number of depth TT and the learning rate η\eta to be an appropriate positive integer and real respectively.

  2. 2.

    Randomly initialize all the parameters αj(x,t)\alpha^{(x,t)}_{j}.

  3. 3.

    Randomly sample a state |Ψ|{\Psi}\rangle from the total Hilbert space \mathcal{H}.

  4. 4.

    Calculate the partial derivatives L|Ψαj(x,t)\frac{\partial{L_{|{\Psi}\rangle}}}{\partial{\alpha^{(x,t)}_{j}}} for all tt, xx, jj according to Eq. (10).

  5. 5.

    Update all the parameters according to Eq. (11).

  6. 6.

    Repeat Steps 3 to 5 until an acceptable approximation is reached.

Refer to caption
Figure 3: The circuit to calculate the gradient with a measurement on an ancillary qubit. The order of the operators applied is from top to bottom. The operator U^T,0(j,x,t)\hat{U}_{T,0}^{(j,x,t)} equals U^T,tΣ^j(x,t)U^t,0\hat{U}_{T,t}\hat{\Sigma}_{j}^{(x,t)}\hat{U}_{t,0}. Some states during the computation with this circuit are listed on the right side. At the last, the average value σ^3\langle\hat{\sigma}_{3}\rangle of the ancillary qubit is measured.

One can notice that our choice of the loss function leads to a friendly form of gradients Eq. (10) for numerical calculation. The states |Ψ(t)\left|{\Psi^{(t)}}\right\rangle and |Φ(t)\left|{\Phi^{(t)}}\right\rangle can be calculated by a forward-propagation and a back-propagation efficiently. Moreover, the gradients can be calculated by implementing a circuit with the help of an ancillary qubit as shown in Fig. 3. At the last of the circuit, the average value σ^3\langle\hat{\sigma}_{3}\rangle of the ancillary qubit is measured. The result σ^3\langle\hat{\sigma}_{3}\rangle can be used to update the parameters of the DTQW-QNN since σ^3\langle\hat{\sigma}_{3}\rangle always coincides with the partial derivative L|Ψ/αj(x,t)\partial{L_{|{\Psi}\rangle}}/\partial{\alpha^{(x,t)}_{j}} in Eq. (10). This might enable us to implement simultaneous tomography and cloning of an unknown unitary operation.

Besides, the position space p\mathcal{H}_{p} is commonly much larger than the coin space c\mathcal{H}_{c}. Theorem 1 thus indicates that one can indirectly control a large system by controlling a small two-level coin system via DTQW on a cycle graph. For example, unitary operations and general two-outcome measurements described by positive-operator-valued measures (POVMs) can be applied to the position space in this way straightforwardly according to Theorem 1. If we are only interested in the unitary operators that act on the position space p\mathcal{H}_{p}, we only need one arbitrary site to be allowed to assign nonidentity coin operators. The detailed content is provided in Appendix D.

Numerical results

We first test our algorithm with a DTQW-QNN to learn the SWAP\mathrm{SWAP} gate. Because all matrix elements of SWAP\mathrm{SWAP} are either 0 or 11, it would be visually clear whether a unitary operator is close to SWAP\mathrm{SWAP} after the operator is visualized. The change of the DTQW unitary operator U^T,0\hat{U}_{T,0} during the training is visualized in Fig. 4. As the DTQW-QNN is trained, U^T,0\hat{U}_{T,0} becomes closer and closer to the desired gate SWAP\mathrm{SWAP}. And the DTQW-QNN realizes the SWAP\mathrm{SWAP} after the training is finished.

Refer to caption
Figure 4: The training of a SWAP gate is visualized; each subfigure shows the matrix elements of the unitary operator U^T,0\hat{U}_{T,0} represented by the DTQW-QNN in different stages of the training: (a) shows the random matrix picked up by the DTQW-QNN before the training, and (b-d) give the updated matrix after 3030, 6060, and 240240 updates to the parameters of the DTQW-QNN respectively. After 240240 updates, the DTQW-QNN represents a SWAP gate precisely. Each bar corresponds to a matrix element, which is a complex number, of the unitary operator U^T,0\hat{U}_{T,0}. The labels on the bottom left and right corners represent, respectively, the row indices and the column indices of the matrix elements. The height of a bar represents the magnitude of the matrix element of U^T,0\hat{U}_{T,0}, while the color of a bar represents its phase angle.

To measure how well the DTQW U^T,0\hat{U}_{T,0} approximates the desired unitary V^\hat{V}, we introduce the distance

d(U^T,0,V^)=1|tr(U^T,0V^)/2n|2.d(\hat{U}_{T,0},\hat{V})=\sqrt{1-\left|\mathrm{tr}(\hat{U}_{T,0}\hat{V}^{\dagger})/{2n}\right|^{2}}. (12)

between the operators U^T,0\hat{U}_{T,0} and V^\hat{V}. The smaller this distance is, the better the DTQW approximates the desired operator.

Refer to caption
Figure 5: The evolution of the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) between the DTQW unitary operator and the desired operator during the training of DTQW-QNNs. The horizontal axis represents the number of times (epochs) that the DTQW-QNN is updated. The vertical axis represents the distance. The thick dashed red line represents the average of distances calculated from 200 sampled operators V^\hat{V} and their corresponding DTQW-QNNs, while the thin blue line represents the distance of the worst sample, i.e., the largest distance among those samples.

In order to show that the DTQW-QNN can actually approximate arbitrary unitary operator, we sample 200 desired operators V^\hat{V} from U(4)\mathrm{U}(4) according to the Haar measure and train 200 DTQW-QNNs in parallel to approximate these operators V^\hat{V} respectively. The evolution of the distance during the training is plotted in Fig. 5. After the training, the final distance between the DTQW-QNN and the desired operator is smaller than 10710^{-7} even for the worst case of the 200 samples. For DTQW-QNNs with different number nn of sites on the cycle, Fig. 6 shows that the average distance is also always smaller than 10710^{-7}. From Fig. 6, we can also notice that with more sites on the cycle, the training of the DTQW-QNNs is faster, i.e., less updates are needed.

Refer to caption
Figure 6: The distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) evolves differently during the training of DTQW-QNNs with different number nn of sites on the cycle. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}).

Training the DTQW-QNN exhibits some similar phenomena as training classical machine learning models. For example, the implicit acceleration by overparameterization arora2018on also emerges in the training of the DTQW-QNN. The implicit acceleration by overparameterization is a phenomenon where the neural network training becomes faster if more layers are added to the network. For DTQW-QNNs, more layers mean more steps of walk, i.e., a larger depth TT. As shown in Fig. 7, when the number of depth TT is larger, the distance drops faster during the training.

Refer to caption
Figure 7: The distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) of DTQW-QNNs with a larger depths TT drops faster during the training. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}).
Refer to caption
Figure 8: The evolution of the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) during the training of a DTQW-QNN with a 4040-dimensional underlying quantum system, i.e., there are 2020 sites on the cycle. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance.

To show that the algorithm also works for larger quantum systems, we apply it to a DTQW on a cycle graph with 2020 sites as a demonstration to realize the quantum Fourier transformation. As shown in Fig. 8, this DTQW-QNN with a 4040-dimensional underlying quantum system can still be trained to implement the operator we want. For the meta parameters used to generate the numerical results throughout this work, see Appendix E.

V Making the DTQW-based neural network more friendly for implementations

In all previous parts of this work, we have assumed that the single-site coin operators c^x(t)\hat{c}_{x}^{(t)} can take values from U(2)\mathrm{U}(2) arbitrarily. This means the single-site coin operator can have arbitrary phase and arbitrary rotational axis. However, it would be much easier to implement rotations along a fixed axis with fixed phases in laboratories. Hence, in this section, we simplify the DTQW so that it becomes easier to implement. Also, there are always noises when DTQW-QNNs are implemented in laboratories. We test it under the situation where noises are presented in the single-site coin operators c^x(t)\hat{c}_{x}^{(t)}. Throughout this section, the numerical demonstrations are all based on DTQWs on a cycle with two sites.

Refer to caption
(a)
Refer to caption
(b)
Figure 9: The evolution of the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) during the training of a DTQW-QNN with different simplifications: (a) the phases of single-site coin operators are random and fixed; (b) all single-site coin operators are rotations along the xx-axis. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance. The thick dashed red line shows the average of distances calculated from 200 sampled operators V^\hat{V} and their corresponding DTQW-QNNs, while the thin blue line shows the largest distance among those samples.

V.1 Random fixed phases

Firstly, it can be observed that the phases eiα0(x,t)σ^0e^{i\alpha_{0}^{(x,t)}\hat{\sigma}_{0}} in Eq. (8) of single-site coin operators c^x(t)\hat{c}_{x}^{(t)} are relative phases when c^x(t)|xx|\hat{c}_{x}^{(t)}\otimes\left|{x}\middle\rangle\middle\langle{x}\right| are summed in Eq. (3). They are not merely a contribution to the global phase of the DTQW U^T,0\hat{U}_{T,0}. Hence, any change in one of the phases may cause a nontrivial change in U^T,0\hat{U}_{T,0}. This seemingly requires an annoying tuning of all the phase factors of single-site coin operators at different times tt and at different sites xx when the DTQW is implemented.

Fortunately, we find that these phase factors eiα0(x,t)σ^0e^{i\alpha_{0}^{(x,t)}\hat{\sigma}_{0}} actually need no adjustment. As shown in Fig. 9a, the DTQW-QNN can still approximate an arbitrary operator V^\hat{V} via gradient descent even if all the phase factors signed to different sites are random and fixed during the training, i.e.,

c^x(t)=ei𝒂(x)eiα3(x,t)σ^3eiα2(x,t)σ^2eiα1(x,t)σ^1,\hat{c}_{x}^{(t)}=e^{i\bm{a}^{(x)}}e^{i\alpha_{3}^{(x,t)}\hat{\sigma}_{3}}e^{i\alpha_{2}^{(x,t)}\hat{\sigma}_{2}}e^{i\alpha_{1}^{(x,t)}\hat{\sigma}_{1}}, (13)

where phases 𝒂(x)\bm{a}^{(x)} are independent real random variables. This releases us from the cumbersome tuning of the phases of single-site coin operators.

V.2 Simple rotations along x-axis only

The formalism of the single-site coin operators c^x(t)\hat{c}_{x}^{(t)} in Eq. (8) involves three consecutive rotations, namely, eiαj(x,t)σ^j,j=1,2,3e^{i\alpha_{j}^{(x,t)}\hat{\sigma}_{j}},j=1,2,3, each along a different axis. To make it easier for laboratory implementations, we simplify the single-site coin operators to be simple rotations only along the xx-axis, i.e.,

c^x(t)=ei𝒂(x)eiα(x,t)σ^1,\hat{c}_{x}^{(t)}=e^{i\bm{a}^{(x)}}e^{i\alpha^{(x,t)}\hat{\sigma}_{1}}, (14)

where α(x,t)\alpha^{(x,t)} now is merely a real parameter. In this situation the DTQW-QNN can still realize arbitrary operators via gradient descent, as indicated by Fig. 9b.

By comparing Figs. 9a and 9b, we can notice that the DTQW-QNN in this section needs much more time to train compared with the DTQW-QNN in Sec. V.1. To reveal the cause, we have trained 200 DTQW-QNNs to approximate 200 randomly sampled operators V^\hat{V}, respectively. We choose a threshold to be 10110^{-1} and mark the DTQW-QNNs of which the distance after 200 iterations of training is still larger than the threshold. We find that the phase differences 𝒂(0)𝒂(1)\bm{a}^{(0)}-\bm{a}^{(1)} of these marked DTQW-QNNs are all near 0 or ±π\pm\pi as shown in Fig. 10. Hence, we conclude that these specific differences in phases cause the DTQW-QNN to be slow to train. This result also corroborates that the phases of single-site coin operators contribute to the DTQW total effect U^T,0\hat{U}_{T,0} non-trivially as we have stated in Sec. V.1. Now knowing the cause, we can easily avoid these specific phase differences when implementing DTQW-QNNs.

Refer to caption
Figure 10: The distribution of the phase differences 𝒂(0)𝒂(1)\bm{a}^{(0)}-\bm{a}^{(1)} is shown. The horizontal axis indicates the phase difference. For a bar whose base side starts from aa and ends at bb on the horizontal axis, its height represents the proportion of DTQW-QNNs whose phase differences is between aa and bb. The blue and yellow striped bars together correspond to all of the DTQW-QNNs. The yellow striped bars represent the portion of DTQW-QNNs whose distance after training is larger than the threshold 10110^{-1}.

V.3 Noise on rotation axes

When the DTQW-QNN is implemented in laboratories, it is impossible to have all the rotation axes of c^x(t)\hat{c}_{x}^{(t)} be perfectly along the xx direction. There are always noises on the rotational axis, i.e.,

c^x(t)=ei𝒂(x)eiα(x,t)(𝒏^(x,t)σ),\hat{c}_{x}^{(t)}=e^{i\bm{a}^{(x)}}e^{i\alpha^{(x,t)}(\hat{\bm{n}}^{(x,t)}\cdot\vec{\sigma})}, (15)

where

𝒏^(x,t)=[0cos𝜽(x,t)sin𝜽(x,t)cos𝝋(x,t)sin𝜽(x,t)sin𝝋(x,t)],\hat{\bm{n}}^{(x,t)}=\begin{bmatrix}0\\ \cos{\bm{\theta}^{(x,t)}}\\ \sin{\bm{\theta}^{(x,t)}}\cos{\bm{\varphi}^{(x,t)}}\\ \sin{\bm{\theta}^{(x,t)}}\sin{\bm{\varphi}^{(x,t)}}\end{bmatrix}, (16)

where 𝜽(x,t)\bm{\theta}^{(x,t)} and 𝝋(x,t)\bm{\varphi}^{(x,t)} are independent real random variables. In this situation, approximations to desired operators still can be found via gradient descent, as shown in Fig. 11.

Refer to caption
Figure 11: The evolution of the distance d(U^T,0,V^)d(\hat{U}_{T,0},\hat{V}) during the training for a DTQW-QNN with noises present. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance. The thick dashed red line shows the average of distances calculated from 100 sampled operators V^\hat{V} and their corresponding DTQW-QNNs, while the thin blue line shows the largest distance among those samples.

VI Conclusion

In conclusion, we have proposed a quantum neural network based on a simple DTQW on a cycle graph, and used the network to implement arbitrary quantum computation tasks, i.e., unitary operations on an arbitrary NN-dimensional Hilbert space.

In order to implement an arbitrary unitary operation via a circuit model, one needs to decompose the unitary into a sequence of smaller unitary operators. However, via our DTQW-QNN, we only need to update the parameters by a learning algorithm. In other words, our model is adaptive to new tasks. With a new computational task given, our network can simply evolve according to the learning algorithm, and there is no need to decompose the desired operation into a sequence of smaller gates.

Regarding the universality of our model, we presented a specific construction of realizing arbitrary two-level unitary operations on the computational basis, and proved that the DTQW-QNN is universal for all unitary operations on the overall Hilbert space of the involved quantum systems. The DTQW-QNN is not only universal for quantum computation but also universal for controlling the whole quantum system. We also provided an optimization so that the circuit depth of the DTQW-QNN does not need to exceed 2n22n+12n^{2}-2n+1 to realize an arbitrary unitary operator on a 2n2n-dimensional Hilbert space. However, this is only a theoretical limit of the network size in the worst case for the purpose of analytical proof. The appropriate number of nodes for each task may vary, and it is an open question to find this number for a given task.

Our network evolves according to a learning algorithm based on gradient descent, with the loss function carefully chosen so that the parameter updates can be efficiently calculated in a back-propagation fashion and can be, in principle, directly read out from a measurement. The algorithm performs well in updating the parameters of the neural network. We have shown good approximations of unitary operations on a Hilbert space up to 4040 dimensions, as well as arbitrary two-outcome POVMs. Finally, we have also simplified the DTQW-QNN in various aspects. For example, the rotation gates involved in the DTQW are all limited to be along the xx-axis. Such simplifications make the DTQW-QNN more friendly for laboratory implementations while its capability of implementing desired operations is maintained.

We have shown the capability of the DTQW-QNNs in both analytical and numerical ways. Further studies might reveal their total capacity in completing various quantum computation tasks as well as solving machine learning problems, and further experimental implementations would make them more practically useful and closer to real-life applications.

Acknowledgements.
This work is supported by the Innovation Program for Quantum Science and Technology (Grant No. 2021ZD0301701) and the National Natural Science Foundation of China (Grant No. 12175104). Part of the numerical simulations in this work involves the use of QuTiP johansson2013qutip .

References

  • (1) Y. Aharonov, L. Davidovich, and N. Zagury. Phys. Rev. A 48, 1687 (1993).
  • (2) R. Portugal, Quantum Walks and Search Algorithms (Springer, New York, 2013).
  • (3) J. Kempe, Contemp. Phys. 44, 307 (2003).
  • (4) K. Manouchehri and J. Wang, Physical Implementation of Quantum Walks (Springer Berlin, Heidelberg, 2014).
  • (5) E. Farhi and H. Neven, arXiv:1802.06002 (2018).
  • (6) J. Zhao, Y.-H. Zhang, C.-P. Shao, Y.-C. Wu, G.-C. Guo, and G.-P. Guo, Phys. Rev. A 100, 012334 (2019).
  • (7) K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Phys. Rev. A 98, 032309 (2018).
  • (8) P.-L. Dallaire-Demers and N. Killoran, Phys. Rev. A 98, 012324 (2018).
  • (9) M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Phys. Rev. X 8, 021050 (2018).
  • (10) C. Zoufal, A. Lucchi, and S. Woerner. npj Quantum Information 5, 103 (2019).
  • (11) V. Dunjko and H. J. Briegel. Rep. Prog. Phys. 81, 074001 (2018).
  • (12) M. Schuld, I. Sinayskiy, and F. Petruccione. Quantum Information Processing 13, 2567 (2014).
  • (13) S. Aaronson. Nature Physics 11, 291 (2015).
  • (14) L. Gyongyosi and S. Imre, Sci. Rep.9 (2019).
  • (15) A. M. Childs, Phys. Rev. Lett. 102, 180501 (2009).
  • (16) N. B. Lovett, S. Cooper, M. Everitt, M. Trevers, and V. Kendon, Phys. Rev. A 81, 042330 (2010).
  • (17) P. Kurzynski and A. Wojcik, Phys. Rev. Lett. 110, 200404 (2013).
  • (18) Z. Bian, J. Li, H. Qin, X. Zhan, R. Zhang, B. C. Sanders, and P. Xue, Phys. Rev. Lett. 114, 203602 (2015).
  • (19) Y.-Y. Zhao, N.-K. Yu, P. Kurzynski, G.-Y. Xiang, C.-F. Li, and G.-C. Guo, Phys. Rev. A 91, 042101 (2015).
  • (20) C. Darken, J. Chang, and J. Moody, in Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop, Vol. 2 (1992) pp. 3-12.
  • (21) Y. Bengio, N. Boulanger-Lewandowski, and R. Pascanu, in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, New York, 2013) pp. 8624–8628.
  • (22) M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information (Cambridge University Press, New York, 2010) Chap. 4, Sec. 5, p. 189.
  • (23) S. Arora, N. Cohen, and E. Hazan, in Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 80 (PMLR, Stockholm, 2018), pp. 244-253.
  • (24) J. R. Johansson, P. D. Nation, and F. Nori, Comput Phys Commun 184, 1234 (2013).

Appendix A Proof of Theorem 1

Proof of Theorem 1.

Since every unitary operator can be decomposed into a product of two-level unitary operators Nielsen2007Quantum , we only need to show that Theorem 1 stands for V^\hat{V} of the form

i,j=01vi,j|ci,xicj,xj|+e(c0,x0)e(c1,x1)|ee|.\sum_{i,j=0}^{1}v_{i,j}\left|{c_{i},x_{i}}\middle\rangle\middle\langle{c_{j},x_{j}}\right|+\sum_{\begin{subarray}{c}e\neq(c_{0},x_{0})\\ e\neq(c_{1},x_{1})\end{subarray}}\left|{e}\middle\rangle\middle\langle{e}\right|. (17)

We prove this by constructing the family of single-site coin operators {c^x(t)}\{\hat{c}_{x}^{(t)}\} explicitly.

If c0c1c_{0}\neq c_{1}, let tmeett_{\mbox{meet}} be the solution to the integer tt in

{x0+tδc0=x1+tδc1(modn)0t<n\begin{cases}x_{0}+t\delta_{c_{0}}=x_{1}+t\delta_{c_{1}}\ (\mathrm{mod}\ {n})\\ 0\leq t<n\end{cases} (18)

and xmeetx_{\mbox{meet}} be x0+tmeetδc0(modn)x_{0}+t_{\mbox{meet}}\delta_{c_{0}}\ (\mathrm{mod}\ {n}). The solution tmeett_{\mbox{meet}} exists and is unique since δ0δ1\delta_{0}\neq\delta_{1} and gcd(|δ0δ1|,n)=1\gcd(|\delta_{0}-\delta_{1}|,n)=1. Choose T=nT=n and

c^x(t)={i,j=01vi,j|ciccj|if t=tmeet and x=xmeetI^cotherwise.\hat{c}_{x}^{(t)}=\begin{cases}\sum_{i,j=0}^{1}v_{i,j}\left|{c_{i}}\middle\rangle_{c}\middle\langle{c_{j}}\right|&\mbox{if $t=t_{\mbox{meet}}$ and $x=x_{\mbox{meet}}$}\\ \hat{I}_{c}&\mbox{otherwise}\end{cases}. (19)

We can verify that this TT-step quantum walk realizes the two-level unitary operator V^\hat{V} by the following calculation

U^T,0|c,x=\displaystyle\hat{U}_{T,0}|{c,x}\rangle= U^T,tm+1U^tm+1,tmU^tm,0|c,x\displaystyle\hat{U}_{T,t_{m}+1}\hat{U}_{t_{m}+1,t_{m}}\hat{U}_{t_{m},0}|{c,x}\rangle (20)
=\displaystyle= U^T,tm+1U^tm+1,tm|c,x+tmδc(modn)\displaystyle\hat{U}_{T,t_{m}+1}\hat{U}_{t_{m}+1,t_{m}}|{c,x+t_{m}\delta_{c}\ (\mathrm{mod}\ {n})}\rangle (21)
=\displaystyle= {U^T,tm+1i=01vi,0|ci,xm+δci(modn)if (c,x)=(c0,x0)U^T,tm+1i=01vi,1|ci,xm+δci(modn)if (c,x)=(c1,x1)U^T,tm+1|c,x+(tm+1)δc(modn)otherwise\displaystyle\begin{cases}\hat{U}_{T,t_{m}+1}\sum_{i=0}^{1}v_{i,0}|{c_{i},x_{m}+\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{0},x_{0})$}\\ \hat{U}_{T,t_{m}+1}\sum_{i=0}^{1}v_{i,1}|{c_{i},x_{m}+\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{1},x_{1})$}\\ \hat{U}_{T,t_{m}+1}|{c,x+(t_{m}+1)\delta_{c}\ (\mathrm{mod}\ {n})}\rangle&\mbox{otherwise}\end{cases} (22)
=\displaystyle= {i=01vi,0|ci,xm+(ntm)δci(modn)if (c,x)=(c0,x0)i=01vi,1|ci,xm+(ntm)δci(modn)if (c,x)=(c1,x1)|c,x+nδc(modn)otherwise\displaystyle\begin{cases}\sum_{i=0}^{1}v_{i,0}|{c_{i},x_{m}+(n-t_{m})\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{0},x_{0})$}\\ \sum_{i=0}^{1}v_{i,1}|{c_{i},x_{m}+(n-t_{m})\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{1},x_{1})$}\\ |{c,x+n\delta_{c}\ (\mathrm{mod}\ {n})}\rangle&\mbox{otherwise}\end{cases} (23)
=\displaystyle= {i=01vi,0|ci,xi+nδci(modn)if (c,x)=(c0,x0)i=01vi,1|ci,xi+nδci(modn)if (c,x)=(c1,x1)|c,xotherwise\displaystyle\begin{cases}\sum_{i=0}^{1}v_{i,0}|{c_{i},x_{i}+n\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{0},x_{0})$}\\ \sum_{i=0}^{1}v_{i,1}|{c_{i},x_{i}+n\delta_{c_{i}}\ (\mathrm{mod}\ {n})}\rangle&\mbox{if $(c,x)=(c_{1},x_{1})$}\\ |{c,x}\rangle&\mbox{otherwise}\end{cases} (24)
=\displaystyle= {i=01vi,0|ci,xiif (c,x)=(c0,x0)i=01vi,1|ci,xiif (c,x)=(c1,x1)|c,xotherwise\displaystyle\begin{cases}\sum_{i=0}^{1}v_{i,0}|{c_{i},x_{i}}\rangle&\mbox{if $(c,x)=(c_{0},x_{0})$}\\ \sum_{i=0}^{1}v_{i,1}|{c_{i},x_{i}}\rangle&\mbox{if $(c,x)=(c_{1},x_{1})$}\\ |{c,x}\rangle&\mbox{otherwise}\end{cases} (25)

, where U^t1,t0\hat{U}_{t_{1},t_{0}} stands for 𝒯t=t0t11S^C^(t)\mathcal{T}\prod_{t=t_{0}}^{t_{1}-1}\hat{S}\hat{C}^{(t)} and tmt_{m}, xmx_{m} stands for tmeett_{\mbox{meet}} and xmeetx_{\mbox{meet}} respectively.

If c0=c1c_{0}=c_{1}, let tmeett_{\mbox{meet}} be the unique solution to the integer tt in

{x0+tδc~0=x1+tδc~1(modn)0<t<n,\begin{cases}x_{0}+t\delta_{\tilde{c}_{0}}=x_{1}+t\delta_{\tilde{c}_{1}}\ (\mathrm{mod}\ {n})\\ 0<t<n\end{cases}, (26)

where c~0=c0\tilde{c}_{0}=c_{0} and c~1=1c1\tilde{c}_{1}=1-c_{1}. Denote x0+tmeetδc~0(modn)x_{0}+t_{\mbox{meet}}\delta_{\tilde{c}_{0}}\ (\mathrm{mod}\ {n}) as xmeetx_{\mbox{meet}}. Choose T=2nT=2n and

c^x(t)={σ^xif t=0 or n, and x=x1i,j=01vi,j|c~icc~j|if t=tmeet and x=xmeetI^cotherwise.\hat{c}_{x}^{(t)}=\begin{cases}\hat{\sigma}_{x}&\mbox{if $t=0$ or $n$, and $x=x_{1}$}\\ \sum_{i,j=0}^{1}v_{i,j}\left|{\tilde{c}_{i}}\middle\rangle_{c}\middle\langle{\tilde{c}_{j}}\right|&\mbox{if $t=t_{\mbox{meet}}$ and $x=x_{\mbox{meet}}$}\\ \hat{I}_{c}&\mbox{otherwise}\end{cases}. (27)

It is easy to verify that this is a realization of the two-level unitary operator V^\hat{V}. ∎

Appendix B Implementing the Fourier transformation

In this section, we demonstrate the calculation to implement the four-by-four Fourier transformation. Firstly, we decompose the Fourier transformation QFT=u^6u^5u^3u^3u^2u^1\mathrm{QFT}=\hat{u}_{6}\hat{u}_{5}\hat{u}_{3}\hat{u}_{3}\hat{u}_{2}\hat{u}_{1} Nielsen2007Quantum , where

u^1=12[200002000022002i2],\hat{u}_{1}=\frac{1}{2}\begin{bmatrix}2&0&0&0\\ 0&2&0&0\\ 0&0&\sqrt{2}&\sqrt{2}\\ 0&0&-\sqrt{2}i&\sqrt{2}\end{bmatrix}, (28)
u^2=13[3000036006300003],\hat{u}_{2}=\frac{1}{3}\begin{bmatrix}3&0&0&0\\ 0&-\sqrt{3}&-\sqrt{6}&0\\ 0&\sqrt{6}&-\sqrt{3}&0\\ 0&0&0&3\end{bmatrix}, (29)
u^3=14[40000400001+3i3(i1)003(i+1)13i],\hat{u}_{3}=\frac{1}{4}\begin{bmatrix}4&0&0&0\\ 0&4&0&0\\ 0&0&-1+3i&\sqrt{3}(i-1)\\ 0&0&\sqrt{3}(i+1)&-1-3i\end{bmatrix}, (30)
u^4=12[1300310000200002],\hat{u}_{4}=\frac{1}{2}\begin{bmatrix}1&-\sqrt{3}&0&0\\ \sqrt{3}&1&0&0\\ 0&0&2&0\\ 0&0&0&2\end{bmatrix}, (31)
u^5=13[3000036006300003],\hat{u}_{5}=\frac{1}{3}\begin{bmatrix}3&0&0&0\\ 0&\sqrt{3}&-\sqrt{6}&0\\ 0&\sqrt{6}&\sqrt{3}&0\\ 0&0&0&3\end{bmatrix}, (32)
u^6=12[2000020000220022].\hat{u}_{6}=\frac{1}{2}\begin{bmatrix}2&0&0&0\\ 0&2&0&0\\ 0&0&\sqrt{2}&-\sqrt{2}\\ 0&0&\sqrt{2}&\sqrt{2}\end{bmatrix}. (33)

All these u^i\hat{u}_{i} are two-level unitary operators. By comparing u^i\hat{u}_{i} with Eq. (17) we can find c0,c1,x0,x1c_{0},c_{1},x_{0},x_{1} for each u^i\hat{u}_{i}. Then we substitute c0,c1,x0,x1c_{0},c_{1},x_{0},x_{1} with their value in Eqs. (18) and (19) if c0=c1c_{0}=c_{1} or Eqs. (26) and (27) if c0c1c_{0}\neq c_{1} to find out the DTQW for implementing each u^i\hat{u}_{i}. The DTQW for each u^i\hat{u}_{i} is combined one after another in the temporal order of u^i\hat{u}_{i} to form a large DTQW. In other words, the walker first walks according to the DTQW for implementing u^1\hat{u}_{1}. After the DTQW for implementing u^1\hat{u}_{1} is finished, the walker continues to walk according to the DTQW for implementing u^2\hat{u}_{2}, then u^3\hat{u}_{3}, u^4\hat{u}_{4}, etc. The single-site coin operators c^x(t)\hat{c}_{x}^{(t)} of the final combined DTQW for implementing the quantum Fourier transformation are shown in the following table, where X stands for the Pauli xx matrix and I stands for the identity matrix.

xx tt c^x(t)\hat{c}_{x}^{(t)} 0 1 2 3 4 5 6 7 8 9
0 I I I I I I I I I I
1 X 22(1i11)\frac{\sqrt{2}}{2}\begin{pmatrix}1&-i\\ 1&1\end{pmatrix} X I I 33(1221)-\frac{\sqrt{3}}{3}\begin{pmatrix}1&\sqrt{2}\\ -\sqrt{2}&1\end{pmatrix} X 14(1+3i3(1+i)3(1i)13i)-\frac{1}{4}\begin{pmatrix}1+3i&\sqrt{3}(1+i)\\ \sqrt{3}(1-i)&1-3i\end{pmatrix} X I
xx tt c^x(t)\hat{c}_{x}^{(t)} 10 11 12 13 14 15 16 17 18 19
0 I 12(1331)\frac{1}{2}\begin{pmatrix}1&-\sqrt{3}\\ \sqrt{3}&1\end{pmatrix} I I I I I I I I
1 X I X I I 33(1221)\frac{\sqrt{3}}{3}\begin{pmatrix}1&-\sqrt{2}\\ \sqrt{2}&1\end{pmatrix} X 22(1111)\frac{\sqrt{2}}{2}\begin{pmatrix}1&1\\ -1&1\end{pmatrix} X I

Appendix C Optimization of depth required

We show in this section that any unitary operator V^U(2n)\hat{V}\in\mathrm{U}(2n) can be realized with a DTQW-based neural network of depth 2n22n+12n^{2}-2n+1 by constructing the implementation.

Before the actual construction, we first introduce the follow lemma so that the total effect of our DTQW-based neural networks becomes more distinct.

Lemma 1.

For any V^U(2n)\hat{V}\in\mathrm{U}(2n), it is realizable by a TT-step DTQW on an nn-cycle if and only if

[𝒯τ=0T1(ξ=0n1U^|0,ξ,|1,ξ+τδ(ξ+τδ0,τ))]V^S^T=I^\left[\mathcal{T}\prod_{\tau=0}^{T-1}\left(\prod_{\xi=0}^{n-1}\hat{U}^{(\xi+\tau\delta_{0},\tau)}_{\left|{0,\xi}\right\rangle,\left|{1,\xi+\tau\delta}\right\rangle}\right)\right]\hat{V}^{\dagger}\hat{S}^{T}=\hat{I} (34)

for a family of two-level unitary operators {U^|0,ξ,|1,ξ+τδ(ξ,τ)}\left\{\hat{U}^{(\xi,\tau)}_{\left|{0,\xi}\right\rangle,\left|{1,\xi+\tau\delta}\right\rangle}\right\} indexed by the set {(ξ,τ):0ξ<n and 0τ<T}\{(\xi,\tau):0\leq\xi<n\mbox{ and }0\leq\tau<T\}, where U^|0,ξ,|1,ξ+τδ(ξ,τ)\hat{U}^{(\xi,\tau)}_{\left|{0,\xi}\right\rangle,\left|{1,\xi+\tau\delta}\right\rangle} is a two-level unitary acting on the subspace spanned by {|0,ξ,|1,ξ+τδ}\{\left|{0,\xi}\right\rangle,\left|{1,\xi+\tau\delta}\right\rangle\}, and δ=δ0δ1\delta=\delta_{0}-\delta_{1}.

This lemma is proved by the following calculation:

U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= 𝒯t=0T1S^C^(t),\displaystyle\mathcal{T}\prod_{t=0}^{T-1}\hat{S}\hat{C}^{(t)}, (35)
U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= 𝒯t=0T1[S^x=0n1(c^x(t)|xx|+ξxI^x|ξξ|)],\displaystyle\mathcal{T}\prod_{t=0}^{T-1}\Bigg{[}\hat{S}\cdot\prod_{x=0}^{n-1}\Bigg{(}\begin{aligned} &\hat{c}_{x}^{(t)}\otimes\left|{x}\middle\rangle\middle\langle{x}\right|\\ &+\sum_{\xi\neq x}\hat{I}_{x}\otimes\left|{\xi}\middle\rangle\middle\langle{\xi}\right|\Bigg{)}\Bigg{]},\end{aligned} (36)
U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= 𝒯t=0T1[S^S^tS^tx=0n1(c^x(t)|xx|+ξxI^x|ξξ|)S^tS^t],\displaystyle\mathcal{T}\prod_{t=0}^{T-1}\Bigg{[}\begin{aligned} &\hat{S}\cdot\hat{S}^{t}\cdot\hat{S}^{-t}\prod_{x=0}^{n-1}\Bigg{(}\hat{c}_{x}^{(t)}\otimes|x\rangle\langle x|\\ &+\sum_{\xi\neq x}\hat{I}_{x}\otimes|\xi\rangle\langle\xi|\Bigg{)}\hat{S}^{t}\cdot\hat{S}^{-t}\Bigg{]},\end{aligned} (37)
U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= 𝒯t=0T1[S^t+1x=0n1(S^tc^x(t)|xx|+ξxI^x|ξξ|S^t)S^t],\displaystyle\mathcal{T}\prod_{t=0}^{T-1}\Bigg{[}\begin{aligned} &\hat{S}^{t+1}\prod_{x=0}^{n-1}\Bigg{(}\hat{S}^{-t}\cdot\hat{c}_{x}^{(t)}\otimes|x\rangle\langle x|\\ &+\sum_{\xi\neq x}\hat{I}_{x}\otimes|\xi\rangle\langle\xi|\cdot\hat{S}^{t}\Bigg{)}\cdot\hat{S}^{-t}\Bigg{]},\end{aligned} (38)
U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= ST𝒯t=0T1(x=0n1U^(x,t)).\displaystyle S^{T}\cdot\mathcal{T}\prod_{t=0}^{T-1}\left(\prod_{x=0}^{n-1}\hat{U}^{(x,t)}\right). (39)

Notice that if ξ+tδcx\xi+t\cdot\delta_{c}\neq x,

U^(x,t)|c,ξ=|c,ξ.\hat{U}^{(x,t)}|c,\xi\rangle=|c,\xi\rangle. (40)

Hence, U^(x,t)\hat{U}^{(x,t)} is a two-level unitary, and the possible nonidentity effect subspace is spanned by {|0,xtδ0,|1,xtδ1}\left\{\left|0,x-t\delta_{0}\right\rangle,\left|1,x-t\delta_{1}\right\rangle\right\}. Thus

U^T,0\displaystyle\hat{U}_{T,0} =\displaystyle= ST𝒯t=0T1(x=0n1U^|0,xtδ0,|1,xtδ1(x,t))\displaystyle S^{T}\cdot\mathcal{T}\prod_{t=0}^{T-1}\left(\prod_{x=0}^{n-1}\hat{U}_{\left|0,x-t\delta_{0}\right\rangle,\left|1,x-t\delta_{1}\right\rangle}^{(x,t)}\right) (41)
=\displaystyle= ST𝒯t=0T1(ξ=0n1U^|0,ξ,|1,ξ+tδ(ξ+tδ0,t)),\displaystyle S^{T}\cdot\mathcal{T}\prod_{t=0}^{T-1}\left(\prod_{\xi=0}^{n-1}\hat{U}_{|0,\xi\rangle,|1,\xi+t\delta\rangle}^{\left(\xi+t\delta_{0,t}\right)}\right),

where δ=δ0δ1,ξ=xtδ0\delta=\delta_{0}-\delta_{1},\xi=x-t\delta_{0}. By moving all shift operators in Eq. (4) to the far left, this lemma is proved.

With this lemma, we can finally start our construction of the implementation for arbitrary unitary operators V^\hat{V}. Let us denote

V^t={V^tV^t1if t2ξ=0n1U^|0,ξ,|1,ξ+tδ(ξ+tδ0,t)S^1if t=1;V^0S^1if t=0\hat{V}_{t}=\left\{\begin{array}[]{ll}\hat{V}_{t}\cdot\hat{V}_{t-1}&\mbox{if $t\geq 2$}\\ \prod_{\xi=0}^{n-1}\hat{U}_{|0,\xi\rangle,|1,\xi+t\delta\rangle}^{\left(\xi+t\delta_{0,t}\right)}\cdot\hat{S}^{-1}&\mbox{if $t=1$};\\ \hat{V}_{0}\cdot\hat{S}^{-1}&\mbox{if $t=0$}\end{array}\right. (42)

if 2knτ<2kn+nk2kn\leqslant\tau<2kn+n-k and ξ=(k1)δ(modn)\xi=(k-1)\delta\ (\mathrm{mod}\ {n}):

U^(ξ+τδ0,τ)=U^x|1,ξ+τδ(ξ+τδ0,τ)(V~τ|0,kδ),\hat{U}^{\left(\xi+\tau\delta_{0},\tau\right)}=\hat{U}_{x|1,\xi+\tau\delta\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}\left(\tilde{V}_{\tau}|0,k\delta\rangle\right), (43)

if 2knτ<2kn+nk12kn\leqslant\tau<2kn+n-k-1 and ξ=δ(modn)\xi=-\delta\ (\mathrm{mod}\ {n}):

U^(ξ+τδ0,τ)=U^x|0,ξ(ξ+τδ0,τ)(V~τ|0,kδ),\hat{U}^{\left(\xi+\tau\delta_{0},\tau\right)}=\hat{U}_{x|0,\xi\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}\left(\tilde{V}_{\tau}|0,k\delta\rangle\right), (44)

if 2kn+n+kτ<2(k+1)n2kn+n+k\leqslant\tau<2(k+1)n and ξ=(k1t)δ(modn)\xi=(k-1-t)\delta\ (\mathrm{mod}\ {n}):

U^(ξ+τδ0,τ)=U^x|0,ξ(ξ+τδ0,τ)(V~τ|1,kδ),\hat{U}^{\left(\xi+\tau\delta_{0},\tau\right)}=\hat{U}_{x|0,\xi\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}\left(\tilde{V}_{\tau}|1,k\delta\rangle\right), (45)

if (2k+1)nτ<2(k+1)nk(2k+1)n\leqslant\tau<2(k+1)n-k and ξ=kδ(modn)\xi=k\delta\ (\mathrm{mod}\ {n}):

U^(ξ+τδ0,τ)=U^x|1,ξ+τδ(ξ+τδ0,τ)(V~τ|1,kδ),\hat{U}^{\left(\xi+\tau\delta_{0},\tau\right)}=\hat{U}_{x|1,\xi+\tau\delta\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}\left(\tilde{V}_{\tau}|1,k\delta\rangle\right), (46)

where δ=δ0δ1\delta=\delta_{0}-\delta_{1}, k=τ2nk=\lfloor\frac{\tau}{2n}\rfloor, and Ux|φ(ξ+τδ0,τ)(|ψ)U_{x|\varphi\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}(|\psi\rangle) is any two-level unitary subject to φ|Ux|φ(ξ+τδ0,τ)|ψ=0\left\langle\varphi\left|U_{x|\varphi\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}\right|\psi\right\rangle=0. One can easily verify that such Ux|φ(ξ+τδ0,τ)(|ψ)U_{x|\varphi\rangle}^{\left(\xi+\tau\delta_{0},\tau\right)}(|\psi\rangle) always exists as long as |ψ=|0,ξ|\psi\rangle=|0,\xi\rangle or |1,ξ+τδ|1,\xi+\tau\delta\rangle.

For induction on t2n\lfloor\frac{t}{2n}\rfloor, let t=2n22n+1t=2n^{2}-2n+1, With cx(t)=[p]x|S^tU^(x,t)S^t|xpc_{x}^{(t)}=\tensor{[}_{p}]{\left\langle{x}\right|}{}\hat{S}^{t}\hat{U}^{(x,t)}\hat{S}^{-t}|{x}\rangle_{p}, c1\forall c\leqslant 1, l<t2n\forall l<\lfloor\frac{t}{2n}\rfloor, we have

V^t|c,lδ=|c,lδ.\hat{V}_{t}|c,l\delta\rangle=|c,l\delta\rangle. (47)

Appendix D Controlling large systems via DTQW-based neural network

Refer to caption
Figure 12: The evolution of the distance between the DTQW and the desired operation during the training of DTQW-QNNs. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the largeness of the distance. The thick dashed red line shows the average of distances calculated from 200 sampled operations and their corresponding DTQW-QNNs, while the thin blue line shows the largest distance among those samples.

In Sec. IV, we mentioned the possibility of controlling a large system via the DTQW indirectly by controlling the 22-level coin system. As shown in Fig. 12, this is actually feasible, indicated by the numerical results, when the desired operation on the position system is unitary.

Refer to caption
Figure 13: The evolution of the distance during the training of DTQW-QNNs. The horizontal axis represents the number of times that the DTQW-QNN is updated. The vertical axis represents the distance. The thick dashed red line shows the average of distances calculated from 200 sampled operations and their corresponding DTQW-QNNs, while the thin blue line shows the largest distance among those samples.

Not only unitary operations can be realized in this indirect controlling fashion, but more general quantum operations such as POVM measurements can also be realized, as shown in Fig. 13. To apply gradient descent in this situation, the loss is defined as

L|ψp=12j=01|ψj(T)p|ϕj(T)p2,L_{|{\psi}\rangle_{p}}=\frac{1}{2}\sum_{j=0}^{1}\left\lVert|{\psi^{(T)}_{j}}\rangle_{p}-|{\phi^{(T)}_{j}}\rangle_{p}\right\rVert^{2}, (48)

where |ψj(T)p=[c]j|U^t,0|0c|ψp|{\psi^{(T)}_{j}}\rangle_{p}=\tensor{[}_{c}]{\left\langle{j}\right|}{}\hat{U}_{t,0}|{0}\rangle_{c}|{\psi}\rangle_{p}, and |ϕj(T)p=M^j|ψp|{\phi^{(T)}_{j}}\rangle_{p}=\hat{M}_{j}|{\psi}\rangle_{p}. This loss is well-selected by us so that the form of partial derivatives in Eq. (10) needs no modification, i.e.,

L|Ψαj(x,t)=Im(Φ(t)|Σ^j(x,t)|Ψ(t)),\frac{\partial{L_{|{\Psi}\rangle}}}{\partial{\alpha^{(x,t)}_{j}}}=\operatorname{Im}\left(\langle{\Phi^{(t)}}|{\hat{\Sigma}_{j}^{(x,t)}}|{\Psi^{(t)}}\rangle\right), (49)

where |Ψ(t)=U^t,0|0c|ψp|{\Psi^{(t)}}\rangle=\hat{U}_{t,0}|{0}\rangle_{c}|{\psi}\rangle_{p}, |Φ(t)=U^t,0|Φ(T)|{\Phi^{(t)}}\rangle=\hat{U}_{t,0}^{\dagger}|{\Phi^{(T)}}\rangle, and |Φ(T)=j=01|jc|ϕj(T)p|{\Phi^{(T)}}\rangle=\sum_{j=0}^{1}|{j}\rangle_{c}|{\phi^{(T)}_{j}}\rangle_{p}. The distance between two measurements {N^j=[c]j|U^t,0|0c}j=01\{\hat{N}_{j}=\tensor{[}_{c}]{\left\langle{j}\right|}{}\hat{U}_{t,0}|{0}\rangle_{c}\}_{j=0}^{1} and {M^j}j=01\{\hat{M}_{j}\}_{j=0}^{1} in Fig. 13 is measured by

d({N^j}j=01,{M^j}j=01)=12n2j=01tr2(M^jM^j)+tr2(N^jN^j)2|tr(N^jM^j)|2.d(\{\hat{N}_{j}\}_{j=0}^{1},\{\hat{M}_{j}\}_{j=0}^{1})\\ =\frac{1}{2n\sqrt{2}}\sum_{j=0}^{1}\sqrt{\mathrm{tr}^{2}(\hat{M}_{j}^{\dagger}\hat{M}_{j})+\mathrm{tr}^{2}(\hat{N}_{j}^{\dagger}\hat{N}_{j})-2|\mathrm{tr}(\hat{N}_{j}^{\dagger}\hat{M}_{j})|^{2}}. (50)

Appendix E Meta parameters used in numerical simulation

For all numerical simulations, the δ0\delta_{0} and δ1\delta_{1} for the shift operator S^\hat{S} are set to be 0 and 11 respectively. And all the real initial parameters αj(x,t)\alpha^{(x,t)}_{j} in the coin operators before the training are randomly sampled from [2π,2π][-2\pi,2\pi] uniformly and independently. The training sets are always the Haar-measured pure states from the appropriate Hilbert space. For the desired operator V^\hat{V}, the number of depth TT, the number of sites in the cycle nn, the learning rate η\eta, the number of samples of DTQW-QNN trained in parallel NsampleN_{\text{sample}} and other randomness involved, see the table below [where U(2)\mathrm{U}(2) and U(4)\mathrm{U}(4) are equipped with corresponding Haar measures].

Figure V^\hat{V} TT nn η\eta NsampleN_{\text{sample}} Other randomness
Fig. 4 SWAP\mathrm{SWAP} 5 2 0.1 / /
Fig. 5 U(2)\mathrm{U}(2) / 2 0.05 200 for each T /
Fig. 6 QFT\mathrm{QFT} 2n22n^{2} / 0.05
200 for n=2,3n=2,3
50 for n=4,5n=4,5
/
Fig. 7 QFT\mathrm{QFT} / 2 0.01 200 for each T /
Fig. 8 QFT\mathrm{QFT} 500 20 0.05 10 /
Fig. 9a U(4)\mathrm{U}(4) 20 2 0.05 200 𝒂(x)\bm{a}^{(x)} uniformly sampled from [0,2π][0,2\pi]
Fig. 9b U(4)\mathrm{U}(4) 20 2 0.05 200 𝒂(x)\bm{a}^{(x)} uniformly sampled from [0,2π][0,2\pi] and shared by all samples
Fig. 10 U(4)\mathrm{U}(4) 20 2 0.05 200 𝒂(x)\bm{a}^{(x)} uniformly sampled from [0,2π][0,2\pi] and independent for all samples
Fig. 11 QFT\mathrm{QFT} 20 2 0.1 100
𝒂(x)\bm{a}^{(x)} uniformly sampled from [0,2π][0,2\pi] and shared by all samples
𝜽(x,t)\bm{\theta}^{(x,t)} sampled from normal distribution with standard derivation 0.010.01
𝝋(x,t)\bm{\varphi}^{(x,t)} uniformly sampled from [0,2π][0,2\pi]
Fig. 12 U(2)\mathrm{U}(2) 20 4 0.01 150 /
Fig. 13 U(4)\mathrm{U}(4) 20 4 0.01 150 /