This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimal and two-step adaptive quantum detector tomography

Shuixin Xiao xiaoshuixin@sjtu.edu.cn    Yuanlong Wang yuanlong.wang.qc@gmail.com    Daoyi Dong daoyidong@gmail.com    Jun Zhang zhangjun12@sjtu.edu.cn UMich-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai 200240, China Centre for Quantum Computation and Communication Technology (Australian Research Council), Centre for Quantum Dynamics, Griffith University, Brisbane, Queensland 4111, Australia School of Engineering and Information Technology, University of New South Wales, Canberra ACT 2600, Australia
Abstract

Quantum detector tomography is a fundamental technique for calibrating quantum devices and performing quantum engineering tasks. In this paper, we design optimal probe states for detector estimation based on the minimum upper bound of the mean squared error (UMSE) and the maximum robustness. We establish the minimum UMSE and the minimum condition number for quantum detectors and provide concrete examples that can achieve optimal detector tomography. In order to enhance the estimation precision, we also propose a two-step adaptive detector tomography algorithm to optimize the probe states adaptively based on a modified fidelity index. We present a sufficient condition on when the estimation error of our two-step strategy scales inversely proportional to the number of state copies. Moreover, the superposition of coherent states is used as probe states for quantum detector tomography and the estimation error is analyzed. Numerical results demonstrate the effectiveness of both the proposed optimal and adaptive quantum detector tomography methods.

keywords:
Quantum detector tomography, quantum system identification, adaptive estimation, quantum systems
thanks: This research was supported by the National Natural Science Foundation of China (62173229), the Australian Government via AUSMURI Grant No. AUSMURI000002 and the Australian Research Council Discovery Projects Funding Scheme under Project DP190101566. The material in this paper was partially presented at the 60th IEEE conference on Decision and Control, Austin, Texas, USA, December 13-17, 2021.

, , ,

thanks: Corresponding author.
\endNoHyper

1 Introduction

In the past decades, we have witnessed significant progress in a variety of fields of quantum science and technology, including quantum computation [1], quantum communication [2] and quantum sensing [3]. In these applications, a fundamental task is to develop efficient estimation and identification methods to acquire information of quantum states, system parameters and quantum detectors. There are three typical classes of quantum estimation and identification tasks: (i) quantum state tomography (QST) which aims to estimate unknown states [4, 5, 6]; (ii) quantum process tomography which targets in identifying parameters of evolution operators [7, 8, 9, 10, 11] (e.g., the system Hamiltonian [12, 13, 14, 15, 16, 17]); and (iii) quantum detector tomography (QDT) which aims to identify and calibrate quantum measurement devices. In this paper, we focus on QDT and aim to present optimal and adaptive strategies for enhancing the efficiency and precision of QDT.

For general QDT protocols, the first solution was proposed in [18] using maximum likelihood estimation (MLE). Subsequent works divide quantum detectors into phase-insensitive detectors and phase-sensitive detectors. Phase-insensitive detectors only have diagonal elements in the photon number basis, and therefore are relatively straightforward to be characterized using linear regression [19], function fitting [20], or convex optimization [21, 22, 23]. In experiment, a regularized least-square method was used in [24, 25] for phase-insensitive detectors. For phase-sensitive detectors, non-diagonal elements can be nonzero and they are usually more challenging to reconstruct. The work in [26, 27] formulated QDT as a convex quadratic optimization problem for this type of detectors. Ref. [28] proposed a two-stage solution with an analytical computational complexity and error upper bound. Ref. [29] further studied a binary detector tomography method with lower computational complexity by projection. Self-characterization of one-qubit QDT was proposed in [30] which does not rely on precisely calibrated probe states.

Refer to caption
Figure 1: General procedures for optimal detector tomography and adaptive detector tomography. For optimal detector tomography, we only consider Step 1 and it is non-adaptive. We focus on finding the optimal probe states. For adaptive detector tomography, we choose adaptive probe states in Step 2 based on Step 1 estimation to obtain a more accurate estimate.

In this paper, we consider optimal QDT and adaptive QDT which are applicable to both phase insensitive and phase sensitive detectors. The general tomography procedures are shown in Fig. 1. For optimal QDT, we only consider Step 1 where we obtain measurement data and then use the two-stage reconstruction algorithm in [28] to identify the unknown detectors. A natural question would be which input probe states are optimal, according to some requirements or criteria. Here we use the upper bound of the mean squared error (UMSE) [28] and the robustness described by the condition number against measurement errors as two criteria [31]. We prove that the minimum UMSE is (n1)(d4+d3d2)4N\frac{(n-1)(d^{4}+d^{3}-d^{2})}{4N} where dd is the dimension of detector matrices, nn is the number of detector matrices and NN is the resource number (i.e., number of copies of probe states). We also prove the minimum condition number is d+1\sqrt{d+1}. We then provide two examples of optimal probe states–SIC (symmetric informationally complete) states with the smallest M=d2M=d^{2} and MUB (mutually unbiased) states for M=d(d+1)M=d(d+1) where MM is the type of probe states. When restricted to product states, we prove the minimum of UMSE is 20m(n1)4N\frac{20^{m}(n-1)}{4N} and that of the condition number is 3m23^{\frac{m}{2}} where mm is the number of qubits.

Another focus we consider is to develop adaptive QDT to enhance the identification accuracy of quantum detectors. Adaptive strategies have been employed in QST and existing results show that adaptive QST has great potential to enhance the estimation precision of quantum states [32, 33, 34, 35, 36, 37]. Inspired by adaptive QST, we propose a two-step adaptive QDT as shown in Fig. 1. Step 1 is the same as optimal QDT, and adaptive probe states are chosen in Step 2 based on a rough detector estimation in Step 1. With these adaptive probe states, we can improve the infidelity from O(1/N)O(1/\sqrt{N}) to optimal value O(1/N)O(1/N). We also use the superposition of coherent states to realize QDT and use numerical examples to demonstrate the effectiveness of our optimal and adaptive QDT.

This paper is organized as follows. In Section 2, we present background knowledge and reconstruction algorithm. In Section 3, we propose optimal QDT and provide concrete examples. In Section 4, we present a two-step adaptive QDT. In Section 5, we give numerical examples of optimal and adaptive QDT. Conclusions are presented in Section 6.

Notation: For a matrix AA, A0A\geq 0 means AA is positive semidefinite. The conjugation and transpose (T)(T) of AA is AA^{\dagger}. The trace of AA is Tr(A)\operatorname{Tr}(A). The identity matrix is II. The real and complex domains are \mathbb{R} and \mathbb{C}, respectively. The tensor product is \otimes. The set of all dd-dimensional complex/real vectors is d/d\mathbb{C}^{d}/\mathbb{R}^{d}. Row and column vectors also denoted as ψ|\langle\psi| and |ψ|\psi\rangle, respectively. The Frobenius norm for matrix and 2-norm for vector are \|\cdot\|. The Kronecker delta function is δ\delta. i=1\mathrm{i}=\sqrt{-1}. The column vectorization function is vec\operatorname{vec}. The diagonal elements of a diagonal matrix diag(X)\operatorname{diag}(X) are the elements in XX and XX is a vector. The estimation of variable XX denotes X^\hat{X}. For any positive semidefinite Xd×dX_{d\times d} with spectral decomposition X=UPU,X=UPU^{\dagger}, define X\sqrt{X} or X12X^{\frac{1}{2}} as Udiag(P11,P22,,Pdd)UU\operatorname{diag}\left(\sqrt{P_{11}},\sqrt{P_{22}},\ldots,\sqrt{P_{dd}}\right)U^{\dagger}. Pauli matrices are σx=(0110)\sigma_{x}=\left(\begin{array}[]{ll}0&1\\ 1&0\end{array}\right), σy=(0ii0)\sigma_{y}=\left(\begin{array}[]{cc}0&-\mathrm{i}\\ \mathrm{i}&0\end{array}\right), σz=(1001)\sigma_{z}=\left(\begin{array}[]{cc}1&0\\ 0&-1\end{array}\right). The fidelity between quantum states ρ\rho and ρ^\hat{\rho} is Fs(ρ^,ρ)=[Trρ^ρρ^]2F_{s}(\hat{\rho},\rho)=[\operatorname{Tr}\sqrt{\sqrt{\hat{\rho}}\rho\sqrt{\hat{\rho}}}]^{2}.

2 Preliminaries

Here we present the background knowledge and briefly introduce the two-stage QDT reconstruction algorithm in [28], which will be employed as a critical part for developing optimal QDT.

2.1 Quantum state and measurement

For a dd-dimensional quantum system, its state can be described by a d×dd\times d Hermitian matrix ρ\rho, satisfying ρ0,Tr(ρ)=1\rho\geq 0,\operatorname{Tr}(\rho)=1. When ρ=|ψψ|\rho=|\psi\rangle\langle\psi| for some |ψd|\psi\rangle\in\mathbb{C}^{d}, we call ρ\rho a pure state, and its purity Tr(ρ2)\operatorname{Tr}(\rho^{2}) reaches the maximum value 11. Otherwise, ρ\rho is called a mixed state, and can be represented using pure states {|ψi}:ρ=ici|ψiψi|\left\{\left|\psi_{i}\right\rangle\right\}:\rho=\sum_{i}c_{i}\left|\psi_{i}\right\rangle\left\langle\psi_{i}\right| where cic_{i}\in\mathbb{R} and ici=1\sum_{i}c_{i}=1.

A quantum detector connects the classical and quantum world through a set of operators known as positive-operator-valued measure (POVM). A set of POVM elements is a set of Hermitian and positive semidefinite operators {Pi}\left\{P_{i}\right\}, which is the mathematical representation of quantum detectors, satisfying the completeness constraint iPi=I\sum_{i}P_{i}=I. We directly call {Pi}\left\{P_{i}\right\} a POVM element in this paper. The operators PiP_{i} may be finite or infinite dimensional in theory. When infinite dimensional, we usually truncate them at a finite dimension dd in practice. When ρ\rho is measured using {Pi}\left\{P_{i}\right\}, the probability of obtaining the ii-th result is given by the Born’s rule

pi=Tr(Piρ).p_{i}=\operatorname{Tr}\left(P_{i}\rho\right). (1)

Because of the completeness constraint, we have ipi=1\sum_{i}p_{i}=1. In practical experiments, suppose NN (called resource number) identical copies of ρ\rho are prepared and the ii-th result occurs NiN_{i} times. Then p^i=Ni/N\hat{p}_{i}=N_{i}/N is the experimental estimation of the true value pip_{i}.

2.2 Problem formulation of QDT

By applying known quantum states to an unknown quantum detector and obtaining the measurement results, one can estimate the detector, which is called quantum detector tomography (QDT). Denote the true value of the detector as {Pi}i=1n\left\{P_{i}\right\}_{i=1}^{n}. We design MM different types of quantum states (called probe states), where each state ρj\rho_{j} use the same resource number N/MN/M thus their total number of copies is NN. One can formulate the problem of QDT as [28]:

Problem 1.

Given experimental data {p^ij}\left\{\hat{p}_{ij}\right\}, solve

min{P^i}i=1ni=1nj=1M[p^ijTr(P^iρj)]2\min_{\left\{\hat{P}_{i}\right\}_{i=1}^{n}}\sum_{i=1}^{n}\sum_{j=1}^{M}\left[\hat{p}_{ij}-\operatorname{Tr}\left(\hat{P}_{i}\rho_{j}\right)\right]^{2}

such that P^i=P^i,P^i0\hat{P}_{i}=\hat{P}_{i}^{\dagger},\hat{P}_{i}\geq 0 for 1in1\leq i\leq n and i=1nP^i=I\sum_{i=1}^{n}\hat{P}_{i}=I.

There are various methods to formulate and solve the problem of QDT. For maximum likelihood estimation (MLE) [18], if large data are given, the MLE can asymptotically reach the Crame´\acute{e}r-Rao bound for parameter estimation while the computational complexity is usually high. There also exist several convex optimization approaches [21, 22, 23] which might be more efficient. However, an analytical error upper bound is missing for MLE and convex optimization approaches, making it difficult to optimize the input probe states in the general case without available prior information about the detector. Here we choose the linear regression method due to its simplicity and computational efficiency. Moreover, Ref. [28] presents the two-stage QDT solution to Problem 1, giving an analytical error upper bound in favor of optimizing general input states. We thus review this solving procedure as follows.

Let {Ωi}i=1d2\left\{\Omega_{i}\right\}_{i=1}^{d^{2}} be a complete basis set of orthonormal operators with dd-dimension. Without loss of generality, let Tr(ΩiΩj)=δij\operatorname{Tr}\left(\Omega_{i}^{\dagger}\Omega_{j}\right)=\delta_{ij}, Ωi=Ωi\Omega_{i}=\Omega_{i}^{\dagger} where Tr(Ωi)=0\operatorname{Tr}\left(\Omega_{i}\right)=0 except Ω1=I/d\Omega_{1}=I/\sqrt{d}. Then we parameterize the detector and probe states as

Pi\displaystyle P_{i} =a=1d2θiaΩa,\displaystyle=\sum_{a=1}^{d^{2}}\theta_{i}^{a}\Omega_{a}, (2)
ρj\displaystyle\rho_{j} =b=1d2ϕjbΩb,\displaystyle=\sum_{b=1}^{d^{2}}\phi_{j}^{b}\Omega_{b},

where θiaTr(PiΩa)\theta_{i}^{a}\triangleq\operatorname{Tr}\left(P_{i}\Omega_{a}\right) and ϕjbTr(ρjΩb)\phi_{j}^{b}\triangleq\operatorname{Tr}\left(\rho_{j}\Omega_{b}\right) are real. Using Born’s rule, we can obtain

pij=a=1d2ϕjaθiaϕjTθi,p_{ij}=\sum_{a=1}^{d^{2}}\phi_{j}^{a}\theta_{i}^{a}\triangleq\phi_{j}^{T}\theta_{i}, (3)

where ϕj\phi_{j}, θi\theta_{i} is the parameterization of ρj\rho_{j} and PiP_{i}, respectively. Suppose the outcome for PiP_{i} appears nijn_{ij} times, then p^ij=nij/(N/M)\hat{p}_{ij}=n_{ij}/(N/M). Denote the estimation error as eij=p^ijpije_{ij}=\hat{p}_{ij}-p_{ij}. According to the central limit theorem, eije_{ij} converges in distribution to a normal distribution with mean zero and variance (pijpij2)/(N/M)\left(p_{ij}-p_{ij}^{2}\right)/(N/M). We thus have the linear regression equation

p^ij=ϕjTθi+eij.\hat{p}_{ij}=\phi_{j}^{T}\theta_{i}+e_{ij}. (4)

Let Θ=(θ1T,θ2T,,θnT)T\Theta=\left(\theta_{1}^{T},\theta_{2}^{T},\ldots,\theta_{n}^{T}\right)^{T} be the vector of all the unknown parameters to be estimated. Collect the parameterization of the probe states as X=(ϕ1,ϕ2,,ϕM)TX=\left(\phi_{1},\phi_{2},\ldots,\phi_{M}\right)^{T}. Let 𝓎^=\hat{\mathcal{y}}= (p^11,p^12,,p^1M,p^21,p^22(\hat{p}_{11},\hat{p}_{12},\ldots,\hat{p}_{1M},\hat{p}_{21},\hat{p}_{22}, ,p^2M,,p^nM)T\ldots,\hat{p}_{2M},\ldots,\hat{p}_{nM})^{T}, 𝒳=InX\mathcal{X}=I_{n}\otimes X, =(e11,e12,,e1M\mathcal{e}=(e_{11},e_{12},\ldots,e_{1M}, e21,e22,,e2M,,enM)Te_{21},e_{22},\ldots,e_{2M},\ldots,e_{nM})^{T}, =(1,1,,1)1×nId2,𝒹d2×1=(d,0,,0)T\mathcal{H}=(1,1,\ldots,1)_{1\times n}\otimes I_{d^{2}},\mathcal{d}_{d^{2}\times 1}=(\sqrt{d},0,\ldots,0)^{T}. Then the regression equations can be rewritten in a compact form [28]:

𝓎^=𝒳Θ+,\hat{\mathcal{y}}=\mathcal{X}\Theta+\mathcal{e}, (5)

with a linear constraint

Θ=𝒹.\mathcal{H}\Theta=\mathcal{d}. (6)

Now Problem 1 can be transformed into the following equivalent form:

Problem 2.

Given experimental data 𝓎^\hat{\mathcal{y}}, solve
min{P^i}i=1n𝓎^𝒳Θ^2\min_{\left\{\hat{P}_{i}\right\}_{i=1}^{n}}\|\hat{\mathcal{y}}-\mathcal{X}\hat{\Theta}\|^{2} such that Θ^=𝒹\mathcal{H}\hat{\Theta}=\mathcal{d} and P^i0\hat{P}_{i}\geq 0 for 1in1\leq i\leq n, where Θ^\hat{\Theta} is the parametrization of {P^i}\left\{\hat{P}_{i}\right\}.

In [28], Problem 2 is split into two approximate subproblems:

Problem 2.1.\bm{1}.

Given experimental data 𝓎^\hat{\mathcal{y}}, solve min{E^i}i=1n𝓎^𝒳Θ^2\min_{\left\{\hat{E}_{i}\right\}_{i=1}^{n}}\|\hat{\mathcal{y}}-\mathcal{X}\hat{\Theta}\|^{2} such that Θ^=𝒹\mathcal{H}\hat{\Theta}=\mathcal{d} where Θ^\hat{\Theta} is the parametrization of {E^i}\left\{\hat{E}_{i}\right\}.

Problem 2.2.\bm{2}.

Given i=1nE^i=I\sum_{i=1}^{n}\hat{E}_{i}=I, solve
min{P^i}i=1ni=1nE^iP^i2\min_{\left\{\hat{P}_{i}\right\}_{i=1}^{n}}\sum_{i=1}^{n}\left\|\hat{E}_{i}-\hat{P}_{i}\right\|^{2} such that i=1nP^i=I\sum_{i=1}^{n}\hat{P}_{i}=I and P^i0\hat{P}_{i}\geq 0 for 1in1\leq i\leq n.

The reconstruction algorithm has two stages, as shown in Fig. 1. In Stage 1, Problem 2.1 is solved by Constrained Least Squares (CLS) method and the form of the solution is further simplified. In Stage 2, Problem 2.2 is solved by matrix decomposition. We briefly review the two-stage QDT reconstruction algorithm in [28] in the next subsection.

2.3 Two-stage QDT reconstruction algorithm

In Stage 1, we directly give the simplified form of the CLS solution to Problem 2.1 as

Θ^CLS=((XTX)1XT(y^11ny0)+1n𝒹(XTX)1XT(y^n1ny0)+1n𝒹),\hat{\Theta}_{\text{CLS}}=\left(\begin{array}[]{c}\left(X^{T}X\right)^{-1}X^{T}\left(\hat{y}_{1}-\frac{1}{n}y_{0}\right)+\frac{1}{n}\mathcal{d}\\ \vdots\\ \left(X^{T}X\right)^{-1}X^{T}\left(\hat{y}_{n}-\frac{1}{n}y_{0}\right)+\frac{1}{n}\mathcal{d}\end{array}\right), (7)

where y^i=(p^i1,p^i2,,p^iM)T\hat{y}_{i}=\left(\hat{p}_{i1},\hat{p}_{i2},\ldots,\hat{p}_{iM}\right)^{T} for 1in1\leq i\leq n and y0=((1,,1)1×M)T=iy^iy_{0}=\left((1,\ldots,1)_{1\times M}\right)^{T}=\sum_{i}\hat{y}_{i}. Let the ii-th block be Θ^i,CLS=(θ^i1,,θ^id2)T\hat{\Theta}_{i,\text{CLS}}=\left(\hat{\theta}_{i}^{1},\ldots,\hat{\theta}_{i}^{d^{2}}\right)^{T} and the Stage 1 estimate is E^i=a=1d2θ^iaΩa\hat{E}_{i}=\sum_{a=1}^{d^{2}}\hat{\theta}_{i}^{a}\Omega_{a} which may have negative eigenvalues. Thus, Stage 2 is designed to obtain a physical estimate Pi^\hat{P_{i}} by eigenvalue correction with three substages. Firstly, we decompose E^i=F^iG^i\hat{E}_{i}=\hat{F}_{i}-\hat{G}_{i} with F^i0,G^i0\hat{F}_{i}\geq 0,\hat{G}_{i}\geq 0. We then perform a spectral decomposition to obtain E^i=R^iK^iR^i\hat{E}_{i}=\hat{R}_{i}\hat{K}_{i}\hat{R}_{i}^{\dagger}. Assume there are n^i\hat{n}_{i} nonpositive eigenvalues for E^i\hat{E}_{i}, and they are in decreasing order. Thus, the best decomposition in the sense of minimizing G^i\|\hat{G}_{i}\| is F^i=R^idiag((K^i)11,(K^i)22,,(K^i)(dn^i)(dn^i),0,,0)R^i\hat{F}_{i}=\hat{R}_{i}\text{diag}((\hat{K}_{i})_{11},(\hat{K}_{i})_{22},...,(\hat{K}_{i})_{(d-\hat{n}_{i})(d-\hat{n}_{i})},0,...,0)\hat{R}_{i}^{\dagger}, and G^i=R^idiag(0,,0,(K^i)(dn^i+1)(dn^i+1)\hat{G}_{i}=-\hat{R}_{i}\text{diag}(0,...,0,(\hat{K}_{i})_{(d-\hat{n}_{i}+1)(d-\hat{n}_{i}+1)},
(K^i)(dn^i+2)(dn^i+2),,(K^i)dd)R^i(\hat{K}_{i})_{(d-\hat{n}_{i}+2)(d-\hat{n}_{i}+2)},...,(\hat{K}_{i})_{dd})\hat{R}_{i}^{\dagger}. Secondly, we apply decomposition I+iG^i=iF^i=C^C^I+\sum_{i}\hat{G}_{i}=\sum_{i}\hat{F}_{i}=\hat{C}\hat{C}^{\dagger}, and hence iC^1F^iC^=I\sum_{i}\hat{C}^{-1}\hat{F}_{i}\hat{C}^{-\dagger}=I. Let B^i=C^1F^iC^\hat{B}_{i}=\hat{C}^{-1}\hat{F}_{i}\hat{C}^{-\dagger} which is positive semidefinite and iB^i=I\sum_{i}\hat{B}_{i}=I. For any unitary U^\hat{U}, C^C^=C^U^U^C^\hat{C}\hat{C}^{\dagger}=\hat{C}\hat{U}\hat{U}^{\dagger}\hat{C}^{\dagger} holds. Therefore, U^B^iU^\hat{U}^{\dagger}\hat{B}_{i}\hat{U} can also be an estimate of the detector. To neutralize the effect of U^C^1()(C^)1U^\hat{U}^{\dagger}\hat{C}^{-1}(\cdot)(\hat{C}^{\dagger})^{-1}\hat{U} on F^i\hat{F}_{i}, we define an optimal unitary matrix U^op\hat{U}_{\text{op}} to minimize C^U^opI\|\hat{C}\hat{U}_{\text{op}}-I\| [28] and its analytical expression is U^op=C^C^C^1\hat{U}_{\text{op}}=\sqrt{\hat{C}^{\dagger}\hat{C}}\hat{C}^{-1}. Therefore, the final estimate is P^i=U^opB^iU^op\hat{P}_{i}=\hat{U}_{\text{op}}^{\dagger}\hat{B}_{i}\hat{U}_{\text{op}}. This general algorithm can be used for arbitrary n2n\geq 2.

3 Optimal quantum detector tomography

In detector tomography as shown in Step 1 in Fig. 1, a natural way to reduce the tomography error is to carefully choose the probe states according to a certain optimality criterion, which should be independent of the detector. In this section, we propose a criterion based on two non-conflicting indices (UMSE and condition number), subsequently present the conditions on achieving optimal detector tomography and provide illustrative examples.

3.1 Optimality criterion

One index to evaluate probe states is to score their worst performance, i.e., their upper bounds of estimation errors. Let 𝔼\mathbb{E} denote the expectation w.r.t. all possible measurement results. The Stage 1 error E^iPi\left\|\hat{E}_{i}-P_{i}\right\| between the estimate E^i\hat{E}_{i} and its true value PiP_{i} is bounded by UMSE (upper bound of the mean squared error ) [28]

𝔼(iE^iPi2)\displaystyle\mathbb{E}\left(\sum_{i}\left\|\hat{E}_{i}-P_{i}\right\|^{2}\right) =𝔼(Θ^CLSΘ2)\displaystyle=\mathbb{E}\left(\left\|\hat{\Theta}_{\text{CLS}}-\Theta\right\|^{2}\right) (8)
(n1)M4NTr[(XTX)1],\displaystyle\leq\frac{(n-1)M}{4N}\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]},

and the final estimation error for all detectors
𝔼(ΣiP^iPi2)\mathbb{E}\left(\Sigma_{i}\left\|\hat{P}_{i}-P_{i}\right\|^{2}\right) is bounded by [28]

𝔼(iP^iPi2)\displaystyle\mathbb{E}\left(\sum_{i}\left\|\hat{P}_{i}-P_{i}\right\|^{2}\right) (9)
\displaystyle\leq (dn+2dn+1)(n1)M4NTr[(XTX)1]+o(1N).\displaystyle\frac{(dn+2\sqrt{d}n+1)(n-1)M}{4N}\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]}+o\left(\frac{1}{N}\right).

Since in (8) and (9) the parts dependent on probe states are both MTr[(XTX)1]M\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]}, we take it as our first criterion. For general QDT, we have no special prior knowledge on the detectors, and the following conditions are equivalent: (i) the detector can not be uniquely identified; (ii) the probe states do not span the space of all dd-dimensional states; (iii) XTXX^{T}X is singular; (iv) the UMSE is infinite. From (i) and (iv), it is thus reasonable to take MTr(XTX)1M\operatorname{Tr}(X^{T}X)^{-1} as an evaluation index. We require the optimal probe states to minimize MTr(XTX)1M\operatorname{Tr}(X^{T}X)^{-1}. This criterion is conservative and its limitation is that the UMSE might not be tight. The relation (8) is only tight for the case P1=P2=I/2P_{1}=P_{2}=I/2 and (9) is always loose. Therefore, minimizing UMSE is not equivalent to minimizing MSE. In addition, UMSE depends on the assumption that there only exists statistics error in the measurement data from finite state copies (discussed in Remark 4). Despite these limitations, numerical results in [28] (e.g., Fig. 4 therein) indicate a very similar behavior between UMSE and MSE when changing the probe states while maintaining the other parameters nn, MM, NN fixed. Hence, UMSE is also a useful index evaluating the probe state set.

The second index we consider is, for a set of probe states, how robust the generated estimation result is w.r.t. measurement noise. Note that our CLS estimation (7) is in fact equivalent to the least squares estimation of the linear regression problem

(y^11ny0y^n1ny0)=(InX)(Θ11n𝒹Θn1n𝒹).\left(\begin{array}[]{c}\hat{y}_{1}-\frac{1}{n}y_{0}\\ \vdots\\ \hat{y}_{n}-\frac{1}{n}y_{0}\end{array}\right)=(\begin{array}[]{l}I_{n}\otimes X\end{array})\left(\begin{array}[]{c}\Theta_{1}-\frac{1}{n}\mathcal{d}\\ \vdots\\ \Theta_{n}-\frac{1}{n}\mathcal{d}\end{array}\right). (10)

Typically, the sensitivity of a linear system solution to perturbations in the data is evaluated by the condition number of the coefficient matrix, defined (among several possible choices) in this paper as cond(A)=σmax(A)σmin(A)\operatorname{cond}(A)=\frac{\sigma_{\max}(A)}{\sigma_{\min}(A)}, where σmax/min(A)\sigma_{\max/\min}(A) is the maximum/minimum singular value of AA. Hence, to maximize the estimation robustness w.r.t. measurement noise amounts to minimizing the condition number in (10) cond(InX)=cond(In)cond(X)=cond(X)\operatorname{cond}(I_{n}\otimes X)=\operatorname{cond}(I_{n})\operatorname{cond}(X)=\operatorname{cond}(X). Therefore, the second evaluation index is chosen as cond(X)\operatorname{cond}(X). To sum up, the optimal probe states should minimize MTr(XTX)1M\operatorname{Tr}(X^{T}X)^{-1} and minimize cond(X)\operatorname{cond}(X) simultaneously. In the following we give specific characterization and show that the two optimal indices can be achieved simultaneously for several examples.

3.2 Optimal probe states

We now give the condition on optimal probe states (OPS).

Theorem 1.

For a dd-dimensional detector with nn matrices, assume each type of probe states has the same number of copies, altogether summed to NN copies. Then the minimum of UMSE is (n1)(d4+d3d2)4N\frac{(n-1)(d^{4}+d^{3}-d^{2})}{4N} and the minimum of cond(X)\operatorname{cond}(X) is d+1\sqrt{d+1}. These minima are achieved simultaneously if and only if there exist MM different types of probe states such that XTXX^{T}X is diagonal and its eigenvalues are λ1=Md\lambda_{1}=\frac{M}{d} and λ2==λd2=Md(d+1)\lambda_{2}=\dots=\lambda_{d^{2}}=\frac{M}{d(d+1)}.

{pf}

We assume that Md2M\geq d^{2} different types of probe states are used, since clearly M<d2M<d^{2} is not optimal. Denote the eigenvalues of XTXX^{T}X as λ1λ2λd2\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{d^{2}}. For minimizing UMSE, it can be formulated as

min i=1d2Mλi\displaystyle\sum_{i=1}^{d^{2}}\frac{M}{\lambda_{i}} (11)
s.t. i=1d2λiM,λ1Md.\displaystyle\sum_{i=1}^{d^{2}}\lambda_{i}\leq M,\lambda_{1}\geq\frac{M}{d}.

For maximizing robustness, it can be formulated as

min λ1λd2\displaystyle\sqrt{\frac{\lambda_{1}}{\lambda_{d^{2}}}} (12)
s.t. i=1d2λiM,λ1Md.\displaystyle\sum_{i=1}^{d^{2}}\lambda_{i}\leq M,\lambda_{1}\geq\frac{M}{d}.

The constraint i=1d2λiM\sum_{i=1}^{d^{2}}\lambda_{i}\leq M comes from the purity requirement. For the parameterization of each probe state, we have

ϕj2\displaystyle\left\|\phi_{j}\right\|^{2} =b=1d2|ϕjb|2\displaystyle=\sum_{b=1}^{d^{2}}\left|\phi_{j}^{b}\right|^{2} (13)
=Tr[(b=1d2ϕjbΩb)(b=1d2ϕjbΩb)]\displaystyle=\operatorname{Tr}\left[\left(\sum_{b=1}^{d^{2}}\phi_{j}^{b}\Omega_{b}\right)^{\dagger}\left(\sum_{b=1}^{d^{2}}\phi_{j}^{b}\Omega_{b}\right)\right]
=Tr(ρjρj)=Tr(ρj2)1.\displaystyle=\operatorname{Tr}\left(\rho_{j}^{\dagger}\rho_{j}\right)=\operatorname{Tr}\left(\rho_{j}^{2}\right)\leq 1.

Thus for the parameterization matrix of probe states, we have

i=1d2λi\displaystyle\sum_{i=1}^{d^{2}}\lambda_{i} =Tr(XTX)=X2=j=1Mϕj2\displaystyle=\operatorname{Tr}\left(X^{T}X\right)=\|X\|^{2}=\sum_{j=1}^{M}\left\|\phi_{j}\right\|^{2} (14)
=j=1MTr(ρj2)M.\displaystyle=\sum_{j=1}^{M}\operatorname{Tr}\left(\rho_{j}^{2}\right)\leq M.

The second constraint λ1Md\lambda_{1}\geq\frac{M}{d} is from the matrix element satisfying

(XTX)11=j=1M(ϕj1)2=M(1d)2=Md\left(X^{T}X\right)_{11}=\sum_{j=1}^{M}\left(\phi_{j}^{1}\right)^{2}=M\left(\frac{1}{\sqrt{d}}\right)^{2}=\frac{M}{d} (15)

in our chosen basis {Ωi}i=1d2\left\{\Omega_{i}\right\}_{i=1}^{d^{2}}.

For (11) and (12), using Lagrange multiplier method as in [4], the minimum of i=1d2Mλi\sum_{i=1}^{d^{2}}\frac{M}{\lambda_{i}} is d4+d3d2d^{4}+d^{3}-d^{2}, achieved when λ1=Md\lambda_{1}=\frac{M}{d} and λ2==λd2=Md(d+1)\lambda_{2}=\dots=\lambda_{d^{2}}=\frac{M}{d(d+1)}. Hence, the minimum of UMSE is

(n1)M4NTr[(XTX)1]\displaystyle\quad\frac{(n-1)M}{4N}\operatorname{Tr}\left[\left(X^{T}X\right)^{-1}\right] (16)
(n1)M4N[dM+d(d+1)M(d21)]\displaystyle\geq\frac{(n-1)M}{4N}\left[\frac{d}{M}+\frac{d(d+1)}{M}(d^{2}-1)\right]
=(n1)(d4+d3d2)4NO(nd4N),\displaystyle=\frac{(n-1)(d^{4}+d^{3}-d^{2})}{4N}\sim O\left(\frac{nd^{4}}{N}\right),

which does not rely on the type number of probe states MM. The minimum of λ1λd2\sqrt{\frac{\lambda_{1}}{\lambda_{d^{2}}}} is d+1\sqrt{d+1} because λ1Md\lambda_{1}\geq\frac{M}{d} and λd2Md(d+1)\lambda_{d^{2}}\leq\frac{M}{d(d+1)}. These minima can be achieved at the same time if and only if λ1=Md\lambda_{1}=\frac{M}{d} and λ2==λd2=Md(d+1)\lambda_{2}=\dots=\lambda_{d^{2}}=\frac{M}{d(d+1)}.

Now, we show that XTXX^{T}X is a diagonal matrix. Assume XTX=Vdiag(Md,Md(d+1)Id1)VTX^{T}X=V\operatorname{diag}\left(\frac{M}{d},\frac{M}{d(d+1)}I_{d-1}\right)V^{T}. For (XTX)11\left(X^{T}X\right)_{11}, we have

Md\displaystyle\frac{M}{d} =(XTX)11\displaystyle=\left(X^{T}X\right)_{11} (17)
=vTVdiag(Md,Md(d+1)Id1)VTv\displaystyle=v^{T}V\operatorname{diag}\left(\frac{M}{d},\frac{M}{d(d+1)}I_{d-1}\right)V^{T}v
=Md|V11|2+Md(d+1)i=2d|V1i|2\displaystyle=\frac{M}{d}\left|V_{11}\right|^{2}+\frac{M}{d(d+1)}\sum_{i=2}^{d}\left|V_{1i}\right|^{2}
=Md(d+1)+Md+1|V11|2\displaystyle=\frac{M}{d(d+1)}+\frac{M}{d+1}\left|V_{11}\right|^{2}

where v=[1,0,,0]Tv=\left[1,0,\cdots,0\right]^{T}. Thus, V11=1V_{11}=1 and V=diag(1,V~d1)V=\operatorname{diag}(1,\tilde{V}_{d-1}) where V~d1\tilde{V}_{d-1} is a d1d-1 dimensional orthogonal matrix. Therefore, we have

XTX\displaystyle\quad X^{T}X (18)
=diag(1,V~d1)diag(Md,Md(d+1)Id1)diag(1,V~d1T)\displaystyle=\operatorname{diag}\left(1,\tilde{V}_{d-1}\right)\operatorname{diag}\left(\frac{M}{d},\frac{M}{d(d+1)}I_{d-1}\right)\operatorname{diag}\left(1,\tilde{V}_{d-1}^{T}\right)
=diag(Md,Md(d+1)Id1),\displaystyle=\operatorname{diag}\left(\frac{M}{d},\frac{M}{d(d+1)}I_{d-1}\right),

which is a diagonal matrix. \Box

Our results also show that the description of OPS in Assumption 11 in [28] needs to be more precise. For certain MM, if there exist OPS, all of them must be pure states because j=1MTr(ρj2)=j=1d2λi=M\sum_{j=1}^{M}\operatorname{Tr}(\rho_{j}^{2})=\sum_{j=1}^{d^{2}}\lambda_{i}=M from (14). This indicates that pure states are better than mixed states. However, it is not clear whether these OPS exist for arbitrary Md2M\geq d^{2}. Thus, whether we can always find a pure state set which is better than arbitrary mixed state set for given MM is still an open problem.

Remark 1.

In system identification, the similar problem called input design problem has been widely discussed. There are many existing results, e.g., D,A,E-optimal input design [38]. The common idea behind the problem of our optimal probe states and optimal input design is that both problems consider minimizing the trace of the covariance matrix, which is A-optimal input design [38]. The difference is that we also consider robustness and other physical eigenvalue constraints (e.g., purity) for probe states. Thus, we cannot directly adapt classical input design problem for the quantum case.

Remark 2.

If we only want to reach the minimum condition number d+1\sqrt{d+1} without considering UMSE, we need to ensure λ1λd2=d+1\frac{\lambda_{1}}{\lambda_{d^{2}}}=d+1 which can be satisfied even for mixed states. For example, for a concentric sphere inside the Bloch sphere, we can also find the corresponding platonic solid on it which has the smallest condition number d+1\sqrt{d+1} (this case will be discussed later). Therefore, if we only consider minimum condition number, we cannot obtain the minimum UMSE. However, if we only consider minimum UMSE, the eigenvalues also satisfy the requirement of minimum condition number. Hence, when we solve the optimization problem and analyze optimal probe states, it suffices to only consider minimum UMSE.

Remark 3.

In quantum state tomography, the condition number for the optimal measurement is 11, achieved by special measurement such as the optimal generalized Pauli operators [39]. However, in QDT, OPS need to satisfy the unit-trace constraint. Therefore, the largest eigenvalue λ1\lambda_{1} must be equal to or larger than Md\frac{M}{d} and the minimum condition number is d+1\sqrt{d+1}.

Remark 4.

We consider two criteria for optimization–upper bound of the mean squared error (UMSE) and the robustness described by the condition number. For UMSE, we assume there only exists statistics error from finite state copies and the analytical upper bound depends on this assumption. If this assumption is not satisfied, the UMSE should be adjusted by adding the unmodeled noise (e.g., apparatus noise) to the error \mathcal{e} in (5). For robustness, condition number characterizes the sensitivity of the estimation result to errors in measurement data. Hence, this criterion is unrelated to the specific sources of the errors.

We then give two examples of OPS for M=d2M=d^{2} and M=d(d+1)M=d(d+1) which are motivated by projection measurements.

The first example is motivated from SIC-POVM. To reconstruct an unknown quantum state ρ\rho, a generalized measurement must have at least d2d^{2} linear independent elements, which is called informationally complete. Furthermore, if the measurement results are maximally independent, the POVM is called symmetric informationally complete POVM (SIC-POVM). The simplest mathematical definition of an SIC-POVM is a set of d2d^{2} normalized vectors |ϕk\left|\phi_{k}\right\rangle in d\mathbb{C}^{d} satisfying [40]

|ϕj|ϕk|2=1d+1,jk.\left|\left\langle\phi_{j}|\phi_{k}\right\rangle\right|^{2}=\frac{1}{d+1},\quad j\neq k. (19)

It has been conjectured that SIC-POVMs exist for all dimensions [40] and their existence has been given for d121d\leq 121, and some other sporadic values [41]. When SIC-POVMs exist in dd dimension, we define the d2d^{2} pure states |ϕkd(1kd2)|\phi_{k}\rangle\in\mathbb{C}^{d}(1\leq k\leq d^{2}) to be SIC states if they satisfy (19).

Proposition 3.2.

dd-dimensional SIC states (when they exist) are a set of OPS with the smallest MM as M=d2M=d^{2}.

{pf}

Note that choosing different orthogonal bases does not change the inner product between vectors. Therefore, for parameterization of the SIC states, we have

(XXT)ij={1,i=j1d+1,ij\left(XX^{T}\right)_{ij}=\left\{\begin{array}[]{c}1,i=j\\ \frac{1}{d+1},i\neq j\end{array}\right. (20)

and M=d2M=d^{2}. We assume the standard singular value decomposition of XX is X=UΣVTX=U\Sigma V^{T} where Σ\Sigma is a diagonal matrix, and UU and VV are unitary matrices. According to (20), we have

XXT=(11d+11d+11)=UΣ2UT.XX^{T}=\left(\begin{array}[]{ccc}1&\cdots&\frac{1}{d+1}\\ \vdots&\ddots&\vdots\\ \frac{1}{d+1}&\cdots&1\end{array}\right)=U\Sigma^{2}U^{T}. (21)

Thus, Σ=diag(d,dd+1Id21)\Sigma=\operatorname{diag}(d,{\frac{d}{d+1}I_{d^{2}-1}}) and

XTX\displaystyle X^{T}X =VΣUTUΣVT\displaystyle=V\Sigma U^{T}U\Sigma V^{T} (22)
=Vdiag(d,dd+1Id21)VT.\displaystyle=V\operatorname{diag}(d,{\frac{d}{d+1}I_{d^{2}-1}})V^{T}.

The largest eigenvalue of XTXX^{T}X is dd and the other eigenvalues are dd+1\frac{d}{d+1}, which satisfies Theorem 1. Therefore, SIC states are OPS. Also, among all dd-dimensional OPS, SIC states have the smallest MM as M=d2M=d^{2}. If M<d2M<d^{2}, it is not informationally complete, and the detector cannot be uniquely determined without other prior knowledge. \Box The second example is motivated from MUB measurement. Two sets of orthogonal bases k={|ψik:i=1,\mathcal{B}^{k}=\{\left|\psi_{i}^{k}\right\rangle:i=1,\ldots,d},d\} and ={|ψj:j=1,,d}\mathcal{B}^{\ell}=\left\{\left|\psi_{j}^{\ell}\right\rangle:j=1,\ldots,d\right\} are called mutually unbiased if and only if [42]

|ψik|ψj|2={1/d for k,δi,j for k=.\left|\left\langle\psi_{i}^{k}|\psi_{j}^{\ell}\right\rangle\right|^{2}=\left\{\begin{array}[]{ll}1/d&\text{ for }k\neq\ell,\\ \delta_{i,j}&\text{ for }k=\ell.\end{array}\right. (23)

In particular, one can find maximally d+1d+1 sets of mutually unbiased bases in Hilbert spaces of prime-power dimension d=pk,d=p^{k}, with pp a prime and k{k} a positive integer [43]. When d+1d+1 sets of MUB measurement exist in d\mathbb{C}^{d}, we view each projective MUB measurement operator as a pure state and we call them MUB states. Thus, MUB states always exist for mm-qubit systems (d=2m)(d=2^{m}).

Proposition 3.3.

dd-dimensional MUB states (when they exist) are a set of OPS for M=d(d+1)M=d(d+1).

{pf}

For MUB states in (23), we have M=d(d+1)M=d(d+1). For their parameterization, we have

(XXT)ij={δij,kd+1i,j(k+1)d,0kd,1d, otherwise. \left(XX^{T}\right)_{ij}=\left\{\begin{array}[]{c}\delta_{ij},kd+1\leq i,j\leq(k+1)d,0\leq k\leq d,\\ \frac{1}{d},\text{ otherwise. }\end{array}\right. (24)

We assume the standard singular value decomposition of XX is X=UΣVTX=U\Sigma V^{T}. Thus, we have

XXT=(Id(1d)d(1d)d(1d)dId(1d)d(1d)d(1d)dId)=UΣΣTUT,XX^{T}=\left(\begin{array}[]{cccc}I_{d}&\left(\frac{1}{d}\right)_{d}&\cdots&\left(\frac{1}{d}\right)_{d}\\ \left(\frac{1}{d}\right)_{d}&I_{d}&\cdots&\left(\frac{1}{d}\right)_{d}\\ \vdots&\vdots&\vdots&\vdots\\ \left(\frac{1}{d}\right)_{d}&\left(\frac{1}{d}\right)_{d}&\cdots&I_{d}\end{array}\right)=U\Sigma\Sigma^{T}U^{T}, (25)

where (1d)d\left(\frac{1}{d}\right)_{d} denotes a d×dd\times d matrix and all the elements are 1d\frac{1}{d}. We have

XTX\displaystyle X^{T}X =VΣTUTUΣVT\displaystyle=V\Sigma^{T}U^{T}U\Sigma V^{T} (26)
=Vdiag(d+1,Id21)VT,\displaystyle=V\operatorname{diag}(d+1,I_{d^{2}-1})V^{T},

where the largest eigenvalue is d+1d+1 and the remaining d21d^{2}-1 eigenvalues are 11. They are OPS because their eigenvalues satisfy Theorem 1. \Box For Md2M\neq d^{2} and Md(d+1)M\neq d(d+1), we leave it an open problem when there always exist OPS satisfying the two indices simultaneously. For two-qubit detectors, from the above results we know the optimal probe states can be constructed using 44-dimensional MUB states and SIC states as shown in Appendix A. A similar two-qubit problem for QST was discussed in [4] to determine the optimal measurement based on UMSE and in [39] based on condition number.

For one-qubit probe states, they have simple geometric property; i.e., they are in the Bloch sphere. The pure states are on the surface and the mixed states are inside the sphere. Hence, an alternative method to search for one-qubit OPS is based on geometric symmetry. For M=4M=4, when the probe states are on the surface of Bloch sphere and become the four vertices of a tetrahedron concentric with Bloch sphere, they are OPS. In fact, they are also SIC states. For M=6M=6, one can construct OPS similarly using octahedron, and they are also MUB states. We also have cube, icosahedron, dodecahedron for M=8,12,20M=8,12,20, respectively, which are OPS. We conjecture all one-qubit OPS are constructed from the five platonic solids on the Bloch sphere in this way, and we show there do not exist OPS for M=5M=5 in Appendix B. For multi-qubit states, we still do not fully know the geometric property and it is thus an open problem to find other optimal probe states.

3.3 Product probe state

In experiment, product states are among the ones most straightforwardly to be implemented. In this subsection we consider the case d=2md=2^{m} for some integer mm and each probe state is an mm-qubit tensor product state as ρj=ρj1(1)ρjm(m)\rho_{j}=\rho^{(1)}_{j_{1}}\otimes\cdots\otimes\rho^{(m)}_{j_{m}} where 1j1M1,,1jmMm{1\leq j_{1}\leq M_{1},\cdots,1\leq j_{m}\leq M_{m}} and there are MiM_{i} different types of one-qubits for the ii-th qubit of the product states. The total number of these mm-qubit tensor product states is thus i=1mMi\prod_{i=1}^{m}{M_{i}}. We then give their optimal UMSE and condition number.

Theorem 3.4.

The minimum UMSE of mm-qubit product probe state is 20m(n1)4N\frac{20^{m}(n-1)}{4N} and the minimum condition number is 3m\sqrt{3^{m}}. These minima are achieved simultaneously if and only if each qubit is among optimal one-qubit states.

{pf}

Assume the parameterization of all the ii-th single-qubit states {ρji(i)}ji=1Mi\left\{\rho_{j_{i}}^{(i)}\right\}_{j_{i}=1}^{M_{i}} is XiX_{i}. We thus have

XTX\displaystyle X^{T}X =(X1X2Xm)T(X1X2Xm)\displaystyle=\left(X_{1}\otimes X_{2}\cdots\otimes X_{m}\right)^{T}\left(X_{1}\otimes X_{2}\cdots\otimes X_{m}\right) (27)
=X1TX1XmTXm,\displaystyle=X_{1}^{T}X_{1}\otimes\cdots\otimes X_{m}^{T}X_{m},

and therefore,

Tr[(XTX)1]\displaystyle\operatorname{Tr}\left[\left(X^{T}X\right)^{-1}\right] =i=1mTr[(XiTXi)1]i=1m20Mi,\displaystyle=\prod_{i=1}^{m}\operatorname{Tr}\left[\left(X_{i}^{T}X_{i}\right)^{-1}\right]\geq\prod_{i=1}^{m}\frac{20}{M_{i}}, (28)

where 2020 comes from d4+d3d2d^{4}+d^{3}-d^{2} for d=2d=2. Thus, the UMSE is (n1)M4NTr[(XTX)1]20m(n1)4N\frac{(n-1)M}{4N}\operatorname{Tr}\left[\left(X^{T}X\right)^{-1}\right]\geq\frac{20^{m}(n-1)}{4N}. For condition number,

cond(X)=i=1mcond(Xi)i=1m3=3m,\operatorname{cond}(X)=\prod_{i=1}^{m}\operatorname{cond}\left(X_{i}\right)\geq\prod_{i=1}^{m}\sqrt{3}=\sqrt{3^{m}}, (29)

where 33 comes from d+1d+1 for d=2d=2. We can reach these minima if and only if for each ii, {ρji(i)}ji=1Mi\left\{\rho_{j_{i}}^{(i)}\right\}_{j_{i}=1}^{M_{i}} is an OPS set. \Box

Here we compare the two criteria–UMSE and condition number between OPS and optimal product states for an mm-qubit detector with the dimension d=2md=2^{m}. The UMSEs are (n1)(16m+8m4m)4N\frac{(n-1)(16^{m}+8^{m}-4^{m})}{4N} for OPS and 20m(n1)4N\frac{20^{m}(n-1)}{4N} for optimal product states. UMSE of optimal probe states is always smaller than that of optimal product states for m2m\geq 2. For m=1m=1, they are both 20(n1)4N\frac{20(n-1)}{4N}. For condition number, it is 2m+1\sqrt{2^{m}+1} for OPS and 3m\sqrt{3^{m}} for optimal product states. Thus, for both UMSE and condition number, OPS play better than optimal product states in multi-qubit systems. Because most of the OPS are entangled states, this is an example showcasing the advantage of entanglement in QDT.

We give an example of two-qubit product probe states. Let the parameterization of the jj-th two-qubit product probe state ρj\rho_{j} be

ϕj\displaystyle\phi_{j} =ϕj1(1)ϕj2(2)=[12,12ψj2(2),12ψj1(1),ψj1(1)ψj2(2)]T,\displaystyle=\phi_{j_{1}}^{(1)}\otimes\phi_{j_{2}}^{(2)}=\left[\frac{1}{2},\frac{1}{\sqrt{2}}\psi_{j_{2}}^{(2)},\frac{1}{\sqrt{2}}\psi_{j_{1}}^{(1)},\psi_{j_{1}}^{(1)}\otimes\psi_{j_{2}}^{(2)}\right]^{T}, (30)

where ϕji(i)=[12,ψji(i)]T\phi_{j_{i}}^{(i)}=\left[\frac{1}{\sqrt{2}},\psi_{j_{i}}^{(i)}\right]^{T} is the parameterization of the ii-th qubit of ρj\rho_{j}. Since optimal one-qubit probe states are pure states, ψj1(1)=ψj2(2)=12\left\|\psi_{j_{1}}^{(1)}\right\|=\left\|\psi_{j_{2}}^{(2)}\right\|=\frac{1}{\sqrt{2}}. Denote the eigenvalues of XTXX^{T}X as λ1λ2λd2\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{d^{2}}. We can obtain three constraints as

(XTX)11=j=1M(ϕj1)2=M(12)2=M4,\left(X^{T}X\right)_{11}=\sum_{j=1}^{M}\left(\phi_{j}^{1}\right)^{2}=M\left(\frac{1}{2}\right)^{2}=\frac{M}{4}, (31)
i=14λii=14(XTX)ii=M(12)2+j=1M12ψj1(2)2=M2,\sum_{i=1}^{4}\lambda_{i}\geq\sum_{i=1}^{4}\left(X^{T}X\right)_{ii}=M\left(\frac{1}{2}\right)^{2}+\sum_{j=1}^{M}\left\|\frac{1}{\sqrt{2}}\psi_{j_{1}}^{(2)}\right\|^{2}=\frac{M}{2}, (32)
i=17λii=17(XTX)ii\displaystyle\sum_{i=1}^{7}\lambda_{i}\geq\sum_{i=1}^{7}\left(X^{T}X\right)_{ii} (33)
=M(12)2+j=1M12ψj1(1)2+j=1M12ψj2(2)2=3M4.\displaystyle=M\left(\frac{1}{2}\right)^{2}+\sum_{j=1}^{M}\left\|\frac{1}{\sqrt{2}}\psi_{j_{1}}^{(1)}\right\|^{2}+\sum_{j=1}^{M}\left\|\frac{1}{\sqrt{2}}\psi_{j_{2}}^{(2)}\right\|^{2}=\frac{3M}{4}.

Thus, the optimization problem is

min i=116Mλi\displaystyle\sum_{i=1}^{16}\frac{M}{\lambda_{i}} (34)
s.t. i=116λi=M,λ1M4,\displaystyle\sum_{i=1}^{16}\lambda_{i}=M,\lambda_{1}\geq\frac{M}{4},
i=14λiM2,i=17λi3M4.\displaystyle\sum_{i=1}^{4}\lambda_{i}\geq\frac{M}{2},\sum_{i=1}^{7}\lambda_{i}\geq\frac{3M}{4}.

It can be proven that i=116Mλi\sum_{i=1}^{16}\frac{M}{\lambda_{i}} reaches its minimum 400400 and the minimum UMSE is 100(n1)N\frac{100(n-1)}{N} when λ1=M4\lambda_{1}=\frac{M}{4},λ2==λ7=M12,\lambda_{2}=\cdots=\lambda_{7}=\frac{M}{12}, λ8==λ16=M36\lambda_{8}=\cdots=\lambda_{16}=\frac{M}{36}. It can be verified that this minimum UMSE can be reached by using the tensor of platonic solid states such as M=36M=36 (Cube). The minimum condition number is λ1λ16=M/4M/36=3\sqrt{\frac{\lambda_{1}}{\lambda_{16}}}=\sqrt{\frac{M/4}{M/36}}=3.

3.4 Superposition of coherent probe state

In quantum optics experiments, the preparation of number states (or Fock states) |k(k)|k\rangle(k\in\mathbb{N}) is usually a difficult task, especially when kk is large. Therefore, it is also difficult to prepare SIC and MUB states. Coherent states, more straightforward to be prepared, are more commonly used as probe states for QDT in practice. Thus, a good approach is to use the superposition of several coherent states to approximate SIC states and MUB states.

A coherent state is denoted as |α|\alpha\rangle where α\alpha\in\mathbb{C} and it can be expanded using number states as

|α=e|α|22i=0αii!|i.|\alpha\rangle=e^{-\frac{|\alpha|^{2}}{2}}\sum_{i=0}^{\infty}\frac{\alpha^{i}}{\sqrt{i!}}|i\rangle. (35)

The inner product relationship between two states |α|\alpha\rangle and |β|\beta\rangle is

β|α=e12(|β|2+|α|22βα).\langle\beta|\alpha\rangle=\mathrm{e}^{-\frac{1}{2}\left(|\beta|^{2}+|\alpha|^{2}-2\beta^{*}\alpha\right)}. (36)

Let |αd=e|α|22i=0d1αii!|i|\alpha_{d}\rangle=\mathrm{e}^{-\frac{|\alpha|^{2}}{2}}\sum_{i=0}^{d-1}\frac{\alpha^{i}}{\sqrt{i}!}|i\rangle. Coherent states are in essence infinite dimensional. To estimate a dd-dimensional detector, we employ |αd|\alpha_{d}\rangle as the approximate description of |α|\alpha\rangle, and we assume that the discarded part |α|αd|\alpha\rangle-|\alpha_{d}\rangle has a small enough influence such that it can be neglected. A matrix or vector with subscript dd means it is truncated in dd dimension.

Remark 5.

It can lead to significant error to approximate a general pure state using only one coherent state instead of the superposition of many. This problem is often referred to as the nonclassicality of states. As an example, we now consider a Fock state |n|n\rangle. The infidelity is

1Fs(|nn|,|αα|)=[2(1exp(|α|2)|α|2nn!)]1/2.1-F_{s}(|n\rangle\langle n|,|\alpha\rangle\langle\alpha|)=\left[2\left(1-\exp(-|\alpha|^{2})\frac{|\alpha|^{2n}}{n!}\right)\right]^{1/2}.

The minimum infidelity is obtained by |α|2=n|\alpha|^{2}=n and for n=0,1,2n=0,1,2, the minimum infidelity is 0,0.6321,0.72930,0.6321,0.7293, respectively. Thus, the distance between one Fock state and one coherent state may be quite large. This also shows that the Fock state |n|n\rangle becomes more and more non-classical as the value nn increases [44].

In quantum state engineering, a central problem is how to construct a pure state by superposition of coherent states. There are two main approaches to do this. One is to write the pure state as a superposition of coherent states along the real axis in phase space

|ψ=(α)|α𝑑α,|\psi\rangle=\int_{-\infty}^{\infty}\mathcal{F}(\alpha)|\alpha\rangle d\alpha, (37)

where (α)\mathcal{F}(\alpha) is the distribution function. The other choice is the superposition on a circle

|ψ=02πR(ϕ)|Reiϕ𝑑ϕ,|\psi\rangle=\int_{0}^{2\pi}\mathcal{F}_{R}(\phi)\left|Re^{i\phi}\right\rangle d\phi, (38)

where RR is the radius of the circle and R(ϕ)\mathcal{F}_{R}(\phi) a circle distribution function. The analytical solutions of (α)\mathcal{F}(\alpha) and R(ϕ)\mathcal{F}_{R}(\phi) were given in [45]. They also gave the discrete superposition of coherent states to construct pure states by discretizing the above equations. Ref. [46] evaluated the performance to construct squeezed displaced number states and [45] gave a representation of a Fock state |n|n\rangle by n+1n+1 coherent states. If we can construct all Fock states with high accuracy, we can use these Fock states to construct all the ideal pure states we need. However, in practice, the technique to superpose many coherent states arbitrarily is still developing.

In this paper, we use the finite superposition of ss coherent states |ψ~=k=1sck|αdk|\tilde{\psi}\rangle=\sum\limits_{k=1}^{s}{{c_{k}}}\left|{{\alpha_{d}}}\right\rangle_{k} to approximate an ideal pure state |ψ|\psi\rangle where |ψ~|\tilde{\psi}\rangle indicates that it has not been normalized. Hence, |ψ~|\tilde{\psi}\rangle is not yet a quantum state in the most strict sense. To find the superposition state |ψ~|\tilde{\psi}\rangle, at first sight it can be formulated as an optimization problem to maximize fidelity as

max|ψ~|ψ|ψ~|2,\max_{{|\tilde{\psi}\rangle}}|\langle\psi|\tilde{\psi}\rangle|^{2}, (39)

Since the fidelity is Fs(ρ^,ρ)=[Trρ^ρρ^]2=|ψ|ψ~|2F_{s}(\hat{\rho},\rho)=[\operatorname{Tr}\sqrt{\sqrt{\hat{\rho}}\rho\sqrt{\hat{\rho}}}]^{2}=|\langle\psi|\tilde{\psi}\rangle|^{2}. However, for the purpose of QDT, this cost function is not the most suitable. The most important part of the superposed state |ψ~|\tilde{\psi}\rangle is its direction. We hope to align it in the same direction as |ψ|\psi\rangle, even if the norm|ψ~\||\tilde{\psi}\rangle\| can be different from |ψ=1\||\psi\rangle\|=1. Hence, we need the normalized state |ψ~ψ~|ψ~\frac{|\tilde{\psi}\rangle}{\sqrt{\langle\tilde{\psi}|\tilde{\psi}\rangle}} to be a good approximation to |ψ|\psi\rangle, which thus leads to the numerator of (40). Also, if |ψ~\||\tilde{\psi}\rangle\| is too small, we cannot neglect elements in the dimension larger than dd and the corresponding measurement data Tr(|ψ~ψ~|Pi)\operatorname{Tr}\left(\left|\tilde{\psi}\right\rangle\left\langle\tilde{\psi}\right|P_{i}\right) will be small, leading to low measurement accuracy for given the same resource number NN. Therefore, we add a penalty ψ~|ψ~\langle\tilde{\psi}|\tilde{\psi}\rangle to prevent the norm |ψ~|\tilde{\psi}\rangle from being too small. Thus, we design a new cost function as

min|ψ~|ψ~ψ~|ψ~|ψ2ψ~|ψ~.\displaystyle\min_{{|\tilde{\psi}\rangle}}\frac{\|\frac{|\tilde{\psi}\rangle}{\sqrt{\langle\tilde{\psi}|\tilde{\psi}\rangle}}-|\psi\rangle\|^{2}}{\langle\tilde{\psi}|\tilde{\psi}\rangle}. (40)

This optimization problem is usually non-convex, and thus we select different initial points and numerically search for the best solution.

3.5 Error analysis for state preparation

When we prepare probe states such SIC states and MUB states in experiment, there usually exist state preparation errors. For example, if we use superposition of coherent states, there exists approximation error between the ideal target probe state and the superposition of coherent states when the number of coherent states for superposition is not large enough. We give the error analysis on UMSE and condition number when there exists state preparation error.

Theorem 3.5.

Given two probe state sets {ρj}1M\left\{\rho_{j}\right\}_{1}^{M} and {ρ^j}1M\left\{\hat{\rho}_{j}\right\}_{1}^{M}, if maxjρjρ^jε\max_{j}\left\|\rho_{j}-\hat{\rho}_{j}\right\|\leq\varepsilon and ελd22M\varepsilon\leq\frac{\lambda_{d^{2}}}{2M}, the corresponding error in (n1)M4N|Tr(XTX)1Tr(X^TX^)1|\frac{(n-1)M}{4N}\big{|}\operatorname{Tr}\left(X^{T}X\right)^{-1}\!-\!\operatorname{Tr}\left(\hat{X}^{T}\hat{X}\right)^{-1}\big{|} is upper bounded by 2(n1)d2M2ε4N(λd22Mε)2\dfrac{2(n-1)d^{2}M^{2}\varepsilon}{4N{\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}}} where λd2>0\lambda_{d^{2}}>0 is the smallest eigenvalue of the parameterization matrix XTXX^{T}X and the corresponding error on condition number is upper bounded by M(λ1+λd2)ε(λd22Mε)2\dfrac{M\left(\lambda_{1}+\lambda_{d^{2}}\right)\varepsilon}{\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}} where λ1\lambda_{1} is the largest eigenvalue of XTXX^{T}X.

{pf}

See Appendix C.

4 Adaptive detector tomography

As shown in Fig. 1, after employing partial resources to obtain a rough estimate of the detector through Step 1, one can design new probe states dependent on the specific estimation value of the detector to further improve the accuracy in Step 2. Our adaptive QDT scheme is applicable for arbitrary reconstruction algorithm.

4.1 Evaluation index

In quantum information, fidelity (or infidelity) has profound physical meaning to characterize the distance and similarity between two quantum states (or operations). It has been a widely used metric [2, 47, 48]. In developing adaptive QDT, we also employ infidelity as the evaluation index.

The fidelity between two arbitrary states ρ^\hat{\rho} and ρ\rho is defined by

Fs(ρ^,ρ)[Trρ^ρρ^]2,F_{s}(\hat{\rho},\rho)\triangleq[\operatorname{Tr}\sqrt{\sqrt{\hat{\rho}}\rho\sqrt{\hat{\rho}}}]^{2}, (41)

which has three basic properties:

(i)Fs(ρ^,ρ)=Fs(ρ,ρ^);(i)\ \ F_{s}(\hat{\rho},\rho)=F_{s}(\rho,\hat{\rho}); (42)
(ii) 0Fs(ρ^,ρ)1;(ii)\ \ 0\leq F_{s}(\hat{\rho},\rho)\leq 1; (43)
(iii)Fs(ρ^,ρ)=1ρ^=ρ.(iii)\ \ F_{s}(\hat{\rho},\rho)=1\ \ \Leftrightarrow\ \ \hat{\rho}=\rho. (44)

To extend the fidelity definition from states to detectors, a natural idea is to normalize the POVM element such that it has the same mathematical property as a quantum state. This leads to the definition

F0(P^i,Pi)[TrP^iPiP^i]2/[Tr(Pi)Tr(P^i)].F_{0}\left(\hat{P}_{i},P_{i}\right)\triangleq\left[\operatorname{Tr}\sqrt{\sqrt{\hat{P}_{i}}{P}_{i}\sqrt{\hat{P}_{i}}}\right]^{2}/\left[\operatorname{Tr}\left(P_{i}\right)\operatorname{Tr}\left(\hat{P}_{i}\right)\right]. (45)

This definition has been widely used in QDT [21, 22, 27] to evaluate the estimation performance, and corresponding infidelity is defined as 1F0(P^i,Pi)1-F_{0}\left(\hat{P}_{i},P_{i}\right). However, we find that this definition is not always appropriate, because in certain circumstances the third property (44) dose not hold for (45) (we call this phenomenon distortion). More specifically, for certain detector {Pi}i=1n\left\{{P}_{i}\right\}_{i=1}^{n}, distortion means there exists {P^i}i=1n\left\{\hat{P}_{i}\right\}_{i=1}^{n} such that F0(P^i,Pi)=1F_{0}\left(\hat{P}_{i},P_{i}\right)=1 for all 1in1\leq i\leq n while P^i=Pi\hat{P}_{i}=P_{i} fails for at least one ii. For example, suppose the detector is P1=P2=P3=I3P_{1}=P_{2}=P_{3}=\frac{I}{3} and the estimations are P^1=a1P1\hat{P}_{1}=a_{1}P_{1}, P^2=a2P2\hat{P}_{2}=a_{2}P_{2}, P^3=a3P3\hat{P}_{3}=a_{3}P_{3} where a1a_{1}, a2a_{2} and a3a_{3} are three arbitrary positive numbers satisfying a1+a2+a3=3a_{1}+a_{2}+a_{3}=3. The fidelities for the three detectors are all maximum 11, but the estimation is in fact usually not accurate. We characterize when the evaluation index (45) will distort in the following proposition, and present its proof in Appendix D.

Proposition 4.6.

The evaluation index (45) will distort if and only if {Pi}i=1n\left\{P_{i}\right\}_{i=1}^{n} are linearly dependent, i.e., there exists nonzero cnc\in\mathbb{R}^{n} such that i=1nciPi=0\sum_{i=1}^{n}c_{i}P_{i}=0. Specially, if d2<nd^{2}<n, there must exist distortion.

When distortion happens, limF0(P^i,Pi)1P^i=Pi\lim_{F_{0}(\hat{P}_{i},P_{i})\rightarrow 1}\hat{P}_{i}=P_{i} fails. That is, the estimation can be on a wrong track even if the fidelity approaches to 1. To fix this problem, we add [Tr(PiP^i)]2/d2\big{[}\operatorname{Tr}(P_{i}-\hat{P}_{i})\big{]}^{2}/d^{2} and propose a new detector fidelity as

F(P^i,Pi)=\displaystyle F\left(\hat{P}_{i},P_{i}\right)= [Tr(P^iPiP^i)]2/[Tr(Pi)Tr(P^i)]\displaystyle\left[\operatorname{Tr}(\sqrt{\sqrt{\hat{P}_{i}}{P}_{i}\sqrt{\hat{P}_{i}}})\right]^{2}/\left[\operatorname{Tr}\left(P_{i}\right)\operatorname{Tr}\left(\hat{P}_{i}\right)\right] (46)
[Tr(PiP^i)]2/d2.\displaystyle-\left[\operatorname{Tr}\left(P_{i}-\hat{P}_{i}\right)\right]^{2}/d^{2}.

The fidelity (46) takes values in (1d1,1](\frac{1}{d}-1,1] and we give the proof of the lower bound in Appendix E. When F(P^i,Pi)=1F\left(\hat{P}_{i},P_{i}\right)=1, we must have P^i=Pi\hat{P}_{i}=P_{i} (and vice versa), solving the distortion problem of (45). For the rest of this paper, we refer to “fidelity” as (46), unless otherwise declared. The infidelity is defined as 1F(P^i,Pi)1-F\left(\hat{P}_{i},P_{i}\right).

4.2 Two-step adaptive quantum detector tomography

Let {|λt}\left\{\left|\lambda_{t}\right\rangle\right\} be the eigenvectors of PiP_{i} for given ii, where the zero eigenvalues are λr+1==λd=0\lambda_{r+1}=\cdots=\lambda_{d}=0. We can view PiTr(Pi)\frac{P_{i}}{\operatorname{Tr}\left(P_{i}\right)} and P^iTr(P^i)\frac{\hat{P}_{i}}{\operatorname{Tr}\left(\hat{P}_{i}\right)} as two quantum states. Thus, we may use the analysis in [37] to obtain the Taylor series expansion of the new infidelity 1F(P^i,Pi)1-F(\hat{P}_{i},P_{i}) based on (46) up to the second order as

𝔼(1F(P^i,Pi))\displaystyle\mathbb{E}\left(1-F(\hat{P}_{i},P_{i})\right) (47)
=\displaystyle= 𝔼(1F0(P^i,Pi)+[Tr(PiP^i)]2/d2)\displaystyle\mathbb{E}\left(1-F_{0}\left(\hat{P}_{i},P_{i}\right)+\left[\operatorname{Tr}\left(P_{i}-\hat{P}_{i}\right)\right]^{2}/d^{2}\right)
=\displaystyle= 𝔼(1F0(P^i,Pi)+(t=1dλt|Δ2|λt)2/d2)\displaystyle\mathbb{E}\left(1-F_{0}\left(\hat{P}_{i},P_{i}\right)+\left(\sum_{t=1}^{d}\left\langle\lambda_{t}|\Delta_{2}|\lambda_{t}\right\rangle\right)^{2}/d^{2}\right)
=\displaystyle= 𝔼(t=r+1dλt|Δ1|λt)+𝔼(12t,k=1r|λt|Δ1|λk|2λt+λk)\displaystyle\mathbb{E}\left(\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right)+\mathbb{E}\left(\frac{1}{2}\sum_{t,k=1}^{r}\frac{\left|\left\langle\lambda_{t}|\Delta_{1}|\lambda_{k}\right\rangle\right|^{2}}{\lambda_{t}+\lambda_{k}}\right)
𝔼(14[t=r+1dλt|Δ1|λt]2)\displaystyle-\mathbb{E}\left(\frac{1}{4}\left[\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right]^{2}\right)
+𝔼(t=1dk=1dλt|Δ2|λtλk|Δ2|λk/d2)+O(Δ13),\displaystyle+\mathbb{E}\left(\sum_{t=1}^{d}\sum_{k=1}^{d}\left\langle\lambda_{t}|\Delta_{2}|\lambda_{t}\right\rangle\left\langle\lambda_{k}|\Delta_{2}|\lambda_{k}\right\rangle/d^{2}\right)+O\left(\|\Delta_{1}\|^{3}\right),

where Δ1=PiTr(Pi)P^iTr(P^i),Δ2=PiP^i\Delta_{1}=\frac{P_{i}}{\operatorname{Tr}\left(P_{i}\right)}-\frac{\hat{P}_{i}}{\operatorname{Tr}\left(\hat{P}_{i}\right)},\Delta_{2}=P_{i}-\hat{P}_{i}. Crucially, the new term [Tr(PiP^i)]2/d2-\left[\operatorname{Tr}\left(P_{i}-\hat{P}_{i}\right)\right]^{2}/d^{2} is in the second order instead of in the first order, and we can thus imitate the analysis for QST in [37]. Note that we use a perturbation method thus (47) is valid only around PiP_{i}. The condition can be guaranteed because we are analyzing the asymptotic behavior as NN tends to infinity (large enough).

According to the Taylor series, the scaling performance of the infidelity depends on the rank of the detector. For instance, for a full-rank POVM element, the first-order term vanishes and the infidelity scales as O(1/N)O(1/N) using just non-adaptive QDT. For a rank deficient PiP_{i}, the first-order term dominates and thus the infidelity scales as O(1/N)O(1/\sqrt{N}) by QDT which only has Step 1 in Fig. 1. The optimal scaling of infidelity 1Fs(ρ^,ρ)1-F_{s}(\hat{\rho},\rho) is O(1/N)O(1/N) for unbiased estimate in QST [49] and the optimal scaling of [Tr(PiP^i)]2/d2-\left[\operatorname{Tr}\left(P_{i}-\hat{P}_{i}\right)\right]^{2}/d^{2} is always not worse than O(1/N)O(1/N). Thus, the optimal scaling of infidelity 1F0(P^i,Pi)1-F_{0}(\hat{P}_{i},P_{i}) or 1F(P^i,Pi)1-F(\hat{P}_{i},P_{i}) is also O(1/N)O(1/N). From [37], we know the second-order terms always scale as O(1/N)O(1/N). Therefore, for rank-deficient detectors, the behavior of the infidelity can be corrected to O(1/N)O(1/N) if one can eliminate the influence of the first-order term [37]. This depends on the diagonal coefficients of Δ1\Delta_{1} in the kernel of PiTr(Pi)\frac{P_{i}}{\operatorname{Tr}\left(P_{i}\right)}, which suggests performing QDT in a basis aligning with the eigenvectors of PiP_{i} [37].

The detailed adaptive procedure is as follows. To simplify the expression, we use PP to represent each POVM element PiP_{i}. For PP, given a probe state set {ρj}j=1M\{\rho_{j}\}_{j=1}^{M}, we choose one among them (e.g., ρa\rho_{a}) and its spectral decomposition is ρa=VaΓaVa\rho_{a}=V_{a}\Gamma_{a}V_{a}^{\dagger}. The spectral decomposition of PiP_{i} is P=UΛUP=U\Lambda U^{\dagger}. To perform QDT in a basis that agrees with the eigenvectors of PP [37], we change each probe state ρj\rho_{j} to ρ~j\tilde{\rho}_{j} by a common conjugation as

ρ~j=UVaρjVaU.\tilde{\rho}_{j}=UV_{a}^{\dagger}\rho_{j}V_{a}U^{\dagger}. (48)

However, in practice, we do not know UiU_{i}. Therefore, we have two steps as shown in Fig. 1. In Step 1, we obtain an estimator {P~}\left\{\tilde{P}\right\} by applying non-adaptive QDT on an ensemble of size N0N_{0}. Then in Step 2, we use new probe states ρ~j\tilde{\rho}_{j} as

ρ~j=U~VaρjVaU~,\tilde{\rho}_{j}=\tilde{U}V_{a}^{\dagger}\rho_{j}V_{a}\tilde{U}^{\dagger}, (49)

for an ensemble of size NN0N-N_{0}, where P~=U~Λ~U~\tilde{P}=\tilde{U}\tilde{\Lambda}\tilde{U}^{\dagger}. For each POVM element, we need to repeat the above procedure.

Furthermore, to guarantee that the infidelity scaling is improved to O(1/N)O(1/N), one needs to carefully choose the probe states. Before showcasing how to do this, we first introduce several basic definitions. For a dd-dimensional Hermitian space, we define bases ({|ii=1d})\mathcal{B}\left(\left\{|i\rangle_{i=1}^{d}\right\}\right) consisting of the following elements,

σiz=|ii|(1id),\sigma_{i}^{z}=|i\rangle\langle i|\quad(1\leq i\leq d), (50)
σijx=(|ij|+|ji|)/2(1i<jd),\sigma_{ij}^{x}=(|i\rangle\langle j|+|j\rangle\langle i|)/\sqrt{2}\quad(1\leq i<j\leq d), (51)
σijy=(i|ij|+i|ji|)/2(1i<jd),\sigma_{ij}^{y}=(-\mathrm{i}|i\rangle\langle j|+\mathrm{i}|j\rangle\langle i|)/\sqrt{2}\quad(1\leq i<j\leq d), (52)

where the set {|ii=1d}\left\{|i\rangle_{i=1}^{d}\right\} is an arbitrary orthogonal basis. Note that all the operators are orthogonal w.r.t. the inner product A,B=Tr(AB)\langle A,B\rangle=\operatorname{Tr}\left(A^{\dagger}B\right). For a given basis {ξk}k=1K1\{\xi_{k}\}_{k=1}^{K_{1}}, denote span{{ξk}k=1K1}\operatorname{span}\left\{\{\xi_{k}\}_{k=1}^{K_{1}}\right\} as the set of all finite real linear combinations of ξk\xi_{k} (1kK1)(1\leq k\leq K_{1}). If span{{ξk}k=1K1}=span{{μk}k=1K2}\operatorname{span}\left\{\{\xi_{k}\}_{k=1}^{K_{1}}\right\}=\operatorname{span}\left\{\{\mu_{k}\}_{k=1}^{K_{2}}\right\}, we say {ξk}k=1K1\{\xi_{k}\}_{k=1}^{K_{1}} and {μk}k=1K2\{\mu_{k}\}_{k=1}^{K_{2}} are equivalent. If the span of the probe state set contains all dd-dimensional Hermitian matrices and M=d2M=d^{2}, we say the probe state set is complete. If we further have M>d2M>d^{2}, we say it is over-complete.

In this section, we assume the true value of a rank rr POVM element is PP, the Step 1 estimation is P~\tilde{P} and the Step 2 estimation is P^\hat{P}. The spectral decomposition of PP is P=i=1dλi|λiλi|{P}=\sum_{i=1}^{d}{\lambda}_{i}\left|{\lambda}_{i}\right\rangle\left\langle{\lambda}_{i}\right| where λ1λ2λr>0\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{r}>0, λr+1==λd=0\lambda_{r+1}=\cdots=\lambda_{d}=0. In (50)-(52), if we change {|i}\{|i\rangle\} to {|λi}\{|\lambda_{i}\rangle\} for i=1,,di=1,\ldots,d, we call this new basis ({|λii=1d})\mathcal{B}\left(\left\{|\lambda_{i}\rangle_{i=1}^{d}\right\}\right) ideal bases. If we change {|i}\{|i\rangle\} to {|λi}\{|\lambda_{i}\rangle\} with ii restricted in [r+1,d][r+1,d], we call the set ({|λii=r+1d})\mathcal{B}\left(\left\{|\lambda_{i}\rangle_{i=r+1}^{d}\right\}\right) of these new (dr)2(d-r)^{2} elements as the null bases of PP. If we change {|i}\{|i\rangle\} to {|λi}\{|\lambda_{i}\rangle\} with ii restricted in [1,r][1,r], we call the set ({|λii=1r})\mathcal{B}\left(\left\{|\lambda_{i}\rangle_{i=1}^{r}\right\}\right) of these new r2r^{2} elements as the range bases of PP. After Step 1, we obtain an estimate P~\tilde{P} and the spectral decomposition is P~=i=1dλ~i|λ~iλ~i|\tilde{P}=\sum_{i=1}^{d}\tilde{\lambda}_{i}\left|\tilde{\lambda}_{i}\right\rangle\left\langle\tilde{\lambda}_{i}\right|. If we change {|i}\{|i\rangle\} to {|λ~i}\{|\tilde{\lambda}_{i}\rangle\} for i=1,,di=1,\ldots,d, we call this new basis ({|λ~ii=1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=1}^{d}\right\}\right) estimated bases. If we change {|i}\{|i\rangle\} to {|λ~i}\{|\tilde{\lambda}_{i}\rangle\} with ii restricted in [r+1,d][r+1,d], we call the set ({|λ~ii=r+1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=r+1}^{d}\right\}\right) of these new (dr)2(d-r)^{2} elements estimated null bases.

Then we give the following theorem as a guideline to design the probe state set.

Theorem 4.7.

For two-step adaptive QDT using arbitrary reconstruction algorithm with an O(1/N)O(1/N) scaling for the MSE, suppose the resource number is N0N_{0} in Step 1 and NN0N-N_{0} in Step 2, both evenly distributed for each probe state. The infidelity 𝔼(1F(P^,P))\mathbb{E}\left(1-F(\hat{P},P)\right) of any rank-deficient POVM element scales as O(1N0(NN0))+O(1NN0)O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right)+O\left(\frac{1}{N-N_{0}}\right) if

  1. c1)

    the probe states in Step 1 are complete or over-complete;

  2. c2)

    the probe states in Step 2 are complete or over-complete;

  3. c3)

    the probe state set in Step 2 includes a subset equivalent to the (dr)2(d-r)^{2} estimated null basis ({|λ~ii=r+1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=r+1}^{d}\right\}\right) from Step 1.

{pf}

From Condition c1), we obtain a rank r~\tilde{r} estimation P~\tilde{P} and the spectral decomposition is P~=i=1dλ~i|λ~iλ~i|\tilde{P}=\sum_{i=1}^{d}\tilde{\lambda}_{i}\left|\tilde{\lambda}_{i}\right\rangle\left\langle\tilde{\lambda}_{i}\right| where λ~1λ~2λ~r~>0\tilde{\lambda}_{1}\geq\tilde{\lambda}_{2}\geq\cdots\geq\tilde{\lambda}_{\tilde{r}}>0, λ~r~+1==λ~d=0\tilde{\lambda}_{\tilde{r}+1}=\cdots=\tilde{\lambda}_{d}=0. Thereby, the inner product between the eigenvectors of PP and their estimates obtained from P~\tilde{P} [37] is

𝔼(λi|λ~jλ~j|λi)\displaystyle\mathbb{E}\left(\left\langle\lambda_{i}|\tilde{\lambda}_{j}\right\rangle\left\langle\tilde{\lambda}_{j}|\lambda_{i}\right\rangle\right) =O(1N0),\displaystyle=O\left(\frac{1}{N_{0}}\right), (53)
𝔼(λi|λ~iλ~i|λi)\displaystyle\mathbb{E}\left(\left\langle\lambda_{i}|\tilde{\lambda}_{i}\right\rangle\left\langle\tilde{\lambda}_{i}|\lambda_{i}\right\rangle\right) =1+O(1N0).\displaystyle=1+O\left(\frac{1}{N_{0}}\right).

In the estimated basis ({|λ~ii=1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=1}^{d}\right\}\right), the true value matrix PP can be uniquely represented as

P=1i<jd(Sijxσ~ijx+Sijyσ~ijy)+k=1dSkzσ~kz.\displaystyle{P}=\sum_{1\leq i<j\leq d}\left(S_{ij}^{x}\tilde{\sigma}_{ij}^{x}+S_{ij}^{y}\tilde{\sigma}_{ij}^{y}\right)+\sum_{k=1}^{d}S_{k}^{z}\tilde{\sigma}_{k}^{z}. (54)

Then in Step 2, according to Conditions c2) and c3), we can obtain an estimate

P^=1i<jd(S^ijxσ~ijx+S^ijyσ~ijy)+k=1dS^kzσ~kz.\displaystyle{\hat{P}}=\sum_{1\leq i<j\leq d}\left(\hat{S}_{ij}^{x}\tilde{\sigma}_{ij}^{x}+\hat{S}_{ij}^{y}\tilde{\sigma}_{ij}^{y}\right)+\sum_{k=1}^{d}\hat{S}_{k}^{z}\tilde{\sigma}_{k}^{z}. (55)

For Step 2 estimation, the first-order term is

𝔼(t=r+1dλt|Δ1|λt)\displaystyle\mathbb{E}\left(\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right) (56)
=𝔼[1i<jdt=r+1d(S^ijxTr(P^)SijxTr(P))λt|σ~ijx|λt\displaystyle=\mathbb{E}\left[\sum_{1\leq i<j\leq d}\sum_{t=r+1}^{d}\left(\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right)\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{t}\right\rangle\right.
+1i<jdt=r+1d(S^ijyTr(P^)SijyTr(P))λt|σ~ijy|λt\displaystyle+\sum_{1\leq i<j\leq d}\sum_{t=r+1}^{d}\left(\frac{\hat{S}_{ij}^{y}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{y}}{\operatorname{Tr}(P)}\right)\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{y}\right|\lambda_{t}\right\rangle
+k=1dt=r+1d(S^kzTr(P^)SkzTr(P))λt|σ~kz|λt]\displaystyle+\left.\sum_{k=1}^{d}\sum_{t=r+1}^{d}\left(\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(\hat{P})}-\frac{S_{k}^{z}}{\operatorname{Tr}(P)}\right)\left\langle\lambda_{t}\left|\tilde{\sigma}_{k}^{z}\right|\lambda_{t}\right\rangle\right]
𝔼(1i<jdt=r+1d|S^ijxTr(P^)SijxTr(P)||λt|σ~ijx|λt|)\displaystyle\leq\mathbb{E}\left(\sum_{1\leq i<j\leq d}\sum_{t=r+1}^{d}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left|\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{t}\right\rangle\right|\right)
+𝔼(1i<jdt=r+1d|S^ijxTr(P^)SijyTr(P)||λt|σ~ijy|λt|)\displaystyle+\mathbb{E}\left(\sum_{1\leq i<j\leq d}\sum_{t=r+1}^{d}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{y}}{\operatorname{Tr}(P)}\right|\left|\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{y}\right|\lambda_{t}\right\rangle\right|\right)
+𝔼(k=1dt=r+1d|S^kzTr(P^)SkzTr(P)||λt|σ~kz|λt|).\displaystyle+\mathbb{E}\left(\sum_{k=1}^{d}\sum_{t=r+1}^{d}\left|\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(\hat{P})}-\frac{S_{k}^{z}}{\operatorname{Tr}(P)}\right|\left|\left\langle\lambda_{t}\left|\tilde{\sigma}_{k}^{z}\right|\lambda_{t}\right\rangle\right|\right).

Then we show the first-order term scales as

𝔼(t=r+1dλt|Δ1|λt)=O(1N0(NN0)),\displaystyle\mathbb{E}\left(\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right)=O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right), (57)

and the detailed calculation can be found in Appendix F.

Note that the (dr)2(d-r)^{2} estimated null bases are not probe states. Hence, we cannot directly use them. We need to use the linear combinations of these estimated null bases to construct quantum states. According to Condition c3), the probe state set {ρ~l}l=1M\left\{\tilde{\rho}_{l}\right\}_{l=1}^{M} includes a subset {ρ~l}l=1(dr)2\left\{\tilde{\rho}_{l}\right\}_{l=1}^{(d-r)^{2}} equivalent to the (dr)2(d-r)^{2} estimated null bases ({|λ~ii=r+1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=r+1}^{d}\right\}\right). We assume the practical measurement results are p^l\hat{p}_{l} and the ideal measurement results are p~l\tilde{p}_{l} for these probe states. For 1l(dr)21\leq l\leq(d-r)^{2}, the measurement results p~l\tilde{p}_{l} are used to estimate the null bases of PP, and for the other M(dr)2M-(d-r)^{2} measurement results, they are used to estimate the range bases of PP. Even though we do not know ideal measurement results p~l\tilde{p}_{l}, we can still use them like (F.98) and (F.99) because we focus on infidelity. Using the linear combinations of {p~l}l=1(dr)2\left\{\tilde{p}_{l}\right\}_{l=1}^{(d-r)^{2}}, we can obtain S^ijx=l=1(dr)2a^lijp~l\hat{S}_{ij}^{x}=\sum_{l=1}^{(d-r)^{2}}\hat{a}_{l}^{ij}\tilde{p}_{l} and Sijx=l=1(dr)2alijp~l{S}_{ij}^{x}=\sum_{l=1}^{(d-r)^{2}}a_{l}^{ij}\tilde{p}_{l} for r+1i<jdr+1\leq i<j\leq d for certain real numbers a^lij\hat{a}_{l}^{ij} and alij{a}_{l}^{ij}. The relation (F.99) also holds because |a^lijalij|=O(1NN0)\left|\hat{a}_{l}^{ij}-{a}_{l}^{ij}\right|=O\left(\frac{1}{\sqrt{N-N_{0}}}\right). Then we assume S^kz=l=1(dr)2clkp^l\hat{S}_{k}^{z}=\sum_{l=1}^{(d-r)^{2}}c_{l}^{k}\hat{p}_{l} for certain real numbers clk{c}_{l}^{k}. From condition c3), {ρ~l}l=1(dr)2\left\{\tilde{\rho}_{l}\right\}_{l=1}^{(d-r)^{2}} are linear combinations of the estimated null bases ({|λ~ii=r+1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=r+1}^{d}\right\}\right), and thus p~l=O(1N0)\tilde{p}_{l}=O\left(\frac{1}{N_{0}}\right). Hence, like (F.101), we have

var(S^kz)=var(l=1(dr)2clkp^l)\displaystyle\operatorname{var}\left({\hat{S}_{k}^{z}}\right)=\operatorname{var}\left(\sum_{l=1}^{(d-r)^{2}}c_{l}^{k}\hat{p}_{l}\right) (58)
=1Nkz[l=1(dr)2(clk)2p~l(1p~l)+covij(cikp~i,cjkp~j)]\displaystyle=\frac{1}{N_{k}^{z}}\left[\sum_{l=1}^{(d-r)^{2}}(c_{l}^{k})^{2}\tilde{p}_{l}\left(1-\tilde{p}_{l}\right)+\operatorname{cov}_{i\neq j}\left(c_{i}^{k}\tilde{p}_{i},c_{j}^{k}\tilde{p}_{j}\right)\right]
=O(1N0(NN0)).\displaystyle=O\left(\frac{1}{N_{0}\left(N-N_{0}\right)}\right).

Therefore, using probe states satisfying Conditions c1)-c3), the first-order term scales as

𝔼(t=r+1dλt|Δ1|λt)=O(1N0(NN0)).\displaystyle\mathbb{E}\left(\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right)=O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right). (59)

Finally, the second-order term scales as

𝔼(12t,k=1r|λt|Δ1|λk|2λt+λk)O(1NN0),\displaystyle\mathbb{E}\left(\frac{1}{2}\sum_{t,k=1}^{r}\frac{\left|\left\langle\lambda_{t}\left|\Delta_{1}\right|\lambda_{k}\right\rangle\right|^{2}}{\lambda_{t}+\lambda_{k}}\right)\sim O\left(\frac{1}{N-N_{0}}\right), (60)
𝔼(14[t=r+1dλt|Δ1|λt]2)O(1N0(NN0)),\displaystyle\mathbb{E}\left(\frac{1}{4}\left[\sum_{t=r+1}^{d}\left\langle\lambda_{t}\left|\Delta_{1}\right|\lambda_{t}\right\rangle\right]^{2}\right)\sim O\left(\frac{1}{N_{0}\left(N-N_{0}\right)}\right),
𝔼(t=1dk=1dλt|Δ2|λtλk|Δ2|λk/d2)O(1NN0).\displaystyle\mathbb{E}\left(\sum_{t=1}^{d}\sum_{k=1}^{d}\left\langle\lambda_{t}\left|\Delta_{2}\right|\lambda_{t}\right\rangle\left\langle\lambda_{k}\left|\Delta_{2}\right|\lambda_{k}\right\rangle/d^{2}\right)\sim O\left(\frac{1}{N-N_{0}}\right).

Using (59) and (60), the infidelity 𝔼(1F(P^,P))\mathbb{E}\left(1-F(\hat{P},P)\right) of any rank-deficient POVM element scales as
O(1N0(NN0))+O(1NN0)O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right)+O\left(\frac{1}{N-N_{0}}\right). \Box

We have the following corollary if we use GPB (generalized Pauli basis) states shown in (A.70)–(A.72) in Step 2,

Corollary 4.8.

For a POVM element PP, using GPB states in Step 2, the infidelity reaches the optimal scaling O(1/N)O(1/N) if N0=αNN_{0}=\alpha N for certain 0<α<10<\alpha<1.

The proof is straightforward. If the POVM element PP is full rank, it does not have first-order term and the infidelity scales as O(1/N)O(1/N). If PP is rank deficient with an unknown rank rr, GPB states always satisfy Conditions c1)-c3) and thus the infidelity still scales as O(1/N)O(1/N) by choosing N0=αNN_{0}=\alpha N for certain 0<α<10<\alpha<1. Because GPB states are effective for all cases, we will show their performance in Numerical Examples (Section 5).

We may use three methods to further make the estimation from Step 1 more accurate if the rank is not precise. The first one is to use Corollary 4.8. We use d2d^{2} GPB states which always span the null basis. The second one is the possible scenario where we know the rank value from prior information. The third method is, from the rough estimation of Step 1, we may know an exact rank interval of the to-be-estimated POVM element. Since we know the upper bound of the estimated error as (9) here, we thus know the variation range of all the eigenvalues from Lemma Appendix C.9. Assume the rank interval is [a,b][a,b] where 1a<bd1\leq a<b\leq d, and r[a,b]r\in[a,b] is the unknown rank. The more accurate is the estimation result from Step 1, the smaller is the interval length bab-a. Then to ensure the infidelity scaling is O(1/N)O(1/N), a conservative method is to add (da)2(d-a)^{2} quantum states spanning the estimated null basis ({|λ~ii=a+1d})\mathcal{B}\left(\left\{|\tilde{\lambda}_{i}\rangle_{i=a+1}^{d}\right\}\right).

Remark 6.

The key in proving Theorem 4.7 is to characterize the first-order term in (56), which is unaffected by the added term [Tr(PiP^i)]2/d2-\left[\operatorname{Tr}\left(P_{i}-\hat{P}_{i}\right)\right]^{2}/d^{2} comparing (46) with (45). Hence, when (45) holds without distortion (see Proposition 4.6), one can still use (45) as the definition of fidelity, and GPB states still reach O(1/N)O(1/N) scaling for the infidelity.

The choice of N0N_{0} plays a key role in the performance of the two-step adaptive QDT. In this paper, we choose N0=N2N_{0}=\frac{N}{2} and the infidelity scales as O(1/N)O(1/N). Note that the infidelity behavior for different detector matrices can be different, even for binary detectors. For example, if P1=U1diag(1,0,0,0)U1P_{1}=U_{1}\operatorname{diag}\left(1,0,0,0\right)U_{1}^{\dagger} for certain unitary U1U_{1} and P2=IP1P_{2}=I-P_{1}, P2P_{2}’s eigenvalues are 1,1,1,01,1,1,0. Since P2P_{2} has one zero eigenvalue, the infidelity reaches O(1/N)O(1/\sqrt{N}) by non-adaptive QDT. However, if P1=U1diag(0.1,0,0,0)U1P_{1}=U_{1}\operatorname{diag}\left(0.1,0,0,0\right)U_{1}^{\dagger}, P2P_{2}’s eigenvalues are 1,1,1,0.91,1,1,0.9. Since P2P_{2} has no zero eigenvalues, the infidelity can reach O(1/N)O(1/{N}) by non-adaptive QDT. Therefore, for a complete characterization and analysis, we need to calculate the infidelity for every POVM element.

5 Numerical Examples

Both the non-adaptive and adaptive QDT protocols need to prepare certain (in this paper pure) states {|ψ}\{|\psi\rangle\} (we call them ideal states) as the probe states, which can be difficult to achieve in practice. As stated in Sec. 3.4, a realistic way in quantum optics experiments is to use the superposition of coherent states to approximate the ideal states. In this section we demonstrate the performance of our optimal and adaptive protocols both using ideal probe states and using superposed coherent probe states. We use the two-stage QDT reconstruction algorithm for numerical simulations.

5.1 Optimal detector tomography

TABLE I. Comparison of various four-dimensional QDT protocols. Protocol Probe states Number(MM) MTr[(XTX)1]M\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]} cond(X)\operatorname{cond}(X) Eqs or Ref 1 SIC 16 304304^{*} 5\sqrt{5}^{*} (A.68) 2 MUB 20 304304^{*} 5\sqrt{5}^{*} (A.67) 3 Cube\text{Cube}^{**} 36 400400 33 (A.69) 4 GPB 16 640640 9+73973\sqrt{\dfrac{9+\sqrt{73}}{9-\sqrt{73}}} (A.70),(A.71),(A.72) 5 Random Pure 32 629.16629.16 2.392.39 [50, 51] 6 1-coherent SIC 16 2.48×104{2.48\times 10^{4}} 307.11307.11 (40) 7 1-coherent MUB 20 2.46×1032.46\times 10^{3} 82.3782.37 (40) 8 1-coherent Random 32 9.81×1039.81\times 10^{3} 26.0326.03 [28] 9 2-coherent SIC 16 400.61400.61 3.443.44 (40) 10 2-coherent MUB 20 470.67470.67 3.273.27 (40) 11 2-coherent Random 32 607.35607.35 5.485.48 [28] 12 3-coherent SIC 16 352.82352.82 2.792.79 (40) 13 3-coherent MUB 20 364.01364.01 2.602.60 (40) 14 3-coherent Random 32 532.85532.85 4.954.95 [28]

  • *

    This is the optimal value.

  • **

    Cube states are product probe states.

For non-adaptive QDT in d=4d=4 systems, we test 1414 different protocols using different probe states in Table I. In protocols 1-5, we compare MTr[(XTX)1]M\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]} and condition numbers of ideal pure states such as MUB, Cube, SIC, GPB probe states which are constructed as in Appendix A. “Cube” states here are product states of one-qubit MUB states, assuming that the d=4d=4 system here is the composition of two d=2d=2 systems. GPB states are shown in (A.70)–(A.72), similar to the optimal measurement-generalized Pauli operators in [39]. “Random Pure” means that we generate 3232 random pure states using the algorithm in [50, 51] where MTr[(XTX)1]M\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]} and condition number are obtained by the average of 10001000 results. We find SIC states and MUB states have the minimum value of MTr[(XTX)1]M\operatorname{Tr}\big{[}(X^{T}X)^{-1}\big{]} as 304304 and minimum condition number as 5\sqrt{5}, satisfying Theorem 1. For two-qubit product states–Cube states which are easier to generate in experiment, the values of UMSE and conidtion number are 400400 and 33, respectively, a little larger than the optimal values and satisfying Theorem 3.4.

In protocols 6-14, we use superposition of ncn_{c} (nc=1,2,3)(n_{c}=1,2,3) coherent states (denoted as ncn_{c}-coherent SIC/MUB) to approximate the four-dimensional SIC and MUB states by solving the optimization problem (40). The case of nc=1n_{c}=1 is using one coherent state without superposition. As a comparison, we also add the protocols of using the superposition of ncn_{c} random coherent states (denoted as ncn_{c}-coherent Random). The random algorithm is the same as the coherent states preparation procedure in [28]. The optimization landscape might have local minima. Therefore, we run the optimization algorithm with 100100 different initial values and choose the best result.

We find for one coherent state, it cannot approximate SIC and MUB states well and both the condition number and UMSE are large. For the superposition of two and three coherent states, they can approximate SIC and MUB states well and the corresponding condition number and UMSE are close to the optimal values, and smaller than 2,3-coherent Random protocols. Also, with ncn_{c} increasing, the superposition result becomes close to ideal probe states and thus the UMSE and condition number decrease.

5.2 Adaptive QDT using ideal probe states

5.2.1 Binary detectors

For binary detectors P1+P2=IP_{1}+P_{2}=I, P1P_{1} and P2P_{2} can be simultaneously diagonalized by a common unitary [29]. Hence, the eigenvalues of P1P_{1} will affect the eigenvalues of P2P_{2}. This determines the rank of P2P_{2}, which will further influence the scaling of non-adaptive tomography. Let

P1=U1diag(μ,0,0,0)U1.\displaystyle P_{1}=U_{1}\operatorname{diag}\left(\mu,0,0,0\right)U_{1}^{\dagger}. (61)

With non-adaptive tomography, when μ<1\mu<1, P2P_{2} is full-ranked and the infidelity of estimating P2P_{2} scales as O(1/N)O(1/N), while for μ=1\mu=1, P2P_{2} is rank-deficient and the infidelity of estimating P2P_{2} scales as O(1/N)O(1/\sqrt{N}).

Therefore, we firstly consider a binary detector for μ=1\mu=1 where

P1=U1diag(1,0,0,0)U1.\displaystyle P_{1}=U_{1}\operatorname{diag}\left(1,0,0,0\right)U_{1}^{\dagger}. (62)

This detector is fully specified by the projection measurement P1P_{1}. The matrix U1U_{1} is randomly generated using the algorithm in [50, 52]. For each resource number, we run the algorithm 100100 times and obtain the average infidelity and standard deviation.

Refer to caption
Refer to caption
Figure 2: Infidelity for binary detectors with given unitary matrix U1U_{1} in (62). (a) P1P_{1}; (b) P2P_{2}.

The four curves in Fig. 2 are as follows:

  • Random Non-adaptive: We only have Step 1 and choose 48 random pure states.

  • Adaptive SIC: In Step 1, we use 1616 SIC states {ρj(SIC)}\left\{\rho_{j}^{\text{(SIC)}}\right\} and in Step 2, we use 3232 new states as (49).

  • Adaptive MUB: In Step 1, we use 2020 MUB states {ρj(MUB)}\left\{\rho_{j}^{\text{(MUB)}}\right\} and in Step 2, we use 4040 new states as (49).

  • Adaptive GPB: In Step 1, we use 1616 GPB states and in Step 2, we use 3232 new GBP states by replacing the set {|i}\{|i\rangle\} by {|λ~i}\left\{\left|\tilde{\lambda}_{i}\right\rangle\right\} in (A.70), (A.71) and (A.72).

As shown in Fig. 2, Random Non-adaptive tomography only reaches 1F=O(1/N)1-F=O(1/\sqrt{N}) for both P1P_{1} and P2P_{2} because they both have zero eigenvalues and the first-order term scales as O(1/N)O(1/\sqrt{N}). Adaptive GPB tomography can reach 1F=O(1/N)1-F=O(1/{N}) for P1P_{1} and P2P_{2} as proved in Corollary 4.8. For adaptive SIC and MUB tomography, they can only reach 1F=O(1/N)1-F=O(1/\sqrt{N}) for P1P_{1} probably because adaptive SIC and MUB states do not have a subset equivalent to the 99 estimated null bases and does not satisfy Condition c3) in Theorem 4.7. They can reach 1F=O(1/N)1-F=O(1/{N}) for P2P_{2} because adaptive SIC and MUB states have a subset equivalent to the estimated null basis and satisfy Condition c3).

Refer to caption
Refer to caption
Figure 3: Infidelity for non-adaptive tomography with different eigenvalues μ\mu of binary detectors. (a) P1P_{1}; (b) P2P_{2}.

Then we show infidelity for non-adaptive tomography with different eigenvalues μ=0.25,0.5,0.75,1\mu=0.25,0.5,0.75,1 in (61) of binary detectors in Fig. 3. For each resource number, we run the algorithm 100100 times and obtain the average infidelity and the standard deviation. Because P1P_{1} is always rank deficient, the infidelity for P1P_{1} scales as O(1/N)O(1/\sqrt{N}). As μ\mu increases, the measurement accuracy increases and thus the infidelity of P1P_{1} becomes smaller for a given NN. Because P2P_{2} is full rank for μ<1\mu<1, its infidelity scales as O(1/N)O(1/N). In addition, the infidelity of P2P_{2} becomes larger as μ\mu increases for a given NN. When μ\mu increases to 11, both the infidelities of P1P_{1} and P2P_{2} scale as O(1/N)O(1/\sqrt{N}) because they are both rank deficient.

To test the robustness of our adaptive protocol, we perform adaptive QDT on 200200 random binary detectors in the form of (62) by changing the unitary matrix U1U_{1} which is randomly created using the algorithm in [50, 52]. For each U1U_{1}, we run our tomography algorithm 100100 times and obtain the mean infidelities for given resource number NN. Then we calculate the mean values and the standard deviations of these 200200 mean infidelities. The result is shown in Fig. 4. It is clear that Adaptive GPB tomography can reach 1F=O(1/N)1-F=O(1/{N}) for P1P_{1} and P2P_{2}. In addition, all the standard deviations are small, which demonstrates that our Adaptive GPB protocol is robust.

Refer to caption
Refer to caption
Figure 4: Infidelity for 200200 different binary detectors by changing U1U_{1} in (62). (a) P1P_{1}; (b) P2P_{2}.

We then consider the case that P1P_{1} is a perturbed projection measurement,

P1+P2=I,\displaystyle P_{1}+P_{2}=I, (63)
P1=U1diag(0.6,0.001,0.001,0.001)U1.\displaystyle P_{1}=U_{1}\operatorname{diag}\left(0.6,0.001,0.001,0.001\right)U_{1}^{\dagger}.
Refer to caption
Refer to caption
Figure 5: Infidelity for binary detectors in (63). (a) P1P_{1}; (b) P2P_{2}.

The tomography errors are shown in Fig. 5. From Fig. 5(a), we can see that the curves of adaptive QDT can be roughly divided into three segments from left to right for the detector P1P_{1}, which is similar to the phenomenon in QST [36]. In the first segment, the resource number is not large enough (N104N\leq 10^{4}) and the near-zero eigenvalues are not strong enough to make a difference from zero. The performance is thus the same as projection measurement. Hence, the infidelity decreases as O(1/N)O(1/{N}) firstly. When the resource number increases (104N105.510^{4}\leq N\leq 10^{5.5}), the near-zero eigenvalues start to take effect. We cannot distinguish them from zero accurately and thus the infidelity scales as O(1/N)O(1/\sqrt{N}). Finally, when the resource number is large enough (N105.5N\geq 10^{5.5}) to clearly distinguish between the near-zero eigenvalues and zero, we are performing full rank detector tomography actually, which has O(1/N)O(1/{N}) decay rate for infidelity. For Random Non-adaptive tomography, it can be divided into two segments. When the resource number is not enough (N106.5N\leq 10^{6.5}) to estimate the near-zero eigenvalues accurately, the infidelity decreases as O(1/N)O(1/\sqrt{N}). When the resource number is large enough (N106.5N\geq 10^{6.5}) to clearly distinguish between the near-zero eigenvalues and zero, the infidelity scales as O(1/N)O(1/{N}). Overall, the Adaptive GPB tomography is the best among these methods.

For detector P2P_{2}, all the eigenvalues are significantly larger than zero, and P2P_{2} is full-rank. Hence, the infidelity decreases as O(1/N)O(1/{N}) for both non-adaptive and adaptive tomography.

5.2.2 Three-valued detectors

Three-valued detectors is different from binary detectors because three-valued detector matrices generally cannot be diagonalized by the same unitary matrix like binary detectors. We consider a three-valued detector as

P1+P2+P3=I,\displaystyle P_{1}+P_{2}+P_{3}=I, (64)
P1=U1diag(0.4,0,0,0)U1=0.4U1(|0000|)U1,\displaystyle P_{1}=U_{1}\operatorname{diag}\left(0.4,0,0,0\right)U_{1}^{\dagger}=0.4U_{1}(|00\rangle\langle 00|)U_{1}^{\dagger},
P2=U2diag(0,0.5,0,0)U2=0.5U2(|0101|)U2.\displaystyle P_{2}=U_{2}\operatorname{diag}\left(0,0.5,0,0\right)U_{2}^{\dagger}=0.5U_{2}(|01\rangle\langle 01|)U_{2}^{\dagger}.

This detector is constructed from two-qubit MUB measurement (see Appendix A). The first detector P1P_{1} is from |00|00\rangle where we product a coefficient 0.40.4 and conjugate a unitary rotation U1U_{1}. In a similar way, P2P_{2} is from |01|01\rangle where we product a coefficient 0.50.5 and conjugate a unitary rotation U2U_{2}. We can prove that P3=IP1P2P_{3}=I-P_{1}-P_{2} is always positive semidefinite. The unitary rotations U1U_{1} and U2U_{2} are generated by the random unitary algorithm in [50, 52]. For each resource number, we run the algorithm 100100 times and obtain the average infidelity and standard deviation.

Refer to caption
Figure 6: Infidelity for three-valued detectors with given unitary matrices U1U_{1} and U2U_{2} in (64). (a) P1P_{1}; (b) P2P_{2}; (c) P3P_{3}.

We focus on the comparison between Adaptive GPB tomography and Random Non-adaptive tomography. The simulation result is in Fig. 6 where our adaptive tomography can reach O(1/N)O(1/N) for P1P_{1} and P2P_{2} as proved in Corollary 4.8, improving the O(1/N)O(1/\sqrt{N}) scaling of non-adaptive tomography. For P3P_{3}, all the eigenvalues are far from zero and both tomography methods can reach O(1/N)O(1/N). We also test robustness by performing adaptive QDT on 200200 random three-valued detectors in the form of (64) by changing unitary matrices U1U_{1} and U2U_{2}, which is similar as binary detectors. The result is shown in Fig. 7. For P1P_{1} and P2P_{2}, the adaptive tomography is robust (with small standard deviation) and their infidelities can reach O(1/N)O(1/N). For full rank P3P_{3}, both tomography methods can reach O(1/N)O(1/N).

Refer to caption
Figure 7: Infidelity for 200200 different three-valued detectors by changing U1U_{1} and U2U_{2} in (64). (a) P1P_{1}; (b) P2P_{2}; (c) P3P_{3}.

Then we consider that P1P_{1} and P2P_{2} have three small eigenvalues as

P1+P2+P3=I,\displaystyle P_{1}+P_{2}+P_{3}=I, (65)
P1=U1diag(0.4,0.001,0.001,0.001)U1,\displaystyle P_{1}=U_{1}\operatorname{diag}\left(0.4,0.001,0.001,0.001\right)U_{1}^{\dagger},
P2=U2diag(0.001,0.5,0.001,0.001)U2.\displaystyle P_{2}=U_{2}\operatorname{diag}\left(0.001,0.5,0.001,0.001\right)U_{2}^{\dagger}.
Refer to caption
Figure 8: Infidelity for three-valued detectors in (65). (a) P1P_{1}; (b) P2P_{2}; (c) P3P_{3}.

The results for P1P_{1} and P2P_{2} can also be divided into three segments as shown in Fig. 8 and we have explained for binary detectors. For P3P_{3}, all the eigenvalues are far from zero and hence, both tomography can reach O(1/N)O(1/N).

5.3 Adaptive QDT using coherent states

Since the adaptive GPB states can improve the infidelity for all detectors if they have zero or near-zero eigenvalues, in this part, we use the superposition of coherent states to approximate the ideal adaptive GPB states. We consider binary detectors as (62). We use ncn_{c}-coherent states as shown in Fig. 9 where nc=1,2,3n_{c}=1,2,3.

Refer to caption
Refer to caption
Figure 9: Superposition of ncn_{c}-coherent states for (62) where nc=1,2,3n_{c}=1,2,3. (a) P1P_{1}; (b) P2P_{2}.

For given ncn_{c}, the adaptive tomography performance is usually better than non-adaptive one. As ncn_{c} increases, the approximation error decreases. When the resource number NN is not large enough to distinguish the approximation error, the infidelity scales close to O(1/N)O(1/N), like nc=2n_{c}=2, adaptive for P2P_{2} in Fig. 9(b) when N105N\leq 10^{5}. When N105N\geq 10^{5}, the infidelity scales to O(1/N)O(1/\sqrt{N}) because of the approximation error.

For binary detectors as (63), the similar results are shown in Fig. 10.

Refer to caption
Refer to caption
Figure 10: Superposition of ncn_{c}-coherent states for (63) where nc=1,2,3n_{c}=1,2,3. (a) P1P_{1}; (b) P2P_{2}.

For P1P_{1}, all the curves can be roughly divided into two segments. For 1-coherent state, the infidelity scales as O(1/N)O(1/\sqrt{N}) when N107N\leq 10^{7} and scales as O(1/N)O(1/N) when N107N\geq 10^{7}. For 2,3-coherent states, the infidelity scales as O(1/N)O(1/\sqrt{N}) when N106N\leq 10^{6} and scales as O(1/N)O(1/N) when N106N\geq 10^{6}. This result is similar to Fig. 5 without the first segment. For P2P_{2}, the infidelity scales as O(1/N)O(1/N) because all the eigenvalues are significantly larger than zero.

6 Conclusion

In this paper we have investigated how to optimize the probe states in quantum detector tomography. We have characterized the optimal probe state sets based on minimizing the UMSE and minimizing the condition number. We have proven that SIC and MUB states are optimal. In the adaptive scenario we have proposed a two-step strategy to adaptively optimize the probe states, and proven that our strategy can improve the modified infidelity from O(1/N)O(1/\sqrt{N}) to O(1/N)O(1/N) under certain conditions. Numerical examples were presented to demonstrate the effectiveness of our strategies.

{ack}

Y. Wang would like to thank Prof. Howard M. Wiseman for the helpful discussion. We thank the anonymous referees and the Associate Editor for the constructive comments.

Appendix Appendix A Several two-qubit measurements and the induced probe states

Five MUB measurement sets for two-qubit quantum states [42] are

{|ψn(MUB)}={{|ψnA},{|ψnB},{|ψnC},{|ψnD},{|ψnE}},\left\{\!|\psi_{n}^{(\text{MUB})}\rangle\!\right\}\!=\!\left\{\!\left\{\left|\psi_{n}^{A}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{B}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{C}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{D}\right\rangle\!\right\}\!,\!\left\{\left|\psi_{n}^{E}\right\rangle\!\right\}\!\right\}, (A.66)

and

{|ψnA}={|00,|01,|10,|11},\displaystyle\left\{\left|\psi_{n}^{A}\right\rangle\right\}=\{|00\rangle,|01\rangle,|10\rangle,|11\rangle\}, (A.67)
{|ψnB}={|R±,|L±},\displaystyle\left\{\left|\psi_{n}^{B}\right\rangle\right\}=\{|R\pm\rangle,|L\pm\rangle\},
{|ψnC}={|±R,|±L},\displaystyle\left\{\left|\psi_{n}^{C}\right\rangle\right\}=\{|\pm R\rangle,|\pm L\rangle\},
{|ψnD}={12(|R0±i|L1),12(|R1±i|L0)},\displaystyle\left\{\left|\psi_{n}^{D}\right\rangle\right\}=\left\{\frac{1}{\sqrt{2}}(|R0\rangle\pm\mathrm{i}|L1\rangle),\frac{1}{\sqrt{2}}(|R1\rangle\pm\mathrm{i}|L0\rangle)\right\},
{|ψnE}={12(|RR±i|LL),12(|RL±i|LR)}.\displaystyle\left\{\left|\psi_{n}^{E}\right\rangle\right\}=\left\{\frac{1}{\sqrt{2}}(|RR\rangle\pm\mathrm{i}|LL\rangle),\frac{1}{\sqrt{2}}(|RL\rangle\pm\mathrm{i}|LR\rangle)\right\}.

where |±=(|0±|1)/2|\pm\rangle=(|0\rangle\pm|1\rangle)/\sqrt{2}, |R=(|0i|1)/2|R\rangle=(|0\rangle-i|1\rangle)/\sqrt{2}, and |L=(|0+i|1)/2|L\rangle=(|0\rangle+i|1\rangle)/\sqrt{2}. This protocol was applied in the QST experiment in [42]. In this paper, Pn=|ψn(MUB)ψn(MUB)|P_{n}=|\psi_{n}^{(\text{MUB})}\rangle\langle\psi_{n}^{(\text{MUB})}| is a projection measurement and ρn=|ψn(MUB)ψn(MUB)|\rho_{n}=|\psi_{n}^{(\text{MUB})}\rangle\langle\psi_{n}^{(\text{MUB})}| is called a MUB state.

For two-qubit SIC POVM, one expression [53], ignoring overall phases and normalization, is

(xxxxiiiiiiiiiiii1111xxxxiiii111111111111xxxxiiii1111iiii1111xxxx),\left(\begin{array}[]{rrrrrrrrrrrrrrrr}x&x&x&x&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-\mathrm{i}\\ 1&1&-1&-1&x&x&x&x&\mathrm{i}&-\mathrm{i}&\mathrm{i}&-\mathrm{i}&1&-1&1&-1\\ 1&-1&1&-1&1&-1&1&-1&x&x&x&x&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}\\ 1&-1&-1&1&-\mathrm{i}&\mathrm{i}&\mathrm{i}&-\mathrm{i}&-1&1&1&-1&x&x&x&x\end{array}\right), (A.68)

where x=2+5x=\sqrt{2+\sqrt{5}} and |ψn(SIC)\left|\psi_{n}^{(\text{SIC})}\right\rangle is the nn-th column of (A.68). In this paper, the set of ρi=|ψn(SIC)ψn(SIC)|Tr(|ψn(SIC)ψn(SIC)|)\rho_{i}=\frac{\left|\psi_{n}^{(\text{SIC})}\right\rangle\left\langle\psi_{n}^{(\text{SIC})}\right|}{\operatorname{Tr}\left(\left|\psi_{n}^{(\text{SIC})}\right\rangle\left\langle\psi_{n}^{(\text{SIC})}\right|\right)} consists of SIC states.

For Cube measurement [54], it is based on the projections onto all the 36 tensor products of the eigenstates of the standard single-qubit Pauli operators

{|ψn(Cube)}={|0,|1,|+,|,|R,|L}2.\left\{\left|\psi_{n}^{(\text{Cube})}\right\rangle\right\}=\{|0\rangle,|1\rangle,|+\rangle,|-\rangle,|R\rangle,|L\rangle\}^{\otimes 2}. (A.69)

In this paper, ρn=|ψn(Cube)ψn(Cube)|\rho_{n}=|\psi_{n}^{(\text{Cube})}\rangle\langle\psi_{n}^{(\text{Cube})}| is a Cube state which is a product state of two single-qubit MUB states.

For GPB (generalized Pauli basis) states, they are

ρiz=|ii|,\rho_{i}^{z}=|i\rangle\langle i|, (A.70)

where i=1,,di=1,\ldots,d and

ρijx=(|ij|+|ji|+|ii|+|jj|)/2,\rho_{ij}^{x}=(|i\rangle\langle j|+|j\rangle\langle i|+|i\rangle\langle i|+|j\rangle\langle j|)/2, (A.71)
ρijy=(i|ij|+i|ji|+|ii|+|jj|)/2,\rho_{ij}^{y}=(-\mathrm{i}|i\rangle\langle j|+\mathrm{i}|j\rangle\langle i|+|i\rangle\langle i|+|j\rangle\langle j|)/2, (A.72)

where 1i<jd1\leq i<j\leq d. The set {|i}\{|i\rangle\} of states with i=1,,di=1,\ldots,d is an arbitrary orthonormal basis. The total number of these states is d2d^{2}. To generate new GPB states in Step 2 of adaptive QDT, we replace the set {|i}i=1d\{|i\rangle\}_{i=1}^{d} by {|λi}i=1d\left\{\left|\lambda_{i}\right\rangle\right\}_{i=1}^{d} which are the eigenvectors of Step 1 estimation.

For random pure states, we choose 3232 random pure states using the algorithm in [50, 51] which are uniformly distributed according to Haar measure.

Appendix Appendix B Optimal one-qubit states

For one-qubit state, we can expand them in Pauli matrices as

ρj=I2+ϕjxσx2+ϕjyσy2+ϕjzσz2,\rho_{j}=\frac{I}{2}+\phi_{j}^{x}\frac{\sigma_{x}}{\sqrt{2}}+\phi_{j}^{y}\frac{\sigma_{y}}{\sqrt{2}}+\phi_{j}^{z}\frac{\sigma_{z}}{\sqrt{2}}, (B.73)

where (ϕjx)2+(ϕjy)2+(ϕjz)212(\phi_{j}^{x})^{2}+(\phi_{j}^{y})^{2}+(\phi_{j}^{z})^{2}\leq\frac{1}{2}. The parameterization XTXX^{T}X for MM one-qubit states are

XTX=(M222j=1Mϕjx22j=1Mϕjy22j=1Mϕjz22j=1Mϕjxj=1M(ϕjx)2j=1Mϕjxϕjyj=1Mϕjxϕjz22j=1Mϕjyj=1Mϕjxϕjyj=1M(ϕjy)2j=1Mϕjyϕjz22j=1Mϕjzj=1Mϕjxϕjzj=1Mϕjyϕjzj=1M(ϕjz)2).X^{T}X=\left(\begin{array}[]{cccc}\frac{M}{2}&\frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{x}&\frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{y}&\frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{z}\\ \frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{x}&\sum_{j=1}^{M}(\phi_{j}^{x})^{2}&\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{y}&\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{z}\\ \frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{y}&\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{y}&\sum_{j=1}^{M}(\phi_{j}^{y})^{2}&\sum_{j=1}^{M}\phi_{j}^{y}\phi_{j}^{z}\\ \frac{\sqrt{2}}{2}\sum_{j=1}^{M}\phi_{j}^{z}&\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{z}&\sum_{j=1}^{M}\phi_{j}^{y}\phi_{j}^{z}&\sum_{j=1}^{M}(\phi_{j}^{z})^{2}\end{array}\right). (B.74)

If these MM states are OPS, by Theorem 1, we have

j=1Mϕjx=j=1Mϕjy=j=1Mϕjz=0,\sum_{j=1}^{M}\phi_{j}^{x}=\sum_{j=1}^{M}\phi_{j}^{y}=\sum_{j=1}^{M}\phi_{j}^{z}=0, (B.75)
j=1Mϕjxϕjy=j=1Mϕjxϕjz=j=1Mϕjyϕjz=0,\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{y}=\sum_{j=1}^{M}\phi_{j}^{x}\phi_{j}^{z}=\sum_{j=1}^{M}\phi_{j}^{y}\phi_{j}^{z}=0, (B.76)
j=1M(ϕjx)2=j=1M(ϕjy)2=j=1M(ϕjz)2=M6,\sum_{j=1}^{M}(\phi_{j}^{x})^{2}=\sum_{j=1}^{M}(\phi_{j}^{y})^{2}=\sum_{j=1}^{M}(\phi_{j}^{z})^{2}=\frac{M}{6}, (B.77)

and

(ϕjx)2+(ϕjy)2+(ϕjz)2=12,1jM.(\phi_{j}^{x})^{2}+(\phi_{j}^{y})^{2}+(\phi_{j}^{z})^{2}=\frac{1}{2},\forall 1\leq j\leq M. (B.78)

We conjecture that real solutions exists for these equations (i.e., OPS exist) if and only if M=4,6,8,12M=4,6,8,12 and 2020, corresponding to the five platonic solids on the Bloch sphere. We prove this is true for M=5M=5.

For M=5M=5, if there exist OPS, then we construct a 5×45\times 4 parameterization matrix of probe states XX as

X=(1565ϕ1x65ϕ1y65ϕ1z1565ϕ5x65ϕ5y65ϕ5z),X=\left(\begin{array}[]{cccc}\frac{1}{\sqrt{5}}&\sqrt{\frac{6}{5}}\phi_{1}^{x}&\sqrt{\frac{6}{5}}\phi_{1}^{y}&\sqrt{\frac{6}{5}}\phi_{1}^{z}\\ \vdots&\vdots&\vdots&\vdots\\ \frac{1}{\sqrt{5}}&\sqrt{\frac{6}{5}}\phi_{5}^{x}&\sqrt{\frac{6}{5}}\phi_{5}^{y}&\sqrt{\frac{6}{5}}\phi_{5}^{z}\end{array}\right), (B.79)

where the columns of XX are orthogonal to each other and their 2-norm is all 11. The 2-norm of each row of XX is 45\sqrt{\frac{4}{5}}. Using Gram-Schmidt process, there must exist a vector bb such that [X,b][X,b] is an orthogonal matrix. By considering its row-norm, we know each element of bb squares to 15\frac{1}{5}, and thus is either 15\sqrt{\frac{1}{5}} or 15-\sqrt{\frac{1}{5}}. However, such bb can never be orthogonal to the first column of XX, contradicting the fact that [X,b][X,b] is an orthogonal matrix. Thus, there do not exist OPS for M=5M=5.

Appendix Appendix C Proof of Theorem 3.5

{pf}

a) Preliminary calculations

The parameterization error in one probe state is

ρjρ^j2=Tr[(ρjρ^j)(ρjρ^j)]=Tr[(b=1d2(ϕjbϕ^jb)Ωb)(b=1d2(ϕjbϕ^jb)Ωb)]=b=1d2|ϕjbϕ^jb|2=ϕjϕ^j2.\begin{array}[]{l}\left\|\rho_{j}-\hat{\rho}_{j}\right\|^{2}=\operatorname{Tr}\left[\left(\rho_{j}^{\dagger}-\hat{\rho}_{j}^{\dagger}\right)\left(\rho_{j}-\hat{\rho}_{j}\right)\right]\\ =\operatorname{Tr}\left[\left(\sum_{b=1}^{d^{2}}\left(\phi_{j}^{b}-\hat{\phi}_{j}^{b}\right)\Omega_{b}\right)^{\dagger}\left(\sum_{b=1}^{d^{2}}\left(\phi_{j}^{b}-\hat{\phi}_{j}^{b}\right)\Omega_{b}\right)\right]\\ =\sum_{b=1}^{d^{2}}\left|\phi_{j}^{b}-\hat{\phi}_{j}^{b}\right|^{2}=\left\|\phi_{j}-\hat{\phi}_{j}\right\|^{2}.\end{array} (C.80)

Hence, the total parameterization error in XX is

XX^2=j=1Mϕjϕ^j2Mε2.\|X-\hat{X}\|^{2}=\sum_{j=1}^{M}\left\|\phi_{j}-\hat{\phi}_{j}\right\|^{2}\leq M\varepsilon^{2}. (C.81)

For the error in XTXX^{T}X, we thus have

XTXX^TX^XTXXTX^+XTX^X^TX^XXX^+X^XX^=(X+X^)XX^2MXX^2Mε,\begin{array}[]{l}\left\|X^{T}X-\hat{X}^{T}\hat{X}\right\|\leq\left\|X^{T}X-X^{T}\hat{X}\right\|+\left\|X^{T}\hat{X}-\hat{X}^{T}\hat{X}\right\|\\ \leq\|X\|\cdot\|X-\hat{X}\|+\|\hat{X}\|\cdot\|X-\hat{X}\|\\ =(\|X\|+\|\hat{X}\|)\|X-\hat{X}\|\\ \leq 2\sqrt{M}\|X-\hat{X}\|\leq 2M\varepsilon,\end{array} (C.82)

where X2=j=1Mϕj2=j=1MTr(ρρ)=j=1MTr(ρ2)M\|X\|^{2}=\sum_{j=1}^{M}\left\|\phi_{j}\right\|^{2}=\sum_{j=1}^{M}\operatorname{Tr}\left(\rho^{\dagger}\rho\right)=\sum_{j=1}^{M}\operatorname{Tr}\left(\rho^{2}\right)\leq M and similarly X^M\|\hat{X}\|\leq\sqrt{M}.

b) Error in the UMSE

We introduce Weyl’s perturbation theorem, which can be found in [55],

Lemma Appendix C.9.

[55] Let A,BA,B be Hermitian matrices with eigenvalues λ1(A)λn(A)\lambda_{1}(A)\geq\cdots\geq\lambda_{n}(A) and λ1(B)λn(B)\lambda_{1}(B)\geq\cdots\geq\lambda_{n}(B), respectively. Then

maxj|λj(A)λj(B)|AB.\max_{j}\left|\lambda_{j}(A)-\lambda_{j}(B)\right|\leq\|A-B\|. (C.83)

Lemma Appendix C.9 was for the operator norm, not larger than its Frobenius norm for any finite dimension square matrix [55]. Therefore, it also holds for the Frobenius norm. Suppose the eigenvalues of XTXX^{T}X are λ1λ2λd2>0\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{d^{2}}>0 and the eigenvalues of X^TX^\hat{X}^{T}\hat{X} are μ1μ2μd2>0\mu_{1}\geq\mu_{2}\geq\cdots\geq\mu_{d^{2}}>0. Thus, using Lemma Appendix C.9, we have

|λ1μ1|XTXX^TX^2Mε,|λd2μd2|XTXX^TX^2Mε,\begin{array}[]{l}\left|\lambda_{1}-\mu_{1}\right|\leq\left\|X^{T}X-\hat{X}^{T}\hat{X}\right\|\leq 2M\varepsilon,\\ \left|\lambda_{d^{2}}-\mu_{d^{2}}\right|\leq\left\|X^{T}X-\hat{X}^{T}\hat{X}\right\|\leq 2M\varepsilon,\end{array} (C.84)

and therefore,

λ12Mεμ1λ1+2Mε,λd22Mεμd2λd2+2Mε.\begin{array}[]{l}\lambda_{1}-2M\varepsilon\leq\mu_{1}\leq\lambda_{1}+2M\varepsilon,\\ \lambda_{d^{2}}-2M\varepsilon\leq\mu_{d^{2}}\leq\lambda_{d^{2}}+2M\varepsilon.\end{array} (C.85)

For UMSE, we have

|Tr(XTX)1Tr(X^TX^)1|=|i=1d2(1λi1μi)|i=1d2|1λi1μi|i=1d2|λiμi|(λd22Mε)2d2maxi|λiμi|(λd22Mε)2d2(λd22Mε)2XTXX^TX^,\begin{array}[]{l}\left|\operatorname{Tr}\left(X^{T}X\right)^{-1}-\operatorname{Tr}\left(\hat{X}^{T}\hat{X}\right)^{-1}\right|=\left|\sum_{i=1}^{d^{2}}\left(\frac{1}{\lambda_{i}}-\frac{1}{\mu_{i}}\right)\right|\\ \leq\sum_{i=1}^{d^{2}}\left|\frac{1}{\lambda_{i}}-\frac{1}{\mu_{i}}\right|\leq\dfrac{\sum_{i=1}^{d^{2}}\left|\lambda_{i}-\mu_{i}\right|}{(\lambda_{d^{2}}-2M\varepsilon)^{2}}\\ \leq\dfrac{d^{2}\max_{i}\left|\lambda_{i}-\mu_{i}\right|}{(\lambda_{d^{2}}-2M\varepsilon)^{2}}\leq\dfrac{d^{2}}{(\lambda_{d^{2}}-2M\varepsilon)^{2}}\left\|X^{T}X-\hat{X}^{T}\hat{X}\right\|,\end{array} (C.86)

because λiμimin{λd22,μd22}(λd22Mε)2>0\lambda_{i}\mu_{i}\geq\min\left\{\lambda_{d^{2}}^{2},\mu_{d^{2}}^{2}\right\}\geq\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}>0 for i=1,2,,d2i=1,2,\cdots,d^{2}.

We combine (C.80), (C.81), (C.82) and (C.86) to obtain

(n1)M4N|Tr(XTX)1Tr(X^TX^)1|\displaystyle\frac{(n-1)M}{4N}\big{|}\operatorname{Tr}\left(X^{T}X\right)^{-1}-\operatorname{Tr}\left(\hat{X}^{T}\hat{X}\right)^{-1}\big{|} (C.87)
\displaystyle\leq 2(n1)d2M2ε4N(λd22Mε)2.\displaystyle\frac{2(n-1)d^{2}M^{2}\varepsilon}{4N{\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}}}.

c) Error in the condition number

For condition number, we have

|λ1λd2μ1μd2|=|λ1μd2μ1λd2|λd2μd2\displaystyle\quad\left|\sqrt{\frac{\lambda_{1}}{\lambda_{d^{2}}}}-\sqrt{\frac{\mu_{1}}{\mu_{d^{2}}}}\right|=\frac{\left|\sqrt{\lambda_{1}\mu_{d^{2}}}-\sqrt{\mu_{1}\lambda_{d^{2}}}\right|}{\sqrt{\lambda_{d^{2}}\mu_{d^{2}}}} (C.88)
=|λ1μd2λd2μ1|λd2μd2(λ1μd2+λd2μ1)\displaystyle=\frac{\left|\lambda_{1}\mu_{d^{2}}-\lambda_{d^{2}}\mu_{1}\right|}{\sqrt{\lambda_{d^{2}}\mu_{d^{2}}}\left(\sqrt{\lambda_{1}\mu_{d^{2}}}+\sqrt{\lambda_{d^{2}}\mu_{1}}\right)}
|λ1(λd2+2Mε)λd2(λ12Mε)|2(λd22Mε)2\displaystyle\leq\frac{\left|{\lambda_{1}\left(\lambda_{d^{2}}+2M\varepsilon\right)}-{\lambda_{d^{2}}\left(\lambda_{1}-2M\varepsilon\right)}\right|}{2\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}}
=M(λ1+λd2)ε(λd22Mε)2.\displaystyle=\frac{M\left(\lambda_{1}+\lambda_{d^{2}}\right)\varepsilon}{\left(\lambda_{d^{2}}-2M\varepsilon\right)^{2}}.

\Box

Appendix Appendix D On the distortion of F0F_{0} in (45)

According to the definition, distortion happens when there exists {P^i}i=1n{Pi}i=1n\left\{\hat{P}_{i}\right\}_{i=1}^{n}\neq\left\{P_{i}\right\}_{i=1}^{n} such that F0(P^i,Pi)=1F_{0}\left(\hat{P}_{i},P_{i}\right)=1 for all ii, which means P^iTr(P^i)=PiTr(Pi)\frac{\hat{P}_{i}}{\operatorname{Tr}\left(\hat{P}_{i}\right)}=\frac{P_{i}}{\operatorname{Tr}\left(P_{i}\right)} for all ii. This indicates that the vector a^0=[Tr(P^1)Tr(P1),,Tr(P^n)Tr(Pn)]T\hat{a}_{0}=\left[\frac{\operatorname{Tr}\left(\hat{P}_{1}\right)}{\operatorname{Tr}\left(P_{1}\right)},\cdots,\frac{\operatorname{Tr}\left(\hat{P}_{n}\right)}{\operatorname{Tr}\left(P_{n}\right)}\right]^{T} and a0=[1,,1]Ta_{0}=[1,\cdots,1]^{T} are different. Note that i=1nP^i=i=1na^0iPi=I\sum_{i=1}^{n}\hat{P}_{i}=\sum_{i=1}^{n}\hat{a}_{0i}P_{i}=I and i=1nPi=I\sum_{i=1}^{n}P_{i}=I. Hence, the following linear equation

[vec(P1),,vec(Pn)]a=vec(I),\left[\operatorname{vec}\left(P_{1}\right),\cdots,\operatorname{vec}\left(P_{n}\right)\right]a=\operatorname{vec}(I), (D.89)

has at least two different solutions a=a^0a=\hat{a}_{0} and a=a0a=a_{0}. Therefore, Rank[vec(P1),,vec(Pn)]<n\operatorname{Rank}\left[\operatorname{vec}\left(P_{1}\right),\cdots,\operatorname{vec}\left(P_{n}\right)\right]<n, and thus {Pi}i=1n\left\{P_{i}\right\}_{i=1}^{n} are linearly dependent.

Conversely, if {Pi}i=1n\left\{P_{i}\right\}_{i=1}^{n} are linearly dependent, (D.89) will have non-trivial homogeneous general solutions, among which we randomly pick a non-zero solution a=[v1,,vn]Ta=[v_{1},\cdots,v_{n}]^{T}. Since (D.89) already has a trivial solution a0a_{0}, we choose w>0w>0 small enough, such that a=[1+wv1,,1+wvn]Ta^{\prime}=[1+wv_{1},\cdots,1+wv_{n}]^{T} has all the elements strictly positive. Hence, aa^{\prime} is another set of positive numbers not equal to [1,,1]T[1,\cdots,1]^{T} such that i=1naiPi=I\sum_{i=1}^{n}a^{\prime}_{i}P_{i}=I. Let the estimation be P^i=aiPi\hat{P}_{i}=a^{\prime}_{i}P_{i}, and distortion happens.

Notice that there are d2d^{2} real degrees of freedom in Pi{P_{i}}, and hence the row rank of LHS of (D.89) is at most d2d^{2}. If d2<nd^{2}<n, we must have Rank[vec(P1),,vec(Pn)]<n\operatorname{Rank}\left[\operatorname{vec}\left(P_{1}\right),\cdots,\operatorname{vec}\left(P_{n}\right)\right]<n and there must exist distortion.

Appendix Appendix E Lower bound of the new fidelity (46)

We show a tight lower bound of our new fidelity (46) as 1d1\frac{1}{d}-1. One can arbitrarily approximate this lower bound but cannot reach it.

We assume that the spectral decomposition of PiP_{i} and P^i\hat{P}_{i} are Pi=UiΔiUiP_{i}=U_{i}\Delta_{i}U_{i}^{\dagger} where Δi=\Delta_{i}= diag(δ1,,δd)\operatorname{diag}\left(\delta_{1},\cdots,\delta_{d}\right), 0δ1δd10\leq\delta_{1}\leq\cdots\leq\delta_{d}\leq 1 and P^i=U^iΔ^iU^i\hat{P}_{i}=\hat{U}_{i}\hat{\Delta}_{i}\hat{U}_{i}^{\dagger} where Δ^i=diag(δ^1,,δ^d)\hat{\Delta}_{i}=\operatorname{diag}\left(\hat{\delta}_{1},\cdots,\hat{\delta}_{d}\right), 1δ^1δ^d01\geq\hat{\delta}_{1}\geq\cdots\geq\hat{\delta}_{d}\geq 0. Define Δi=diag(δd,,δ1){\Delta}_{i}^{\downarrow}=\operatorname{diag}\left({\delta}_{d},\cdots,{\delta}_{1}\right). According to Theorem 2.1 in [56], we have

F0(Δ^i,Δi)F0(P^i,Pi)F0(Δ^i,Δi).F_{0}\left(\hat{\Delta}_{i},\Delta_{i}\right)\leq F_{0}\left(\hat{P}_{i},P_{i}\right)\leq F_{0}\left(\hat{\Delta}_{i},\Delta_{i}^{\downarrow}\right). (E.90)

Therefore,

F(Δ^i,Δi)F(P^i,Pi)F(Δ^i,Δi),F\left(\hat{\Delta}_{i},\Delta_{i}\right)\leq F\left(\hat{P}_{i},P_{i}\right)\leq F\left(\hat{\Delta}_{i},\Delta_{i}^{\downarrow}\right), (E.91)

and we only need to consider the lower bound of F(Δ^i,Δi)F\left(\hat{\Delta}_{i},\Delta_{i}\right). We have

F(Δ^i,Δi)\displaystyle F\left(\hat{\Delta}_{i},\Delta_{i}\right) =(j=1dδjδ^j)2(j=1dδj)(j=1dδ^j)(j=1d(δjδ^j))2d2\displaystyle\!=\!\frac{\left(\sum_{j=1}^{d}\sqrt{\delta_{j}\hat{\delta}_{j}}\right)^{2}}{\left(\!\sum_{j=1}^{d}\delta_{j}\!\right)\left(\!\sum_{j=1}^{d}\hat{\delta}_{j}\!\right)}\!-\!\frac{\left(\!\sum_{j=1}^{d}\left(\delta_{j}-\hat{\delta}_{j}\right)\!\right)^{2}}{d^{2}} (E.92)
=j=1dδjδ^j+jkδjδ^jδkδ^k(j=1dδj)(j=1dδ^j)\displaystyle\!=\!\frac{\sum_{j=1}^{d}\delta_{j}\hat{\delta}_{j}+\sum_{j\neq k}\sqrt{\delta_{j}\hat{\delta}_{j}\delta_{k}\hat{\delta}_{k}}}{\left(\sum_{j=1}^{d}\delta_{j}\right)\left(\sum_{j=1}^{d}\hat{\delta}_{j}\right)}
(j=1d(δjδ^j))2d2\displaystyle\!\quad-\frac{\left(\sum_{j=1}^{d}\left(\delta_{j}-\hat{\delta}_{j}\right)\right)^{2}}{d^{2}}
j=1dδjδ^j(j=1dδj)(j=1dδ^j)(j=1d(δjδ^j))2d2.\displaystyle\!\geq\!\frac{\sum_{j=1}^{d}\delta_{j}\hat{\delta}_{j}}{\left(\!\sum_{j=1}^{d}\delta_{j}\!\right)\left(\!\sum_{j=1}^{d}\hat{\delta}_{j}\!\right)}-\frac{\left(\!\sum_{j=1}^{d}\left(\delta_{j}-\hat{\delta}_{j}\right)\!\right)^{2}}{d^{2}}.

If j=1dδjj=1dδ^j\sum_{j=1}^{d}\delta_{j}\geq\sum_{j=1}^{d}\hat{\delta}_{j}, we have

F(Δ^i,Δi)\displaystyle F\left(\hat{\Delta}_{i},\Delta_{i}\right) j=1dδjδ^j(j=1dδj)(j=1dδ^j)(j=1d(δjδ^j))2d2\displaystyle\!\geq\!\frac{\sum_{j=1}^{d}\delta_{j}\hat{\delta}_{j}}{\left(\!\sum_{j=1}^{d}\delta_{j}\!\right)\left(\!\sum_{j=1}^{d}\hat{\delta}_{j}\!\right)}\!-\!\frac{\left(\!\sum_{j=1}^{d}\left(\delta_{j}-\hat{\delta}_{j}\right)\!\right)^{2}}{d^{2}} (E.93)
δ1(j=1dδ^j)(j=1dδj)(j=1dδ^j)(j=1dδj)2d2\displaystyle\!\geq\!\frac{\delta_{1}\left(\sum_{j=1}^{d}\hat{\delta}_{j}\right)}{\left(\sum_{j=1}^{d}\delta_{j}\right)\left(\sum_{j=1}^{d}\hat{\delta}_{j}\right)}-\frac{\left(\sum_{j=1}^{d}\delta_{j}\right)^{2}}{d^{2}}
=δ1j=1dδj(j=1dδj)2d2\displaystyle\!=\!\frac{\delta_{1}}{\sum_{j=1}^{d}\delta_{j}}-\frac{\left(\sum_{j=1}^{d}\delta_{j}\right)^{2}}{d^{2}}
δ1δ1+d1(δ1+d1)2d21d1,\displaystyle\!\geq\!\frac{\delta_{1}}{\delta_{1}+d-1}-\frac{\left(\delta_{1}+d-1\right)^{2}}{d^{2}}\geq\frac{1}{d}-1,

where the last inequality can be obtained by analyzing the derivative on δ1\delta_{1}. Similarly, if j=1dδj<j=1dδ^j\sum_{j=1}^{d}\delta_{j}<\sum_{j=1}^{d}\hat{\delta}_{j}, we have

F(Δ^i,Δi)\displaystyle F\left(\hat{\Delta}_{i},\Delta_{i}\right) δ^dj=1dδ^j(j=1dδ^j)2d21d1.\displaystyle\geq\frac{\hat{\delta}_{d}}{\sum_{j=1}^{d}\hat{\delta}_{j}}-\frac{\left(\sum_{j=1}^{d}\hat{\delta}_{j}\right)^{2}}{d^{2}}\geq\frac{1}{d}-1. (E.94)

The minimum is achieved if and only if one of PiP_{i} and P^i\hat{P}_{i} is an identity matrix and the other is a zero matrix, which cannot happen in practice. Hence, 1d1\frac{1}{d}-1 cannot be achieved. Finally, consider for example Pi=diag(1τ,1,1,,1)P_{i}=\operatorname{diag}(1-\tau,1,1,\cdots,1) and P^i=diag(τ,0,0,,0)\hat{P}_{i}=\operatorname{diag}({\tau},0,0,\cdots,0) where τ>0\tau>0. Then as τ\tau tends to zero, F(P^i,Pi)F\left(\hat{P}_{i},P_{i}\right) can be arbitrarily close to 1d1\frac{1}{d}-1. Hence, 1d1\frac{1}{d}-1 is a tight but unattainable lower bound.

Appendix Appendix F Detailed calculation of the first-order term in infidelity

Firstly, we analyze the error between real coefficients {Sijx,Sijy,Skz}\left\{S_{ij}^{x},S_{ij}^{y},S_{k}^{z}\right\} and estimated coefficients {S^ijx,S^ijy,S^kz}\left\{\hat{S}_{ij}^{x},\hat{S}_{ij}^{y},\hat{S}_{k}^{z}\right\}.

Using a reconstruction algorithm whose MSE scales as O(1/N)O(1/N) such as two-stage QDT reconstruction algorithm in [28] and MLE in [18], the mean squared error from Step 2 for the detector PP is bounded by O(1NN0)O\left(\frac{1}{N-N_{0}}\right), and thus

𝔼(P^P2)\displaystyle\quad\mathbb{E}\left(\left\|\hat{P}-P\right\|^{2}\right) (F.95)
=\displaystyle= 1i<jd𝔼(S^ijxSijx)2+1i<jd𝔼(S^ijySijy)2\displaystyle\sum_{1\leq i<j\leq d}\mathbb{E}\left(\hat{S}_{ij}^{x}-S_{ij}^{x}\right)^{2}+\sum_{1\leq i<j\leq d}\mathbb{E}\left(\hat{S}_{ij}^{y}-S_{ij}^{y}\right)^{2}
+k=1d𝔼(S^kzSkz)2O(1NN0).\displaystyle+\sum_{k=1}^{d}\mathbb{E}\left(\hat{S}_{k}^{z}-S_{k}^{z}\right)^{2}\sim O\left(\frac{1}{N-N_{0}}\right).

Therefore, the errors between real coefficients {Sijx,Sijy,Skz}\left\{S_{ij}^{x},S_{ij}^{y},S_{k}^{z}\right\} and estimated coefficients {S^ijx,S^ijy,S^kz}\left\{\hat{S}_{ij}^{x},\hat{S}_{ij}^{y},\hat{S}_{k}^{z}\right\} all scale as O(1NN0)O(\frac{1}{\sqrt{N-N_{0}}}). Also,

𝔼|Tr(P^)Tr(P)|=𝔼|k=1d(S^kzSkz)|O(1NN0),\mathbb{E}\left|\operatorname{Tr}(\hat{P})-\operatorname{Tr}(P)\right|=\mathbb{E}\left|\sum_{k=1}^{d}\left(\hat{S}_{k}^{z}-S_{k}^{z}\right)\right|\sim O\left(\frac{1}{\sqrt{N-N_{0}}}\right), (F.96)

and thus,

𝔼|S^ijxTr(P^)SijxTr(P)|\displaystyle\mathbb{E}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right| (F.97)
=𝔼|SijxTr(P)+Tr(P)O(1NN0)SijxTr(P)SijxO(1NN0)|Tr(P)[Tr(P)+O(1NN0)]\displaystyle=\mathbb{E}\frac{\left|S_{ij}^{x}\operatorname{Tr}(P)+\operatorname{Tr}(P)O\left(\frac{1}{\sqrt{N-N_{0}}}\right)-S_{ij}^{x}\operatorname{Tr}(P)-S_{ij}^{x}O\left(\frac{1}{\sqrt{N-N_{0}}}\right)\right|}{\operatorname{Tr}(P)\left[\operatorname{Tr}(P)+O\left(\frac{1}{\sqrt{N-N_{0}}}\right)\right]}
O(1NN0).\displaystyle\sim O\left(\frac{1}{\sqrt{N-N_{0}}}\right).

Similarly, 𝔼|S^ijyTr(P^)SijyTr(P)|O(1NN0)\mathbb{E}\left|\frac{\hat{S}_{ij}^{y}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{y}}{\operatorname{Tr}(P)}\right|\sim O\left(\frac{1}{\sqrt{N-N_{0}}}\right) and
𝔼|S^kzTr(P^)SkzTr(P)|O(1NN0)\mathbb{E}\left|\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(\hat{P})}-\frac{S_{k}^{z}}{\operatorname{Tr}(P)}\right|\sim O\left(\frac{1}{\sqrt{N-N_{0}}}\right).

For the first term in RHS of (56), if tit\neq i and tjt\neq j, using (53), we have

𝔼|S^ijxTr(P^)SijxTr(P)||λt|σ~ijx|λt|\displaystyle\mathbb{E}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left|\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{t}\right\rangle\right| (F.98)
𝔼[|S^ijxTr(P^)SijxTr(P)|(|λt|λ~iλ~j|λt|\displaystyle\leq\mathbb{E}\Bigg{[}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left(\left|\left\langle\lambda_{t}|\tilde{\lambda}_{i}\right\rangle\left\langle\tilde{\lambda}_{j}|\lambda_{t}\right\rangle\right|\right.
+|λt|λ~jλ~i|λt|)]\displaystyle+\left.\left|\left\langle\lambda_{t}|\tilde{\lambda}_{j}\right\rangle\left\langle\tilde{\lambda}_{i}|\lambda_{t}\right\rangle\right|\right)\Bigg{]}
=O(1NN0)[O(1N0)+O(1N0)]\displaystyle=O\left(\frac{1}{\sqrt{N-N_{0}}}\right)\left[O\left(\frac{1}{{N_{0}}}\right)+O\left(\frac{1}{{N_{0}}}\right)\right]
=O(1N0NN0).\displaystyle=O\left(\frac{1}{N_{0}\sqrt{N-N_{0}}}\right).

Thus, 𝔼|S^ijxTr(P^)SijxTr(P)|λt|σ~ijx|λt=O(1N0NN0)\mathbb{E}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left\langle\lambda_{t}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{t}\right\rangle=O\left(\frac{1}{N_{0}\sqrt{N-N_{0}}}\right).

Otherwise, if t=it=i or t=jt=j, without loss of generality, we assume t=it=i. Using (53), we have

𝔼|S^ijxTr(P^)SijxTr(P)||λi|σ~ijx|λi|\displaystyle\mathbb{E}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left|\left\langle\lambda_{i}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{i}\right\rangle\right| (F.99)
𝔼[|S^ijxTr(P^)SijxTr(P)|(|λi|λ~iλ~j|λi|\displaystyle\leq\mathbb{E}\Bigg{[}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left(\left|\left\langle\lambda_{i}|\tilde{\lambda}_{i}\right\rangle\left\langle\tilde{\lambda}_{j}|\lambda_{i}\right\rangle\right|\right.
+|λi|λ~jλ~i|λi|)]\displaystyle+\left.\left|\left\langle\lambda_{i}|\tilde{\lambda}_{j}\right\rangle\left\langle\tilde{\lambda}_{i}|\lambda_{i}\right\rangle\right|\right)\Bigg{]}
=O(1NN0)((1+O(1N0))O(1N0)\displaystyle=O\left(\frac{1}{\sqrt{N-N_{0}}}\right)\left(\left(1+O\left(\frac{1}{N_{0}}\right)\right)O\left(\frac{1}{\sqrt{N_{0}}}\right)\right.
+O(1N0)(1+O(1N0)))\displaystyle+\left.O\left(\frac{1}{\sqrt{N_{0}}}\right)\left(1+O\left(\frac{1}{N_{0}}\right)\right)\right)
=O(1N0(NN0)).\displaystyle=O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right).

Thus, 𝔼|S^ijxTr(P^)SijxTr(P)|λi|σ~ijx|λi=O(1N0(NN0))\mathbb{E}\left|\frac{\hat{S}_{ij}^{x}}{\operatorname{Tr}(\hat{P})}-\frac{S_{ij}^{x}}{\operatorname{Tr}(P)}\right|\left\langle\lambda_{i}\left|\tilde{\sigma}_{ij}^{x}\right|\lambda_{i}\right\rangle=O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right).

Therefore, the total error for the first term in RHS of (56) scales as O(1N0(NN0))O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right). Similarly, the same scaling holds for the second term in RHS of (56).

Finally, we calculate the third term in RHS of (56) and also divide it into two cases. For the first case, we consider 1kr,tr+11\leq k\leq r,t\geq r+1 and have

𝔼|S^kzTr(P^)SkzTr(P)|λt|σ~kz|λt\displaystyle\mathbb{E}\left|\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(\hat{P})}-\frac{S_{k}^{z}}{\operatorname{Tr}(P)}\right|\left\langle\lambda_{t}\left|\tilde{\sigma}_{k}^{z}\right|\lambda_{t}\right\rangle (F.100)
=O(1NN0)O(1N0)\displaystyle=O\left(\frac{1}{\sqrt{N-N_{0}}}\right)O\left(\frac{1}{{N_{0}}}\right)
=O(1N0NN0).\displaystyle=O\left(\frac{1}{N_{0}\sqrt{N-N_{0}}}\right).

For the second case where r+1kdr+1\leq k\leq d, from [37] we have

var(S^kz)=1Nkz(p~kz(1p~kz)),\operatorname{var}\left(\hat{S}_{k}^{z}\right)=\frac{1}{N_{k}^{z}}\left(\tilde{p}_{k}^{z}\left(1-\tilde{p}_{k}^{z}\right)\right), (F.101)

where NkzO(NN0){N_{k}^{z}}\sim O\left({{N-N_{0}}}\right) is the resource number used in Step 2 to estimate S^kz\hat{S}_{k}^{z}, and p~kz{\tilde{p}_{k}^{z}} is

p~kz\displaystyle{\tilde{p}_{k}^{z}} =Tr(P|λ~kλ~k|)=Tr(i=1dλi|λiλi|λ~kλ~k|)\displaystyle=\operatorname{Tr}\left(P|\tilde{\lambda}_{k}\rangle\langle\tilde{\lambda}_{k}|\right)=\operatorname{Tr}\left(\sum_{i=1}^{d}\lambda_{i}\left|\lambda_{i}\right\rangle\langle\lambda_{i}|\tilde{\lambda}_{k}\rangle\langle\tilde{\lambda}_{k}|\right) (F.102)
=tkλtTr(|λtλt|λ~kλ~k|)+λkTr(|λkλk|λ~kλ~k|)\displaystyle=\sum_{t\neq k}\lambda_{t}\operatorname{Tr}\left(\left|\lambda_{t}\right\rangle\langle\lambda_{t}|\tilde{\lambda}_{k}\rangle\langle\tilde{\lambda}_{k}|\right)+\lambda_{k}\operatorname{Tr}\left(\left|\lambda_{k}\right\rangle\langle\lambda_{k}|\tilde{\lambda}_{k}\rangle\langle\tilde{\lambda}_{k}|\right)
=tkλt(O(1N0))\displaystyle=\sum_{t\neq k}\lambda_{t}\left(O\left(\frac{1}{N_{0}}\right)\right)
=O(1N0),\displaystyle=O\left(\frac{1}{N_{0}}\right),

because λk=0\lambda_{k}=0. Hence, we have

var(S^kzTr(P^))=var(S^kzTr(P)+O(1/(NN0)))\displaystyle\operatorname{var}\left(\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(\hat{P})}\right)=\operatorname{var}\left(\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(P)+O\left(1/\left(N-N_{0}\right)\right)}\right) (F.103)
=var(S^kzTr(P)S^kzO(1/(NN0))Tr(P)(Tr(P)+O(1/(NN0))))\displaystyle=\operatorname{var}\left(\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(P)}-\frac{\hat{S}_{k}^{z}O\left(1/\left(N-N_{0}\right)\right)}{\operatorname{Tr}(P)\left(\operatorname{Tr}(P)+O\left(1/\left(N-N_{0}\right)\right)\right)}\right)
var(S^kzTr(P))var(S^kz)O(1N0(NN0)).\displaystyle\sim\operatorname{var}\left(\frac{\hat{S}_{k}^{z}}{\operatorname{Tr}(P)}\right)\sim\operatorname{var}\left(\hat{S}_{k}^{z}\right)\sim O\left(\frac{1}{N_{0}\left(N-N_{0}\right)}\right).

Therefore, the third term in RHS of (56) scales as O(1N0(NN0))O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right). Hence, the first-order term (56) scales as

𝔼(t=r+1dλt|Δ1|λt)=O(1N0(NN0)).\displaystyle\mathbb{E}\left(\sum_{t=r+1}^{d}\left\langle\lambda_{t}|\Delta_{1}|\lambda_{t}\right\rangle\right)=O\left(\frac{1}{\sqrt{N_{0}(N-N_{0})}}\right). (F.104)

References

  • [1] D. P. DiVincenzo, “Quantum computation,” Science, vol. 270, no. 5234, pp. 255–261, 1995.
  • [2] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information. Cambridge University Press, 2010.
  • [3] C. L. Degen, F. Reinhard, and P. Cappellaro, “Quantum sensing,” Review of Modern Physics, vol. 89, p. 035002, 2017.
  • [4] B. Qi, Z. Hou, L. Li, D. Dong, G.-Y. Xiang, and G.-C. Guo, “Quantum state tomography via linear regression estimation,” Scientific Reports, vol. 3, no. 1, pp. 1–6, 2013.
  • [5] Z. Hou, H. S. Zhong, Y. Tian, D. Dong, B. Qi, L. Li, Y. Wang, F. Nori, G.-Y. Xiang, C.-F. Li, and G.-C. Guo, “Full reconstruction of a 14-qubit state within four hours,” New Journal of Physics, vol. 18, no. 8, p. 083036, 2016.
  • [6] B. Mu, H. Qi, I. R. Petersen, and G. Shi, “Quantum tomography by regularized linear regressions,” Automatica, vol. 114, p. 108837, 2020.
  • [7] Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, “Quantum gate identification: Error analysis, numerical results and optical experiment,” Automatica, vol. 101, pp. 269 – 279, 2019.
  • [8] J. Fiurášek and Z. Hradil, “Maximum-likelihood estimation of quantum processes,” Physical Review A, vol. 63, p. 020101, 2001.
  • [9] Q. Yu, D. Dong, Y. Wang, and I. R. Petersen, “Adaptive quantum process tomography via linear regression estimation,” in Proc. of 2020 IEEE International Conference on Systems, Man, and Cybernetics, pp. 4173–4178, 2020.
  • [10] Q. Yu, Y. Wang, D. Dong, and I. R. Petersen, “On the capability of a class of quantum sensors,” Automatica, vol. 129, p. 109612, 2021.
  • [11] S. Xiao, S. Xue, D. Dong, and J. Zhang, “Identification of time-varying decoherence rates for open quantum systems,” IEEE Transactions on Quantum Engineering, vol. 2, pp. 1–12, 2021.
  • [12] Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa, “A quantum Hamiltonian identification algorithm: Computational complexity and error analysis,” IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1388–1403, 2018.
  • [13] Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, “Quantum Hamiltonian identifiability via a similarity transformation approach and beyond,” IEEE Transactions on Automatic Control, vol. 65, no. 11, pp. 4632–4647, 2020.
  • [14] J. Zhang and M. Sarovar, “Quantum Hamiltonian identification from measurement time traces,” Physical Review Letters, vol. 113, p. 080401, 2014.
  • [15] A. Sone and P. Cappellaro, “Hamiltonian identifiability assisted by a single-probe measurement,” Physical Review A, vol. 95, p. 022335, 2017.
  • [16] J. Zhang and M. Sarovar, “Identification of open quantum systems from observable time traces,” Physical Review A, vol. 91, no. 5, p. 052121, 2015.
  • [17] A. Sone and P. Cappellaro, “Exact dimension estimation of interacting qubit systems assisted by a single quantum probe,” Physical Review A, vol. 96, p. 062334, 2017.
  • [18] J. Fiurášek, “Maximum-likelihood estimation of quantum measurement,” Physical Review A, vol. 64, p. 024102, 2001.
  • [19] S. Grandi, A. Zavatta, M. Bellini, and M. G. A. Paris, “Experimental quantum tomography of a homodyne detector,” New Journal of Physics, vol. 19, no. 5, p. 053015, 2017.
  • [20] J. J. Renema, G. Frucci, Z. Zhou, F. Mattioli, A. Gaggero, R. Leoni, M. J. A. de Dood, A. Fiore, and M. P. van Exter, “Modified detector tomography technique applied to a superconducting multiphoton nanodetector,” Optics Express, vol. 20, no. 3, pp. 2806–2813, 2012.
  • [21] A. Feito, J. S. Lundeen, H. Coldenstrodt-Ronge, J. Eisert, M. B. Plenio, and I. A. Walmsley, “Measuring measurement: theory and practice,” New Journal of Physics, vol. 11, no. 9, p. 093038, 2009.
  • [22] J. S. Lundeen, A. Feito, H. Coldenstrodt-Ronge, K. L. Pregnell, C. Silberhorn, T. C. Ralph, J. Eisert, M. B. Plenio, and I. A. Walmsley, “Tomography of quantum detectors,” Nature Physics, vol. 5, no. 1, pp. 27–30, 2009.
  • [23] C. M. Natarajan, L. Zhang, H. Coldenstrodt-Ronge, G. Donati, S. N. Dorenbos, V. Zwiller, I. A. Walmsley, and R. H. Hadfield, “Quantum detector tomography of a time-multiplexed superconducting nanowire single-photon detector at telecom wavelengths,” Optics Express, vol. 21, no. 1, pp. 893–902, 2013.
  • [24] G. Brida, L. Ciavarella, I. P. Degiovanni, M. Genovese, A. Migdall, M. G. Mingolla, M. G. A. Paris, F. Piacentini, and S. V. Polyakov, “Ancilla-assisted calibration of a measuring apparatus,” Physical Review Letters, vol. 108, p. 253601, 2012.
  • [25] G. Brida, L. Ciavarella, I. P. Degiovanni, M. Genovese, L. Lolli, M. G. Mingolla, F. Piacentini, M. Rajteri, E. Taralli, and M. G. A. Paris, “Quantum characterization of superconducting photon counters,” New Journal of Physics, vol. 14, no. 8, p. 085001, 2012.
  • [26] L. Zhang, H. B. Coldenstrodt-Ronge, A. Datta, G. Puentes, J. S. Lundeen, X.-M. Jin, B. J. Smith, M. B. Plenio, and I. A. Walmsley, “Mapping coherence in measurement via full quantum tomography of a hybrid optical detector,” Nature Photonics, vol. 6, no. 6, p. 364, 2012.
  • [27] L. Zhang, A. Datta, H. B. Coldenstrodt-Ronge, X.-M. Jin, J. Eisert, M. B. Plenio, and I. A. Walmsley, “Recursive quantum detector tomography,” New Journal of Physics, vol. 14, no. 11, p. 115005, 2012.
  • [28] Y. Wang, S. Yokoyama, D. Dong, I. R. Petersen, E. H. Huntington, and H. Yonezawa, “Two-stage estimation for quantum detector tomography: Error analysis, numerical and experimental results,” IEEE Transactions on Information Theory, vol. 67, no. 4, pp. 2293–2307, 2021.
  • [29] Y. Wang, D. Dong, and H. Yonezawa, “Tomography of binary quantum detectors,” in Proc. of 2019 IEEE 58th Conference on Decision and Control, pp. 396–400, 2019.
  • [30] A. Zhang, J. Xie, H. Xu, K. Zheng, H. Zhang, Y.-T. Poon, V. Vedral, and L. Zhang, “Experimental self-characterization of quantum measurements,” Physical Review Letters, vol. 124, p. 040402, 2020.
  • [31] S. Xiao, Y. Wang, D. Dong, and J. Zhang, “Optimal quantum detector tomography via linear regression estimation,” in Proc. of 2021 IEEE 60th Conference on Decision and Control, 2021.
  • [32] F. Huszár and N. M. T. Houlsby, “Adaptive Bayesian quantum tomography,” Physical Review A, vol. 85, p. 052120, 2012.
  • [33] G. I. Struchalin, I. A. Pogorelov, S. S. Straupe, K. S. Kravtsov, I. V. Radchenko, and S. P. Kulik, “Experimental adaptive quantum tomography of two-qubit states,” Physical Review A, vol. 93, p. 012103, 2016.
  • [34] K. S. Kravtsov, S. S. Straupe, I. V. Radchenko, N. M. T. Houlsby, F. Huszár, and S. P. Kulik, “Experimental adaptive Bayesian tomography,” Physical Review A, vol. 87, p. 062122, 2013.
  • [35] D. H. Mahler, L. A. Rozema, A. Darabi, C. Ferrie, R. Blume-Kohout, and A. M. Steinberg, “Adaptive quantum state tomography improves accuracy quadratically,” Physical Review Letters, vol. 111, p. 183601, 2013.
  • [36] B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, “Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment,” npj Quantum Information, vol. 3, no. 1, 2017.
  • [37] L. Pereira, L. Zambrano, J. Cortés-Vega, S. Niklitschek, and A. Delgado, “Adaptive quantum tomography in high dimensions,” Physical Review A, vol. 98, p. 012339, 2018.
  • [38] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
  • [39] A. Miranowicz, K. Bartkiewicz, J. Peřina, M. Koashi, N. Imoto, and F. Nori, “Optimal two-qubit tomography based on local and global measurements: Maximal robustness against errors as described by condition numbers,” Physical Review A, vol. 90, p. 062123, 2014.
  • [40] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M. Caves, “Symmetric informationally complete quantum measurements,” Journal of Mathematical Physics, vol. 45, no. 6, pp. 2171–2180, 2004.
  • [41] A. J. Scott, “SICs: Extending the list of solutions,” arXiv preprint arXiv:1703.03993, 2017.
  • [42] R. B. A. Adamson and A. M. Steinberg, “Improving quantum state estimation with mutually unbiased bases,” Physical Review Letters, vol. 105, p. 030406, 2010.
  • [43] T. Durt, B.-G. Englert, I. Bengtsson, and K. Życzkowski, “On mutually unbiased bases,” International Journal of Quantum Information, vol. 08, no. 04, pp. 535–640, 2010.
  • [44] A. Wünsche, V. Dodonov, O. Man’ko, and V. Man’ko, “Nonclassicality of states in quantum optics,” Fortschritte der Physik, vol. 49, no. 10‐11, pp. 1117–1122, 2001.
  • [45] S. Szabo, P. Adam, J. Janszky, and P. Domokos, “Construction of quantum states of the radiation field by discrete coherent-state superpositions,” Physical Review A, vol. 53, pp. 2698–2710, 1996.
  • [46] J. Janszky, P. Domokos, S. Szabó, and P. Adam, “Quantum-state engineering via discrete coherent-state superpositions,” Physical Review A, vol. 51, pp. 4191–4193, 1995.
  • [47] R. Jozsa, “Fidelity for mixed quantum states,” Journal of Modern Optics, vol. 41, no. 12, pp. 2315–2323, 1994.
  • [48] M. Hübner, “Explicit computation of the bures distance for density matrices,” Physics Letters A, vol. 163, no. 4, pp. 239 – 242, 1992.
  • [49] H. Zhu, Quantum State Estimation and Symmetric Informationally Complete POMs. PhD thesis, National University of Singapore, 2012.
  • [50] K. Zyczkowski and M. Kus, “Random unitary matrices,” Journal of Physics A: Mathematical and General, vol. 27, no. 12, pp. 4235–4245, 1994.
  • [51] J. A. Miszczak, “Generating and using truly random quantum states in Mathematica,” Computer Physics Communications, vol. 183, no. 1, pp. 118 – 124, 2012.
  • [52] N. Johnston, “QETLAB: A MATLAB toolbox for quantum entanglement, version 0.9,” Jan. 2016.
  • [53] I. Bengtsson, “From SICs and MUBs to Eddington,” Journal of Physics: Conference Series, vol. 254, p. 012007, 2010.
  • [54] M. D. de Burgh, N. K. Langford, A. C. Doherty, and A. Gilchrist, “Choice of measurement sets in qubit tomography,” Physical Review A, vol. 78, p. 052122, 2008.
  • [55] R. Bhatia, Matrix Analysis. Springer-Verlag New York, 1997.
  • [56] L. Zhang and S.-M. Fei, “Quantum fidelity and relative entropy between unitary orbits,” Journal of Physics A: Mathematical and Theoretical, vol. 47, no. 5, p. 055301, 2014.