This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Machine Learning Regularization
for the Minimum Volume Formula of Toric Calabi-Yau 3-folds

Eugene Choia    Rak-Kyeong Seonga,b xeugenechoi@gmail.com, seong@unist.ac.kr a Department of Mathematical Sciences, and
b Department of Physics,
Ulsan National Institute of Science and Technology,
50 UNIST-gil, Ulsan 44919, South Korea
Abstract

We present a collection of explicit formulas for the minimum volume of Sasaki-Einstein 5-manifolds. The cone over these 5-manifolds is a toric Calabi-Yau 3-fold. These toric Calabi-Yau 3-folds are associated with an infinite class of 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories, which are realized as worldvolume theories of D3-branes probing the toric Calabi-Yau 3-folds. Under the AdS/CFT correspondence, the minimum volume of the Sasaki-Einstein base is inversely proportional to the central charge of the corresponding 4d4d 𝒩=1\mathcal{N}=1 superconformal field theories. The presented formulas for the minimum volume are in terms of geometric invariants of the toric Calabi-Yau 3-folds. These explicit results are derived by implementing machine learning regularization techniques that advance beyond previous applications of machine learning for determining the minimum volume. Moreover, the use of machine learning regularization allows us to present interpretable and explainable formulas for the minimum volume. Our work confirms that, even for extensive sets of toric Calabi-Yau 3-folds, the proposed formulas approximate the minimum volume with remarkable accuracy.

preprint: UNIST-MTH-23-RS-05

I Introduction

Since the introduction of machine learning techniques in He:2017aed ; Krefl:2017yox ; Ruehle:2017mzq ; Carifio:2017bov ; Cole:2019enn ; Cole:2020gkd ; Halverson:2020trp ; Gukov:2020qaj ; Abel:2021rrj ; Krippendorf:2021uxu ; Cole:2021nnt ; Berglund:2023ztk ; Demirtas:2023fir for studying problems that occur in the context of string theory, machine learning – both supervised Bull:2018uow ; Jejjala:2019kio ; Brodie:2019dfx ; He:2020lbz ; Erbin:2020tks ; Anagiannis:2021cco ; Larfors:2022nep and unsupervised Krippendorf:2020gny ; Berman:2021mcw ; Bao:2021olg ; Seong:2023njx – has led to a variety of applications in string theory. A problem that appeared particularly suited for machine learning in 2017 Krefl:2017yox was the problem of identifying a formula for the minimum volume of Sasaki-Einstein 5-manifolds Martelli:2006yb ; Martelli:2005tp . The cone over these Sasaki-Einstein 5-manifolds is a toric Calabi-Yau 3-fold fulton ; 1997hep.th…11013L . Given that there are infinitely many toric Calabi-Yau 3-folds with corresponding Sasaki-Einstein 5-manifolds and that there is an infinite class of 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories associated to them via string theory Greene:1996cy ; Douglas:1997de ; Witten:1998qj ; Klebanov:1998hh ; Douglas:1996sw ; Lawrence:1998ja ; Feng:2000mi ; Feng:2001xr , this beautiful correspondence between geometry and gauge theory was identified in Krefl:2017yox as an ideal testbed for introducing machine learning for string theory.

These 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories corresponding to toric Calabi-Yau 3-folds are realized as worldvolume theories of D3-branes probing the Calabi-Yau singularities. Via the AdS/CFT correspondence Maldacena:1997re ; Morrison:1998cs ; Acharya:1998db , the minimum volume of the Sasaki-Einstein 5-manifolds is related to the maximized aa-function Intriligator:2003jj ; Butti:2005vn ; Butti:2005ps that gives the central charges of the corresponding 4d4d 𝒩=1\mathcal{N}=1 superconformal field theories Gubser:1998vd ; Henningson:1998gx . The proposal in Krefl:2017yox was that machine learning techniques can be used to give a formula of the minimum volume in terms of features taken from the toric diagram of the corresponding toric Calabi-Yau 3-folds. Such a formula would significantly simplify the computation of the minimum volume, which conventionally is computed by minimizing the volume function obtained from the equivariant index Martelli:2006yb ; Martelli:2005tp or Hilbert series of the toric Calabi-Yau 3-fold Benvenuti:2006qr ; Feng:2007ur .

In Krefl:2017yox , we made use of multiple linear regression gauss1823theoria ; fisher1922mathematical ; mendenhall2003second ; freedman2009statistical ; jobson2012applied and a combination of a regression model and a convolutional neural network (CNN) lecun1998gradient ; krizhevsky2012imagenet ; lecun2015deep ; schmidhuber2015deep to learn the minimum volume for toric Calabi-Yau 3-folds. As it is often the case for supervised machine learning rumelhart1986learning ; hastie2009elements , the models lacked interpretability and explainability, achieving high accuracies in estimating the minimum volume with giving only little insight into the mathematical structure and physical origin of the estimating formula.

0 1 2 3 4 5 6 7 8 9
D5 ×\times ×\times ×\times ×\times \cdot ×\times \cdot ×\times \cdot \cdot
NS5 ×\times ×\times ×\times ×\times Σ\Sigma \cdot \cdot
Table 1: Type IIB brane configuration for brane tilings, where Σ:P(x,y)=0\Sigma:P(x,y)=0 refers to the holomorphic curve defined by the corresponding toric Calabi-Yau 3-fold and the Newton polynomial P(x,y)P(x,y) of the associated toric diagram Δ\Delta Hori:2000kt ; Feng:2005gw .

In this work, we aim to highlight the pivotal role of regularization techniques in machine learning tikhonov1963regularization ; hastie2009elements . We demonstrate that employing regularized machine learning models can effectively address the limitations inherent in supervised machine learning, especially for problems that appear in string theory and, more broadly, for problems at the intersection of mathematics and physics. While the primary objective of regularization in machine learning is to prevent overfitting, certain versions of it can be employed to eliminate model parameters, echoing the spirit of regularization in quantum field theory.

By focusing on Least Absolute Shrinkage and Selection Operator (Lasso) regularization tibshirani1996regression for polynomial and logarithmic regression models, we identify several candidate formulas for the minimum volume of Sasaki-Einstein 5-manifolds corresponding to toric Calabi-Yau 3-folds. The discovered formulas depend either on 3 or 6 parameters that come from features of the corresponding toric diagrams fulton ; 1997hep.th…11013L – convex lattice polygons on 2\mathbb{Z}^{2} that characterize uniquely the associated toric Calabi-Yau 3-fold. Compared to the extremely large number of parameters in the regression and CNN models used in our previous work in Krefl:2017yox , the formulas obtained in this study are both presentable, interpretable, and most importantly reusable for the computation of the minimum volume for toric Calabi-Yau 3-folds.

II Calabi-Yau 3-Folds and Quiver Gauge Theories

In this work, we concentrate on non-compact toric Calabi-Yau 3-folds 𝒳\mathcal{X}. These geometries can be considered as cones over Sasaki-Einstein 5-manifolds Y5Y_{5} Maldacena:1997re ; Morrison:1998cs ; Acharya:1998db ; Martelli:2004wu ; Benvenuti:2004dy ; Benvenuti:2005ja ; Butti:2005sw . The toric Calabi-Yau 3-folds are fully characterized by convex lattice polygons Δ\Delta on 2\mathbb{Z}^{2} known as toric diagrams fulton ; 1997hep.th…11013L . The associated Calabi-Yau singularities can be probed by D3-branes whose worldvolume theories form a class of 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories Greene:1996cy ; Douglas:1997de ; Witten:1998qj ; Klebanov:1998hh ; Douglas:1996sw ; Lawrence:1998ja ; Feng:2000mi ; Feng:2001xr .

This class of 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories can be represented in terms of a T-dual Type IIB brane configuration known as a brane tiling Franco:2005rj ; Hanany:2005ve ; Franco:2005sm . Table 1 summarizes the Type IIB brane configuration. Brane tilings can be illustrated in terms of bipartite graphs on a 2-torus T2T^{2} 2003math…..10326K ; kasteleyn1967graph and encapsulate both the field theory information and the information about the associated toric Calabi-Yau geometry. Figure 1 shows an example of a brane tiling and its associated toric Calabi-Yau 3-fold, which is in this case the cone over the zeroth Hirzebruch surface F0F_{0} hirzebruch1968singularities ; brieskorn1966beispiele ; Morrison:1998cs ; Feng:2000mi . The mesonic moduli spaces Witten:1993yc ; Benvenuti:2006qr ; Feng:2007ur ; Butti:2007jv formed by the mesonic gauge invariant operators of these 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories with U(1)U(1) gauge groups is precisely the associated toric Calabi-Yau 3-folds. When all the gauge groups of the 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theory are U(N)U(N), then the mesonic moduli space is given by the NN-th symmetric product of the toric Calabi-Yau 3-fold.

Refer to caption
Figure 1: (a) The brane tiling for the second phase of the zeroth Hirzebruch surface F0F_{0}, and (b) its corresponding toric diagram hirzebruch1968singularities ; brieskorn1966beispiele ; Morrison:1998cs ; Feng:2000mi .

The gravity dual of the 4d4d worldvolume theories is Type IIB string theory on AdS5×Y5AdS_{5}\times Y_{5}, where Y5Y_{5} is the Sasaki-Einstein 5-manifold that forms the base of the associated toric Calabi-Yau 3-fold Maldacena:1997re ; Morrison:1998cs ; Acharya:1998db ; Martelli:2004wu ; Benvenuti:2004dy ; Benvenuti:2005ja ; Butti:2005sw . These 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories are known to flow at low energies to a superconformal fixed point. Under a procedure known as aa-maximization Intriligator:2003jj ; Butti:2005vn ; Butti:2005ps , the superconformal RR-charges of the 4d4d theory are determined. This procedure, involves the maximization of the trial aa-charge, which takes the form

a(R;Y5)=332(3TrR3TrR).\displaystyle a(R;Y_{5})=\frac{3}{32}(3\text{Tr}R^{3}-\text{Tr}R)~{}.~{} (II.1)

The maximization procedure gives the value of the central charge of the superconformal field theory at the conformal fixed point.

Under the AdS/CFT correspondence Maldacena:1997re ; Morrison:1998cs ; Acharya:1998db , the central charge is directly related to the minimized volume of the corresponding Sasaki-Einstein 5-manifold Y5Y_{5} Gubser:1998vd ; Henningson:1998gx . We have,

a(R;Y5)=π3N24V(R;Y5),\displaystyle a(R;Y_{5})=\frac{\pi^{3}N^{2}}{4V(R;Y_{5})}~{},~{} (II.2)

where the R-charges RR and as a result the volume function V(R;Y5)V(R;Y_{5}) can be expressed in terms of Reeb vector components bib_{i} of the corresponding Sasaki-Einstein 5-manifold Martelli:2006yb ; Martelli:2005tp . We can reverse the statement saying that computing the minimum volume,

Vmin=minbiV(bi;Y5),\displaystyle V_{min}=\text{min}_{b_{i}}~{}V(b_{i};Y_{5})~{},~{} (II.3)

is equivalent to obtaining the maximum value of the central charge a(R;Y5)a(R;Y_{5}). This correspondence is true for all 4d4d theories living on a stack of NN D3-branes probing toric Calabi-Yau 3-folds and has been checked extensively in various examples Intriligator:2003jj ; Butti:2005vn ; Butti:2005ps .

In this work, we will focus on the toric Calabi-Yau 3-folds and the corresponding Sasaki-Einstein 5-manifold Y5Y_{5}, with particular emphasis on the minimum volume VminV_{min} of the Sasaki-Einstein 5-manifolds Y5Y_{5}. Building on the pioneering work of Krefl:2017yox , this work proposes the use of more advanced machine learning techniques. In particular, we introduce machine learning regularization by using the Least Absolute Shrinkage and Selection Operator (Lasso) tibshirani1996regression in order to identify an explicit formula for the minimum volume VminV_{min} for Sasaki-Einstein 5-manifolds Y5Y_{5}. We expect to be able to write the minimum volume formula in terms of features obtained from the toric diagram of the corresponding toric Calabi-Yau 3-folds. The use of machine learning regularization allows us to eliminate parameters, reducing the necessary parameters for the volume formula to a manageable amount that is interpretable, presentable and reusable.

Before discussing these machine learning techniques, let us first review in the following section the computation of the volume functions for toric Calabi-Yau 3-folds using Hilbert series.

III Hilbert Series and Calabi-Yau Volumes

Given 𝒳\mathcal{X} as a cone over a projective variety XX, where XX is realized as an affine variety in \mathbb{C}, the Hilbert series Benvenuti:2006qr ; Feng:2007ur is the generating function for the dimension of the graded pieces of the coordinate ring

[x1,,xk]/fi,\displaystyle\mathbb{C}[x_{1},\dots,x_{k}]/\langle f_{i}\rangle~{},~{} (III.4)

where fif_{i} are the defining polynomials of XX. Accordingly, the Hilbert series takes the general form

g(t;𝒳)=i=0dim(Xi)ti.\displaystyle g(t;\mathcal{X})=\sum_{i=0}^{\infty}\text{dim}_{\mathbb{C}}(X_{i})t^{i}~{}.~{} (III.5)

For 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories given by brane tilings Franco:2005rj ; Hanany:2005ve ; Franco:2005sm , we have an associated toric Calabi-Yau 3-fold 𝒳\mathcal{X}, which becomes the mesonic moduli space Witten:1993yc ; Benvenuti:2006qr ; Feng:2007ur ; Butti:2007jv of the 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theory when the gauge groups are all U(1)U(1). The corresponding Hilbert series is the generating function of mesonic gauge invariant operators that form the mesonic moduli space. For the purpose of the remaining discussion, we will consider the 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories given by brane tilings as abelian theories with U(1)U(1) gauge groups.

Following the forward algorithm for brane tilings Feng:2000mi , we can use GLSM fields Witten:1993yc given by perfect matchings pαp_{\alpha} Hanany:2005ve ; Franco:2005rj of the brane tilings in order to express the mesonic moduli space of the abelian 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theory as the following symplectic quotient,

𝒳=Irr//QD=([pα]//QF)//QD,\displaystyle\mathcal{X}={}^{\text{Irr}}\mathcal{F}^{\flat}//Q_{D}=\left(\mathbb{C}[p_{\alpha}]//Q_{F}\right)//Q_{D}~{},~{} (III.6)

where Irr{\text{Irr}}\mathcal{F}^{\flat} is the largest irreducible component, also known as the coherent component, of the master space \mathcal{F}^{\flat} Hanany:2010zz ; Forcella:2008bb ; Forcella:2008eh of the 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theory. The master space is the spectrum of the coordinate ring generated by the chiral fields encoded in pαp_{\alpha} and quotiented by the F-term relations encoded in QFQ_{F}. In (III.6), QFQ_{F} is the FF-term charge matrix summarizing the U(1)U(1) charges originating from the FF-terms, and QDQ_{D} is the DD-term charge matrix which summarizes the U(1)U(1) gauge charges on perfect matchings pαp_{\alpha}.

Following the symplectic quotient description of the mesonic moduli space in (III.6), the Hilbert series can be obtained by solving the Molien integral Pouliot:1998yv ,

g(yα;𝒳)\displaystyle g(y_{\alpha};\mathcal{X}) =\displaystyle= i=1c2|zi|=1dzi2πizi\displaystyle\prod_{i=1}^{c-2}\oint_{|z_{i}|=1}\frac{\mathrm{d}z_{i}}{2\pi iz_{i}} (III.7)
×α=1c11yαj=1c3zj(Qt)jα,\displaystyle\times\prod_{\alpha=1}^{c}\frac{1}{1-y_{\alpha}\prod_{j=1}^{c-3}z_{j}^{(Q_{t})_{j\alpha}}}~{},~{}

where cc is the number of perfect matchings in the brane tiling and Qt=(QF,QD)Q_{t}=(Q_{F},Q_{D}) is the total charge matrix.

Martelli:2006yb ; Martelli:2005tp showed that the same Hilbert series can be obtained directly from the toric diagram Δ\Delta of the toric Calabi-Yau 3-fold 𝒳\mathcal{X}. Given that the toric diagram Δ\Delta is a convex lattice polygon on 2\mathbb{Z}^{2} with an ideal triangulation 𝒯(Δ)\mathcal{T}(\Delta) into unit sub-triangles Δi𝒯(Δ)\Delta_{i}\in\mathcal{T}(\Delta), the Hilbert series of the corresponding toric Calabi-Yau 3-fold 𝒳\mathcal{X} can be written as

g(ti;𝒳)=i=1rj=1n1(1𝐭𝐮i,j),\displaystyle g(t_{i};\mathcal{X})=\sum_{i=1}^{r}\prod_{j=1}^{n}\frac{1}{(1-\mathbf{t}^{\mathbf{u}_{i,j}})}~{},~{} (III.8)

where i=1,,ri=1,\dots,r is the index for the rr unit triangles Δi𝒯(Δ)\Delta_{i}\in\mathcal{T}(\Delta), and j=1,2,3j=1,2,3 is the index for the 33 boundary edges of each unit triangle Δi\Delta_{i}. For each boundary edge ejΔie_{j}\in\Delta_{i}, we have a 33-dimensional outer normal vector 𝐮i,j\mathbf{u}_{i,j} whose components are assigned the following product of fugacities,

𝐭𝐮i,j=a3ta𝐮i,j(a),\displaystyle\mathbf{t}^{\mathbf{u}_{i,j}}=\prod_{a}^{3}t_{a}^{\mathbf{u}_{i,j}(a)}~{},~{} (III.9)

where 𝐮i,j(a)\mathbf{u}_{i,j}(a) indicates the aa-th component of 𝐮i,j\mathbf{u}_{i,j}. We note that 𝐮i,j\mathbf{u}_{i,j} is a 33-dimensional vector because the defining vertices of Δ\Delta and Δi\Delta_{i} are all on a plane at height z=1z=1 such that their coordinates are of the form (x,y,1)(x,y,1). As a result, the vectors 𝐮i,j\mathbf{u}_{i,j} corresponding to edge ejΔje_{j}\in\Delta_{j} are normal to the 3-dimensional surface given by the vectors connecting the origin (0,0,0)(0,0,0) to the two bounding vertices of ejΔje_{j}\in\Delta_{j}.

Refer to caption
Figure 2: (a) The triangulated toric diagram for the zeroth Hirzebruch surface F0F_{0}, and (b) the corresponding normal vectors 𝐮i,j\mathbf{u}_{i,j} for each unit triangle Δi\Delta_{i} in the triangulation.

It is important to note that the fugacities t1,t2,t3t_{1},t_{2},t_{3} in (III.9) relate to the components of normal vectors 𝐮i,j\mathbf{u}_{i,j}, and therefore depend on the triangulation and the particular instance in a given GL(2,)GL(2,\mathbb{Z}) toric orbit of a toric diagram on the z=1z=1 plane. In comparison, the fugacities yαy_{\alpha} in (III.7) refer to the GLSM fields pαp_{\alpha} given by perfect matchings of the corresponding brane tiling. Since perfect matchings can be mapped directly to chiral fields in the 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theory, the fugacities yαy_{\alpha} in (III.7) can be mapped to fugacities counting global symmetry charges carried by chiral fields in the 4d4d theory. Because both Hilbert series from (III.7) and (III.8) refer to the same toric Calabi-Yau 3-fold 𝒳\mathcal{X}, there exists a fugacity map between yαy_{\alpha} and t1,t2,t3t_{1},t_{2},t_{3} that identifies the two Hilbert series with each other.

For the rest of the discussion, let us consider Hilbert series for toric Calabi-Yau 3-folds 𝒳\mathcal{X} that are in terms of fugacities t1,t2,t3t_{1},t_{2},t_{3} corresponding to coordinates of the normal vectors 𝐮i,j3\mathbf{u}_{i,j}\in\mathbb{Z}^{3} of the toric diagram Δ\Delta. Given the Hilbert series g(ti;𝒳)g(t_{i};\mathcal{X}), we can obtain the volume function Martelli:2006yb ; Martelli:2005tp of the Sasaki-Einstein 5-manifold Y5Y_{5} using,

V(bi;Y5)=limμ0μ3g(ti=exp[μbi];𝒳),\displaystyle V(b_{i};Y_{5})=\lim_{\mu\rightarrow 0}\mu^{3}g(t_{i}=\exp[-\mu b_{i}];\mathcal{X})~{},~{} (III.10)

where bib_{i} are the Reeb vector components with i=1,3i=1,\dots 3. We note that the Reeb vector 𝐛=(b1,b2,b3){\bf b}=(b_{1},b_{2},b_{3}) is always in the interior of the toric diagram Δ\Delta and can be chosen such that one of its components is set to

b3=3,\displaystyle b_{3}=3~{},~{} (III.11)

for toric Calabi-Yau 3-folds 𝒳\mathcal{X}. We further note that the limit in (III.10) takes the leading order in μ\mu in the expansion for g(ti=exp[μbi];𝒳)g(t_{i}=\exp[-\mu b_{i}];\mathcal{X}), which is shown to refer to the volume of the Sasaki-Einstein base Y5Y_{5} in Martelli:2006yb ; Martelli:2005tp .

Let us consider in the following paragraph an example of the computation of the volume function in terms of Reeb vector components bib_{i} for the Sasaki-Einstein base of the cone over the zeroth Hirzebruch surface F0F_{0} hirzebruch1968singularities ; brieskorn1966beispiele ; Morrison:1998cs ; Feng:2000mi .

Example: F0F_{0}. The toric diagram, its triangulation and the outer normal vectors 𝐮i,j\mathbf{u}_{i,j} for the cone over the zeroth Hirzebruch surface F0F_{0} hirzebruch1968singularities ; brieskorn1966beispiele ; Morrison:1998cs ; Feng:2000mi are shown in Figure 2(a). The cone over the zeroth Hirzebruch surface F0F_{0} is an interesting toric Calabi-Yau 3-fold because it has two distinct corresponding 4d4d 𝒩=1\mathcal{N}=1 supersymmetric gauge theories represented by two distinct brane tilings that are related by Seiberg duality Seiberg:1994pq ; 2001JHEP…12..001B ; Feng:2000mi . One of the brane tilings is shown in Figure 1.

Using the outer normal vectors 𝐮i,j\mathbf{u}_{i,j} for each of the four unit sub-triangles Δi\Delta_{i} of the toric diagram for F0F_{0} in Figure 2(b), we can use (III.8) to write down the Hilbert series,

g(ti;F0)=1(1t1)(1t21)(1t11t2t31)\displaystyle g(t_{i};F_{0})=\frac{1}{(1-t_{1})(1-t_{2}^{-1})(1-t_{1}^{-1}t_{2}t_{3}^{-1})}
+1(1t11)(1t21)(1t1t2t31)\displaystyle\hskip 14.22636pt+\frac{1}{(1-t_{1}^{-1})(1-t_{2}^{-1})(1-t_{1}t_{2}t_{3}^{-1})}
+1(1t1)(1t2)(1t11t21t31)\displaystyle\hskip 14.22636pt+\frac{1}{(1-t_{1})(1-t_{2})(1-t_{1}^{-1}t_{2}^{-1}t_{3}^{-1})}
+1(1t11)(1t2)(1t1t21t31).\displaystyle\hskip 14.22636pt+\frac{1}{(1-t_{1}^{-1})(1-t_{2})(1-t_{1}t_{2}^{-1}t_{3}^{-1})}~{}.~{} (III.12)

Using the limit in (III.10), we can derive the volume function of the Sasaki-Einstein base directly from the Hilbert series as follows,

V(bi;F0)=\displaystyle V(b_{i};F_{0})=
24(b1b23)(b1b2+3)(b1+b23)(b1+b2+3),\displaystyle\frac{24}{(b_{1}-b_{2}-3)(b_{1}-b_{2}+3)(b_{1}+b_{2}-3)(b_{1}+b_{2}+3)}~{},~{}

where b3=3b_{3}=3. When we find the global minimum of the volume function V(bi;F0)V(b_{i};F_{0}), we obtain

Vmin=minbiV(bi;F0)=8270.29630,\displaystyle V_{min}=\text{min}_{b_{i}}~{}V(b_{i};F_{0})=\frac{8}{27}\simeq 0.29630~{},~{} (III.14)

up to 5 decimal points, which occurs at critical Reeb vector components b1=b2=0b_{1}^{*}=b_{2}^{*}=0. In the remainder of this work, we will maintain a precision level of 5 decimal points for all numerical measurements.

IV Features of Toric Diagrams and Regression

The aim of this work is to identify an expression for the minimum volume VminV_{min} of Sasaki-Einstein 5-manifolds Y5Y_{5} in terms of parameters that we know from the corresponding toric Calabi-Yau 3-folds 𝒳\mathcal{X}. We refer to these parameters as features, denoted as xax_{a}, of the toric Calabi-Yau 3-fold 𝒳\mathcal{X}.

Assuming that we have NxN_{x} features xax_{a} for a given toric Calabi-Yau 3-fold, the proposal in Krefl:2017yox states that we can write down a candidate linear function for the inverse minimum volume in terms of these features as follows,

1/V^min(xaj)y^j=β0+a=1Nxβaxaj,\displaystyle 1/\hat{V}_{min}(x^{j}_{a})\equiv\hat{y}^{j}=\beta_{0}+\sum_{a=1}^{N_{x}}\beta_{a}x_{a}^{j}~{},~{} (IV.15)

where β0\beta_{0} and βa\beta_{a} are real coefficients, and jj labels the particular toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j} with its corresponding toric diagram Δj2\Delta^{j}\in\mathbb{Z}^{2}.

Let us refer to the inverse of the actual minimum volume obtained by volume minimization as 1/Vminjyj1/V_{min}^{j}\equiv y^{j} for a given toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j}. If for a set SS of N=|S|N=|S| toric Calabi-Yau 3-folds 𝒳j\mathcal{X}^{j}, we know the actual minimum volumes VminjV_{min}^{j} via volume minimization, then we can calculate the following residual sum of squares of the difference between the inverses of the actual and the expected minimum volumes for the entire set SS,

\displaystyle\mathcal{L} =\displaystyle= 12Nj=1N=|S|(yjy^j)2\displaystyle\frac{1}{2N}\sum_{j=1}^{N=|S|}\left(y^{j}-\hat{y}^{j}\right)^{2}
=\displaystyle= 12Nj=1N(1/Vminjβ0a=1Nxβaxaj)2.\displaystyle\frac{1}{2N}\sum_{j=1}^{N}\left(1/V_{min}^{j}-\beta_{0}-\sum_{a=1}^{N_{x}}\beta_{a}x_{a}^{j}\right)^{2}~{}.~{}

Here, \mathcal{L} can be considered as a loss function goodfellow2016deep that evaluates the performance of the candidate function for the minimum volume in (IV.15). In multiple linear regression gauss1823theoria ; fisher1922mathematical ; mendenhall2003second ; freedman2009statistical ; jobson2012applied , as initially proposed in Krefl:2017yox , the optimization task is to minimize the loss function in (IV) for a given dataset SS of toric Calabi-Yau 3-folds,

argminβ0,βa.\displaystyle\text{argmin}_{\beta_{0},\beta_{a}}\mathcal{L}~{}.~{} (IV.17)

In Krefl:2017yox , multiple linear regression was used to obtain a candidate minimum volume function using the following feature set,

xaj{f1,f2,f3,f1f2,f1f3,,f12,f22,f32}j,\displaystyle x_{a}^{j}\in\{f_{1},f_{2},f_{3},f_{1}f_{2},f_{1}f_{3},\dots,f_{1}^{2},f_{2}^{2},f_{3}^{2}\}^{j}~{},~{} (IV.18)

where

f1=I,f2=E,f3=V,\displaystyle f_{1}=I~{},~{}f_{2}=E~{},~{}f_{3}=V~{},~{} (IV.19)

corresponding respectively to the number of internal lattice points in Δj\Delta^{j}, the number of boundary lattice points in Δj\Delta^{j}, and the number of vertices that form the extremal corner points in Δj\Delta^{j}, for a given toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j}. Under Pick’s theorem pick1899geometrisches , these features are related as follows,

A=I+E/21,\displaystyle A=I+E/2-1~{},~{} (IV.20)

where AA is the area of the toric diagram Δ\Delta, with the area of the smallest unit triangle in 2\mathbb{Z}^{2} having A=1/2A=1/2.

With a dataset SS of N=15,147N=15,147 toric Calabi-Yau 3-folds, the work in Krefl:2017yox showed that the candidate linear function in (IV.15) with features given by (IV.18) is able to estimate the inverse minimum volume with an expected percentage relative error of 2.2%. In this work, we expand upon the accomplishments of Krefl:2017yox by introducing novel features that describe toric Calabi-Yau 3-folds, augmenting the datasets for toric Calabi-Yau 3-folds, and applying machine learning techniques incorporating regularization. These improvements are designed to address some of the shortcomings of the work in Krefl:2017yox as well as give explicit interpretable formulas for the minimum volume for toric Calabi-Yau 3-folds.

Refer to caption
Figure 3: (a) The toric diagram Δ1\Delta_{1} for the cone over dP1\text{dP}_{1}, and (b) the corresponding 22-enlarged toric diagram Δ2\Delta_{2} with n=2n=2.

New Features. We introduce several new features that describe a toric Calabi-Yau 3-fold and are obtained from the corresponding toric diagram Δ\Delta. By defining the nn-enlarged toric diagram as,

Δn={nv=(nx,ny)|v=(x,y)Δ},\displaystyle\Delta_{n}=\{nv=(nx,ny)~{}|~{}v=(x,y)\in\Delta\}~{},~{} (IV.21)

where n+n\in\mathbb{Z}^{+} and v=(x,y)2v=(x,y)\in\mathbb{Z}^{2} are the coordinates of the vertices in the original toric diagram Δ\Delta. We note that Δ1=Δ\Delta_{1}=\Delta. These nn-enlarged toric diagrams Δn\Delta_{n} also appeared in Berglund:2021ztg for the study of Hodge numbers of Calabi-Yau manifolds that are constructed as hypersurfaces in toric varieties given by Δ\Delta.

Using the nn-enlarged toric diagram Δnj\Delta_{n}^{j} for a given toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j}, we can now refer to the area of Δn\Delta_{n} as AnA_{n}, the number of internal lattice points of Δn\Delta_{n} as InI_{n}, and the number of boundary lattice points in Δn\Delta_{n} as EnE_{n}. We further note that the number of vertices VnV_{n} corresponding to extremal corner points in Δn\Delta_{n} is the same for VV in Δ\Delta for all nn, i.e. Vn=VV_{n}=V.

In our work, we use features of a toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j} that are composed from members of the following set,

{A,V,E,In}j,\displaystyle\{A,V,E,I_{n}\}^{j}~{},~{} (IV.22)

where n=1,,7n=1,\dots,7. These are defined through the corresponding toric diagram Δj\Delta^{j} and its corresponding nn-enlarged toric diagram Δnj\Delta_{n}^{j}. Through the application of machine learning regularization, our objective is to differentiate between features that contribute to the expression for the minimum volume associated with a toric Calabi-Yau 3-fold and those that do not.

Set Description |Sm||S_{m}|
S1aS_{\text{1a}} all polytopes 5×55\times 5 lattice box 15,327
S1bS_{\text{1b}} all polytopes r=3.5r=3.5 circle 31,324
S2aS_{\text{2a}} selected polytopes 30×3030\times 30 lattice box 202,015
S2bS_{\text{2b}} selected polytopes r=15r=15 circle 201,895
Table 2: For training the machine learning models, we make use of 4 sets SmS_{m} of toric diagrams with different sizes |Sm||S_{m}|.

New Sets of Toric Calabi-Yau 3-folds. The aim of this work is to make use of machine learning with regularization in order to identify an interpretable formula that accurately estimates the minimum volume of Sasaki-Einstein 5-manifolds corresponding to toric Calabi-Yau 3-folds. The interpretability of the minimum volume formula is achieved by the lowest possible number of features on which the formula depends on. In order to train such a regularized machine learning model, we establish four sets SmS_{m} of toric Calabi-Yau 3-folds 𝒳j\mathcal{X}^{j}, for which the corresponding minimum volumes are known. These sets SmS_{m} are defined as follows:

  • S1aS_{\text{1a}}: This set consists of toric Calabi-Yau 3-folds whose toric diagrams fit into a 5×55\times 5 lattice box in 2\mathbb{Z}^{2} as illustrated in Figure 4(a). This set contains a certain degree of redundancy given that convex lattice polygons related by a GL(2,)GL(2,\mathbb{Z}) transformation on their vertices refer to the same toric Calabi-Yau 3-fold. Accordingly, we restrict ourselves to toric diagrams Δj\Delta^{j} that give unique combinations of the form (1/Vminj,Vj,Ej,Ij)(1/V_{min}^{j},V^{j},E^{j},I^{j}). This results in a dataset of |S1a|=15,327|S_{\text{1a}}|=15,327 distinct toric diagrams with unique inverse minimum volumes 1/Vminj1/V_{min}^{j} up to 6 decimal points.

  • S1bS_{\text{1b}}: The second set consists of toric Calabi-Yau 3-folds whose toric diagrams fit inside a circle centered at the origin (0,0)(0,0) on the 2\mathbb{Z}^{2} lattice with radius r=3.5r=3.5 as illustrated in Figure 4(b). By imposing the condition that we want GL(2,)GL(2,\mathbb{Z})-distinct toric diagrams Δj\Delta^{j} with unique combinations of the form (1/Vminj,Vj,Ej,Ij)(1/V_{min}^{j},V^{j},E^{j},I^{j}), we obtain |S1b|=31,324|S_{\text{1b}}|=31,324 toric diagram for this set.

  • S2aS_{\text{2a}}: For this set, we choose randomly 300,000 toric diagrams that fit into a 30×3030\times 30 lattice box in 2\mathbb{Z}^{2}. By imposing the condition that the toric diagrams Δj\Delta^{j} have unique combinations of the form (1/Vminj,Vj,Ej,Ij)(1/V_{min}^{j},V^{j},E^{j},I^{j}), we obtain |S2a|=202,015|S_{\text{2a}}|=202,015 toric diagram for this set.

  • S2bS_{\text{2b}}: For this set, we choose randomly 300,000 toric diagrams that fit into a circle centered at the origin (0,0)(0,0) on the 2\mathbb{Z}^{2} lattice with radius r=15r=15. By imposing the condition that the toric diagrams Δj\Delta^{j} have unique combinations of the form (1/Vminj,Vj,Ej,Ij)(1/V_{min}^{j},V^{j},E^{j},I^{j}), we obtain |S2b|=201,895|S_{\text{2b}}|=201,895 toric diagram for this set.

The distribution of inverse minimum volumes 1/Vmin1/V_{min} for the above sets of toric diagrams is illustrated together with the mean inverse minimum volume y¯=1/Vmin=1|Sm|j=1|Sm|1/Vminj\overline{y}=\langle 1/V_{min}\rangle=\frac{1}{|S_{m}|}\sum_{j=1}^{|S_{m}|}1/V_{min}^{j} in Figure 5. In the following sections, we make use of regularized machine learning in order to identify functions that optimally estimate the inverse minimum volume 1/Vmin1/V_{min} in each of the above datasets.

Refer to caption
Figure 4: (a) Toric diagrams in datasets S1aS_{\text{1a}} and S2aS_{\text{2a}} are constrained by a nx×nyn_{x}\times n_{y} lattice box in 2\mathbb{Z}^{2}, whereas (b) toric diagrams in datasets S1bS_{\text{1b}} and S2bS_{\text{2b}} are constrained by a circle of radius rr with the center at (0,0)2(0,0)\in\mathbb{Z}^{2}.
Refer to caption
Figure 5: The distribution of expected minimum volumes y=1/Vminy=1/V_{min} for the datasets (a) S1aS_{\text{1a}}, (b) S1bS_{\text{1b}}, (c) S2aS_{\text{2a}} and (c) S2bS_{\text{2b}}. The mean expected value y¯\overline{y} is indicated by a white line. The histograms for values of y=1/Vminy=1/V_{min} are obtained for bin sizes Δy\Delta y with the number of toric diagrams in binh\text{bin}_{h} given by N(binh)N(\text{bin}_{h}).

Machine Learning Models and Regularization. In order to obtain a function for the minimum volume of Sasaki-Einstein 5-manifolds corresponding to toric Calabi-Yau 3-folds in terms of features obtained from the corresponding toric diagrams, we make use of the following machine learning models:

  • Polynomial Regression (PR). We make use of polynomial regression montgomery2021introduction , where the relationship between the feature variables xajx_{a}^{j} and the predicted variable y^j\hat{y}^{j}, is given by

    y^j=β0+a=1Nxβaxaj.\displaystyle\hat{y}^{j}=\beta_{0}+\sum_{a=1}^{N_{x}}\beta_{a}x_{a}^{j}~{}.~{} (IV.23)

    Here, β0\beta_{0} and βa\beta_{a} are real coefficients, NxN_{x} is the number of features, and jj labels the particular sample in the data set that is used to train this machine learning model. In our case, the data set consists of toric Calabi-Yau 3-folds 𝒳j\mathcal{X}^{j}, where the corresponding minimum volume VminjV_{min}^{j} is given by y=1/Vminjy=1/V_{min}^{j}. Here we note that the features xajx_{a}^{j} are taken from the set {(fuj)a(fvj)b|1a+b2,a,b+}\{(f_{u}^{j})^{a}(f_{v}^{j})^{b}~{}|~{}1\leq a+b\leq 2,~{}a,b\in\mathbb{Z}^{+}\} with fuj{A,V,E,In}jf_{u}^{j}\in\{A,V,E,I_{n}\}^{j}, where n=1,,7n=1,\dots,7.

  • Logarithmic Regression (LR). We make use of logarithmic regression montgomery2021introduction in order to help linearize relationships between features xajx_{a}^{j} that are potentially multiplicative in their contribution towards the predicted variable y^j\hat{y}^{j}. To be more precise, we make use of a log\log-log\log model where we log\log-transform both the predicted variable y^j\hat{y}^{j} and the features xajx_{a}^{j}. The predicted variable is then given by,

    log(y^j)=β0+a=1Nxβalog(xaj)\displaystyle\log(\hat{y}^{j})=\beta_{0}+\sum_{a=1}^{N_{x}}\beta_{a}\log(x_{a}^{j}) (IV.24)

    where β0\beta_{0} and βa\beta_{a} are real coefficients, and NxN_{x} is the number of log\log-transformed features of the form log(xaj)\log(x_{a}^{j}). The label jj corresponds to a particular toric Calabi-Yau 3-fold 𝒳j\mathcal{X}^{j} whose corresponding minimum volume VminjV_{min}^{j} is given by yj=1/Vminjy^{j}=1/V_{min}^{j}. Here we note that the log\log-transformed features of the form log(xaj)\log(x_{a}^{j}) are taken from the set {(log(fuj))a(log(fvj))b|1a+b2,a,b+}\{(\log(f_{u}^{j}))^{a}(\log(f_{v}^{j}))^{b}~{}|~{}1\leq a+b\leq 2,~{}a,b\in\mathbb{Z}^{+}\} with fuj{A,V,E,In}jf_{u}^{j}\in\{A,V,E,I_{n}\}^{j}, where n=3,,7n=3,\dots,7. Here, we do not make use of I1I_{1} and I2I_{2}.

When we introduce regularization tikhonov1963regularization ; hastie2009elements into polynomial regression and logarithmic regression, we minimize the following loss function between the predicted variable y^j\hat{y}^{j} and the expected variable yy,

=12Nj=1N(yjy^j)2+Δ,\displaystyle\mathcal{L}=\frac{1}{2N}\sum_{j=1}^{N}(y^{j}-\hat{y}^{j})^{2}~{}+\Delta\mathcal{L}~{},~{} (IV.25)

where Δ\Delta\mathcal{L} is the regularization term in the loss function. The loss function in (IV.25) is iteratively minimized during the optimization process and we set for all following computations the maximum number of iterative steps to be Nmax=10,000N_{max}=10,000. The precise form of the regularization term in the loss function as well as the different regularization schemes in machine learning are discussed in the following section.

V Least Absolute Shrinkage and Selection Operator (Lasso) and Regularization

The Least Absolute Shrinkage and Selection Operator (Lasso) tibshirani1996regression is a machine learning regularization technique primarily employed to prevent overfitting in supervised machine learning. However, it can also be utilized for feature selection. In our work, the overarching goal in employing Lasso is to introduce a machine learning model capable of delivering optimal predictions for the minimum volume for toric Calabi-Yau 3-folds while using the fewest features from the training dataset. For problems such as the one considered in this work, it is quintessential to be able to obtain formulas with a small number of parameters. As a result, using Lasso is particularly suited for discovering new mathematical formulas such as the one aimed for in this work for the minimum volume for toric Calabi-Yau 3-folds.

In the following section, we give a brief overview of several regularization schemes including Lasso in the context of supervised machine learning for the minimum volume formula for toric Calabi-Yau 3-folds.

Regularization. Regularization in machine learning is a technique usually used to avoid overfitting the dataset during model training. This is done by adding a penalty term in the loss function. The introduction of the added regularization term Δ\Delta\mathcal{L}, resulting in an updated loss function of the form,

+Δ,\displaystyle\mathcal{L}+\Delta\mathcal{L}~{},~{} (V.26)

serves the purpose of constraining the possible parameter values within the supervised machine learning model. In the case of multiple linear regression as first introduced in Krefl:2017yox and reviewed in section §IV, these parameters would be the real coefficients β0\beta_{0} and βa\beta_{a} in the candidate linear function in (IV.15) for the expected minimum volume given by y^j=1/V^minj\hat{y}^{j}=1/\hat{V}^{j}_{min}. By restricting the values for these parameters, regularization effectively makes it harder for the supervised machine learning model to give a candidate function for the minimum volume VminV_{min} with many terms in the function. This prevents the machine learning model to overfit the dataset of minimized volumes for toric Calabi-Yau 3-folds.

Let us review the following three regularization schemes:

  • L1 Regularization (Lasso). This regularization scheme also known as Least Absolute Shrinkage and Selection Operator (Lasso) tibshirani1996regression adds the following linear regularization term to the loss function of the regression model,

    ΔL1=αa=1Nx|βa|,\displaystyle\Delta\mathcal{L}_{\text{L1}}=\alpha\sum_{a=1}^{N_{x}}|\beta_{a}|~{},~{} (V.27)

    where βa\beta_{a} are the real parameters of the regression model. α\alpha is a real regularization parameter. Increasing the value of α\alpha has the effect of increasing the strength of the L1 regularization.

  • L2 Regularization (Ridge). Another regularization scheme is known as Ridge regularization or L2 regularization hoerl1970ridge . It adds the following quadratic regularization term to the loss function of the regression model,

    ΔL2=αa=1Nxβa2,\displaystyle\Delta\mathcal{L}_{\text{L2}}=\alpha\sum_{a=1}^{N_{x}}\beta_{a}^{2}~{},~{} (V.28)

    where βa\beta_{a} are the real parameters of the regression model and α\alpha is again the real regularization parameter.

  • Elastic Net (L1 and L2). Elastic Net zou2005regularization is a combination of L1 (Lasso) and L2 (Ridge) regularization and adds the following regularization terms to the loss function,

    ΔL1,L2=α1a=1Nx|βa|+α2a=1Nxβa2,\displaystyle\Delta\mathcal{L}_{\text{L1,L2}}=\alpha_{1}\sum_{a=1}^{N_{x}}|\beta_{a}|+\alpha_{2}\sum_{a=1}^{N_{x}}\beta_{a}^{2}~{},~{} (V.29)

    where α1\alpha_{1} and α2\alpha_{2} are relative real regularization parameters that regulate the proportion of L1 regularization and L2 regularization in this regularization scheme.

Amongst these regularization schemes in supervised machine learning, we are going to mainly focus on Lasso and L1 regularization for the remainder of this work. While all three regularization schemes share the common goal of constraining the range of values for the model parameters βa\beta_{a}, it is noteworthy that only Lasso possesses the unique property of inducing sparsity among the model parameters, resulting in the complete elimination of certain parameters during the training process.

Refer to caption
Figure 6: Parametric plots for β1\beta_{1} and β2\beta_{2} for a 2-parameter model hastie2009elements . (a) In L1 regularization (Lasso), the minimum of the regularized loss function min(+Δ)\min(\mathcal{L}+\Delta\mathcal{L}) is more likely to be located when one of the parameters vanishes, in comparison to the case (b) in L2 regularization (Ridge) where the minimum of the regularized loss function min(+Δ)\min(\mathcal{L}+\Delta\mathcal{L}) is equally more likely located at non-zero values of the parameters. This illustrates that L1 regularization is more suited in eliminating parameters under optimization.
Refer to caption
Figure 7: The L1 (Lasso) regularization parameter α\alpha for polynomial regression on dataset S1aS_{\text{1a}} (15,327 toric diagrams in 5×55\times 5 lattice box) against (a) the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), (b) the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)}, and (c) the corresponding R2(α)R^{2}(\alpha)-score. The optimal regularization parameter α\alpha^{*} was found in the range α=104,,101\alpha=10^{-4},\dots,10^{1} by taking steps of Δα1.12202\Delta\alpha\simeq 1.12202. We also have the L1 (Lasso) regularization parameter α\alpha for polynomial regression on dataset S2aS_{\text{2a}} (202,015 random toric diagrams in 30×3030\times 30 lattice box) against (c) the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), (d) the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)}, and (e) the corresponding R2(α)R^{2}(\alpha)-score. The optimal regularization parameter α\alpha^{*} was found in the range α=104,,103\alpha=10^{-4},\dots,10^{3} by taking steps of Δα1.17490\Delta\alpha\simeq 1.17490.
Refer to caption
Figure 8: The L1 (Lasso) regularization parameter α\alpha for logarithmic regression on dataset S1aS_{\text{1a}} (15,327 toric diagrams in 5×55\times 5 lattice box) against (a) the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), (b) the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)}, and (c) the corresponding R2(α)R^{2}(\alpha)-score. The optimal regularization parameter α\alpha^{*} was found in the range α=104,,101\alpha=10^{-4},\dots,10^{1} by taking steps of Δα1.12202\Delta\alpha\simeq 1.12202. We also have the L1 (Lasso) regularization parameter α\alpha for logarithmic regression on dataset S2aS_{\text{2a}} (202,015 random toric diagrams in 30×3030\times 30 lattice box) against (c) the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), (d) the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)}, and (e) the corresponding R2(α)R^{2}(\alpha)-score. The optimal regularization parameter α\alpha^{*} was found in the range α=104,,103\alpha=10^{-4},\dots,10^{3} by taking steps of Δα1.17490\Delta\alpha\simeq 1.17490.

There are several arguments why Lasso enables the complete elimination of some of the model parameters and the corresponding features in the candidate function for the minimum volume VminV_{min} for toric Calabi-Yau 3-folds. In order to illustrate this, let us consider the case with Nx=2N_{x}=2 features x1jx_{1}^{j} and x2jx_{2}^{j}, for which the L1 and L2 regularization terms take respectively the following form,

ΔL1=α(|β1|+|β2|),ΔL2=α(β12+β22).\displaystyle\Delta\mathcal{L}_{\text{L1}}=\alpha(|\beta_{1}|+|\beta_{2}|)~{},~{}\Delta\mathcal{L}_{\text{L2}}=\alpha(\beta_{1}^{2}+\beta_{2}^{2})~{}.~{} (V.30)

If we assume that under optimization, the regularization terms reach a value ΔL1=ϵ\Delta\mathcal{L}_{\text{L1}}=\epsilon and ΔL2=ϵ\Delta\mathcal{L}_{\text{L2}}=\epsilon for α>0\alpha>0 and ϵ\epsilon\in\mathbb{R}, we can draw the parametric plots for the two regularization terms as shown in Figure 6 hastie2009elements . We can see from the plots in Figure 6 that for L1 regularization, the minimum of the total loss function is more likely achieved when one of the two parameters β1\beta_{1} or β2\beta_{2} approaches 0. This is in part due to the absolute values taken for the parameters in the linear L1 regularization term.

As a result, Lasso regularization is particularly suited for feature selection and parameter elimination in regression models. In our work, we employ L1 regularization to derive a formula for the minimum volume VminV_{min} of Sasaki-Einstein 5-manifolds corresponding to toric Calabi-Yau 3-folds that is interpretable, presentable and reusable.

VI Candidates for Minimum Volume Functions

In this work, our aim is to apply Lasso regularization in order to identify explicit formulas for the minimum volume for toric Calabi-Yau 3-folds. By doing so, our aim is to maximize the accuracy of the formulas that we find while minimizing the number of parameters the formulas depend on, making them interpretable and readily presentable.

data set y=1/Vminy=1/V_{min} α\alpha^{*} Nβa(α)N_{\beta_{a}(\alpha^{*})} R2(α)R^{2}(\alpha^{*})
S1aS_{\text{1a}} y^1aPR=1.28837A0.71753V+0.07208I2+5.18969\hat{y}_{\text{1a}}^{\text{PR}}=1.28837A-0.71753V+0.07208I_{2}+5.18969 0.03548 3 0.98354
S1bS_{\text{1b}} y^1bPR=1.36089A0.61041V+0.15561I+5.31028\hat{y}_{\text{1b}}^{\text{PR}}=1.36089A-0.61041V+0.15561I+5.31028 0.01995 3 0.98697
S2aS_{\text{2a}} y^2aPR=1.61574A19.35740V+0.06419I+101.58972\hat{y}_{\text{2a}}^{\text{PR}}=1.61574A-19.35740V+0.06419I+101.58972 0.97724 3 0.98743
S2bS_{\text{2b}} y^2bPR=1.61494A19.42096V+0.06494I+101.84952\hat{y}_{\text{2b}}^{\text{PR}}=1.61494A-19.42096V+0.06494I+101.84952 0.97724 3 0.98740
Table 3: Optimal candidate formulas for the minimum volume for toric Calabi-Yau 3-folds given by y=1/Vminy=1/V_{min} and obtained under L1 (Lasso) regularized polynomial regression (PR) on datasets S1aS_{\text{1a}}, S1bS_{\text{1b}}, S2aS_{\text{2a}} and S2bS_{\text{2b}}. For each optimal candidate formula, we give the optimal regularization parameter α\alpha^{*} that maximizes the corresponding R2R^{2}-score and minimizes the number of non-zero coefficients NβaN_{\beta_{a}} in the formula.
data set y=1/Vminy=1/V_{min} α\alpha^{*} Nβa(α)N_{\beta_{a}(\alpha^{*})} R2(α)R^{2}(\alpha^{*})
S1aS_{\text{1a}} y^1aLR=1.97348A0.77011V0.21355I30.08796I40.02722I50.00202e0.00923(logI3)2\hat{y}_{\text{1a}}^{\text{LR}}=1.97348A^{0.77011}V^{-0.21355}I_{3}^{0.08796}I_{4}^{0.02722}I_{5}^{0.00202}e^{0.00923(\log{I_{3}})^{2}} 0.00045 6 0.98932
S1bS_{\text{1b}} y^1bLR=1.75668A0.74154V0.182009E0.00050I30.16451I40.00679e0.00447(logI3)2\hat{y}_{\text{1b}}^{\text{LR}}=1.75668A^{0.74154}V^{-0.182009}E^{0.00050}I_{3}^{0.16451}I_{4}^{0.00679}e^{0.00447(\log{I_{3}})^{2}} 0.00032 6 0.98992
S2aS_{\text{2a}} y^2aLR=2.50772A0.95411V0.21992I30.02867\hat{y}_{\text{2a}}^{\text{LR}}=2.50772A^{0.95411}V^{-0.21992}I_{3}^{0.02867} 0.00112 3 0.99281
S2bS_{\text{2b}} y^2bLR=2.51288A0.95322V0.21970I30.02898\hat{y}_{\text{2b}}^{\text{LR}}=2.51288A^{0.95322}V^{-0.21970}I_{3}^{0.02898} 0.00112 3 0.99297
Table 4: Optimal candidate formulas for the minimum volume for toric Calabi-Yau 3-folds given by y=1/Vminy=1/V_{min} and obtained under L1 (Lasso) regularized logarithmic regression (LR) on datasets S1aS_{\text{1a}}, S1bS_{\text{1b}}, S2aS_{\text{2a}} and S2bS_{\text{2b}}. For each optimal candidate formula, we give the optimal regularization parameter α\alpha^{*} that maximizes the corresponding R2R^{2}-score and minimizes the number of non-zero coefficients NβaN_{\beta_{a}} in the formula.

Parameter Sparsity vs Accuracy. Like in all regression problems, we introduce as a measure of how well the model fits the observed data using the R2R^{2}-score montgomery2021introduction ; hastie2009elements given by,

R2=1SresStot,\displaystyle R^{2}=1-\frac{S_{res}}{S_{tot}}~{},~{} (VI.31)

where the residual sum of squares SresS_{res} is given by,

Sres=j=1N(yjy^j)2\displaystyle S_{res}=\sum_{j=1}^{N}(y^{j}-\hat{y}^{j})^{2} (VI.32)

and the total sum of squares StotS_{tot} is given by,

Stot=j=1N(yjy¯)2.\displaystyle S_{tot}=\sum_{j=1}^{N}(y^{j}-\overline{y})^{2}~{}.~{} (VI.33)

Here, y^j\hat{y}^{j} denotes the predicted value for the minimum volume VminjV_{min}^{j} given by yj=1/Vminjy^{j}=1/V_{min}^{j}, whereas y¯\overline{y} denotes the mean of the expected values yjy^{j}.

We recall that the optimization problem for the L1-regularized regression model is to minimize the loss function +ΔL1\mathcal{L}+\Delta\mathcal{L}_{\text{L1}} with the L1 regularization term. As we discussed in the sections above, this optimization problem focuses on minimizing the mean squared error with a penalty for non-zero coefficients βa(α)\beta_{a}(\alpha), which depends on the regularization parameter α\alpha.

Here, we note that there is an additional optimization problem regarding the maximization of the R2R^{2}-score in (VI.31) and the minimization of the number Nβa(α)N_{\beta_{a}(\alpha)} of non-zero coefficients βa(α)\beta_{a}(\alpha). We can formulate this additional optimization problem as follows,

maxα{R2(α)λNβa(α)Nx},\displaystyle\max_{\alpha}\left\{R^{2}(\alpha)-\lambda\frac{N_{\beta_{a}(\alpha)}}{N_{x}}\right\}~{},~{} (VI.34)

where 0<Nβa(α)Nx0<N_{\beta_{a}(\alpha)}\leq N_{x}, and the values of the coefficients βa(α)\beta_{a}(\alpha) and the R2(α)R^{2}(\alpha)-score all depend on the regularization parameter α\alpha. λ\lambda is a positive hyperparameter that regulates how much we value sparsity of feature coefficients βa(α)\beta_{a}(\alpha) over the accuracy of the estimate given by R2(α)R^{2}(\alpha).

Candidate Formulas. The candidate formulas for the minimum volume for toric Calabi-Yau 3-folds are identified by an optimal regularization parameter α\alpha^{*} that maximizes the R2R^{2}-score of the candidate formula and minimizes the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)} corresponding to features in the chosen regression model. In order to identify the optimal regularization parameter α\alpha^{*} for the optimization problem in (VI.34), we search for α\alpha^{*} in a given fixed range for α\alpha as specified in Figure 7 and Figure 8. We do the search for the optimal regularization parameter α\alpha^{*} for all four datasets in Table 2 for both L1-regularized polynomial regression and L1-regularized logarithmic regression as discussed in sections §IV and §V. The chosen L1-regularized regression models are trained for a particular value of the regularization parameter α\alpha under a fixed randomly chosen 80% training and 20% testing data split, where the corresponding R2R^{2}-score depending on α\alpha is obtained from the testing data.

Refer to caption
Figure 9: The L1-regularized logarithmic regression models trained on datasets S1aS_{\text{1a}} and S1bS_{\text{1b}} perform better on toric diagrams with larger areas AA (see selection in (e)-(h)) than for toric diagrams with smaller areas AA (see selection in (a)-(d)). The performance is measure by the relative percentage error ϵ(1/y^)\epsilon(1/\hat{y}) of the predicted minimum volume given by 1/y^1/\hat{y}. The R2R^{2}-scores for the L1-regularized logarithmic regression models trained on datasets S1aS_{\text{1a}} and S1bS_{\text{1b}} are R2(y1aLR)=0.98932R^{2}\left(y_{\text{1a}}^{\text{LR}}\right)=0.98932 and R2(y1bLR)=0.98992R^{2}\left(y_{\text{1b}}^{\text{LR}}\right)=0.98992, respectively.

Figure 7 shows respectively for datasets S1aS_{\text{1a}} and S2aS_{\text{2a}} plots for the L1 regularization parameter α\alpha for polynomial regression against standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), against the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)}, and against the R2R^{2}-score. Here, the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha) are obtained when the training is conducted over normalized features x¯a\overline{x}_{a}. When the training is completed for a specific value of α\alpha, the candidate formula for the minimum volume given by y=1/Vminy=1/V_{min} is obtained by reversing the normalization on the features, giving us the coefficients βa(α)\beta_{a}(\alpha) of the candidate formula. We also have Figure 8 which shows respectively for datasets S1aS_{\text{1a}} and S2aS_{\text{2a}} plots for the L1 regularization parameter α\alpha for logarithmic regression against the standardized coefficients β¯a(α)\overline{\beta}_{a}(\alpha), the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)} and the R2R^{2}-score. Similar plots can also be obtained for datasets S1bS_{\text{1b}} and S2bS_{\text{2b}} for both L1-regularized polynomial regression and L1-regularized logarithmic regression.

Overall, the plots illustrate that the identified optimal regularization parameters α\alpha^{*} minimize the number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)} in the formula estimating the minimum volume given by y=1/Vminy=1/V_{min}, as well as maximize the accuracy of the formulas measured by the R2R^{2}-score. Table 3 and Table 4 summarize respectively the most optimal candidate formulas for the minimum volume given by y=1/Vminy=1/V_{min} under L1-regularized polynomial regression and L1-regularized logarithmic regression for the four datasets in Table 2, with the corresponding optimal regularization parameters α\alpha^{*}, the corresponding number of non-zero coefficients Nβa(α)N_{\beta_{a}(\alpha)} and the R2R^{2}-score.

A closer look reveals that for all models, the identified optimal regularization parameters α\alpha^{*} results in formulas that approximate the minimum volume y=1/Vminy=1/V_{min} extremely well for all the datasets S1aS_{\text{1a}}, S1bS_{\text{1b}}, S2aS_{\text{2a}} and S2bS_{\text{2b}}. Overall, the L1-regularized logarithmic regression models seem to give more accurate results than the L1-regularized polynomial regression models with Nβa(α)6N_{\beta_{a}(\alpha)}\leq 6 over all datasets. In particular, L1-regularized logarithmic regression models trained on datasets S2aS_{\text{2a}} and S2bS_{\text{2b}} have R2R^{2}-scores above 0.990.99, which is exceptionally high.

Having a closer look at explicit examples of toric Calabi-Yau 3-folds in the datasets reveals however that the performances of the regularized regression models can vary between different toric Calabi-Yau 3-folds. For example, focusing on the L1-regularized logarithmic regression models trained on S1aS_{\text{1a}} and S1bS_{\text{1b}}, we observe that the minimum volumes given by 1/y^1aLR1/\hat{y}_{\text{1a}}^{\text{LR}} and 1/y^1bLR1/\hat{y}_{\text{1b}}^{\text{LR}} in Table 4 perform differently for toric diagrams with smaller areas AA compared to toric diagrams with larger areas AA as illustrated in Figure 9. Similar observations can be made for the L1-regularized logarithmic regression models trained on S2aS_{\text{2a}} and S2bS_{\text{2b}} as well as the L1-regularized polynomial regression models.

In summary, we can calculate the expected relative percentage errors E[ϵ]E[\epsilon] of the predicted minimum volumes given by 1/y^1/\hat{y} and the corresponding standard deviations σ[ϵ]\sigma[\epsilon] for the L1-regularized logarithmic regression models as follows,

E[ϵ1aLR]=2.158%,σ[ϵ1aLR]=1.696%,\displaystyle E\left[\epsilon_{\text{1a}}^{\text{LR}}\right]=2.158\%~{},~{}\sigma\left[\epsilon_{\text{1a}}^{\text{LR}}\right]=1.696\%~{},~{}
E[ϵ1bLR]=1.884%,σ[ϵ1bLR]=1.545%,\displaystyle E\left[\epsilon_{\text{1b}}^{\text{LR}}\right]=1.884\%~{},~{}\sigma\left[\epsilon_{\text{1b}}^{\text{LR}}\right]=1.545\%~{},~{}
E[ϵ2aLR]=3.577%,σ[ϵ2aLR]=2.396%,\displaystyle E\left[\epsilon_{\text{2a}}^{\text{LR}}\right]=3.577\%~{},~{}\sigma\left[\epsilon_{\text{2a}}^{\text{LR}}\right]=2.396\%~{},~{}
E[ϵ2bLR]=3.579%,σ[ϵ2bLR]=2.399%.\displaystyle E\left[\epsilon_{\text{2b}}^{\text{LR}}\right]=3.579\%~{},~{}\sigma\left[\epsilon_{\text{2b}}^{\text{LR}}\right]=2.399\%~{}.~{} (VI.35)

We note that the models trained on S2aS_{\text{2a}} and S2bS_{\text{2b}} have a larger expected relative percentage error than the ones trained on S1aS_{\text{1a}} and S1bS_{\text{1b}}. This is partly due to the fact that S2aS_{\text{2a}} and S2bS_{\text{2b}} contain randomly selected toric diagrams in a 30×3030\times 30 lattice box in 2\mathbb{Z}^{2} and r=15r=15 circle, respectively, whereas S1aS_{\text{1a}} and S1bS_{\text{1b}} contain the full set of toric diagrams in a 5×55\times 5 lattice box in 2\mathbb{Z}^{2} and r=3.5r=3.5 circle, respectively, as defined in Table 2.

We also note that the R2R^{2}-scores of the L1-regularized logarithmic regression models in Table 4,

R2(y1aLR)=0.98932,R2(y1bLR)=0.98992,\displaystyle R^{2}\left(y_{\text{1a}}^{\text{LR}}\right)=0.98932~{},~{}R^{2}\left(y_{\text{1b}}^{\text{LR}}\right)=0.98992~{},~{}
R2(y2aLR)=0.99281,R2(y2bLR)=0.99297,\displaystyle R^{2}\left(y_{\text{2a}}^{\text{LR}}\right)=0.99281~{},~{}R^{2}\left(y_{\text{2b}}^{\text{LR}}\right)=0.99297~{},~{} (VI.36)

are overall very high and close to 11. Compared to the expected relative percentage errors in (VI), which measure how far off predictions of the minimum volume given by 1/y^1/\hat{y} are, the R2R^{2}-score is a measure of the accuracy of the trained regression model. It quantifies the proportion of the variation in y=1/Vminy=1/V_{min} that can be predicted using the features selected from the corresponding toric diagrams of the toric Calabi-Yau 3-folds.

VII Discussions and Conclusions

With this work, we demonstrated that employing regularization in machine learning models can effectively address the limitations posed by supervised machine learning techniques applied to problems that occur in the context of string theory. In particular, we have shown that the minimum volume VminV_{min} for Sasaki-Einstein 5-manifolds corresponding to toric Calabi-Yau 3-folds can be expressed by just 3 features of the associated toric diagrams Δ\Delta with an R2R^{2}-score 0.98\geq 0.98. These 3 features are the area AA of Δ\Delta, the number of vertices VV in Δ\Delta, and the number of internal points in the factor n=3n=3 enlarged toric diagram Δ3\Delta_{3}.

The simultaneous maximization of the R2R^{2}-score and the minimization of the number surviving parameters in the candidate function for y=1/Vminy=1/V_{min} by varying the regularization strength given by the regularization parameter α\alpha, the proposed regularized regression models in this work give far more presentable, interpretable and explainable results than our previous work in Krefl:2017yox . Above all, as suggested in Figure 9, the candidate formulas for the minimum volumes of toric Calabi-Yau 3-folds obtained in this study are concise enough to facilitate the examination of why some toric Calabi-Yau 3-folds are associated with minimum volumes that are more challenging to predict than those of certain other toric Calabi-Yau 3-folds. We plan to report on these investigations in the near future. We foresee that the application of regularization schemes to other supervised machine learning applications in string theory will open up equally promising research opportunities in the future.

Acknowledgements.
R.K.-S. would like to thank the Simons Center for Geometry and Physics at Stony Brook University, the City University of New York Graduate Center, the Institute for Basic Science Center for Geometry and Physics, as well as the Kavli Institute for the Physics and Mathematics of the Universe for hospitality during various stages of this work. He is supported by a Basic Research Grant of the National Research Foundation of Korea (NRF-2022R1F1A1073128). He is also supported by a Start-up Research Grant for new faculty at UNIST (1.210139.01), a UNIST AI Incubator Grant (1.230038.01) and UNIST UBSI Grants (1.230168.01, 1.230078.01), as well as an Industry Research Project (2.220916.01) funded by Samsung SDS in Korea. He is also partly supported by the BK21 Program (“Next Generation Education Program for Mathematical Sciences”, 4299990414089) funded by the Ministry of Education in Korea and the National Research Foundation of Korea (NRF).

References

  • (1) Y.-H. He, Deep-Learning the Landscape, 1706.02714.
  • (2) D. Krefl and R.-K. Seong, Machine Learning of Calabi-Yau Volumes, Phys. Rev. D 96 (2017) 066014, [1706.03346].
  • (3) F. Ruehle, Evolving neural networks with genetic algorithms to study the String Landscape, JHEP 08 (2017) 038, [1706.07024].
  • (4) J. Carifio, J. Halverson, D. Krioukov and B. D. Nelson, Machine Learning in the String Landscape, JHEP 09 (2017) 157, [1707.00655].
  • (5) A. Cole, A. Schachner and G. Shiu, Searching the Landscape of Flux Vacua with Genetic Algorithms, JHEP 11 (2019) 045, [1907.10072].
  • (6) A. Cole, G. J. Loges and G. Shiu, Interpretable Phase Detection and Classification with Persistent Homology, in 34th Conference on Neural Information Processing Systems, 12, 2020. 2012.00783.
  • (7) J. Halverson, A. Maiti and K. Stoner, Neural Networks and Quantum Field Theory, Mach. Learn. Sci. Tech. 2 (2021) 035002, [2008.08601].
  • (8) S. Gukov, J. Halverson, F. Ruehle and P. Sułkowski, Learning to Unknot, Mach. Learn. Sci. Tech. 2 (2021) 025035, [2010.16263].
  • (9) S. Abel, A. Constantin, T. R. Harvey and A. Lukas, Evolving Heterotic Gauge Backgrounds: Genetic Algorithms versus Reinforcement Learning, Fortsch. Phys. 70 (2022) 2200034, [2110.14029].
  • (10) S. Krippendorf, R. Kroepsch and M. Syvaeri, Revealing systematics in phenomenologically viable flux vacua with reinforcement learning, 2107.04039.
  • (11) A. Cole, S. Krippendorf, A. Schachner and G. Shiu, Probing the Structure of String Theory Vacua with Genetic Algorithms and Reinforcement Learning, in 35th Conference on Neural Information Processing Systems, 11, 2021. 2111.11466.
  • (12) P. Berglund, Y.-H. He, E. Heyes, E. Hirst, V. Jejjala and A. Lukas, New Calabi-Yau Manifolds from Genetic Algorithms, 2306.06159.
  • (13) M. Demirtas, J. Halverson, A. Maiti, M. D. Schwartz and K. Stoner, Neural Network Field Theories: Non-Gaussianity, Actions, and Locality, 2307.03223.
  • (14) K. Bull, Y.-H. He, V. Jejjala and C. Mishra, Machine Learning CICY Threefolds, Phys. Lett. B 785 (2018) 65–72, [1806.03121].
  • (15) V. Jejjala, A. Kar and O. Parrikar, Deep Learning the Hyperbolic Volume of a Knot, Phys. Lett. B 799 (2019) 135033, [1902.05547].
  • (16) C. R. Brodie, A. Constantin, R. Deen and A. Lukas, Machine Learning Line Bundle Cohomology, Fortsch. Phys. 68 (2020) 1900087, [1906.08730].
  • (17) Y.-H. He and A. Lukas, Machine Learning Calabi-Yau Four-folds, Phys. Lett. B 815 (2021) 136139, [2009.02544].
  • (18) H. Erbin and R. Finotello, Machine learning for complete intersection Calabi-Yau manifolds: a methodological study, Phys. Rev. D 103 (2021) 126014, [2007.15706].
  • (19) V. Anagiannis and M. C. N. Cheng, Entangled q-convolutional neural nets, Mach. Learn. Sci. Tech. 2 (2021) 045026, [2103.11785].
  • (20) M. Larfors, A. Lukas, F. Ruehle and R. Schneider, Numerical metrics for complete intersection and Kreuzer–Skarke Calabi–Yau manifolds, Mach. Learn. Sci. Tech. 3 (2022) 035014, [2205.13408].
  • (21) S. Krippendorf and M. Syvaeri, Detecting Symmetries with Neural Networks, 2003.13679.
  • (22) D. S. Berman, Y.-H. He and E. Hirst, Machine learning Calabi-Yau hypersurfaces, Phys. Rev. D 105 (2022) 066002, [2112.06350].
  • (23) J. Bao, Y.-H. He and E. Hirst, Neurons on Amoebae, J. Symb. Comput. 116 (2022) 1–38, [2106.03695].
  • (24) R.-K. Seong, Unsupervised Machine Learning Techniques for Exploring Tropical Coamoeba, Brane Tilings and Seiberg Duality, 2309.05702.
  • (25) D. Martelli, J. Sparks and S.-T. Yau, Sasaki-Einstein manifolds and volume minimisation, Commun. Math. Phys. 280 (2008) 611–673, [hep-th/0603021].
  • (26) D. Martelli, J. Sparks and S.-T. Yau, The geometric dual of a-maximisation for toric Sasaki- Einstein manifolds, Commun. Math. Phys. 268 (2006) 39–65, [hep-th/0503183].
  • (27) W. Fulton, Introduction to toric varieties. Annals of mathematics studies. Princeton Univ. Press, Princeton, NJ, 1993.
  • (28) N. C. Leung and C. Vafa, Branes and Toric Geometry, ArXiv High Energy Physics - Theory e-prints (Nov., 1997) , [hep-th/9711013].
  • (29) B. R. Greene, String theory on Calabi-Yau manifolds, in Theoretical Advanced Study Institute in Elementary Particle Physics (TASI 96): Fields, Strings, and Duality, pp. 543–726, 6, 1996. hep-th/9702155.
  • (30) M. R. Douglas, B. R. Greene and D. R. Morrison, Orbifold resolution by D-branes, Nucl.Phys. B506 (1997) 84–106, [hep-th/9704151].
  • (31) E. Witten, Anti-de Sitter space and holography, Adv. Theor. Math. Phys. 2 (1998) 253–291, [hep-th/9802150].
  • (32) I. R. Klebanov and E. Witten, Superconformal field theory on three-branes at a Calabi-Yau singularity, Nucl.Phys. B536 (1998) 199–218, [hep-th/9807080].
  • (33) M. R. Douglas and G. W. Moore, D-branes, Quivers, and ALE Instantons, hep-th/9603167.
  • (34) A. E. Lawrence, N. Nekrasov and C. Vafa, On conformal field theories in four-dimensions, Nucl.Phys. B533 (1998) 199–209, [hep-th/9803015].
  • (35) B. Feng, A. Hanany and Y.-H. He, D-brane gauge theories from toric singularities and toric duality, Nucl. Phys. B595 (2001) 165–200, [hep-th/0003085].
  • (36) B. Feng, A. Hanany and Y.-H. He, Phase structure of D-brane gauge theories and toric duality, JHEP 08 (2001) 040, [hep-th/0104259].
  • (37) J. M. Maldacena, The large N limit of superconformal field theories and supergravity, Adv. Theor. Math. Phys. 2 (1998) 231–252, [hep-th/9711200].
  • (38) D. R. Morrison and M. R. Plesser, Nonspherical horizons. 1., Adv.Theor.Math.Phys. 3 (1999) 1–81, [hep-th/9810201].
  • (39) B. S. Acharya, J. M. Figueroa-O’Farrill, C. M. Hull and B. J. Spence, Branes at conical singularities and holography, Adv. Theor. Math. Phys. 2 (1999) 1249–1286, [hep-th/9808014].
  • (40) K. A. Intriligator and B. Wecht, The Exact superconformal R symmetry maximizes a, Nucl. Phys. B 667 (2003) 183–200, [hep-th/0304128].
  • (41) A. Butti and A. Zaffaroni, R-charges from toric diagrams and the equivalence of a- maximization and Z-minimization, JHEP 11 (2005) 019, [hep-th/0506232].
  • (42) A. Butti and A. Zaffaroni, From toric geometry to quiver gauge theory: The Equivalence of a-maximization and Z-minimization, Fortsch.Phys. 54 (2006) 309–316, [hep-th/0512240].
  • (43) S. S. Gubser, Einstein manifolds and conformal field theories, Phys. Rev. D 59 (1999) 025006, [hep-th/9807164].
  • (44) M. Henningson and K. Skenderis, The Holographic Weyl anomaly, JHEP 07 (1998) 023, [hep-th/9806087].
  • (45) S. Benvenuti, B. Feng, A. Hanany and Y.-H. He, Counting BPS operators in gauge theories: Quivers, syzygies and plethystics, JHEP 11 (2007) 050, [hep-th/0608050].
  • (46) B. Feng, A. Hanany and Y.-H. He, Counting Gauge Invariants: the Plethystic Program, JHEP 03 (2007) 090, [hep-th/0701063].
  • (47) C.-F. Gauss, Theoria combinationis observationum erroribus minimis obnoxiae. Henricus Dieterich, 1823.
  • (48) R. A. Fisher, On the mathematical foundations of theoretical statistics, Philosophical transactions of the Royal Society of London. Series A, containing papers of a mathematical or physical character 222 (1922) 309–368.
  • (49) W. Mendenhall, T. Sincich and N. S. Boudreau, A second course in statistics: regression analysis, vol. 6. Prentice Hall Upper Saddle River, NJ, 2003.
  • (50) D. A. Freedman, Statistical models: theory and practice. cambridge university press, 2009.
  • (51) J. D. Jobson, Applied multivariate data analysis: regression and experimental design. Springer Science & Business Media, 2012.
  • (52) Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (1998) 2278–2324.
  • (53) A. Krizhevsky, I. Sutskever and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems 25 (2012) .
  • (54) Y. LeCun, Y. Bengio and G. Hinton, Deep learning, nature 521 (2015) 436–444.
  • (55) J. Schmidhuber, Deep learning in neural networks: An overview, Neural networks 61 (2015) 85–117.
  • (56) D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning representations by back-propagating errors, nature 323 (1986) 533–536.
  • (57) T. Hastie, R. Tibshirani, J. H. Friedman and J. H. Friedman, The elements of statistical learning: data mining, inference, and prediction, vol. 2. Springer, 2009.
  • (58) K. Hori and C. Vafa, Mirror symmetry, hep-th/0002222.
  • (59) B. Feng, Y.-H. He, K. D. Kennaway and C. Vafa, Dimer models from mirror symmetry and quivering amoebae, Adv. Theor. Math. Phys. 12 (2008) 489–545, [hep-th/0511287].
  • (60) A. Tikhonov, Regularization of incorrectly posed problems, in Soviet Math. Dokl., pp. 1624–1627, 1963.
  • (61) R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society Series B: Statistical Methodology 58 (1996) 267–288.
  • (62) D. Martelli and J. Sparks, Toric geometry, Sasaki-Einstein manifolds and a new infinite class of AdS/CFT duals, Commun. Math. Phys. 262 (2006) 51–89, [hep-th/0411238].
  • (63) S. Benvenuti, S. Franco, A. Hanany, D. Martelli and J. Sparks, An infinite family of superconformal quiver gauge theories with Sasaki-Einstein duals, JHEP 06 (2005) 064, [hep-th/0411264].
  • (64) S. Benvenuti and M. Kruczenski, From Sasaki-Einstein spaces to quivers via BPS geodesics: L**p,q—r, JHEP 04 (2006) 033, [hep-th/0505206].
  • (65) A. Butti, D. Forcella and A. Zaffaroni, The Dual superconformal theory for L**pqr manifolds, JHEP 09 (2005) 018, [hep-th/0505220].
  • (66) S. Franco, A. Hanany, K. D. Kennaway, D. Vegh and B. Wecht, Brane Dimers and Quiver Gauge Theories, JHEP 01 (2006) 096, [hep-th/0504110].
  • (67) A. Hanany and K. D. Kennaway, Dimer models and toric diagrams, hep-th/0503149.
  • (68) S. Franco et al., Gauge theories from toric geometry and brane tilings, JHEP 01 (2006) 128, [hep-th/0505211].
  • (69) R. Kenyon, An introduction to the dimer model, ArXiv Mathematics e-prints (Oct., 2003) , [math/0310326].
  • (70) P. Kasteleyn, Graph theory and crystal physics, Graph theory and theoretical physics (1967) 43–110.
  • (71) F. Hirzebruch, Singularities and exotic spheres. Societe Mathematic de France, 1968.
  • (72) E. Brieskorn, Beispiele zur differentialtopologie von singularitäten, Inventiones mathematicae 2 (1966) 1–14.
  • (73) E. Witten, Phases of N = 2 theories in two dimensions, Nucl. Phys. B403 (1993) 159–222, [hep-th/9301042].
  • (74) A. Butti, D. Forcella, A. Hanany, D. Vegh and A. Zaffaroni, Counting Chiral Operators in Quiver Gauge Theories, JHEP 0711 (2007) 092, [0705.2771].
  • (75) A. Hanany and A. Zaffaroni, The master space of supersymmetric gauge theories, Adv.High Energy Phys. 2010 (2010) 427891.
  • (76) D. Forcella, A. Hanany, Y.-H. He and A. Zaffaroni, The Master Space of N=1 Gauge Theories, JHEP 0808 (2008) 012, [0801.1585].
  • (77) D. Forcella, A. Hanany, Y.-H. He and A. Zaffaroni, Mastering the Master Space, Lett.Math.Phys. 85 (2008) 163–171, [0801.3477].
  • (78) P. Pouliot, Molien function for duality, JHEP 01 (1999) 021, [hep-th/9812015].
  • (79) N. Seiberg, Electric - magnetic duality in supersymmetric nonAbelian gauge theories, Nucl. Phys. B435 (1995) 129–146, [hep-th/9411149].
  • (80) C. E. Beasley and M. Ronen Plesser, Toric duality is Seiberg duality, Journal of High Energy Physics 12 (Dec., 2001) 1–+, [hep-th/0109053].
  • (81) I. Goodfellow, Y. Bengio and A. Courville, Deep learning. MIT press, 2016.
  • (82) G. Pick, Geometrisches zur zahlenlehre, Sitzenber. Lotos (Prague) 19 (1899) 311–319.
  • (83) P. Berglund, B. Campbell and V. Jejjala, Machine Learning Kreuzer-Skarke Calabi-Yau Threefolds, 2112.09117.
  • (84) D. C. Montgomery, E. A. Peck and G. G. Vining, Introduction to linear regression analysis. John Wiley & Sons, 2021.
  • (85) A. E. Hoerl and R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1970) 55–67.
  • (86) H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society Series B: Statistical Methodology 67 (2005) 301–320.