This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Sum and Difference Sets in Generalized Dihedral Groups

Ruben Ascoli, Justin Cheigh, Guilherme Zeus Dantas e Moura, Ryan Jeong, Andrew Keisling, Astrid Lilly, Steven J. Miller, Prakod Ngamlamai, Matthew Phang
Abstract.

Given a group GG, we say that a set AGA\subseteq G has more sums than differences (MSTD) if |A+A|>|AA||A+A|>|A-A|, has more differences than sums (MDTS) if |A+A|<|AA||A+A|<|A-A|, or is sum-difference balanced if |A+A|=|AA||A+A|=|A-A|. A problem of recent interest has been to understand the frequencies of these type of subsets.

The seventh author and Vissuet studied the problem for arbitrary finite groups GG and proved that almost all subsets AGA\subseteq G are sum-difference balanced as |G||G|\to\infty. For the dihedral group D2nD_{2n}, they conjectured that of the remaining sets, most are MSTD, i.e., there are more MSTD sets than MDTS sets. Some progress on this conjecture was made by Haviland et al. in 2020, when they introduced the idea of partitioning the subsets by size: if, for each mm, there are more MSTD subsets of D2nD_{2n} of size mm than MDTS subsets of size mm, then the conjecture follows.

We extend the conjecture to generalized dihedral groups D=2GD=\mathbb{Z}_{2}\ltimes G, where GG is an abelian group of size nn and the nonidentity element of 2\mathbb{Z}_{2} acts by inversion. We make further progress on the conjecture by considering subsets with a fixed number of rotations and reflections. By bounding the expected number of overlapping sums, we show that the collection 𝒮D,m\mathcal{S}_{D,m} of subsets of the generalized dihedral group DD of size mm has more MSTD sets than MDTS sets when 6mcjn6\leq m\leq c_{j}\sqrt{n} for cj=1.3229/111+5jc_{j}=1.3229/\sqrt{111+5j}, where jj is the number of elements in GG with order at most 22. We also analyze the expectation for |A+A||A+A| and |AA||A-A| for AD2nA\subseteq D_{2n}, proving an explicit formula for |AA||A-A| when nn is prime.

Key words and phrases:
More Sums Than Differences, Dihedral Group, Generalized Dihedral Group
2020 Mathematics Subject Classification:
11P99, 05B10
This work was supported by NSF grant DMS1947438, Williams College, and Harvey Mudd College.

1. Introduction and Main Results

Given a set of AA integers, the sumset and difference set of AA are defined as

A+A={a1+a2:a1,a2A} and AA={a1a2:a1,a2A}.A+A\ =\ \{a_{1}+a_{2}:a_{1},a_{2}\in A\}\quad\text{ and }\quad A-A\ =\ \{a_{1}-a_{2}:a_{1},a_{2}\in A\}. (1)

These elementary operations are fundamental in additive number theory. A natural problem of recent interest has been to understand the relative sizes of the sum and difference sets of sets AA.

Definition 1.1.

We say that a set AA has more sums than differences (MSTD) if |A+A|>|AA||A+A|>|A-A|; has more differences than sums (MDTS) if |A+A|<|AA||A+A|<|A-A|; or is sum-difference balanced if |A+A|=|AA||A+A|=|A-A|.

We intuitively expect most sets to be MDTS since addition is commutative and subtraction is not. Nevertheless, MSTD subsets of integers exist. Nathanson detailed in [Nat07] the history of the problem, and attributed to John Conway the first recorded example of an MSTD subset of integers, {0,2,3,4,7,11,12,14}\{0,2,3,4,7,11,12,14\}. Martin and O’Bryant proved in [MO07] that the proportion of the 2n2^{n} subsets AA of {0,1,,n1}\{0,1,\ldots,n-1\} which are MSTD is bounded below by a positive value for all n15n\geq 15. They proved this by controlling the “fringe” elements of AA, those close to 0 and n1n-1, which have the most influence over whether elements are missing from the sum and difference sets. In [Zha11], Zhao gave a deterministic algorithm to compute the limit of the ratio of MSTD subsets of {0,1,,n1}\{0,1,\ldots,n-1\} as nn goes to infinity and found that this ratio is at least 4.28×1044.28\times 10^{-4}. For more on the problem of MSTD sets in the integers, see also [Heg07] and [Nat07a] for constructive examples of infinite families of MSTD sets, [MOS10] and [Zha10] for non-constructive proofs of existence of infinite families of MSTD sets, and [HM09] and [HM13] for an analysis of sets with each integer from 0 to n1n-1 included with probability cnδcn^{-\delta}.

More recently, several authors have examined analogous problems for groups GG other than the integers. For example, Do, Kulkarni, Moon, Wellens, Wilcox, and the seventh author studied in [DKMMWW15] the analogous problem for higher-dimensional integer lattices.

For finite groups, although the usual notation for the operation of the group is multiplication, we match the notation from previous work and define, for a subset AGA\subseteq G, its sumset and difference set as

A+A={a1a2:a1,a2A} and AA={a1a21:a1,a2A}.A+A\ =\ \{a_{1}a_{2}:a_{1},a_{2}\in A\}\quad\text{ and }\quad A-A\ =\ \{a_{1}a_{2}^{-1}:a_{1},a_{2}\in A\}. (2)

Definition 1.1 of MSTD, MDTS, and sum-difference balanced sets apply in this context.

The approaches used to study MSTD subsets of integers do not generalize for MSTD subsets of finite groups due to the lack of fringes. Zhao proved asymptotics for numbers of MSTD subsets of finite abelian groups as the size of the group goes to infinity in [Zha10a]. The seventh author and Vissuet examined the problem for arbitrary finite groups GG, also with the size of the group going to infinity, and proved Theorem 1.2.

Theorem 1.2 ([MV14]).

Let {Gn}\{G_{n}\} be a sequence of finite groups, not necessarily abelian, with |Gn||G_{n}|\to\infty. Let AnA_{n} be a uniformly chosen random subset of GnG_{n}. Then [An+An=AnAn=Gn]1\mathbb{P}[A_{n}+A_{n}=A_{n}-A_{n}=G_{n}]\to 1 as nn\to\infty. In other words, as the size of the finite groups increases without bound, almost all subsets are balanced (with sumset and difference set equalling the entire group).

Furthermore, for the case of dihedral groups D2nD_{2n}, they proposed Conjecture 1.3.

Conjecture 1.3 ([MV14]).

Let n3n\geq 3 be an integer. There are more MSTD subsets of D2nD_{2n} than MDTS subsets of D2nD_{2n}.

Given a set AD2n=r,srn,s2,rsrsA\subseteq D_{2n}=\langle r,s\mid r^{n},s^{2},rsrs\rangle, define RR (resp. FF) as the set of elements of AA of the form rir^{i} (resp. risr^{i}s), called rotation elements (resp. flip elements). Hence, A=RFA=R\cup F. Then, we can write

A+A\displaystyle A+A\ =(R+R)(F+F)(R+F)(R+F),\displaystyle=\ (R+R)\cup(F+F)\cup(R+F)\cup(-R+F),
AA\displaystyle A-A\ =(RR)(F+F)(R+F).\displaystyle=\ (R-R)\cup(F+F)\cup(R+F). (3)

Intuition for Conjecture 1.3 comes from noting that F+FF+F and R+FR+F contribute to both A+AA+A and AAA-A; R+RR+R and R+F-R+F contribute only to A+AA+A; and RRR-R contributes only to AAA-A.

In 2020, Haviland, Kim, Lâm, Lentfer, Trejos Suáres, and the seventh author made progress towards Conjecture 1.3 by partitioning subsets of D2nD_{2n} by size. They proposed Conjecture 1.4 as a means of proving Conjecture 1.3.

Conjecture 1.4 ([HKLLMT20]).

Let n3n\geq 3 be an integer, and let 𝒮2n,m\mathcal{S}_{2n,m} denote the collection of subsets of D2nD_{2n} of size mm. For any m2nm\leq 2n, 𝒮2n,m\mathcal{S}_{2n,m} has at least as many MSTD sets as MDTS sets.

They showed that Conjecture 1.4 holds for m=2m=2, which we reproduce in this paper, and we also extend their approach to m=3m=3. They also showed that Conjecture 1.4 holds for m>nm>n by showing that all sets in 𝒮2n,m\mathcal{S}_{2n,m} are sum-difference balanced. We prove this result in this paper, using Lemma 1.5. These results are proved in Section 2.

Lemma 1.5.

Let n3n\geq 3 be an integer, and let AD2nA\subseteq D_{2n}. Let RR (resp. FF) be the subset of rotations (resp. flips) in AA. Suppose that |F|>n2|F|>\frac{n}{2} or |R|>n2|R|>\frac{n}{2}. Then, AA cannot be MDTS.

Furthermore, we extend Conjecture 1.3 as follows. A generalized dihedral group is given by D=2GD=\mathbb{Z}_{2}\ltimes G, where GG is any abelian group and where the nonidentity element of 2\mathbb{Z}_{2} acts on GG by inversion.

Conjecture 1.6.

Let GG be an abelian group with at least one element of order 33 or greater, and let D=2GD=\mathbb{Z}_{2}\ltimes G be the corresponding generalized dihedral group. Then, there are more MSTD subsets of DD than MDTS subsets of DD.

Conjecture 1.3 is a special case of Conjecture 1.6, with G=nG=\mathbb{Z}_{n}. We also state Conjecture 1.7, analogous to Conjecture 1.4.

Conjecture 1.7.

Let DD be a generalized dihedral group of size 2n2n, and let 𝒮D,m\mathcal{S}_{D,m} denote the collection of subsets of DD of size mm. For any m2nm\leq 2n, 𝒮D,m\mathcal{S}_{D,m} has at least as many MSTD sets as MDTS sets.

The version of Lemma 1.5 that we prove in Section 2 deals with the generalized dihedral group.

In Section 3, we prove our main theorem, verifying Conjecture 1.7 for the case of mcjnm\leq c_{j}\sqrt{n}, where cjc_{j} is a constant (independent of nn) depending only on the quantity jj, the number of elements of order at most 22 in the abelian group GG. More explicitly, we show the following.

Theorem 1.8.

Let D=2GD=\mathbb{Z}_{2}\ltimes G be a generalized dihedral group of size 2n2n. Let 𝒮D,m\mathcal{S}_{D,m} denote the collection of subsets of DD of size mm, and let jj denote the number of elements in GG with order at most 22. If 6mcjn6\leq m\leq c_{j}\sqrt{n}, where cj=1.3229/111+5jc_{j}=1.3229/\sqrt{111+5j}, then there are more MSTD sets than MDTS sets in 𝒮D,m\mathcal{S}_{D,m}.

See Section 3 the proof of this result and two related theorems. We also extend these results to the dihedral group on finitely generated abelian groups in Section 3.3.

Next, in Section 4, we discuss the following result about the expected size of |AA||A-A| when AA is a randomly chosen set in 𝒮2n,m\mathcal{S}_{2n,m}, the collection of subsets of D2nD_{2n} of size mm.

Theorem 1.9.

If nn is prime, and AA is chosen uniformly at random from 𝒮2n,m\mathcal{S}_{2n,m}, then

𝔼[|AA|]= 2nnm2m(nm)+2n(n1)(nm1m1)m(2nm)n2(n1)(2nm)k=1m1(n+km1mk1)(nk1k1)k(mk).\mathbb{E}[|A-A|]\ =\ 2n-\frac{nm2^{m}{\binom{n}{m}}+2n(n-1){\binom{n-m-1}{m-1}}}{m{\binom{2n}{m}}}-\frac{n^{2}(n-1)}{{\binom{2n}{m}}}\sum_{k=1}^{m-1}\frac{{\binom{n+k-m-1}{m-k-1}}{\binom{n-k-1}{k-1}}}{k(m-k)}. (4)

Finally, in Section 5, we discuss directions for further research.

2. Direct Analysis

2.1. Small Subsets

For the case of the usual dihedral group D2nD_{2n}, we have the following two results.

Lemma 2.1 ([HKLLMT20]).

Let n3n\geq 3, and let 𝒮2n,2\mathcal{S}_{2n,2} denote the collection of subsets of D2nD_{2n} of size 22. Then, 𝒮2n,2\mathcal{S}_{2n,2} has strictly more MSTD sets than MDTS sets.

Lemma 2.2.

Let n3n\geq 3, and let 𝒮2n,3\mathcal{S}_{2n,3} denote the collection of subsets of D2nD_{2n} of size 33. Then, 𝒮2n,3\mathcal{S}_{2n,3} has strictly more MSTD sets than MDTS sets.

The proofs for both these lemmas use basic and somewhat tedious casework; they can be found in Appendix A.

Similar results for the generalized dihedral group likely follow from similar arguments.

2.2. Large Subsets

We consider what happens when mm gets close to nn. Here we can prove a result for any generalized dihedral group. For the rest of this section, let GG be a finite abelian group of size nn. Recall that the generalized dihedral group is given by D=2GD=\mathbb{Z}_{2}\ltimes G, where the nonidentity element of 2\mathbb{Z}_{2} acts on GG by inversion. Note that |D|=2n|D|=2n. Writing an element of the group DD as (z,g)(z,g) where z{0,1}z\in\{0,1\} and gGg\in G, we write RDR_{D} to mean the subset of DD consisting of elements with z=0z=0 and FDF_{D} to mean the subset of DD consisting of elements with z=1z=1. Note that |RD|=|FD|=n|R_{D}|=|F_{D}|=n. For the case where D=D2n=2nD=D_{2n}=\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n} is the usual dihedral group, RDR_{D} and FDF_{D} are the sets of rotations and flips, respectively; out of convenience, we will use these terms for the general case as well.

It turns out that having m>nm>n ensures that AA is balanced.

Lemma 2.3.

Let DD be a generalized dihedral group of size 2n2n. Let ADA\subseteq D, and let R=ARDR=A\cap R_{D} and F=AFDF=A\cap F_{D}. If max(|R|,|F|)>n/2\max(|R|,|F|)>n/2, then RDA+AR_{D}\subseteq A+A and RDAAR_{D}\subseteq A-A.

Proof.

Let LL be the larger of RR and FF, and define LD=RDL_{D}=R_{D} if L=RL=R and LD=FDL_{D}=F_{D} if L=FL=F.

For each rotation rRDr\in R_{D}, define rL1={r1|L}rL^{-1}=\{r\ell^{-1}\ |\ \ell\in L\} and rL={r|L}rL=\{r\ell\ |\ \ell\in L\}. Note that |rL1|=|rL|=|L|>n/2|rL^{-1}|=|rL|=|L|>n/2, and rL1rL^{-1}, rLrL, LL are subsets of the set LDL_{D} which has size nn. Hence, by the inclusion–exclusion principle, LrL1L\cap rL^{-1} and LrLL\cap rL are nonempty. Thus, rL+LA+Ar\in L+L\subseteq A+A and rLLAAr\in L-L\subseteq A-A.

Therefore, as desired, RDA+AR_{D}\subseteq A+A and RDAAR_{D}\subseteq A-A. ∎

Remark 1.

Note Lemma 2.3 implies if max(|R|,|F|)>n/2\max(|R|,|F|)>n/2, then AA is not MDTS. This follows from the discussion after the statement of Conjecture 1.3 (which we explicitly extend to the general dihedral group case in Section 3). For a set to be MDTS, RRR-R must contribute rotations to AAA-A that the set A+AA+A does not have. But here we have shown that if max(|R|,|F|)>n/2\max(|R|,|F|)>n/2, then A+AA+A has all the rotations.

Lemma 2.4.

Let DD be a generalized dihedral group of size 2n2n, and let ADA\subseteq D with |A|=m|A|=m. If m>nm>n, then A+A=AA=DA+A=A-A=D.

Proof.

Let RR (resp. FF) be the subset of rotations (resp. flips) in AA; hence A=RFA=R\cup F. Let L,SL,S be the larger and smaller of RR and FF, respectively. Define |L|=n1,|S|=n2|L|=n_{1},|S|=n_{2} for n1+n2=m>nn_{1}+n_{2}=m>n. Thus, n1>n/2n_{1}>n/2.

By Lemma 2.3, we have that RDA+AR_{D}\subseteq A+A and RDAAR_{D}\subseteq A-A.

For each flip fFDf\in F_{D}, define fL1={f1|L}fL^{-1}=\{f\ell^{-1}\ |\ \ell\in L\} and fL={f|L}fL=\{f\ell\ |\ \ell\in L\}.

Note that |fL1|=|fL|=|L|=n1|fL^{-1}|=|fL|=|L|=n_{1}, |S|=n2|S|=n_{2}, and fL1,fL,SfL^{-1},fL,S are subsets of SDS_{D}, which has size n<n1+n2n<n_{1}+n_{2}. Hence, by the inclusion–exclusion principle, fL1SfL^{-1}\cup S and fLSfL\cup S are nonempty. Thus, fS+LA+Af\in S+L\subseteq A+A and fSLAAf\in S-L\subseteq A-A. Therefore, FDA+AF_{D}\subseteq A+A and FDAAF_{D}\subseteq A-A.

Thus, we have RD,FDA+AR_{D},F_{D}\subseteq A+A and RD,FDAAR_{D},F_{D}\subseteq A-A, which imply A+A=AA=DA+A=A-A=D. ∎

3. Collision Analysis

Let GG be a finite abelian group of size nn, written multiplicatively. Recall that the generalized dihedral group is given by D=2GD=\mathbb{Z}_{2}\ltimes G, where the nonidentity element of 2\mathbb{Z}_{2} acts on GG by inversion.

This section is dedicated to proving the following.

See 1.8

Note that the theorem is only useful when cjn6c_{j}\sqrt{n}\geq 6, or n(6/cj)2n\geq(6/c_{j})^{2}.

If nn is arbitrarily large and jj is a constant compared to nn, we can make a stronger statement: we can replace cjc_{j} in the above theorem with a constant arbitrarily close to 2/70.5345\sqrt{2/7}\approx 0.5345. Specifically, we have the following.

Theorem 3.1.

For fixed jj and ϵ>0\epsilon>0, there exists nj,ϵn_{j,\epsilon} with the following property. Let GG be an abelian group of size nnj,ϵn\geq n_{j,\epsilon} with at most jj elements of order 22 or 11. Then with DD and 𝒮D,m\mathcal{S}_{D,m} defined as in Theorem 1.8, we have that if 6m(2/7ϵ)n6\leq m\leq\left(\sqrt{2/7}-\epsilon\right)\sqrt{n}, then there are more MSTD sets than MDTS sets in 𝒮D,m\mathcal{S}_{D,m}.

We can give a stronger, more general statement on the proportion of MSTD sets in 𝒮D,m\mathcal{S}_{D,m} if mm is large and also bounded above by a (smaller) constant times n\sqrt{n}.

Theorem 3.2.

Let D,jD,j, and 𝒮D,m\mathcal{S}_{D,m} be defined as in Theorem 1.8. For any ϵ>0\epsilon>0, there exist mϵm_{\epsilon} and cϵ,jc_{\epsilon,j} such that if mϵmcϵ,jnm_{\epsilon}\leq m\leq c_{\epsilon,j}\sqrt{n}, the proportion of MSTD sets in 𝒮D,m\mathcal{S}_{D,m} is at least 1ϵ1-\epsilon.

Here mϵm_{\epsilon} and cϵ,jc_{\epsilon,j} are independent of nn, but similarly to before, this theorem is only useful when n(mϵ/cϵ,j)2n\geq(m_{\epsilon}/c_{\epsilon,j})^{2}.

Remark 2.

In practice, Theorems 1.8 and 3.2 are most useful if jj is essentially a constant compared to nn. This is indeed the case for the original dihedral group D2n=2nD_{2n}=\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}, where j=1j=1 when nn is odd and j=2j=2 when nn is even, yielding cj0.12c_{j}\geq 0.12. The family of original dihedral groups is also a good example of how to use Theorem 3.1: we can apply that theorem with j=2j=2 and ϵ\epsilon arbitrarily small to get that for large enough dihedral groups, we can get the coefficient of the n\sqrt{n} in the theorem to be very close to 2/7\sqrt{2/7}, which is a significant improvement over 0.120.12.

However, these theorems are not useful when, for example, G=2××2×3G=\mathbb{Z}_{2}\times\ldots\times\mathbb{Z}_{2}\times\mathbb{Z}_{3}. Here GG does have an element of order at least 33, so Conjecture 1.6 applies, but we have j=n/3j=n/3, which is too large for Theorems 1.8 and 3.2 to apply to any values of mm. In fact, if jn/100j\geq n/100, then n(6/cj)2n\leq(6/c_{j})^{2}, and Theorem 1.8 does not apply to any values of mm.

We first prove Theorem 1.8 here. Then, in Subection 3.1 we demonstrate Theorems 3.1 and 3.2.

Recall the definitions of RDR_{D} and FDF_{D} from Section 2.2. Note that any element in FDF_{D} has order 22 in DD. Furthermore, any element in RDR_{D} has order at most 22 in DD if and only if it has order at most 22 in GG.

We begin with a set AA with size mm and count the number of elements in A+AA+A and AAA-A. In this count, we will make a naive assumption: there are no overlaps between sums and differences that we do not expect to overlap. Decompose AA into the union of the set of rotations R=ARDR=A\cap R_{D} of and the set of flips F=AFDF=A\cap F_{D}, and define k=|F|k=|F|. We have:

A+A\displaystyle A+A\ =(R+F)(F+R)(R+R)(F+F);\displaystyle=\ (R+F)\cup(F+R)\cup(R+R)\cup(F+F);
AA\displaystyle A-A\ =(RF)(FR)(RR)(FF)\displaystyle=\ (R-F)\cup(F-R)\cup(R-R)\cup(F-F) (5)

Consider first the flips in A+AA+A and AAA-A. In A+AA+A, these are in R+FR+F and F+RF+R. Note that we do not expect a lot of overlap in general; for a rotation (0,g1)R(0,g_{1})\in R and a flip (1,g2)F(1,g_{2})\in F, (0,g1)(1,g2)=(1,g1g2)(0,g_{1})\cdot(1,g_{2})=(1,g_{1}g_{2}) does not equal (1,g2)(0,g1)=(1,g2g11)(1,g_{2})\cdot(0,g_{1})=(1,g_{2}g_{1}^{-1}) unless g1g_{1} has order 11 or 22 in GG. On the other hand, for the flips in AAA-A, we have RF=FRR-F=F-R. This is because (0,g1)(1,g2)1=(0,g1)(1,g2)=(1,g1g2)(0,g_{1})\cdot(1,g_{2})^{-1}=(0,g_{1})\cdot(1,g_{2})=(1,g_{1}g_{2}), and (1,g2)(0,g1)1=(1,g2)(0,g11)=(1,g1g2)(1,g_{2})\cdot(0,g_{1})^{-1}=(1,g_{2})\cdot(0,g_{1}^{-1})=(1,g_{1}g_{2}), so these two are the same. There are mkm-k rotations and kk flips in AA, so we thus expect the flips to contribute 2(mk)k2(m-k)k to A+AA+A but only (mk)k(m-k)k to AAA-A.

Next, consider the rotations in A+AA+A and AAA-A. Begin with F+FF+F and FFF-F. Since all flips have order 22, these are in fact the same set and thus always contribute equally to A+AA+A and to AAA-A. Also note that if k0k\neq 0 (we will treat the k=0k=0 case later), the identity 11 is contained in F+FF+F and FFF-F.

Next consider R+RR+R and RRR-R. Adding rotations is commutative, so we expect R+RR+R to contribute (mk2)+(mk)\binom{m-k}{2}+(m-k) to the size of A+AA+A, where the mkm-k term comes from the sum of each rotation in RR with itself. On the other hand, in general g1g21g2g11g_{1}g_{2}^{-1}\neq g_{2}g_{1}^{-1}, so RRR-R is expected to contribute 2(mk2)2\binom{m-k}{2}. Here there is no additional mkm-k term since when g1=g2g_{1}=g_{2}, we have g1g21=1g_{1}g_{2}^{-1}=1, and 11 was already counted in AAA-A from FFF-F.

We now put this all together. For |A+A|>|AA||A+A|>|A-A|, we need

2(mk)k+(mk2)+(mk)>(mk)k+2(mk2),\displaystyle 2(m-k)k+\binom{m-k}{2}+(m-k)\ >\ (m-k)k+2\binom{m-k}{2},
\displaystyle\iff k>m/31 and mk,\displaystyle k\ >\ m/3-1\text{ and }m\ \neq\ k,
\displaystyle\iff m/3k<m.\displaystyle m/3\ \leq\ k\ <\ m. (6)

Note that when k=mk=m the set is necessarily balanced as F+F=FFF+F=F-F. Further, one can now see why we may assume k0k\neq 0: when kk is smaller than m/3m/3, we expect the set to be MDTS, and indeed we will assume this is the case.

To use this naive estimate to prove our theorem, we first formalize our assumption that we have minimal overlaps within the sumset.

Definition 3.3.

Let ADA\subseteq D such that |A|=m|A|=m, and let q=(a,b,c,d)A4q=(a,b,c,d)\in A^{4}. We say that qq represents a collision if ab=cdab=cd.

Every collision that occurs in AA has the potential to make |A+A||A+A| smaller relative to our naive estimate, unless the quadruple qq is of a form that we already took into account. For example, if q=(a,b,a,b)q=(a,b,a,b), then this is not a collision we need to count as ab=abab=ab trivially. Similarly, collisions of the form q=(a,b,b,a)q=(a,b,b,a) with aa and bb both rotations do not subtract from |A+A||A+A| in Equation (3) as we already accounted for commutativity of addition for rotations. And finally, if q=(a,b,c,d)q=(a,b,c,d) is a collision where aa, bb, cc, and dd are all flips, then qq does not impact Equation (3) as F+F=FFF+F=F-F do not affect the relative sizes of the sum and difference sets. We refer to these three kinds of quadruples as redundant.

Every non-redundant collision (a,b,c,d)(a,b,c,d) of AA, together with (c,d,a,b)(c,d,a,b), decreases the size of A+AA+A by at most 11 from our naive estimate. Let XAX_{A} be half the total number of non-redundant collisions of AA. Then combining the above analysis with Equation (3), we are guaranteed to have that AA is MSTD when

2(mk)k+(mk2)+(mk)XA>(mk)k+2(mk2),\displaystyle 2(m-k)k+\binom{m-k}{2}+(m-k)-X_{A}\ >\ (m-k)k+2\binom{m-k}{2},
\displaystyle\iff m/3+2XA3(mk)k<m,\displaystyle m/3+\frac{2X_{A}}{3(m-k)}\ \leq\ k<m,
\displaystyle\iff 3k24mk+m2+2XA 0 (and k<m).\displaystyle 3k^{2}-4mk+m^{2}+2X_{A}\ \leq\ 0\text{ (and $k\ <\ m$)}. (7)

We use the quadratic equation to solve for when the above quantity equals 0 and obtain k=(1/6)(4m±16m212(m2+2XA))=(1/3)(2m±m26XA)k=(1/6)(4m\pm\sqrt{16m^{2}-12(m^{2}+2X_{A})})=(1/3)(2m\pm\sqrt{m^{2}-6X_{A}}). Thus Equation (7) is satisfied when:

2mm26XA3k2m+m26XA3 (and k<m).\displaystyle\frac{2m-\sqrt{m^{2}-6X_{A}}}{3}\ \leq\ k\ \leq\ \frac{2m+\sqrt{m^{2}-6X_{A}}}{3}\ \ \text{ (and $k\ <\ m$)}. (8)
Remark 3.

We are assuming that no “collisions” of the form ab1=cd1ab^{-1}=cd^{-1} happen to lower the size of AAA-A. Because our objective is to guarantee |A+A|>|AA||A+A|>|A-A| for a large proportion of AA, this assumption still gives a sufficient condition on XAX_{A} and kk.

One therefore sees that for most values of kk, when the number of collisions is not too large, AA is MSTD. More formally, suppose that AA is chosen randomly out of the subsets of DD with size mm. Suppose that the expectation value of XAX_{A} is bounded above by c1m2c_{1}m^{2}, where c1=7/11520.006076c_{1}=7/1152\approx 0.006076. Then the actual value of XAX_{A} exceeds 12c1m212c_{1}m^{2} at most 1/121/12 of the time by Markov’s inequality. Of sets AA with 5m/12k11m/125m/12\leq k\leq 11m/12, the actual value of XAX_{A} exceeds 12c1m212c_{1}m^{2} at most 2/12=1/62/12=1/6 of the time. Thus when 5m/12k11m/125m/12\leq k\leq 11m/12, Equation (8) is true at least 5/65/6 of the time since when XA=12c1m2X_{A}=12c_{1}m^{2}, the equation reads

2mm272c1m23k2m+m272c1m23,\displaystyle\frac{2m-\sqrt{m^{2}-72c_{1}m^{2}}}{3}\ \leq\ k\ \leq\ \frac{2m+\sqrt{m^{2}-72c_{1}m^{2}}}{3},
\displaystyle\iff 2m9/16m23k2m+9/16m23,\displaystyle\frac{2m-\sqrt{9/16\cdot m^{2}}}{3}\ \leq\ k\ \leq\ \frac{2m+\sqrt{9/16\cdot m^{2}}}{3},
\displaystyle\iff 5m12k11m12.\displaystyle\frac{5m}{12}\ \leq\ k\ \leq\ \frac{11m}{12}. (9)

Now, we need to make sure that for our values of mm, more than (1/2)/(5/6)=3/5(1/2)/(5/6)=3/5 proportion of sets in 𝒮D,m\mathcal{S}_{D,m} have 5m/12k11m/125m/12\leq k\leq 11m/12. This will ensure that a proportion greater than 1/21/2 of sets in 𝒮D,m\mathcal{S}_{D,m} satisfy Equation (8) and are therefore MSTD.

More formally speaking, we require

limnk=5m/1211m/12(nk)(nmk)(2nm)> 3/5\displaystyle\lim_{n\to\infty}\frac{\sum_{k=\left\lceil 5m/12\right\rceil}^{\left\lfloor 11m/12\right\rfloor}\binom{n}{k}\binom{n}{m-k}}{\binom{2n}{m}}\ >\ 3/5 (10)
\displaystyle\iff m!2mk=5m/1211m/121k!(mk)!> 3/5,\displaystyle\frac{m!}{2^{m}}\sum_{k=\left\lceil 5m/12\right\rceil}^{\left\lfloor 11m/12\right\rfloor}\frac{1}{k!(m-k)!}\ >\ 3/5, (11)

where we may take the limit as nn\to\infty because the left hand side of Equation (10) decreases with increasing nn. Equation (11) can be verified numerically to be true for m6m\geq 6.111In fact, when mm is large, we expect almost all of the sets in 𝒮D,m\mathcal{S}_{D,m} to have 5m/12k11m/125m/12\leq k\leq 11m/12; see Subection 3.1 for further discussion on this fact. For smaller values of m6m\geq 6, one can verify Equations (10) and (11) using the following Desmos link: https://www.desmos.com/calculator/e4zqwbmmcr.

The problem has therefore been reduced to placing an upper bound on the values of mm such that the expected value of XAX_{A} is at most c1m2c_{1}m^{2} when AA is chosen uniformly at random from 𝒮D,m\mathcal{S}_{D,m}. The following lemma gives us the result we need.

Lemma 3.4.

When AA is chosen uniformly at random from 𝒮D,m\mathcal{S}_{D,m}, we have

𝔼[XA](732+1m+5j8m2)m4n.\mathbb{E}[X_{A}]\ \leq\ \left(\frac{7}{32}+\frac{1}{m}+\frac{5j}{8m^{2}}\right)\frac{m^{4}}{n}. (12)

Much of the machinery of this proof lies in Lemma 3.4; its proof is rather technical and can be found in Subsection 3.2.

When m6m\geq 6, Lemma 3.4 implies that under the hypothesis of the lemma,

𝔼[XA]c2m4n, where c2=732+16+5j288=111+5j288.\mathbb{E}[X_{A}]\ \leq\ c_{2}\frac{m^{4}}{n},\quad\text{ where }c_{2}\ =\ \frac{7}{32}+\frac{1}{6}+\frac{5j}{288}\ =\ \frac{111+5j}{288}. (13)

We are ready to complete the proof of Theorem 1.8. Recall that we wanted to have 𝔼[XA]c1m2\mathbb{E}[X_{A}]\leq c_{1}m^{2} to ensure most subsets of size mm would be MSTD. Thus the requisite upper bound on mm is determined as

c2m4nc1m2,\displaystyle c_{2}\frac{m^{4}}{n}\ \leq\ c_{1}m^{2},
mc1nc2=cjn,\displaystyle\iff m\ \leq\ \sqrt{\frac{c_{1}n}{c_{2}}}\ =\ c_{j}\sqrt{n}, (14)

where cj=c1/c2=7/(4(111+5j))1.3229/5j+111c_{j}=\sqrt{c_{1}/c_{2}}=\sqrt{7/(4(111+5j))}\approx 1.3229/\sqrt{5j+111}.

This concludes the proof of Theorem 1.8.

3.1. Proof of Theorems 3.1 and 3.2

In this section we prove Theorem 3.1, and in the process we outline the steps needed to prove Theorem 3.2. We now assume that n>nj,ϵn>n_{j,\epsilon}, where jj is a constant upper bound on the number of elements of order at most 2 in the group GG and nj,ϵn_{j,\epsilon} is sufficiently large.

Having proved Theorem 1.8 for 6m(1.3229/5j+111)n6\leq m\leq(1.3229/\sqrt{5j+111})\sqrt{n}, we may now assume m(1.3229/5j+111)nm\geq(1.3229/\sqrt{5j+111})\sqrt{n}; that is, mm is now large. We will follow similar steps to the previous proof, but using this assumption, we will increase c1c_{1} to be arbitrarily close to 1/161/16, and we will decrease c2c_{2} to be arbitrarily close to 7/327/32. Then we will have that the coefficient of the n\sqrt{n}, which is c1/c2\sqrt{c_{1}/c_{2}} (as discussed in the previous proof), is arbitrarily close to 2/7\sqrt{2/7}.

Take small ϵ1>0\epsilon_{1}>0. Note that inside of 𝒮D,m\mathcal{S}_{D,m}, the distribution of values of kk is a hypergeometric distribution. This is because one can construct a random set in 𝒮D,m\mathcal{S}_{D,m} by taking mm random elements of the group DD without replacement, one at a time; to begin with there is a 1/21/2 chance each time that we choose a flip. Thus since nn is very large and jj is fixed, having m(1.3229/5j+111)nm\geq(1.3229/\sqrt{5j+111})\sqrt{n} is sufficient for a proportion at least 1ϵ11-\epsilon_{1} of sets in 𝒮D,m\mathcal{S}_{D,m} to have k[(1/2ϵ1)m,(1/2+ϵ1)m]k\in[(1/2-\epsilon_{1})m,(1/2+\epsilon_{1})m].

Going back to Equation (8), we thus see that we just need m26XA\sqrt{m^{2}-6X_{A}} to be at least m/2+ϵ1m/2+\epsilon_{1}, or 6XAm2(m/2+ϵ1)26X_{A}\leq m^{2}-(m/2+\epsilon_{1})^{2}, slightly more than half the time when kk is in the relevant interval. More specifically, we need

[6XA3m24mϵ1ϵ12|(1/2ϵ1)mk(5/6+ϵ1)m]121(1ϵ1)2.\mathbb{P}\left[6X_{A}\leq\frac{3m^{2}}{4}-m\epsilon_{1}-\epsilon_{1}^{2}\ \Big{|}\ (1/2-\epsilon_{1})m\leq k\leq(5/6+\epsilon_{1})m\right]\ \geq\ \frac{1}{2}\cdot\frac{1}{(1-\epsilon_{1})^{2}}. (15)

Then, among sets with (1/2ϵ1)mk(5/6+ϵ1)m(1/2-\epsilon_{1})m\leq k\leq(5/6+\epsilon_{1})m (which, recall, form a proportion of at least 1ϵ11-\epsilon_{1} of sets in 𝒮D,m\mathcal{S}_{D,m}), at least a proportion of (1/2)/(1ϵ1)2(1/2)/(1-\epsilon_{1})^{2} satisfy Equation (8). This means that a proportion of at least (1/2)/(1ϵ1)>1/2(1/2)/(1-\epsilon_{1})>1/2 of sets in 𝒮D,m\mathcal{S}_{D,m} are MSTD.

For Equation (15) to hold, we claim that it suffices to have the following probability bound, not conditioned on the size of kk:

[XAm28mϵ16ϵ126]1211ϵ1+ϵ1.\mathbb{P}\left[X_{A}\leq\frac{m^{2}}{8}-\frac{m\epsilon_{1}}{6}-\frac{\epsilon_{1}^{2}}{6}\right]\ \geq\ \frac{1}{2}\cdot\frac{1}{1-\epsilon_{1}}+\epsilon_{1}. (16)

To see why Equation (16) implies Equation (15), call BB the event that (1/2ϵ1)mk(5/6+ϵ1)m(1/2-\epsilon_{1})m\leq k\leq(5/6+\epsilon_{1})m and CC the event that XAm28mϵ16ϵ126X_{A}\leq\frac{m^{2}}{8}-\frac{m\epsilon_{1}}{6}-\frac{\epsilon_{1}^{2}}{6}. Then, we manipulate conditional probabilities as follows.

[C]=[C|B][B]+[C|¬B][¬B][C|B][B]+[¬B],\displaystyle\mathbb{P}[C]\ =\ \mathbb{P}\left[C\ |\ B\right]\mathbb{P}[B]+\mathbb{P}\left[C\ |\ \neg{B}\right]\mathbb{P}[\neg{B}]\ \leq\ \mathbb{P}\left[C\ |\ B\right]\mathbb{P}[B]+\mathbb{P}[\neg B],
[C|B][C][¬B][B]=[C](1[B])[B]= 11[C][B].\displaystyle\iff\mathbb{P}\left[C\ |\ B\right]\ \geq\ \frac{\mathbb{P}[C]-\mathbb{P}[\neg B]}{\mathbb{P}[B]}\ =\ \frac{\mathbb{P}[C]-(1-\mathbb{P}[B])}{\mathbb{P}[B]}\ =\ 1-\frac{1-\mathbb{P}[C]}{\mathbb{P}[B]}. (17)

Since [B]1ϵ1\mathbb{P}[B]\geq 1-\epsilon_{1} and Equation (16) says that [C](1/2)/(1ϵ1)+ϵ1\mathbb{P}[C]\geq(1/2)/(1-\epsilon_{1})+\epsilon_{1}, we have that if Equation (16) is true, then

[C|B] 11((1/2)/(1ϵ1)+ϵ1)1ϵ1= 1(1ϵ1)(1/2)/(1ϵ1)1ϵ1=1/2(1ϵ1)2,\displaystyle\mathbb{P}[C\ |\ B]\ \geq\ 1-\frac{1-((1/2)/(1-\epsilon_{1})+\epsilon_{1})}{1-\epsilon_{1}}\ =\ 1-\frac{(1-\epsilon_{1})-(1/2)/(1-\epsilon_{1})}{1-\epsilon_{1}}\ =\ \frac{1/2}{(1-\epsilon_{1})^{2}}, (18)

and the claim is shown.

To ensure that Equation (16) is true, we require

𝔼[XA](1ϵ11/21ϵ1)(m28mϵ16ϵ126).\mathbb{E}[X_{A}]\ \leq\ \left(1-\epsilon_{1}-\frac{1/2}{1-\epsilon_{1}}\right)\left(\frac{m^{2}}{8}-\frac{m\epsilon_{1}}{6}-\frac{\epsilon_{1}^{2}}{6}\right). (19)

Then by Markov’s inequality, the probability that XAX_{A} exceeds m28mϵ16ϵ126\frac{m^{2}}{8}-\frac{m\epsilon_{1}}{6}-\frac{\epsilon_{1}^{2}}{6} is at most 1ϵ11/21ϵ11-\epsilon_{1}-\frac{1/2}{1-\epsilon_{1}}, which is equivalent to Equation (16).

We may now choose a small value ϵ2\epsilon_{2} such that Equation (19) is true if

𝔼[XA](116ϵ2)m2.\mathbb{E}[X_{A}]\ \leq\ \left(\frac{1}{16}-\epsilon_{2}\right)m^{2}. (20)

Notice that in the limit ϵ10\epsilon_{1}\to 0, Equation (19) boils down to the statement that 𝔼[XA]m2/16\mathbb{E}[X_{A}]\leq m^{2}/16, so we can make ϵ2\epsilon_{2} be as small as desired by making ϵ1\epsilon_{1} be small.

We now revisit Lemma 3.4. We are now assuming that mm is large and jj is a constant, so only the first term dominates:

𝔼[XA](732+ϵ3)m4n.\mathbb{E}[X_{A}]\ \leq\ \left(\frac{7}{32}+\epsilon_{3}\right)\frac{m^{4}}{n}. (21)

We wanted to have 𝔼[XA](1/16ϵ2)m2\mathbb{E}[X_{A}]\leq(1/16-\epsilon_{2})m^{2}. Thus the upper bound on mm is now determined as

(732+ϵ3)m4n(116ϵ2)m2\displaystyle\left(\frac{7}{32}+\epsilon_{3}\right)\frac{m^{4}}{n}\ \leq\ \left(\frac{1}{16}-\epsilon_{2}\right)m^{2}
mcn,\displaystyle\iff m\ \leq\ c\sqrt{n}, (22)

where

c=1/16ϵ7/32+ϵ3.\displaystyle c\ =\ \sqrt{\frac{1/16-\epsilon^{\prime}}{7/32+\epsilon_{3}}}. (23)

If nn is arbitrarily large and jj is constant compared to nn, we can choose ϵ2\epsilon_{2} and ϵ3\epsilon_{3} to be very small, so that cc is arbitrarily close to (1/16)/(7/32)0.5345\sqrt{(1/16)/(7/32)}\approx 0.5345.

This completes the proof of Theorem 3.1. To prove Theorem 3.2, we follow a very similar method. We choose mϵm_{\epsilon} large enough that almost all of the sets in 𝒮D,m\mathcal{S}_{D,m} have (0.5ϵ1)mk(0.5+ϵ1)m(0.5-\epsilon_{1})m\leq k\leq(0.5+\epsilon_{1})m. The only difference is that now, in Equation (15) we replace the right-hand side with (1ϵ1)(1-\epsilon_{1}), so that the proportion of MSTD sets is at least (1ϵ1)2(1-\epsilon_{1})^{2}. (We may choose ϵ1\epsilon_{1} so that (1ϵ1)2(1-\epsilon_{1})^{2} equals the desired (1ϵ)(1-\epsilon)). Then in Equation (16) we replace the right-hand side with 1(1ϵ1)ϵ11-(1-\epsilon_{1})\epsilon_{1}, leading to the analog of Equation (20) being that 𝔼[XA]ϵ2m2\mathbb{E}[X_{A}]\leq\epsilon_{2}m^{2} for some small ϵ2\epsilon_{2} depending on ϵ1\epsilon_{1}. The rest of the proof continues as before, leading to a coefficient cϵ,jc_{\epsilon,j} proportional to ϵ2\sqrt{\epsilon_{2}}. This completes the proof.

3.2. Proof of Lemma 3.4

We now prove Lemma 3.4. Recall that XAX_{A} is defined to be half the number of non-redundant collisions in the set AA, and we are interested in bounding above the expectation value of XAX_{A} when AA is chosen uniformly at random from 𝒮D,m\mathcal{S}_{D,m}.

To more easily count the collisions in AA, we make the following definition.

Definition 3.5.

A redundant triple is a triple (a,b,c)D3(a,b,c)\in D^{3} such that the quadruple (a,b,c,c1ab)(a,b,c,c^{-1}ab) is redundant. That is, a triple (a,b,c)(a,b,c) is redundant if a=ca=c, or if a,b,a,b, and cc are all flips, or if b=cb=c and aa and bb are both rotations. Denote TD3T\subseteq D^{3} to be the set of non-redundant triples.

Define the function χ:𝒮D,m×T{0,1}\chi:\mathcal{S}_{D,m}\times T\to\{0,1\} by χ(A,t)=1\chi(A,t)=1 if for the non-redundant triple t=(a,b,c)t=(a,b,c), the element c1abc^{-1}ab is in AA, and χ(A,t)=0\chi(A,t)=0 otherwise.

For a fixed set AA, the set A3TA^{3}\cap T is the set of non-redundant triples with all three elements contained in AA. Notice that we have

XA=12tA3Tχ(A,t).X_{A}\ =\ \frac{1}{2}\sum_{t\in A^{3}\cap T}\chi(A,t). (24)

That is, the number of non-redundant collisions in AA is the same as the number of non-redundant triples (a,b,c)A3T(a,b,c)\in A^{3}\cap T such that the element d=c1abd=c^{-1}ab is in AA, forming a quadruple (a,b,c,d)(a,b,c,d) representing a collision ab=cdab=cd.

By definition of expectation value, we write

𝔼[XA]=A𝒮D,m[A]XA.\mathbb{E}[X_{A}]\ =\ \sum_{A\in\mathcal{S}_{D,m}}\mathbb{P}[A]X_{A}. (25)

Since AA is chosen uniformly at random from the (2nm)\binom{2n}{m} sets in 𝒮D,m\mathcal{S}_{D,m}, we have [A]=1/(2nm)\mathbb{P}[A]=1/\binom{2n}{m}. Thus,

𝔼[XA]=1(2nm)A𝒮D,m12tA3Tχ(A,t).\mathbb{E}[X_{A}]\ =\ \frac{1}{\binom{2n}{m}}\sum_{A\in\mathcal{S}_{D,m}}\frac{1}{2}\sum_{t\in A^{3}\cap T}\chi(A,t). (26)

We swap the order of the sums.

𝔼[XA]=121(2nm)tTA𝒮D,mA3tχ(A,t).\mathbb{E}[X_{A}]\ =\ \frac{1}{2}\frac{1}{\binom{2n}{m}}\sum_{t\in T}\sum_{\begin{subarray}{c}A\in\mathcal{S}_{D,m}\\ A^{3}\ni t\end{subarray}}\chi(A,t). (27)

To compute the inner sum, we must simply count the number of sets A𝒮D,mA\in\mathcal{S}_{D,m} with t=(a,b,c)A3t=(a,b,c)\in A^{3} such that AA contains c1abc^{-1}ab. That is,

𝔼[XA]=121(2nm)(a,b,c)T|{A𝒮D,m|a,b,c,c1abA}|.\mathbb{E}[X_{A}]\ =\ \frac{1}{2}\frac{1}{\binom{2n}{m}}\sum_{(a,b,c)\in T}|\{A\in\mathcal{S}_{D,m}\ |\ a,b,c,c^{-1}ab\in A\}|. (28)

We now break this sum into seven pieces for different kinds of triples (a,b,c)T(a,b,c)\in T. These are:

  • T1={(a,b,c)T|a,b,c,c1ab are distinct }T_{1}\ =\ \{(a,b,c)\in T\ |\ a,b,c,c^{-1}ab\text{ are distinct }\}

  • T2={(a,b,c)T|a=b;a,c,c1a2 are distinct}T_{2}\ =\ \{(a,b,c)\in T\ |\ a=b;\ a,c,c^{-1}a^{2}\text{ are distinct}\}

  • T3={(a,b,c)T|b=c;a,c,c1ac are distinct}T_{3}\ =\ \{(a,b,c)\in T\ |\ b=c;\ a,c,c^{-1}ac\text{ are distinct}\}

  • T4={(a,b,c)T|c1ab=a;a,b,c are distinct}T_{4}\ =\ \{(a,b,c)\in T\ |\ c^{-1}ab=a;\ a,b,c\text{ are distinct}\}

  • T5={(a,b,c)T|c1ab=c;a,b,c are distinct}T_{5}\ =\ \{(a,b,c)\in T\ |\ c^{-1}ab=c;\ a,b,c\text{ are distinct}\}

  • T6={(a,b,c)T|b=c;a=c1ac;a,c are distinct}T_{6}\ =\ \{(a,b,c)\in T\ |\ b=c;a=c^{-1}ac;\ a,c\text{ are distinct}\}

  • T7={(a,b,c)T|a=b;c=c1a2;a,c are distinct}T_{7}\ =\ \{(a,b,c)\in T\ |\ a=b;c=c^{-1}a^{2};\ a,c\text{ are distinct}\}

We have T=i=17TiT=\bigcup_{i=1}^{7}T_{i}, for these seven cases cover all the cases of possible equalities between the four elements except for those where a=ca=c, or equivalently, c1ab=bc^{-1}ab=b, since those cases are redundant triples. Furthermore, this union is disjoint.

Note that for triples (a,b,c)T1(a,b,c)\in T_{1}, the quantity |{A𝒮D,m|a,b,c,c1abA}||\{A\in\mathcal{S}_{D,m}\ |\ a,b,c,c^{-1}ab\in A\}| is given by (2n4m4)\binom{2n-4}{m-4} since we are requiring four distinct elements to be in AA, and we have 2n42n-4 choices for the remaining m4m-4 elements. For triples in T2,T3,T4,T_{2},T_{3},T_{4}, and T5T_{5}, we are requiring three distinct elements to be in AA, so we have |{A𝒮D,m|a,b,c,c1abA}|=(2n3m3)|\{A\in\mathcal{S}_{D,m}\ |\ a,b,c,c^{-1}ab\in A\}|=\binom{2n-3}{m-3}. Finally, for triples in T6T_{6} and T7T_{7} we have |{A𝒮D,m|a,b,c,c1abA}|=(2n2m2)|\{A\in\mathcal{S}_{D,m}\ |\ a,b,c,c^{-1}ab\in A\}|=\binom{2n-2}{m-2}.

Therefore, from Equation (28), we may write:

𝔼[XA]\displaystyle\mathbb{E}[X_{A}]\ =121(2nm)[(2n4m4)|T1|+(2n3m3)(|T2|+|T3|+|T4|+|T5|)+(2n2m2)(|T6|+|T7|)]\displaystyle=\ \frac{1}{2}\frac{1}{\binom{2n}{m}}\left[\binom{2n-4}{m-4}|T_{1}|+\binom{2n-3}{m-3}\left(|T_{2}|+|T_{3}|+|T_{4}|+|T_{5}|\right)+\binom{2n-2}{m-2}\left(|T_{6}|+|T_{7}|\right)\right]
12[(m2n)4|T1|+(m2n)3(|T2|+|T3|+|T4|+|T5|)+(m2n)2(|T6|+|T7|)],\displaystyle\leq\ \frac{1}{2}\left[\left(\frac{m}{2n}\right)^{4}|T_{1}|+\left(\frac{m}{2n}\right)^{3}\left(|T_{2}|+|T_{3}|+|T_{4}|+|T_{5}|\right)+\left(\frac{m}{2n}\right)^{2}\left(|T_{6}|+|T_{7}|\right)\right], (29)

where in the second line we used the fact that

(2n4m4)(2nm)=m(m1)(m2)(m3)2n(2n1)(2n2)(2n3)(m2n)4,\frac{\binom{2n-4}{m-4}}{\binom{2n}{m}}\ =\ \frac{m(m-1)(m-2)(m-3)}{2n(2n-1)(2n-2)(2n-3)}\ \leq\ \left(\frac{m}{2n}\right)^{4}, (30)

and similarly for the other two terms.

Now, to use Equation (29) to find an upper bound on 𝔼[XA]\mathbb{E}[X_{A}], all that remains to be done is find an upper bound on each of the |Ti||T_{i}|’s. We do so next.

We bound |T1||T_{1}| using the trivial inequality |T1||T||T_{1}|\leq|T|. There are (2n)3(2n)^{3} total triples in D3D^{3}, but we may subtract the redundant triples, including the n3n^{3} triples consisting of three flips. Thus we obtain

|T1| 7n3.|T_{1}|\ \leq\ 7n^{3}. (31)

We bound |T2||T_{2}| and |T3||T_{3}| next. We have 2n2n choices for bb. In T2T_{2}, aa must equal bb, and in T3T_{3}, cc must equal bb, and in both, we have at most 2n2n choices for the remaining value of aa or cc. The extra condition that c1abc^{-1}ab is distinct from the others only lowers |T2||T_{2}| and |T3||T_{3}|, so we do not have to take it into account to obtain an upper bound. Thus,

|T2|,|T3| 4n2.|T_{2}|,\ |T_{3}|\ \leq\ 4n^{2}. (32)

For |T4||T_{4}| and |T5||T_{5}|, we have 2n2n choices for aa and 2n2n choices for cc, but bb must be the element a1caa^{-1}ca for T4T_{4} or a1c2a^{-1}c^{2} for T5T_{5}. Thus we have

|T4|,|T5| 4n2.|T_{4}|,\ |T_{5}|\ \leq\ 4n^{2}. (33)

When considering T6T_{6}, we note that since b=cb=c, the triple (a,b,c)(a,b,c) is redundant if aa and cc are both rotations, or if both are flips. So, one must be a rotation and the other must be a flip, and we require that a=c1aca=c^{-1}ac, or ca=acca=ac. Thus to bound |T6||T_{6}| we must count the number of pairs of elements with (0,g1)(1,g2)=(1,g2)(0,g1)(0,g_{1})\cdot(1,g_{2})=(1,g_{2})\cdot(0,g_{1}), that is, (1,g1g2)=(1,g2g11)(1,g_{1}g_{2})=(1,g_{2}g_{1}^{-1}). This happens if and only if g11=g1g_{1}^{-1}=g_{1}, or g12=1g_{1}^{2}=1. Recalling that there are jj elements in GG with order 22 or less, there are therefore jj choices for the rotation element, and nn choices for the flip element. We multiply by 22 since aa can be either the flip or the rotation, and cc is the other of the two. Thus,

|T6| 2nj.|T_{6}|\ \leq\ 2nj. (34)

Finally, for T7T_{7}, we again split into two cases: firstly where aa is a rotation, and secondly where aa is a flip, so cc is a rotation to avoid redundancy. Here we require c=c1a2c=c^{-1}a^{2}, or c2=a2c^{2}=a^{2}. For the first case, we first consider the number of pairs with a2=c2=1a^{2}=c^{2}=1. Since aa must be a rotation, there are only jj choices for a2=1a^{2}=1; cc can be a flip or a rotation, so there are n+jn+j ways to have c2=1c^{2}=1. So, there are at most (n+j)j(n+j)j pairs of this kind. If aa is a rotation and a21a^{2}\neq 1, then cc must also be a rotation since otherwise c2=1a2c^{2}=1\neq a^{2}. Thus aa and cc commute, so a2=c2(ac1)2=1a^{2}=c^{2}\iff(ac^{-1})^{2}=1. There are njn-j choices of aa with a21a^{2}\neq 1, and for each, there are jj choices of cc which have (ac1)2=1(ac^{-1})^{2}=1. Thus there are at most (nj)j(n-j)j pairs of this kind.

Next we consider the case where aa is a flip and cc is a rotation. Here a2=1a^{2}=1, so there are only jj choices for cc that have c2=a2c^{2}=a^{2}. Thus there are njnj pairs of this kind. In total,

|T7|(n+j)j+(nj)j+nj= 3nj.|T_{7}|\ \leq\ (n+j)j+(n-j)j+nj\ =\ 3nj. (35)

We now substitute these bounds on each |Ti||T_{i}| into Equation (29).

𝔼[XA]\displaystyle\mathbb{E}[X_{A}]\ 12[(m2n)4(7n3)+(m2n)3(4n2+4n2+4n2+4n2)+(m2n)2(2nj+3nj)]\displaystyle\leq\ \frac{1}{2}\left[\left(\frac{m}{2n}\right)^{4}\left(7n^{3}\right)+\left(\frac{m}{2n}\right)^{3}\left(4n^{2}+4n^{2}+4n^{2}+4n^{2}\right)+\left(\frac{m}{2n}\right)^{2}\left(2nj+3nj\right)\right]
=(732+1m+5j8m2)m4n.\displaystyle=\ \left(\frac{7}{32}+\frac{1}{m}+\frac{5j}{8m^{2}}\right)\frac{m^{4}}{n}. (36)

This concludes the proof of Lemma 3.4.

3.3. Finitely Generated Abelian Groups

We transition to a discussion of finitely generated abelian groups GG. When |G|<|G|<\infty Theorems 1.8, 3.1, and 3.2 hold, so the remaining case is when GG is an infinite group. However, we must make some restriction as to ensure taking subsets uniformly at random is well defined. By the fundamental theorem of finitely generated abelian groups,

Gr0q1r1qkrk,G\cong\mathbb{Z}^{r_{0}}\oplus\mathbb{Z}_{q_{1}}^{r_{1}}\oplus\cdots\oplus\mathbb{Z}_{q_{k}}^{r_{k}}, (37)

where qiq_{i} are powers of (not necessarily distinct) primes. We denote elements of GG as a tuple (g0,1,g0,2,,g0,r0,g1,1,,gk,rk)G(g_{0,1},g_{0,2},\dots,g_{0,r_{0}},g_{1,1},\dots,g_{k,r_{k}})\in G, where g0,bg_{0,b}\in\mathbb{Z} and ga,bqag_{a,b}\in\mathbb{Z}_{q_{a}} for a>0a>0. Since this section will occasionally require us to deal with multiple groups simultaneously, we update our notation of DD as the generalized dihedral group of GG to the standard Dih(G)\mathrm{Dih}(G). We still denote elements of Dih(G)\mathrm{Dih}(G) as (z,g)(z,g), where z{0,1},gGz\in\{0,1\},g\in G.

For some fixed α\alpha\in\mathbb{N}, we will consider taking subsets uniformly at random from the finite

Dih(Gα)={(z,(g0,1,,g0,r0,,gk,rk))Dih(G)| 0g0,b<α}.\mathrm{Dih}(G_{\alpha})=\{(z,(g_{0,1},\dots,g_{0,r_{0}},\dots,g_{k,r_{k}}))\in\mathrm{Dih}(G)\ |\ 0\leq g_{0,b}<\alpha\}. (38)

Our goal is to leverage Theorems 1.8, 3.1, and 3.2, and, to do so, we will refer to Dih(G)\mathrm{Dih}(G^{\prime}), where

G=αr0q1r1qkrk.G^{\prime}=\mathbb{Z}_{\alpha}^{r_{0}}\oplus\mathbb{Z}_{q_{1}}^{r_{1}}\oplus\cdots\oplus\mathbb{Z}_{q_{k}}^{r_{k}}. (39)

To adhere to prior notation, let jj be the number of elements in GG^{\prime} that are at most order 22 and let 𝒮m\mathcal{S}_{m} denote the collection of subsets of Dih(Gα)\mathrm{Dih}(G_{\alpha}) that have size mm. Then we get the following corollaries of Theorems 1.8, 3.1, and 3.2:

Corollary 3.6.

If 6mcjn6\leq m\leq c_{j}\sqrt{n}, then there are more MSTD than MDTS in 𝒮m\mathcal{S}_{m}.

Corollary 3.7.

If nnj,ϵn\geq n_{j,\epsilon}, then if 6m(2/7ϵ)n6\leq m\leq\left(\sqrt{2/7}-\epsilon\right)\sqrt{n}, then there are more MSTD than MDTS sets in 𝒮m\mathcal{S}_{m}.

Corollary 3.8.

For any ϵ>0\epsilon>0, there exist mϵm_{\epsilon} and cϵ,jc_{\epsilon,j} such that if mϵmcϵ,jnm_{\epsilon}\leq m\leq c_{\epsilon,j}\sqrt{n}, the proportion of MSTD sets in 𝒮m\mathcal{S}_{m} is at least 1ϵ1-\epsilon.

Proof.

We will establish a bijection ϕ:Dih(Gα)Dih(G)\phi:\mathrm{Dih}(G_{\alpha})\rightarrow\mathrm{Dih}(G^{\prime}) such that, if (a,b,c,d)(Dih(Gα))4(a,b,c,d)\in(\mathrm{Dih}(G_{\alpha}))^{4} is a collision, then (ϕ(a),ϕ(b),ϕ(c),ϕ(d))(Dih(G))4(\phi(a),\phi(b),\phi(c),\phi(d))\in(\mathrm{Dih}(G^{\prime}))^{4} is also a collision. Since we are still working with generalized dihedral groups, this immediately implies Corollaries 3.6, 3.7, and 3.8, as the number of non-degenerate collisions in Dih(Gα)\mathrm{Dih}(G_{\alpha}) is bounded above by the number of non-degenerate collisions in Dih(G)\mathrm{Dih}(G^{\prime}).

Let ϕ:Dih(Gα)Dih(G)\phi:\mathrm{Dih}(G_{\alpha})\rightarrow\mathrm{Dih}(G^{\prime}) defined by ϕ((z,(g0,1,,gk,rk)))=(z,(g0,1,,gk,rk)).\phi((z,(g_{0,1},\dots,g_{k,r_{k}})))=(z,(g_{0,1},\dots,g_{k,r_{k}})). As defined, this is clearly a bijection. Let (x1,x2,x3,x4)(Dih(Gα))4(x_{1},x_{2},x_{3},x_{4})\in(\mathrm{Dih}(G_{\alpha}))^{4} be a collision. By definition, x1x2=x3x4x_{1}x_{2}=x_{3}x_{4}. Let xj=(zj,(g0,1,j,,gk,rk,j)x_{j}=(z_{j},(g_{0,1,j},\dots,g_{k,r_{k},j}). Define the binary operation t:t×tt\star_{t}:\mathbb{Z}_{t}\times\mathbb{Z}_{t}\rightarrow\mathbb{Z}_{t} by

g1g2={g1+g2modt,z1=0,g1g2modt,z1=1.g_{1}\star g_{2}=\begin{cases}g_{1}+g_{2}\mod t,\ z_{1}=0,\\ g_{1}-g_{2}\mod t,\ z_{1}=1.\end{cases} (40)

Then we get

x1x2=(z1+z2 mod 2,(g0,1,1ng0,1,2,,gk,rk,1qkgk,rk,2)).x_{1}x_{2}=(z_{1}+z_{2}\text{ mod }2,(g_{0,1,1}\star_{n}g_{0,1,2},\dots,g_{k,r_{k},1}\star_{q_{k}}g_{k,r_{k},2})). (41)

In other words, we are given the following system of equations:

z1+z2z3+z4mod2,\displaystyle z_{1}+z_{2}\equiv z_{3}+z_{4}\mod 2,
g0,1,1+g0,1,2=g0,1,3+g0,1,4,\displaystyle g_{0,1,1}+g_{0,1,2}=g_{0,1,3}+g_{0,1,4},
\displaystyle\vdots
gk,rk,1qkgk,rk,2=gk,rk,3qkgk,rk,4.\displaystyle g_{k,r_{k},1}\star_{q_{k}}g_{k,r_{k},2}=g_{k,r_{k},3}\star_{q_{k}}g_{k,r_{k},4}.

Consider q=(ϕ(x1),ϕ(x2),ϕ(x3),ϕ(x4))q=(\phi(x_{1}),\phi(x_{2}),\phi(x_{3}),\phi(x_{4})). For qq to be a collision in Dih(G)\mathrm{Dih}(G^{\prime}), we require the following system of equations:

z1+z2z3+z4mod2,\displaystyle z_{1}+z_{2}\equiv z_{3}+z_{4}\mod 2,
g0,1,1αg0,1,2=g0,1,3αg0,1,4,\displaystyle g_{0,1,1}\star_{\alpha}g_{0,1,2}=g_{0,1,3}\star_{\alpha}g_{0,1,4},
\displaystyle\vdots
gk,rk,1qkgk,rk,2=gk,rk,3qkgk,rk,4,\displaystyle g_{k,r_{k},1}\star_{q_{k}}g_{k,r_{k},2}=g_{k,r_{k},3}\star_{q_{k}}g_{k,r_{k},4},

which is implied by our given system. Therefore the result follows. ∎

We present an example of the number of collisions in another dihedral group 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2}. We show that the dihedral group 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} possesses exactly the same number of possible collisions as the dihedral group 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n^{2}} if and only if nn is odd and there are more collisions in 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} otherwise. This result only provides some intuition on the the expected number of collisions depending on jj, the number of elements of order 22 within the particular abelian group.

Lemma 3.9.

The number of possible collisions within the two groups are equal if and only if nn is odd and there are more collisions in 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} otherwise.

To do so, we make use of the following useful results:

Lemma 3.10.

The number of pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1b1xmodna_{1}-b_{1}\equiv x\mod n and a2b2ymodna_{2}-b_{2}\equiv y\mod n and the number of pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)(b1n+b2)xn+ymodn2(a_{1}n+a_{2})-(b_{1}n+b_{2})\equiv xn+y\mod n^{2} are both equal to n2n^{2}

Proof.

For any given (b1,b2)(b_{1},b_{2}) and b1n+b2b_{1}n+b_{2}, there exist only a single (a1,a2)(a_{1},a_{2}) and a1n+a2a_{1}n+a_{2} which satisfies the conditions respectively. Thus, there are n2n^{2} such pairs for both cases. ∎

Lemma 3.11.

The number of pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1+b1xmodna_{1}+b_{1}\equiv x\mod n and a2+b2ymodna_{2}+b_{2}\equiv y\mod n is as follows:

  • when nn is odd, there are n2+12\frac{n^{2}+1}{2} such pairs

  • when nn is even and yy is odd or xx is odd, there are n22\frac{n^{2}}{2} such pairs

  • when nn, yy, and xx are all even, there are n2+42\frac{n^{2}+4}{2} such pairs

Proof.

When nn is odd, each choice of a1a_{1} is paired with a single possible b1b_{1}, with one such pair being a1=b1a_{1}=b_{1}. Similarly, the same holds for a2a_{2} and b2b_{2}. This gives n2n^{2} as the number of pairs. However, we over-counted since swapping a1a_{1} with b1b_{1} and a2a_{2} with b2b_{2} does not yield a distinct pair. Thus, we divide by 22 except for the single pair where a1=b1a_{1}=b_{1} and a2=b2a_{2}=b_{2} which we did not over-count to get n212+1=n2+12\frac{n^{2}-1}{2}+1=\frac{n^{2}+1}{2} pairs of (a1,a2),(b1,b2)(a_{1},a_{2}),(b_{1},b_{2}).

When nn is even but yy is odd, we choose pairs (a2,b2)(a_{2},b_{2}) first which gives n/2n/2 distinct pairs where a2a_{2} is never equal to b2b_{2}. For each of these pairs, we can choose any choice of a1a_{1} which forces b1b_{1} without over-counting. Thus, we get n22\frac{n^{2}}{2} pairs. Similar arguments hold for when xx is odd.

When all of n,y,n,y, and xx are even, we get that there are two pairs of identical number and n22\frac{n-2}{2} pairs of different numbers which sum to xx and similar for yy in n\mathbb{Z}_{n}. We choose a2,b2a_{2},b_{2} first. If a2=b2a_{2}=b_{2}, we have n22+2=n+22\frac{n-2}{2}+2=\frac{n+2}{2} choices for choosing a1,b1a_{1},b_{1}. If a2b2a_{2}\neq b_{2}, we have that for any a1a_{1}, we uniquely determines b1b_{1} without over-counting and thus there are nn choices. In total, we have 2n+22+nn22=n2+422*\frac{n+2}{2}+n*\frac{n-2}{2}=\frac{n^{2}+4}{2} pairs of (a1,a2),(b1,b2)(a_{1},a_{2}),(b_{1},b_{2}). ∎

Lemma 3.12.

The number of pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)+(b1n+b2)xn+ymodn2(a_{1}n+a_{2})+(b_{1}n+b_{2})\equiv xn+y\mod n^{2} is as follows:

  • when nn is odd, there are n2+12\frac{n^{2}+1}{2} such pairs

  • when nn is even but yy is odd, there are n22\frac{n^{2}}{2} such pairs

  • when both nn and yy are even, there are n2+22\frac{n^{2}+2}{2} such pairs

Proof.

When nn is odd, we have that each of a1n+a2a_{1}n+a_{2} is paired with another b1n+b2b_{1}n+b_{2}, with exactly one being paired with itself. Thus, there are n212+1=n2+12\frac{n^{2}-1}{2}+1=\frac{n^{2}+1}{2} pairs of a1n+a2a_{1}n+a_{2} and b1n+b2b_{1}n+b_{2}.

When nn is even but yy is odd, we have that each of a1n+a2a_{1}n+a_{2} is paired with another b1n+b2b_{1}n+b_{2}, necessarily distinct. Thus, there are n22\frac{n^{2}}{2} pairs of a1n+a2a_{1}n+a_{2} and b1n+b2b_{1}n+b_{2}.

When nn is even and yy is even, we have that each of a1n+a2a_{1}n+a_{2} is paired with another b1n+b2b_{1}n+b_{2}, with exactly two being paired with themselves. Thus, there are n222+2=n2+22\frac{n^{2}-2}{2}+2=\frac{n^{2}+2}{2} pairs of a1n+a2a_{1}n+a_{2} and b1n+b2b_{1}n+b_{2}. ∎

Combining these results over the casework where elements of our pairs may be a rotation or a flip yield the desired result, with the details of the proof in Appendix B.

4. Expected Size of Sum and Difference Sets

In this section, we only consider the classical dihedral groups

D2n=2n=r,srn,s2,rsrsD_{2n}=\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}=\langle r,s\mid r^{n},s^{2},rsrs\rangle (42)

Throughout this section, we use 𝒮2n,m\mathcal{S}_{2n,m} to denote the set of subsets of size mm in D2nD_{2n}.

The method of collision analysis will likely not be sufficient to prove that 𝒮2n,m\mathcal{S}_{2n,m} has more MSTD sets than MDTS sets for values of mm greater in order of magnitude than n\sqrt{n}. The intuition for this comes from the fact that the sum and difference sets for AD2nA\subseteq D_{2n} should very roughly have size on the same order of magnitude as A2A^{2}. Hence, one would expect to usually have A+A=AA=D2nA+A=A-A=D_{2n} when mm is much greater than n\sqrt{n}. The analysis for relative numbers of MSTD and MDTS sets in 𝒮2n,m\mathcal{S}_{2n,m} for these larger values of mm should therefore be based on counting the number of missed sums and differences in D2nD_{2n}, in direct analogy with the case of slow decay for the integers in [HM09].

We take the first steps toward such an analysis by proving the following special case.

See 1.9

This follows from the following straightforward yet useful lemma, which reduces the problem of computing the probability of missing a sum or difference to an analogous problem in n\mathbb{Z}_{n}.

Lemma 4.1.

Let AA be a subset in 𝒮2n,m\mathcal{S}_{2n,m} chosen uniformly at random. Then if rir^{i} is a rotation in D2nD_{2n}, we have

[riA+A]=k=0m[A has k flips][iS+S|Sn|S|=mk][iSS|Sn|S|=k],\mathbb{P}[r^{i}\notin A+A]\ =\ \sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S+S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=m-k$}]\\ \cdot\ \mathbb{P}[i\notin S-S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=k$}],\end{matrix} (43)

and

[riAA]=k=0m[A has k flips][iSS|Sn|S|=mk][iSS|Sn|S|=k].\mathbb{P}[r^{i}\notin A-A]\ =\ \sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S-S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=m-k$}]\\ \cdot\ \mathbb{P}[i\notin S-S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=k$}].\end{matrix} (44)

If risr^{i}s is a flip in D2nD_{2n}, we have

[risA+A]=k=0m[A has k flips][iS1+S2iS2S1|S1,S2n|S1|=mk|S2|=k],\mathbb{P}[r^{i}s\notin A+A]\ =\ \sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S_{1}+S_{2}\land i\notin S_{2}-S_{1}|\text{$S_{1},S_{2}\subseteq\mathbb{Z}_{n}$, $|S_{1}|=m-k$, $|S_{2}|=k$}],\end{matrix} (45)

and

[risAA]=k=0m[A has k flips][iS1+S2|S1,S2n|S1|=mk|S2|=k].\mathbb{P}[r^{i}s\notin A-A]\ =\ \sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S_{1}+S_{2}|\text{$S_{1},S_{2}\subseteq\mathbb{Z}_{n}$, $|S_{1}|=m-k$, $|S_{2}|=k$}].\end{matrix} (46)
Proof.

Partition AA into its set of rotations RR and flips FF, and suppose that |F|=k|F|=k and |R|=mk|R|=m-k. The rotation element rir^{i} can appear in A+AA+A precisely as rjr=rj+r^{j}r^{\ell}=r^{j+\ell} for rj,rRr^{j},r^{\ell}\in R or as (rjs)(rs)=rj(r^{j}s)(r^{\ell}s)=r^{j-\ell} for rjs,rsFr^{j}s,r^{\ell}s\in F. Taking the probability of the negations of these events respectively give second and third probabilities appearing in the sum of Equation (43). Equation (44) follows similarly.

The flip element risr^{i}s can appear in A+AA+A precisely as rj(rs)=rj+sr^{j}(r^{\ell}s)=r^{j+\ell}s or as (rs)rj=rjs(r^{\ell}s)r^{j}=r^{\ell-j}s for rjRr^{j}\in R, rsFr^{\ell}s\in F, but unlike in the previous cases, these events are no longer independent so Equation (45) cannot be broken up into a product of simpler probabilities. Finally, risr^{i}s can appear in AAA-A precisely rj(rs)1=rj+s=(rs)(rj)1r^{j}(r^{\ell}s)^{-1}=r^{j+\ell}s=(r^{\ell}s)(r^{j})^{-1}, from which Equation (46) follows. ∎

A number of these probabilities can be expressed explicitly in terms of nn, mm, and kk. To prove Theorem 1.9, we need to compute all probabilities appearing in Equation (44) and Equation (46), but we will also compute the probabilities appearing in Equation (43) for completeness.

Lemma 4.2.

Let SS be a subset of n\mathbb{Z}_{n} of size mkm-k chosen uniformly at random, and let ii be any element of n\mathbb{Z}_{n}. Then

[iS+S]={2mk(n21k)(nmk),n and i both even,2mk(n2mk)(nmk),n even and i odd,2mk(n12mk)(nmk),n odd.\mathbb{P}[i\notin S+S]\ =\ \begin{cases}\frac{2^{m-k}{\binom{\frac{n}{2}-1}{k}}}{{\binom{n}{m-k}}},\ \text{$n$ and $i$ both even,}\\ \frac{2^{m-k}{\binom{\frac{n}{2}}{m-k}}}{{\binom{n}{m-k}}},\ \text{$n$ even and $i$ odd,}\\ \frac{2^{m-k}{\binom{\frac{n-1}{2}}{m-k}}}{{\binom{n}{m-k}}},\ \text{$n$ odd.}\end{cases} (47)
Proof.

If nn and ii are both even, then there exist exactly 22 elements of n\mathbb{Z}_{n} that give ii when added to themselves. The remaining elements of n\mathbb{Z}_{n} partition into pairs of distinct elements adding to ii, and any SS such that iS+Si\notin S+S is obtained by selecting mkm-k of these n21\frac{n}{2}-1 pairs and one element from each pair, hence the result. The case where nn is even and ii is odd is identical except that all nn of the elements of n\mathbb{Z}_{n} are now partitioned into pairs as there are no elements that give ii when added to themselves. Finally, if nn is odd, then there is always a unique element that gives ii when added to itself, and the remaining elements partition into n12\frac{n-1}{2} pairs, after which the same analysis can be applied. ∎

Lemma 4.3.

Let SS be a subset of n\mathbb{Z}_{n} of size kk chosen uniformly at random, and let ii be any nonzero element of n\mathbb{Z}_{n} of order n/dn/d. Then

[iSS]=1(nk)(k1,,kd)0dk1++kd=kt=1dg(nd,kt),\mathbb{P}[i\notin S-S]\ =\ \frac{1}{{\binom{n}{k}}}\sum_{\begin{subarray}{c}(k_{1},\ldots,k_{d})\in\mathbb{Z}_{\geq 0}^{d}\\ k_{1}+\cdots+k_{d}=k\end{subarray}}\prod_{t=1}^{d}g\left(\frac{n}{d},k_{t}\right), (48)

where

g(nd,kt)={1,kt=0,nd(nd1ktkt1)kt,kt>0.g\left(\frac{n}{d},k_{t}\right)\ =\ \begin{cases}1,\ k_{t}=0,\\ \frac{\frac{n}{d}{\binom{\frac{n}{d}-1-k_{t}}{k_{t}-1}}}{k_{t}},\ k_{t}>0.\end{cases} (49)
Remark 4.

When nn is prime, we must have d=1d=1 always and Equation (48) simplifies significantly to

[iSS]=g(n,k)(nk)=1(nk){1,k=0,n(n1kk1)k,k>0.\mathbb{P}[i\notin S-S]\ =\ \frac{g(n,k)}{{\binom{n}{k}}}\ =\ \frac{1}{{\binom{n}{k}}}\cdot\begin{cases}1,\ k=0,\\ \frac{n{\binom{n-1-k}{k-1}}}{k},\ k>0.\end{cases} (50)

This is the reason why we restrict to such nn in Theorem 1.9.

Proof.

Partition n\mathbb{Z}_{n} into the dd additive cosets of the subgroup ini\mathbb{Z}_{n}. Each of these cosets has size n/dn/d and has elements that can be cyclically ordered such that the difference between any element and its predecessor is ii. Choosing a set SS of size kk such that iSSi\notin S-S is then equivalent to partitioning kk into k1++kdk_{1}+\cdots+k_{d} with each kt0k_{t}\geq 0, and choosing ktk_{t} non-adjacent elements from the ttht^{\text{th}} coset to include in SS. This is precisely what is counted by Equation (48) if g(n/d,kt)g(n/d,k_{t}) is equal to the number of ways to choose ktk_{t} non-adjacent elements from a cyclically ordered set of size n/dn/d.

Indeed, this is clear for kt=0k_{t}=0, so assume kt>0k_{t}>0. There are n/dn/d choices for the first element to be included, and each selection of ktk_{t} elements may be made in ktk_{t} different ways by designating different elements to be this first choice. Once the first element has been selected, the number of ways to choose the remaining elements is equal to the number of ways to choose kt1k_{t}-1 non-adjacent elements from a linearly ordered set of size n/d1n/d-1 without choosing the extremal elements. This is the classic stars and bars problem, which yields precisely the binomial coefficient in the definition of g(n/d,kt)g(n/d,k_{t}). ∎

Lemma 4.4.

Let S1S_{1} and S2S_{2} be subsets of m\mathbb{Z}_{m} of sizes mkm-k and kk, respectively, chosen uniformly at random, and let ii be any element of n\mathbb{Z}_{n}. Then

[iS1+S2]=(nkmk)(nmk).\mathbb{P}[i\notin S_{1}+S_{2}]\ =\ \frac{{\binom{n-k}{m-k}}}{{\binom{n}{m-k}}}. (51)
Proof.

For any fixed choice of S2S_{2}, we have (nmk)\binom{n}{m-k} total choices for S1S_{1}. For each of these S1S_{1}, we have iS1+S2i\notin S_{1}+S_{2} if and only if ijS1i-j\notin S_{1} for all jS2j\in S_{2}. Because |S2|=k|S_{2}|=k, this leaves (nkmk)\binom{n-k}{m-k} choices for S1S_{1} such that iS1+S2i\notin S_{1}+S_{2}. ∎

We now have all the necessary parts to prove Theorem 1.9. Simple combinatorics yields

[A has k flips]=(nk)(nmk)(2nm).\mathbb{P}[\text{$A$ has $k$ flips}]\ =\ \frac{{\binom{n}{k}}{\binom{n}{m-k}}}{{\binom{2n}{m}}}. (52)

Combining this with Lemmas 4.1, 4.3, and 4.4 yields the result after some simplification. See Appendix C for complete details.

By plotting 𝔼[|AA|]\mathbb{E}[|A-A|] against mm for certain large values of nn, we can see evidence for our intuition from the beginning of this section that 𝔼[|AA|]\mathbb{E}[|A-A|] quickly becomes close to 2n2n for mm on the order of magnitude of n\sqrt{n} (Figure 1). Numerical evidence from many large primes nn suggests that the value of mm such that 𝔼[|AA|]=n\mathbb{E}[|A-A|]=n is about 1.3875n1.3875\sqrt{n}.222Some code written for this project can be found at https://github.com/ZeusDM/MSTD-experiments.

Refer to caption
Figure 1. A plot of 𝔼[|AA|]\mathbb{E}[|A-A|] versus mm for A𝒮2n,mA\in\mathcal{S}_{2n,m} with n=10007n=10007. Note that 𝔼[|AA|]\mathbb{E}[|A-A|] rapidly approaches 2n=200142n=20014 as mm approaches a small multiple of n100\sqrt{n}\approx 100.

5. Future Work

An immediate future direction of research is to extend the bounds on mm to show Conjecture 1.7 for all mm and nn. However, it is unlikely that the methods we have used in this paper will be useful for larger values of mm. This is because our approach showed that for values of mm that we considered, the majority of subsets of the generalized dihedral group with size mm were MSTD. But this is simply not the case for larger mm: numerical evidence shows that for any mnm\gg\sqrt{n}, the vast majority of the sets are balanced. A new approach is required to show that out of the sets that are not balanced, more are MSTD.

To this end, it would also be productive to more carefully analyze how many elements of the group DD are not in A+AA+A (or not in AAA-A) for |A||A| closer to nn. This follows the approach of [HM09] for the “slow decay” case. Such an analysis could be used to find explicit formulas for the expectation values of |A+A||A+A| and |AA||A-A|. In this paper we found a formula for 𝔼[|AA|]\mathbb{E}[|A-A|], but only when nn is prime. Further, to use these results, we would also need to bound the variance of |A+A||A+A| and |AA||A-A|. Depending on the results, this could be enough to prove Conjecture 1.6 if we can deduce an upper bound MM_{\ell} on the number of MDTS sets and a lower bound MsM_{s} on the number of MSTD sets such that M<MsM_{\ell}<M_{s}.

Yet another possible approach to prove Conjecture 1.6 is to construct an injective map from MDTS sets to MSTD sets in the group. Such an approach has proven to be difficult, but has the potential advantage of working for both large and small values of mm.

Appendix A Proof of Lemmas 2.1 and 2.2

See 2.1

Proof.

There are only 3 possible cases to consider for AA.

  • AA contains two flip elements: in this case, adding and subtracting flips is an identical operation as each flip element has order 2. Thus, AA will necessarily be sum-difference balanced.

  • AA contains one flip and one rotation element: let ri,rjsAr^{i},r^{j}s\in A. Here, A+A={1,r2i,ri+js,rjis}A+A=\{1,r^{2i},r^{i+j}s,r^{j-i}s\}. However, AA={1,ri+js}A-A=\{1,r^{i+j}s\}. When r2ier^{2i}\neq e or rjisri+jsr^{j-i}s\neq r^{i+j}s, AA will be MSTD. In the case where both of these are actually equalities (A+A={e,ri+js}A+A=\{e,r^{i+j}s\}), AA will simply be balanced.

  • AA contains two rotation elements: let ra,rbAr^{a},r^{b}\in A. Here, A+A={r2i,ri+j,r2j}A+A=\{r^{2i},r^{i+j},r^{2j}\}. However, AA={1,rij,rji}A-A=\{1,r^{i-j},r^{j-i}\}. Note that iji\neq j. Both the sum set and the difference set have 3 elements except in one special case. Suppose that r2i=r2jr^{2i}=r^{2j}. Then, we have that rirj=rjririj=rjir^{i}r^{-j}=r^{j}r^{-i}\implies r^{i-j}=r^{j-i}. Thus, when AA contains two rotation elements only, AA is always balanced.

Therefore, AA has strictly more MSTD sets in 𝒮2n,2\mathcal{S}_{2n,2} than MDTS sets. ∎

See 2.2

Proof.

There are 4 possible cases to consider for AA.

  • AA contains three flip elements: just as in the m=2m=2 case, since addition and subtraction of two flips are equivalent, AA is balanced.

  • AA contains two flip elements and one rotation element: let A={ri,rjs,rks}A=\{r^{i},r^{j}s,r^{k}s\} for jkj\neq k. Then the sumset is

    A+A={r2i,ri+js,ri+ks,rjk,rkj,1,rjis,rkis},A+A=\{r^{2i},r^{i+j}s,r^{i+k}s,r^{j-k},r^{k-j},1,r^{j-i}s,r^{k-i}s\}, (53)

    while the difference set is

    AA={ri+js,ri+ks,rjk,rkj,1},A-A=\{r^{i+j}s,r^{i+k}s,r^{j-k},r^{k-j},1\}, (54)

    so AAA+AA-A\subseteq A+A. Moreover, AA is MSTD precisely when at least one of rjisr^{j-i}s and rkisr^{k-i}s is distinct from each of the flips in A+AA+A that lie in AAA-A. We enumerate all the ways that this could fail to occur:

    We have ri+js=rjisr^{i+j}s=r^{j-i}s and ri+ks=rkisr^{i+k}s=r^{k-i}s if and only if i=i(modn)i=-i\pmod{n}. This happens if and only if i=0i=0 or if nn is even and i=n/2i=n/2, and in each of these cases we have (n2)=12(n2n)\binom{n}{2}=\frac{1}{2}(n^{2}-n) choices for jj and kk. On the other hand, we have ri+js=rkisr^{i+j}s=r^{k-i}s and ri+ks=rjisr^{i+k}s=r^{j-i}s if and only if 2i=kj=jk(modn)2i=k-j=j-k\pmod{n}, which implies that 2j=2k(modn)2j=2k\pmod{n}. We also require 2i=1(modn)2i=1\pmod{n}; because jkj\neq k, these equations can occur if and only if nn is divisible by 44, i=n/4i=n/4 or 3n/43n/4, and kj=n/2(modn)k-j=n/2\pmod{n}. We therefore have 22 choices for ii, and 12n\frac{1}{2}n choices for jj and one corresponding choice of kk for each ii. Finally, note that we cannot have rjis=rkisr^{j-i}s=r^{k-i}s because kik\neq i, so this completes the enumeration of all such sets AA that fail to be MSTD.

    There are n(n2)=12(n3n2)n\binom{n}{2}=\frac{1}{2}(n^{3}-n^{2}) total sets AA containing exactly 22 flips. Subtracting the balanced sets that we just enumerated, we obtain the following numbers of MSTD sets:

    12(n3n2)\displaystyle\frac{1}{2}(n^{3}-n^{2})\ if n=1(mod2)n=1\pmod{2},
    12(n3n2)n(n1)\displaystyle\frac{1}{2}(n^{3}-n^{2})-n(n-1)\ if n=2(mod4)n=2\pmod{4},
    12(n3n2)n(n1)n\displaystyle\frac{1}{2}(n^{3}-n^{2})-n(n-1)-n\ if n=0(mod4)n=0\pmod{4}. (55)
  • AA contains one flip element and two rotation element: let A={ri,rj,rks}A=\{r^{i},r^{j},r^{k}s\} for iji\neq j. Then the sumset is

    A+A={ri+j,r2i,r2j,ri+ks,rj+ks,rkis,rkjs,1},A+A=\{r^{i+j},r^{2i},r^{2j},r^{i+k}s,r^{j+k}s,r^{k-i}s,r^{k-j}s,1\}, (56)

    while the difference set is

    AA={rij,rji,ri+ks,rj+ks,1},A-A=\{r^{i-j},r^{j-i},r^{i+k}s,r^{j+k}s,1\}, (57)

    We show that AA cannot be MDTS by separately comparing the flips and the rotations in A+AA+A and AAA-A. The flips in AAA-A are contained in A+AA+A, so this comparison is trivial. The rotations in AAA-A are rij,rji,r^{i-j},r^{j-i}, and 11; we will show that A+AA+A has at least as many rotations. Consider the rotations in A+AA+A: note that ri+jr^{i+j} is different from r2ir^{2i} and from r2jr^{2j} since iji\neq j, so there are at least 22 distinct rotations in A+AA+A. If r2ir2jr^{2i}\neq r^{2j}, then in fact A+AA+A has at least three distinct rotations, and we have |A+A||AA||A+A|\geq|A-A|. On the other hand, if r2i=r2jr^{2i}=r^{2j}, then rij=rjir^{i-j}=r^{j-i}, so there are only at most 22 distinct rotations in AAA-A, so again |A+A||AA||A+A|\geq|A-A|.

  • AA contains three rotation elements: for this case, there are (n3)\binom{n}{3} such subsets. We compare directly to Equation (A), the case that yields the least possible number of MSTD sets in 𝒮2n,3\mathcal{S}_{2n,3}. Here we have

    12(n3n2)n(n1)n>16n(n1)(n2)\frac{1}{2}(n^{3}-n^{2})-n(n-1)-n>\frac{1}{6}n(n-1)(n-2) (58)

    or equivalently,

    n33n2n>0,n^{3}-3n^{2}-n>0, (59)

    which is true if n4n\geq 4. When n=3n=3, since n=1mod2n=1\mod 2, we compare (n3)\binom{n}{3} to 12(n3n2)\frac{1}{2}(n^{3}-n^{2}) instead, and the result is verified once again.

Therefore, even if all of the subsets with three rotation elements are MDTS, which is never true to begin with, we still have strictly more MSTD subsets of size 3 than MDTS subsets of size 3. ∎

Appendix B Proof of Lemma 3.9

For any given (x,y,F)2n2(x,y,F)\in\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2}, we calculate the number of pairs a,b2n2a,b\in\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} where ab=(x,y,F)a-b=(x,y,F). Similarly, we also calculate the number of pairs a,b2n2a,b\in\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} where a+b=(x,y,F)a+b=(x,y,F). Note that these can be counted through counting the number of pairs in n2\mathbb{Z}_{n}^{2} which results in (x,y)(x,y) as follows:

Case 1, F=0F=0

  • Case 1.1, ab=(x,y,0)a-b=(x,y,0)

    • Case 1.1.1 aa and bb are not flips.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1b1xmodna_{1}-b_{1}\equiv x\mod n and a2b2ymodna_{2}-b_{2}\equiv y\mod n

    • Case 1.1.2 aa and bb are both flips.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1b1xmodna_{1}-b_{1}\equiv x\mod n and a2b2ymodna_{2}-b_{2}\equiv y\mod n

  • Case 1.2, a+b=(x,y,0)a+b=(x,y,0)

    • Case 1.2.1 aa and bb are not flips.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1+b1xmodna_{1}+b_{1}\equiv x\mod n and a2+b2ymodna_{2}+b_{2}\equiv y\mod n

    • Case 1.2.2 aa and bb are both flips.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1b1xmodna_{1}-b_{1}\equiv x\mod n and a2b2ymodna_{2}-b_{2}\equiv y\mod n

Case 2, F=1F=1

  • Case 2.1 ab=(x,y,1)a-b=(x,y,1)

    • Case 2.1.1 aa is a flip.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1+b1xmodna_{1}+b_{1}\equiv x\mod n and a2+b2ymodna_{2}+b_{2}\equiv y\mod n

    • Case 2.1.2 bb is a flip.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1+b1xmodna_{1}+b_{1}\equiv x\mod n and a2+b2ymodna_{2}+b_{2}\equiv y\mod n

  • Case 2.2 a+b=(x,y,1)a+b=(x,y,1)

    • Case 2.2.1 aa is a flip.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1b1xmodna_{1}-b_{1}\equiv x\mod n and a2b2ymodna_{2}-b_{2}\equiv y\mod n

    • Case 2.2.2 bb is a flip.

      We get that this is equivalent to counting pairs (a1,a2),(b1,b2)n2(a_{1},a_{2}),(b_{1},b_{2})\in\mathbb{Z}_{n}^{2} where a1+b1xmodna_{1}+b_{1}\equiv x\mod n and a2+b2ymodna_{2}+b_{2}\equiv y\mod n

We get a similar result with a,b2n2a,b\in\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n^{2}} where ab=(xn+y,F)a-b=(xn+y,F) or a+b=(xn+y,F)a+b=(xn+y,F) as follows:

Case 1, F=0F=0

  • Case 1.1, ab=(xn+y,0)a-b=(xn+y,0)

    • Case 1.1.1 aa and bb are not flips.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)(b1n+b2)xn+ymodn2(a_{1}n+a_{2})-(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

    • Case 1.1.2 aa and bb are both flips.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)(b1n+b2)xn+ymodn2(a_{1}n+a_{2})-(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

  • Case 1.2, a+b=(xn+y,0)a+b=(xn+y,0)

    • Case 1.2.1 aa and bb are not flips.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)+(b1n+b2)xn+ymodn2(a_{1}n+a_{2})+(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

    • Case 1.2.2 aa and bb are both flips.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)(b1n+b2)xn+ymodn2(a_{1}n+a_{2})-(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

Case 2, F=1F=1

  • Case 2.1 ab=(xn+y,1)a-b=(xn+y,1)

    • Case 2.1.1 aa is a flip.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)+(b1n+b2)xn+ymodn2(a_{1}n+a_{2})+(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

    • Case 2.1.2 bb is a flip.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)+(b1n+b2)xn+ymodn2(a_{1}n+a_{2})+(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

  • Case 2.2 a+b=(xn+y,1)a+b=(xn+y,1)

    • Case 2.2.1 aa is a flip.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)(b1n+b2)xn+ymodn2(a_{1}n+a_{2})-(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

    • Case 2.2.2 bb is a flip.

      We get that this is equivalent to counting pairs a1n+a2,b1n+b2n2a_{1}n+a_{2},b_{1}n+b_{2}\in\mathbb{Z}_{n^{2}} where (a1n+a2)+(b1n+b2)xn+ymodn2(a_{1}n+a_{2})+(b_{1}n+b_{2})\equiv xn+y\mod n^{2}

Note that the signs are the same between corresponding cases in 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n^{2}} and 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2}. And thus there are only four distinct cases we need to count, and it suffices to compare the number of 4-tuple collisions over all (x,y)n2(x,y)\in\mathbb{Z}_{n}^{2} with the number of 4-tuple collisions over all xn+yn2xn+y\in\mathbb{Z}_{n^{2}}

We will prove the main result by counting the collisions of each of the 3 types. When nn is odd, it is simple to see from Lemma 3.10, Lemma 3.11, and Lemma 3.12 that the number of collisions for each of the (x,y)(x,y) in n2\mathbb{Z}_{n}^{2} is the same for the corresponding xn+yn2xn+y\in\mathbb{Z}_{n^{2}} and thus there are an equal total number of collisions. We now assume that nn is even.

  • a+b=c+da+b=c+d
    Suppose yy is odd, from Lemma 3.11 we have (n22)2(\frac{n^{2}}{2})^{2} pairs for each (x,y)(x,y) for a total of n54\frac{n^{5}}{4} collisions. From Lemma 3.12, we also get (n22)2(\frac{n^{2}}{2})^{2} pairs for each xn+yxn+y for an equal total n54\frac{n^{5}}{4} collisions.

    If yy is even, we have a half of (x,y)(x,y) yielding (n22)2(\frac{n^{2}}{2})^{2} pairs and another half yielding (n2+42)2(\frac{n^{2}+4}{2})^{2} pairs for a total of n2n4+n4+8n2+164=n5+4n3+8n4\frac{n}{2}*\frac{n^{4}+n^{4}+8n^{2}+16}{4}=\frac{n^{5}+4n^{3}+8n}{4}. Meanwhile in n2\mathbb{Z}_{n^{2}} we have each xn+yxn+y yielding (n2+22)2(\frac{n^{2}+2}{2})^{2} for an equal total of n5+4n3+4n4\frac{n^{5}+4n^{3}+4n}{4}. Thus, there are strictly more collisions in n2\mathbb{Z}_{n}^{2} in this case.

  • a+b=cda+b=c-d
    Suppose yy is odd, we have n22\frac{n^{2}}{2} choices for a+ba+b and n2n^{2} choices for cdc-d for a total of n52\frac{n^{5}}{2}. We also get the same results for n2\mathbb{Z}_{n^{2}}.

    If yy is even, we have half of (x,y)(x,y) yielding n22n2\frac{n^{2}}{2}*n^{2} pairs and another half yielding n2+42n2\frac{n^{2}+4}{2}*n^{2} pairs for a total of n2n4+n4+4n22=n5+2n32\frac{n}{2}*\frac{n^{4}+n^{4}+4n^{2}}{2}=\frac{n^{5}+2n^{3}}{2}. For n2\mathbb{Z}_{n^{2}}, we have each (x,y)(x,y) yielding n2+22n2\frac{n^{2}+2}{2}*n^{2} collisions for a total of n5+2n32\frac{n^{5}+2n^{3}}{2}

  • ab=cda-b=c-d
    Both groups have the same number of collisions for each (x,y)(x,y) and the corresponding xn+yxn+y at n4n^{4}. Thus, the total number of collisions in this case is n5n^{5}.

Thus, if we look over all the cases where nn is even, we found that either there are strictly more collisions in n2\mathbb{Z}_{n}^{2} or there are equal amount of collisions. Thus, we can conclude that there are strictly more collisions in 2n2\mathbb{Z}_{2}\ltimes\mathbb{Z}_{n}^{2} when nn is even.

Appendix C Proof of Theorem 1.9

We have

𝔼[|AA|]\displaystyle\mathbb{E}[|A-A|]\ =|D2n|gD2n[gAA]\displaystyle=\ |D_{2n}|-\sum_{g\in D_{2n}}\mathbb{P}[g\notin A-A]
= 2n[1AA][i=1n1[riAA]][i=0n1[risAA]]\displaystyle=\ 2n-\mathbb{P}[1\notin A-A]-\left[\sum_{i=1}^{n-1}\mathbb{P}[r^{i}\notin A-A]\right]-\left[\sum_{i=0}^{n-1}\mathbb{P}[r^{i}s\notin A-A]\right]
= 2n0[i=1n1k=0m[A has k flips][iSS|Sn|S|=mk][iSS|Sn|S|=k]]\displaystyle=\ 2n-0-\left[\sum_{i=1}^{n-1}\sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S-S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=m-k$}]\\ \cdot\ \mathbb{P}[i\notin S-S|\text{$S\subseteq\mathbb{Z}_{n}$, $|S|=k$}]\end{matrix}\right]
[i=0n1k=0m[A has k flips][iS1+S2|S1,S2n|S1|=mk|S2|=k]]\displaystyle\quad\quad\quad\quad\ \ -\left[\sum_{i=0}^{n-1}\sum_{k=0}^{m}\ \begin{matrix}[l]\ \ \mathbb{P}[\text{$A$ has $k$ flips}]\\ \cdot\ \mathbb{P}[i\notin S_{1}+S_{2}|\text{$S_{1},S_{2}\subseteq\mathbb{Z}_{n}$, $|S_{1}|=m-k$, $|S_{2}|=k$}]\end{matrix}\right]
= 2n[i=1n1k=0m(nk)(nmk)(2nm)g(n,k)(nk)g(n,mk)(nmk)]\displaystyle=\ 2n-\left[\sum_{i=1}^{n-1}\sum_{k=0}^{m}\frac{{\binom{n}{k}}{\binom{n}{m-k}}}{{\binom{2n}{m}}}\cdot\frac{g(n,k)}{{\binom{n}{k}}}\cdot\frac{g(n,m-k)}{{\binom{n}{m-k}}}\right]
[i=0n1k=0m(nmk)(2nm)(nkmk)(nmk)]\displaystyle\quad\quad\ \ \ -\left[\sum_{i=0}^{n-1}\sum_{k=0}^{m}\frac{{}{\binom{n}{m-k}}}{{\binom{2n}{m}}}\cdot\frac{{\binom{n-k}{m-k}}}{{\binom{n}{m-k}}}\right]
= 2n[(n1)k=0mg(n,k)g(n,mk)(2nm)][nk=0m(nk)(nkmk)(2nm)]\displaystyle=\ 2n-\left[(n-1)\sum_{k=0}^{m}\frac{g(n,k)\cdot g(n,m-k)}{{\binom{2n}{m}}}\right]-\left[n\sum_{k=0}^{m}\frac{{\binom{n}{k}}{\binom{n-k}{m-k}}}{{\binom{2n}{m}}}\right]
= 2n[2(n1)g(n,0)g(n,m)(2nm)+(n1)k=1m1n(n1kk1)kn(n1m+kmk1)mk(2nm)]\displaystyle=\ 2n-\left[2(n-1)\frac{g(n,0)\cdot g(n,m)}{{\binom{2n}{m}}}+(n-1)\sum_{k=1}^{m-1}\frac{\frac{n{\binom{n-1-k}{k-1}}}{k}\cdot\frac{n{\binom{n-1-m+k}{m-k-1}}}{m-k}}{{\binom{2n}{m}}}\right]
[n(2nm)k=0m(nk)(nkmk)]\displaystyle\quad\quad\ \ \ -\left[\frac{n}{{\binom{2n}{m}}}\sum_{k=0}^{m}{\binom{n}{k}}{\binom{n-k}{m-k}}\right]
= 2nnm2m(nm)+2n(n1)(nm1m1)m(2nm)n2(n1)(2nm)k=1m1(n+km1mk1)(nk1k1)k(mk),\displaystyle=\ 2n-\frac{nm2^{m}{\binom{n}{m}}+2n(n-1){\binom{n-m-1}{m-1}}}{m{\binom{2n}{m}}}-\frac{n^{2}(n-1)}{{\binom{2n}{m}}}\sum_{k=1}^{m-1}\frac{{\binom{n+k-m-1}{m-k-1}}{\binom{n-k-1}{k-1}}}{k(m-k)}, (60)

by the combinatorial identity

k=0m(nk)(nkmk)= 2m(nm).\sum_{k=0}^{m}{\binom{n}{k}}{\binom{n-k}{m-k}}\ =\ 2^{m}{\binom{n}{m}}. (61)

References

  • [DKMMWW15] Thao Do et al. “Sets characterized by missing sums and differences in dilating polytopes” In J. Number Theory 157, 2015, pp. 123–153 DOI: 10.1016/j.jnt.2015.04.027
  • [Heg07] Peter V. Hegarty “Some explicit constructions of sets with more sums than differences” In Acta Arith. 130.1 Instytut Matematyczny Polskiej Akademii Nauk, 2007, pp. 61–77 DOI: 10.4064/aa130-1-4
  • [HKLLMT20] John Haviland et al. “More Sums Than Differences Sets in Finite Non-Abelian Groups”, Presentation slides, Young Mathematicians Conference, The Ohio State University, 2020 URL: http://www-personal.umich.edu/~havijw/resources/talks/mstd_finite_nonabelian.pdf
  • [HM09] Peter V. Hegarty and Steven J. Miller “When almost all sets are difference dominated” In Random Structures Algorithms 35.1 Wiley Online Library, 2009, pp. 118–136 DOI: 10.1002/rsa.20268
  • [HM13] Virginia Hogan and Steven J. Miller “When Generalized Sumsets are Difference Dominated”, 2013 arXiv:1301.5703 [math.NT]
  • [MO07] Greg Martin and Kevin O’Bryant “Many sets have more sums than differences” In Additive combinatorics 43, CRM Proc. Lecture Notes Amer. Math. Soc., Providence, RI, 2007, pp. 287–305 DOI: 10.1090/crmp/043/16
  • [MOS10] Steven J. Miller, Brooke Orosz and Daniel Scheinerman “Explicit constructions of infinite families of MSTD sets” In J. Number Theory 130.5, 2010, pp. 1221–1233 DOI: 10.1016/j.jnt.2009.09.003
  • [MV14] Steven J. Miller and Kevin Vissuet “Most subsets are balanced in finite groups” In Combinatorial and additive number theory 101, Springer Proc. Math. Stat. Springer, 2014, pp. 147–157 DOI: 10.1007/978-1-4939-1601-6“˙11
  • [Nat07] Melvyn B. Nathanson “Problems in additive number theory. I” In Additive combinatorics 43, CRM Proc. Lecture Notes Amer. Math. Soc., Providence, RI, 2007, pp. 263–270 DOI: 10.1090/crmp/043/13
  • [Nat07a] Melvyn B. Nathanson “Sets with more sums than differences” In Integers 7, 2007, pp. A5\bibrangessep24 arXiv: https://www.emis.de/journals/INTEGERS/papers/h5/h5.pdf
  • [Zha10] Yufei Zhao “Constructing MSTD sets using bidirectional ballot sequences” In J. Number Theory 130.5, 2010, pp. 1212–1220 DOI: 10.1016/j.jnt.2009.11.005
  • [Zha10a] Yufei Zhao “Counting MSTD sets in finite abelian groups” In J. Number Theory 130.10 Elsevier, 2010, pp. 2308–2322 DOI: 10.1016/j.jnt.2010.06.001
  • [Zha11] Yufei Zhao “Sets characterized by missing sums and differences” In J. Number Theory 131.11, 2011, pp. 2107–2134 DOI: 10.1016/j.jnt.2011.05.003