tags: ,

Well, as usual I’ve not kept up enough with the blog. Déménagement has taken priority over the last few weeks. As it happens, while dusting off a suitcase that hasn’t been used for years, I found this item inside:

I went back to old haunts in 2011 to collect my MMath, and found that a bookstore I was rather fond of was gone. Not quite Martin Blank finding his old home turned into a convenience store, but it still made me a touch maudlin.

Ah well. Tempus fugit, and all that; you can’t cling on to auld lang syne forever, even if marketed nostalgia is one of the staple products of our culture. I still wish that they’d kept more 2nd-hand bookstores and had fewer plastic bars/shops, though.

Following on, in a sense, from the previous post: soon I shall be rid of this turbulent priest, erm, I mean, teaching calculus to 1st year North American students.

(This post brought to you from the Department of Procrastination.)

The post title is from W. H. Auden’s Leap Before You Lookfull text at this page.

Years ago it was pointed out to me that the rhyme scheme is

abab bbaa baab abba aabb baba

illustrating rather neatly that “4 choose 2 equals 6″. Note also that the last word of each stanza alternates between “leap” and “disappear”, and that there is a kind of “reflectional symmetry” in the order of the stanzas. Specifically, the transposition of a and b has the effect of reversing the order of the 6 4-tuples.

Hmm, maybe I should try this as an example if I get to teach a course introducing people to finite groups…

Well, that break was longer than intended…

In the last post, we claimed that $\displaystyle {\rm AM}_{\rm Z}(G)>1$ for every finite, non-abelian group G. It turns out that the easiest way to prove this goes via a certain minorant for ${\rm AM}_{\rm Z}(G)$ which we will work with in some subsequent posts. In this post, we’ll introduce this minorant, give an explicit lower bound, and then briefly indicate how it allows us to show the stronger result that

$\displaystyle \inf \{ {\rm AM}_{\rm Z}(G) \colon G \mbox{ finite and non-abelian} \} > 1\;.$

## 1. Recap

Recall that

$\displaystyle {\rm AM}_{\rm Z}(G) = \sum_{C,D\in{\rm Conj}(G)} |C|\ |D| \left\vert \sum_{\phi\in {\rm Irr}(G)} \frac{1}{|G|^2} \phi(e)^2\phi(C)\phi(D) \right\vert \;.$

We can rewrite this in a cosmetic but suggestive way. Observe that the inversion map on G, which sends each element to its inverse, maps conjugacy classes to conjugacy classes. It follows that for each D in Conj(G), the set

$\displaystyle \overline{D} = \{ x^{-1} \colon x\in D \}$

also belongs to Conj(G). Moreover, the map ${D \mapsto \overline{D}}$ is an involution, in particular is bijective. Therefore, since ${\phi(\overline{D})=\overline{\phi(D)}}$ for every character ${\phi}$, we obtain

$\displaystyle {\rm AM}_{\rm Z}(G) = \sum_{C,D\in{\rm Conj}(G)} |C|\ |D| \left\vert \sum_{\phi\in {\rm Irr}(G)} \frac{1}{|G|^2} \phi(e)^2\phi(C)\overline{\phi(D)} \right\vert \;.$

We already saw this idea, in a special case, when we looked at ${{\rm AM}_{\rm Z}(G)}$ for abelian groups. There, the point of this small change was that it made the expression look more like an inner product, so that one could apply Schur orthogonality relations; a similar idea was applied in a recent paper of Alaghmandan, Samei and myself (arXiv 1302.1929) to handle certain groups which are close to the abelian case in some sense.

## 2. A remark on normalized versus unnormalized counting measure

First, I need to clear up an issue of normalization conventions, which I omitted to deal with before. In our series of posts, we have always been working on the complex group algebra equipped with the ${\ell^1}$-norm. That is, we are looking at ${L^1(G,\lambda)}$ where ${\lambda}$ denotes counting measure on the finite set G.

On the other hand, the paper of Azimifard–Samei–Spronk (henceforth referred to as [ASS09]), where the amenability constant of the centre of the group algebra was first studied, considers ${L^1(G,\mu)}$ where G is a compact group and ${\mu}$ denotes uniform probability measure on G.

However, there is no serious conflict. For if G is a finite group, let A denote ${\ell^1(G)}$ equipped with counting measure ${\lambda}$ and equipped with convolution using ${\lambda}$, and let B denote ${\ell^1(G)}$ equipped with uniform probability measure ${\mu}$ and equipped with convolution using ${\mu}$. Then a direct calculation shows that the obvious isometric rescaling map from A to B is in fact an isomorphism of Banach algebras. In particular, A and B have the same amenability constant. Thus, our formula from ${{\rm AM}_{\rm Z}(G)}$ coincides with the formula in [ASS09] for the amenability constant of ${L^1(G,\mu)}$.

## 3. A minorant for ${{\rm AM}_{\rm Z}(G)}$

At a naive level (but not a completely facile one) we might say that the difficulty in getting non-trivial lower bounds on ${{\rm AM}_{\rm Z}(G)}$ is due to the fact that one takes the modulus of a sum of different terms, inside which there might be significant cancellation. Indeed, this is exactly what happens in the case of an abelian group: see the previous post for details.

One situation where we can avoid cancellation is where the terms in the sum are all non-negative, so that the modulus is just the sum itself. Looking at the revised formula for ${{\rm AM}_{\rm Z}(G)}$, we see that this happens whenever C=D (it may also happen for some other choices of C and D, but let us ignore that for now). Moreover, if we only want a lower bound on ${{\rm AM}_{\rm Z}(G)}$ and not its precise value, we are free to discard terms indexed by particular C and D. Thus, as observed in [ASS09], ${{\rm AM}_{\rm Z}(G)}$ is bounded below by the following quantity

\displaystyle \begin{aligned} \alpha(G) & := \sum_{C\in{\rm Conj}(G)} |C|^2 \left\vert \sum_{\phi\in {\rm Irr}(G)} \frac{1}{|G|^2} \phi(e)^2\phi(C)\overline{\phi(C)} \right\vert \\ & = |G|^{-2} \sum_{C\in{\rm Conj}(G)} \sum_{\phi\in {\rm Irr}(G)} \phi(e)^2 |\phi(C)|^2 |C|^2 \end{aligned} \ \ \ \ \ (1)

(The paper [ASS09] does not give this quantity a specific symbol, but in subsequent posts it will appear frequently enough that some extra notation seems warranted.)

In the previous post, we claimed that if G is a non-abelian finite group then we have ${\rm AM}_{\rm Z}(G)$ > 1. We can now give a sharper statement. (The calculation in [ASS09] does not give the explicit bound that we do, but it is implicit in their work.)

Proposition 1 (Azimifard–Samei–Spronk, 2009) Let G be a finite, non-abelian group, and let

\displaystyle \begin{aligned} s & =\min \{ |C| \colon C\in {\rm Conj}(G), |C|\neq 1 \} \\ & \equiv \min \{ |\mbox{conj. class of } x | \colon x \in G\setminus Z(G) \}.\end{aligned}

Then

$\displaystyle \alpha(G) \geq 1 + (s^2-s)|G|^{-2} > 1 \;.$

Proof: Compare the formula (1) which defines $\alpha(G)$ with

$\displaystyle |G|^{-2} \sum_{C\in{\rm Conj}(G)} \sum_{\phi\in {\rm Irr}(G)} \phi(e)^2 |\phi(C)|^2 |C| \ \ \ \ \ (2)$

Rearranging the sum and using the Schur row and column orthogonality relations, we see that (2) is equal to

$\displaystyle |G|^{-2} \sum_{\phi\in {\rm Irr}(G)} \phi(e)^2\sum_{C\in{\rm Conj}(G)} |\phi(C)|^2 |C| = |G|^{-1} \sum_{\phi\in {\rm Irr}(G)} \phi(e)^2 = 1.$

Hence

$\displaystyle \alpha(G)-1 = \sum_{C\in{\rm Conj}(G)} \sum_{\phi\in {\rm Irr}(G)} \phi(e)^2 |\phi(C)|^2 (|C|^2-|C|) \;.$

Now all of the terms on the right hand side are non-negative. Some of them may be zero (for instance, whenever C consists of just a single point, or whenver ${\phi(C)=0}$) but we can identify at least one strictly positive term. Namely, let ${C_0}$ be a conjugacy class of size s, and consider the trivial character ${\varepsilon}$ which takes the value 1 everywhere. Then

$\displaystyle \varepsilon(e)^2 |\varepsilon(C_0)|^2 (|C_0|^2-|C_0|) = s^2-s \geq 2,$

which gives us the lower bound that was claimed. $\hfill\Box$

Note that our lower bound “gets worse” as G gets bigger. Indeed, I believe the following question is still open.

Question. Is the infimum of $\alpha(G)$ over all finite non-abelian groups G strictly greater than 1?

Nevertheless, as mentioned in the first post of this series, we can do better when it comes to ${{\rm AM}_{\rm Z}(G)}$, which is the original quantity of interest. This was done in [ASS09] by appealing to a hard result of D. A. Rider, which tells us that the norms of central idempotents have “a gap at 1″.

Theorem 2 (Rider, 1973) Let K be a compact group, let E be a finite subset of Irr(K), and let ${f=\sum_{\phi\in E} \phi(e)\phi \in L^1(K)}$. (The orthogonality relations for irreducible characters imply that ${f}$ is a central idempotent in ${L^1(K)}$, and all central idempotents in ${L^1(K)}$ arise this way.) If ${\Vert f\Vert_1 > 1}$, then ${\Vert f \Vert_1 \geq 301/300}$.

Now let G be a finite, non-abelian group. Since ${\rm AM}_{\rm Z}(G)\geq \alpha(G)$, Proposition~1 immediately implies that ${{\rm AM}_{\rm Z}(G) > 1}$. Now ${{\rm AM}_{\rm Z}(G)=\Vert \Delta_G \Vert}$, where ${\Delta_G}$ is a central idempotent in ${L^1(G\times G)}$. Applying Rider’s theorem to ${\Delta_G}$ we deduce, as in [ASS09], that ${{\rm AM}_{\rm Z}(G)\geq 301/300}$.

Rider’s proof is rather long and technical and we will not present the details here. The constant 301/300 is somewhat arbitrary, resulting from choices made in chains of estimates, and can be improved slightly by repeating Rider’s arguments with more nit-picking. However, it seems that a significant improvement in the constant would require new ideas.

In the next post, we will see that with a more careful use of the Schur orthogonality relations, one can improve the lower bound in Proposition~1 to a constant that does not depend on |G|, provided that G has trivial centre. To do this we will need a new ingredient, not available in [ASS09], which ensures that a group which has an irreducible character of “surprisingly large” degree cannot have any small conjugacy classes except for elements of the centre.

Edited 2013-06-17: corrected some typos/omissions.

OK, back to the story of the central amenability constant. I’ll take the opportunity to re-tread some of the ground from the first post.

## 1. Review/recap

Given a finite group G, ${{\mathbb C} G}$ denotes the usual complex group algebra: we think of it as the vector space ${{\mathbb C}^G}$ equipped with a suitable multiplication. This has a canonical basis as a vector space, indexed by group elements: we denote the basis vector corresponding to an element x of G by ${\delta_x}$. Thus for any function ${\psi:G\rightarrow{\mathbb C}}$, we have ${\sum_{x\in G} \psi(x)\delta_x}$.

(Aside: this is not really the correct “natural” way to think of the group algebra if one generalizes from finite groups to infinite groups; one has to be more careful about whether one is thinking “covariantly or contravariantly”. ${{\mathbb C}^G}$ is naturally a contravariant object as G varies, but the group algebra should be covariant as G varies. However, our approach allows us to view characters on G as elements of the group algebra, which is a very convenient elision.)

The centre of ${{\mathbb C} G}$, henceforth denoted by ${{\rm Z\mathbb C} G}$, is commutative and spanned by its minimal idempotents, which are all of the form

$\displaystyle p_\phi = \frac{\phi(e)}{|G|}\phi \equiv \frac{\phi(e)}{|G|}\sum_{x\in G} \phi(x)\delta_x$

for some irreducible character ${\phi:G\rightarrow{\mathbb C}}$. Moreover, ${\phi\mapsto p_\phi}$ is a bijection between the set of irreducible characters and the set of minimal idempotents in ${{\rm Z\mathbb C} G}$.

We define

$\displaystyle {\bf m}_G = \sum_{\phi\in {\rm Irr}(G)} p_\phi \otimes p_\phi \in {\rm Z\mathbb C} G \otimes {\rm Z\mathbb C} G \equiv {\rm Z\mathbb C} (G\times G )$

and, equipping ${{\mathbb C}(G\times G )}$ with the natural ${\ell^1}$-norm, defined by

$\displaystyle \Vert f\Vert = \sum_{(x,y) \in G\times G } |f(x,y)|,$

we define ${{\rm AM}_{\rm Z}(G)}$ to be ${\Vert {\bf m}_G \Vert}$. Explicitly, if we use the convention that the value of a class function ${\psi}$ on any element of a conjugacy class C is denoted by ${\psi(C)}$, we have

$\displaystyle {\rm AM}_{\rm Z}(G) = \sum_{C,D\in{\rm Conj}(G)} |C|\ |D| \left\vert \sum_{\phi\in {\rm Irr}(G)} \frac{1}{|G|^2} \phi(e)^2\phi(C)\phi(D) \right\vert \;,$

the formula stated in the first post of this series.

## 2. Moving onwards

Remark 1
As I am writing these things up, it occurs to me that “philosophically speaking”, perhaps one should regard ${{\bf m}_G}$ as an element of the group algebra ${{\mathbb C}(G\times G^{\rm op})}$, where Gop denotes the group whose underlying set is that of G but equipped with the reverse multiplication. It is easily checked that a function on ${G\times G}$ is central as an element of ${{\mathbb C}(G\times G )}$ if and only if it is central as an element of the algebra ${{\mathbb C}(G\times G^{\rm op})}$, so we can get away with the definition chosen here. Nevertheless, I have a suspicion that the ${G\times G^{\rm op}}$ picture is somehow the “right” one to adopt, if one wants to put the study of ${{\bf m}_G}$ into a wider algebraic context.

${{\bf m}_G}$ is a non-zero idempotent in a Banach algebra, so it follows from submultiplicativity of the norm that ${{\rm AM}_{\rm Z}(G)=\Vert {\bf m}_G \Vert \geq 1}$. When do we have equality?

Theorem 2 (Azimifard–Samei–Spronk) ${{\rm AM}_{\rm Z}(G)=1}$ if and only if G is abelian.

The proof of necessity (that is, the “only if” direction) will go in the next post. In the remainder of this post, I will give two proofs of sufficiency (that is, the “if” direction).

In the paper of Azimifard–Samei–Spronk (MR 2490229; see also arXiv 0805.3685) where I first learned of ${{\rm AM}_{\rm Z}(G)}$, this direction is glossed over quickly, since it follows from more general facts in the theory of amenable Banach algebras. I will return later, in Section 2.2, to an exposition of how this works for the case in hand. First, let us see how we can approach the problem more directly.

#### 2.1. Proof of sufficiency: direct version

Suppose G is abelian, and let ${n=|G|}$. Then G has exactly n irreducible characters, all of which are linear (i.e. one-dimensional representations, a.k.a. multiplicative functionals). Denoting these characters by ${\phi_1,\dots,\phi_n}$, we have

$\displaystyle {\bf m}_G = \sum_{j=1}^n \frac{1}{n}\phi_j \otimes \frac{1}{n}\phi_j$

so that

$\displaystyle {\rm AM}_{\rm Z}(G) = \sum_{x,y\in G} \left\vert \sum_{j=1}^n \frac{1}{n^2}\phi_j(x)\phi_j(y)\right\vert$

This sum can be evaluated explicitly using some Fourier analysis — or, in the present context, the Schur column orthogonality relations. To make this a bit more transparent, recall that ${\phi(y^{-1})=\overline{\phi(y)}}$ for all characters ${\phi}$ and all y in G. Hence by a change of variables in the previous equation, we get

$\displaystyle {\rm AM}_{\rm Z}(G) = \frac{1}{n^2} \sum_{x,y\in G} \left\vert \sum_{j=1}^n \phi_j(x)\overline{\phi_j(y)} \right\vert$

For a fixed element x in G, the n-tuple ${(\phi_1(x), \dots, \phi_n(x) )}$ is a column in the character table of G. We know by general character theory for finite groups that distinct columns of the character table, viewed as column vectors with complex entries, are orthogonal with respect to the standard inner product. Hence most terms in the expression above vanish, and we are left with

\displaystyle \begin{aligned} {\rm AM}_{\rm Z}(G) & = \frac{1}{n^2} \sum_{x\in G} \left\vert \sum_{j=1}^n \phi_j(x)\overline{\phi_j(x)} \right\vert \\ & = \frac{1}{n^2} \sum_{x\in G} \sum_{j=1}^n \vert\phi_j(x)\vert^2 \end{aligned}

which equals ${1}$, since each ${\phi_j}$ takes values in ${\mathbb T}$. This completes the proof.

#### 2.2. Proof of sufficiency: slick version

The following argument is an expanded version of the one that is outlined, or alluded to, in the paper of Azimifard–Samei–Spronk. It is part of the folklore in Banach algebras — for given values of “folk” — but really the argument goes back to the study of “separable algebras” in the sense of ring theory.

Lemma 3 Let A be an associative, commutative algebra, with identity element 1A. Let ${\Delta: A\otimes A \rightarrow A}$ be the linear map defined by ${\Delta(a\otimes b)=ab}$. Then there is at most one element m in ${A\otimes A}$ that simultaneously satisfies ${\Delta(m)}$=1A and ${a\cdot m = m\cdot a}$ for all a in A.

Proof: Let us first omit the assumption that A is commutative, and work merely with an associative algebra that has an identity.

Define the following multiplication on ${A\otimes A}$:

$\displaystyle (a\otimes b) \odot (c\otimes d) := ac \otimes db .$

Then ${(A\otimes A, \odot)}$ is an associative algebra — the so-called enveloping algebra of A. If m satisfies the conditions mentioned in the lemma, then

$\displaystyle (a\otimes b) \odot m = a\cdot m \cdot b = \Delta(ab)\cdot m \;;$

and so, by taking linear combinations, ${w\odot m = \Delta(w)\cdot m}$ for every w in ${A\otimes A}$. If n is another element of ${A\otimes A}$ satisfying the conditions of the lemma, we therefore have n${\odot}$m=m, and by symmetry, m${\odot}$n=n.

Now we use the assumption that A is commutative. From this assumption, we see that ${(A\otimes A,\odot)}$ is also commutative. Therefore

$\displaystyle m = n\odot m = m\odot n = n$

as required. $\Box$

Now let G be a finite group and let A= ${{\rm Z\mathbb C} G}$. Because A is spanned by its minimal idempotents ${p_\phi}$, and because minimal idempotents in a commutative algebra are mutually orthogonal, ${{\bf m}_G = \sum_\phi p_\phi \otimes p_\phi}$ satisfies the two conditions mentioned in Lemma 3. On the other hand, if G is abelian, consider

$\displaystyle {\bf n}_G := \frac{1}{|G|} \sum_{x\in G} \delta_x \otimes \delta_{x^{-1}} \in {\mathbb C} G = {\rm Z\mathbb C} G = A.$

Clearly $\Delta({\bf n}_G)$=1A, and a direct calculation shows that ${\delta_g\cdot {\bf n}_G = {\bf n}_G\cdot \delta_g}$ for all g in G, so by linearity ${{\bf n}_G}$ also satisfies both conditions mentioned in Lemma 3. Applying the lemma tells us that ${{\bf m}_G= {\bf n}_G}$, and in particular

$\displaystyle {\rm AM}_{\rm Z}(G) = \Vert {\bf n}_G \Vert = 1$

as required.

I am a bit suprised and disappointed to see that the online maths communities I lurk around seem largely oblivious to this recent preprint 1306.3969. Here is the abstract: the added emphasis is mine.

We use the method of interlacing families of polynomials to prove Weaver’s conjecture KS2, which is known to imply a positive solution to the Kadison-Singer problem via Anderson’s Paving Conjecture. Our proof goes through an analysis of the largest roots of a family of polynomials that we call the “mixed characteristic polynomials” of a collection of matrices.

(A few years ago, the 2nd and 3rd authors of that preprint recently made a dramatic improvement in our understanding of a theorem of Bourgain and Tzafriri, see arXiv 0911.1114. So this paper is certainly worth taking seriously at the very least.)

Over on G+, Willie Wong quite sensibly asked for some brief explanation of what the problem said, and why people care(d). I must confess that the full background to the Kadison-Singer conjecture/problem is well outside my area of technical expertise, possibly outside my area of competence. Nevertheless, I can at least link to this article by Casazza and Tremain, which mentions some other conjectures in functional analysis that are known to be equivalent to the Paving Conjecture, and hence (by work of Anderson) to the Kadison-Singer conjecture.

P. G. Casazza, J. C. Tremain. The Kadison–Singer Problem in mathematics and engineering. PNAS vol. 103 (2006) no. 7, 2032–2039

Here is a link to some web material for an AIM workshop on the Kadison-Singer problem, which may give the general audience some idea of work in recent years.

The paper of Weaver which the preprint refers to is:

MR2035401 (2004k:46093) N. Weaver. The Kadison-Singer problem in discrepancy theory. Discrete Math. 278 (2004), no. 1-3, 227–239.
arXiv 0209078

The MathReview of Weaver’s paper, by P. J. Stacey, is short enough that it can be reproduced here:

In [Amer. J. Math. 81 (1959), 383–400; MR0123922 (23 #A1243)], R. V. Kadison and I. M. Singer asked if every pure state on an atomic maximal abelian subalgebra of B(H), the algebra of bounded operators on a separable Hilbert space H, extends uniquely to a pure state on B(H). Developing the approach in [C. A. Akemann and J. Anderson, Mem. Amer. Math. Soc. 94 (1991), no. 458, iv+88 pp.; MR1086563 (92e:46113)], the author formulates a combinatorial version of the Kadison-Singer problem, in terms of unit vectors in Ck. Some positive partial results are then obtained using discrepancy theory.

Perhaps I will keep this blog post updated with some more links, if anyone has suggestions. Though really it should be left to the operator theorists, operator algebraists, and combinatorists to write some expositions in the weeks to come.

### Update 2013-06-20

I see there is some attention now that Terence Tao has mentioned this on G+ and thence on the Selected Papers Network. (I admit that when I mentioned the paper on G+, I didn’t tag it with #spnetwork, mainly because I didn’t feel I had anything intelligent to say at the time; and if this #spnetwork is to become useful to the community of research mathematicians, it needs less noise from spectators, and more commentary from people who understand some ideas in the papers under discussion!)

Gil Kalai has a blogpost which says a little more about how the paper of Marcus, Spielman and Srivastava relates to the previous results of Bourgain and Tzafriri, and mentions that Spielman and Srivastava had previously given a new proof – an improved proof? – of Bourgain-Tzafriri’s restricted invertibility theorem.

Orr Shallit has also picked up on this, and offers some thoughts from the perspective of an operator algebraist/operator theorist.

As promised in the previous blogpost, here is some finite group theory. (Those of you familiar with the British TV show “Faking It” will appreciate that you don’t have to fool all of the people all of the time, just some of the people at the right moments.)

Some preliminary terminology is useful, since we will need it in later posts. We say that H is (isomorphic to) a proper quotient of a group G if there is a homomorphism from G onto H which has non-trivial kernel. A group G is said to be just non-abelian (or JNA or short) if it is non-abelian, yet every proper quotient is abelian. A little thought shows that this is equivalent to the condition that the derived subgroup [G,G] is contained in each non-trivial normal subgroup of G.

(I don’t remember explicitly hearing about JNA groups in past courses or talks, but about seven years ago in Newcastle I heard a visiting speaker talk about families of groups that were “just-${{\mathcal P}}$” for some property ${{\mathcal P}}$.)

Doing some digging in the literature: in the case where the group G is JNA and [G,G] is abelian, there is a classification or structure theorem available, through work of M. F. Newman in two papers that appear in the Proceedings of the London Mathematical Society (both in volume 10, 1960). Later on, when we resume the story of the central amenability constant, JNA groups will turn up very naturally. However, none of that is needed to follow the proof of the following result:

Theorem 1. Let G be a finite JNA group which has trivial centre and which has a conjugacy class of size 2. Then G contains an involution ${t}$ and an element ${r}$ of odd prime order p, such that G has order 2p and ${rtr=t}$.

Up to isomorphism, the only group satisfying the conclusions of the theorem is the dihedral group of order 2p. (Checking that this group actually is JNA is not too difficult, but I won’t go into the details here.)

Remark. In any finite group with a conjugacy class of size 2, both elements in the conjugacy class have the same centralizer — this point will be reiterated below in a little more detail — and this centralizer has index 2 in the parent group, hence is normal. This is encouraging in our setting, since the JNA condition now gives us extra information.

## A leisurely proof

The purpose of this post is to provide a proof of Theorem 1 that uses only basic facts of finite group theory, such as may be found in a first course that covers notions such as conjugacy classes and normal subgroups.

We recall that the derived subgroup of G, sometimes called the commutator subgroup of G, is the subgroup of G generated by all elements of the form ${xyx^{-1}y^{-1}}$ as ${x,y}$ vary over G. (For those who prefer to think categorically, [G,G] is uniquely determined by the property that it is contained in the kernel of every homomorphism from G to any abelian group, i.e. it is the kernel of the abelianization homomorphism.)

Now in Theorem 1, the hypotheses on G are as follows:

1. Z(G)${=\{e\}}$
2. there exists ${r\in}$ G with exactly one other conjugate, ${\overline{r}\neq r}$
3. If N is a non-trivial normal subgroup of G, then every commutator ${[x,y]:=xyx^{-1}y^{-1}}$ belongs to N.

It is immediate from Condition 2 that any given ${x\in}$ G either centralizes both ${r}$ and ${\overline{r}}$, or else swaps them (this is true for any group action on any 2-point set, of course). More formally:

Lemma 2. Let ${x\in}$ G. Either ${xrx^{-1}=r}$ and ${x\overline{r} x^{-1}=\overline{r}}$, or ${xrx^{-1}=\overline{r}}$ and ${x\overline{r} x^{-1} = r}$.

In particular, this lemma implies that ${r\overline{r} r^{-1}=\overline{r}}$, i.e. ${r}$ and ${\overline{r}}$ commute. Applying the lemma to ${r\overline{r}}$ shows that ${x(r\overline{r})x^{-1}=r\overline{r}}$ for all ${x}$ in G, and so by Condition 1 we must have

$\displaystyle r\overline{r}=e. \ \ \ \ \ (1)$

Let N be the order of ${r}$: this is an integer ${\geq 2}$. If N were even, say N=2m, then ${r^m = r^{-m} = (\overline{r})^m \neq e}$, and applying Lemma 2 we would find that ${r^m\in}$ Z(G), which contradicts Condition 1. Therefore N is odd.

Now we fix ${t}$ in G such that ${\overline{r}=trt^{-1}}$. (It will turn out that ${t}$ has to be an involution, but the proof is somewhat indirect.) By Lemma 2, ${t\overline{r} t^{-1} = r}$, and since ${\overline{r}=r^{-1}}$ (Equation (1)) this implies

$\displaystyle r^2 = tr^{-1}t^{-1}r \in [G,G].$

Because ${r}$ has odd order, this implies that ${r\in [G,G]}$, and so [G,G] contains the subgroup of G generated by ${r}$, which we denote by ${\langle r\rangle}$. On the other hand, observe that since ${\overline{r}\in \langle r\rangle}$, Lemma 2 implies that ${\langle r\rangle}$ is a normal subgroup of G. Therefore, by Condition 3,

$\displaystyle [G,G] = \langle r\rangle. \ \ \ \ \ (2)$

Let H be the centralizer in G of the element ${r}$ (and hence, as remarked above, of the element ${\overline{r}}$). We know that H has index 2 in G (by applying the orbit-stabilizer theorem to the conjugation action of G on ${\{r,\overline{r}\}}$). At this point we could now invoke the general fact that index 2 subgroups of any group are necessarily normal subgroups. To keep things self-contained, we instead use some ad hoc arguments (although this admittedly leaves out the bigger picture which motivates our calculations).

Lemma 3. Let ${x\in}$ G. Then either ${x}$ or ${xt}$ belongs to H.

Proof: If ${x}$ belongs to G but not H, then ${x^{-1}}$ does not lie in H, and so ${x^{-1}rx=\overline{r}}$ by Lemma 2. So ${x^{-1}rx= trt^{-1}}$ and rearranging shows that ${xt}$ centralizes ${r}$. $\Box$

Moreover, by definition H contains ${r}$. Hence it contains [G,G], by Equation (2). The next step in our argument is to show that in fact H=[G,G].

Digression. At this point, if we just wanted to proceed as quickly as possible, we could appeal to Theorem 3.4 of

M. F. Newman, On a class of metabelian groups. Proc. London Math. Soc. (3) 10 1960 354–364. MR0117293 (22 #8074)

Indeed, this point is where I’d originally got stuck on my first attempt to prove the theorem, and I has to resort to Newman’s paper to check that what I was hoping to prove was actually true. Nevertheless, it seems worth giving an ad hoc argument using only “bare-hands techniques”, since we are in a much more specialized setting than that covered by Newman’s theorem. What follows is my own argument, found by some trial and error after I had used Newman’s paper to check I was on the right lines.

### Proof that H=[G,G]

Consider the function θ : H${\rightarrow}$ [G,G] defined by ${\theta(h)=tht^{-1}h^{-1}}$. Since [G,G]=${\langle r\rangle}$ is contained in Z(H), for all ${h,k\in}$ H we have

$\displaystyle t(hk)t^{-1} = (tht^{-1})(tkt^{-1}) = \theta(h)h\theta(k)k=\theta(h)\theta(k)hk.$

Thus

$\displaystyle \theta(hk)=\theta(h)\theta(k)\quad\mbox{for all }h,k\in H; \ \ \ \ \ (3)$

in other words, θ is a homomorphism. Our goal is to show θ is a bijection, which will force H and [G,G] to have the same cardinality. Since we already know that [G,G] is contained in H, we can conclude that [G,G]=H as required.

First, we show θ is surjective. Note that ${\theta(r^{-1}) = tr^{-1}t^{-1}r = r^2}$, so if we let m=(N+1)/2 and k be any integer, induction (or the fact θ is a homomorphism) implies that ${\theta(r^{-mk}) = r^k}$. Since [G,G]=${\langle r\rangle}$, this shows θ(H)=[G,G].

Secondly, we show θ is injective. Let K=${\ker(\theta)}$, which is a normal subgroup of H (being the kernel of a homomorphism). Note that K=${\{h\in H \colon ht=th\}}$. It now follows from Lemma 3 that K is normal as a subgroup of G. Suppose K is not the trivial subgroup; then by Condition 3, K contains [G,G], and in particular ${r}$ belongs to K. But since ${trt^{-1}=\overline{r}=r^{-1}}$ (Equation (1)) we have

$\displaystyle \theta(r) = trt^{-1}r^{-1} = r^{-2} \neq \{e\}$

(since ${r\neq\overline{r}}$) and we get a contradiction. Therefore K is the trivial subgroup, and θ:H${\rightarrow}$[G,G] is indeed injective.

### Continuing the proof of Theorem 1

Let us take stock. We have shown that there exist elements ${r,t\in}$G such that:

• ${trt^{-1}=r^{-1}\neq r}$ (this follows from Equation (1) and the choice of ${t}$);
• ${r}$ has odd order, say N (this follows from the remarks after Equation (1));
• ${\langle r\rangle}$ is an index 2 subgroup in G (this follows from Equation (2) and the result just proved).

This is close to what is needed: it only remains to show that ${t^2=e}$ and that N is prime.

We will show that ${t^2}$ belongs to the centre of G, which combined with Condition 1 forces ${t^2=e}$. To do this, first observe that by Lemma 2, ${t^2}$ centralizes ${r}$, and hence centralizes ${\langle r\rangle}$. But since ${\langle r\rangle}$ is an index 2 subgroup of G and ${t\notin\langle r\rangle}$, every element of G belongs to either ${\langle r\rangle}$ or ${t\langle r\rangle}$, from which it follows that ${t^2}$ centralizes everything in G, as required.

Remark. The argument just given seems the quickest way to do things, given what we have already shown to date. However, it relies on knowing that ${\langle r\rangle}$ has index 2 in G. It may be of some interest to note that one can show ${t^2}$ centralizes G more directly, using only Equations (1) and (2). Thus, let ${x\in}$G}. By Equation (2), there exists some integer k such that

$\displaystyle t^{-1}xtx^{-1} = r^k.$

Since ${trt^{-1}=r^{-1}}$, it follows that

$\displaystyle xtx^{-1}t = t(r^k)t^{-1} = r^{-k}$

and so ${xtx^{-1}t = (t^{-1}xtx^{-1})^{-1} = xt^{-1}x^{-1}t}$. Rearranging gives

$\displaystyle tx^{-1}t^{-1} = t^{-1}xt$

so that ${t^2 x^{-1} = x^{-1}t^2}$. Since ${x}$ is arbitrary in G, ${t^2}$ belongs to the centre of G.

Finally, to finish things off, it suffices to show that N is prime. Let p be a prime factor of N. Then ${\langle r^p\rangle}$ is a proper subgroup of ${\langle r\rangle=[G,G]}$, and it is normal in G by Equation (1), since ${r^{-p}\in \langle r^p\rangle}$. Condition 3 therefore forces ${r^p=e}$, and so N=p. This completes the proof of Theorem 1.