This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
• y by
0, our generalization does not presume that >is transitive. 4 (G) and j e { l , . . . , n } let there be an edge (e, j) in D(G4') with i( e j ) = (i e , j ) and i( e j ) = (i e , (j>(e)j). The natural projection p : G* —> G is a covering. Let A be a finite group. An ordinary voltage assignment (or, A-voltage assignment) of G is a function A with the property that 4>{e~l) = 0(e)" 1 for each e £ D(G). The G and q : G^ —> G to be isomorphic, we assume that they are isomorphic by a covering isomorphism <£ : G* —> G1''. Then $ | p - i ^ ) : p _1 (w) —> q~x(v) is a bijection between the n vertices {ui,i>2, • • • ,vn} for all v € V(G). Now, we define / : V(G) -> Sn by f(v) = $| p -i ( „) for all v € V(G). For an edge uv G D(G), if (u, h) is joined to (v, k) in G*, then 4>(uv)(h) = k and (u, f(u)(h)) is joined to (v, f(v)(k)) in G^ for any h. Thus, we have ip(uv)f(u) = f(v)<j>(uv), or ip(uv) = f(v)(j>(uv)f(u)~1 for all uv £ D(G). The authors showed that the converse is also true. Theorem 4 (24) Two n-fold coverings p : G& —» G and q : G^ —* G are isomorphic if and only if there exists a function f : V(G) —> Sn such that ip(uv) = f(y)4>(uv)f(u)~1 for eachuv € D(G). Moreover, if(t>,ijj S C\(G\n), then it is equivalent to say that there exists a permutation a £ Sn such that tjj(uv) = aip(uv)a~1 for each uv £ D(G) — D(T). By labeling the positively directed edges in D(G) — D(T) as ei,e2, ..., /3(G)i a normalized permutation voltage assignment can be identified as a /3(G)-tuple of permutations in Sn, and the set C^(G;n) can be identified as e {gsE) = gi if gse G Ugi. Then the function [4>(giSe)s-e] = g{. G is connected, then the group A becomes the covering transformation group. Let Iso (G;A) (resp. Isoc (G;A)) denote the number of nonisomorphic (resp. connected) regular *4-coverings. We use Iso (G;n) to denote the number of nonisomorphic regular n-fold coverings regardless of the group A involved. Similarly, we define Isoc (G;n). The algebraic characterization of two isomorphic graph coverings given in Theorem 4 can be rephrased for regular coverings as follows. Theorem 9 ( 16 ) Let G are connected, then they are isomorphic if and only if there exists a group isomorphism a : A —» B such that tp(uv) — a(<j>(uv)) for alluv£ D{G)-D{T). In particular, if two voltages 4> and \j) in G^(G;A) derive connected coverings, then their derived coverings are isomorphic if and only if there exists a group automorphism a € Aut(.4) such that ip(uv) = a((f>(uv)) for all uv £ D(G) - D{T). As the case of the set Cj.{G\ n), the set C^{G\ A) of A- voltage assignments of G can be identified as C^(G; A) = A X A x • • • x A, Sn be a permutation voltage assignment. Then the natural covering projection p,p : G^ —* G can be extended to a branched n-fold covering p^ : §* —> S that has at most one branch point inside each face. If the net voltage on a face R has cycle structure (ci,C2,. . • ,cn), then the projection p § is an n-fold branched covering such that each face of the embedding i : G —» § has at most one branch point interior of it, and no branch points in G, then there exists a permutation voltage assignment Sn such that the branched covering p^ : S* —» S is isomorphic to the given branched covering p : S —> S. (2) Let A be a finite group and let S and p^, : S^ —> S are isomorphic. We proved the following. Theorem 17 Let G be a graph 2-cell embedded in § and let cf> and ip be two voltage assignments of G. Then two branched coverings p$ : S^ —* S p^, : S^ —> S are isomorphic as surface branched coverings if and only if the two coverings p^ : G* —> G and p^, : G^ —> G are isomorphic as graph coverings. § — B;n), two branched n-fold surface coverings p^ : S^ —> S and p^p : S1^ —» S are isomorphic if and only if two graph coverings p Sjtjn). For a voltage assignment cj> in C1(*8m <—» §fc;n), the monodromy group of the corresponding covering is nothing but the subgroup < of Sn generated by the image of (m2). Since 1 1 2 showed that for triangulated graphs (graphs in which every cycle of length greater than 3 has a chord) and for either r-initial sets or fc-multiple of s sets T, the greedy algorithm obtains SPT{G) when applied to the reverse of a so-called perfect elimination ordering. SB is a conjugacy of irreducible shifts of finite type then for every b € SA there is an isomorphism of TT(SA, b) to Tc(SB,4>(b)) preserving the positivity structure. If the SFT are mixing we may take as the basepoints any two periodic points of sufficiently large, equal periods.
MULTIPLE ATTRIBUTES
Problems of judgment or choice whose objects have many dimensions or factors that contribute to their values or utilities are often referred to as multiple criteria or multiattribute problems. In such cases, £ is applied to a subset X of the Cartesian product set X\ x X2 x • • • x Xn, where Xi is a set of levels of the ith attribute, criterion or factor. We say that the attributes are preferentially independent if, whenever x,y,z,w € X, xt = z,, yi = Wi, and Xj = yj and Zj = Wj for all j ^ i, it is true that x £3 y <£> z £; w. When this holds, there is binary relation ^ on Xj for each i such that, whenever x and y are identical on all factors other than the ith, x £ y O xt £;$ y^. If attributes are preferentially independent and there is an importance hierarchy among factors such that factor i is overwhelmingly more important than factor j when i < j , holistic preferences will be lexicographic with x >- y if not(:r; ~* yi) for some i, and X{ y^ yi for the smallest such i. A more common situation under preferential independence arises when there are tradeoffs between factors, such as ( i i , ^ ) ~ (2/1,2/2) with x\ >-i y\ and 2/2 >~2 ^2- In such cases the additive conjoint measurement or additive utility model that represents the utility of an object as the sum of utilities for its factor levels may apply: n
{xi,...,xn)
£ (yi,.-.,yn)
o ^2ui(xi) *=i
n
> ^2ui(Vi)
,
»=i
where Ui is a real-valued function on Xi. The additive utility model presumes weak order, preferential independence, and more involved independence conditions. The general independence or cancellation condition says that if m > 2, if x1,... ,xm, y1,... ,ym 6 X are such that x],... ,x™ is apermutation of y\,..., yf1 for i = 1 , . . . , n, and if xi' ^ yi^ for j = 1 , . . . , m — 1, then m m not(a; y y ). If X has sufficiently rich algebraic or topological structure, it may suffice to use only the m = 2 or m = 3 part of the general condition, and the Uj may be unique up to similar positive affine transformations, which means that (vi,... ,vn) satisfies the representation in place of ( u i , . . . ,un) if
19 and only if there are real numbers a > 0 and j3\,..., j3n such that u$ = aui+fii for every i. Other algebraic combinations of factor utilities that do not necessarily presume preferential independence can be considered. An example is the twofactor multiplicative model (3:1,2:2) £ (2/1,2/2) «• u1(xi)u2(x2)
> ui(2/1)^2(2/2)
in which each in can have negative as well as positive values. Other types of interdependence among factors that involve some degree of utility decomposition have also been investigated. A generalization of the additive utility model that does not presume transitivity is n
t=l
where
TIME STREAMS
The multiattribute formulation with x = (x\,X2,... ,xn) can be cast as a time-dependent process when i indexes successive time periods and xi is the outcome for period i in time stream x. When the number of periods is unbounded, we write x = {x\,x2, • • •)• Utility representations in the preceding section apply to finite-periods cases and can be extended to denumerableperiods cases. An example with X\ = X2 = • • • is the weighted additive form 00
u(xi,x2,...)
=
^Xiuixi)
in which u is a one-period utility function, A; > 0 is a weighting constant for period i, and both u and EAj are bounded to ensure convergence. If
20
Ai > A2 > A3 > • • •, the future is discounted, and it is discounted at a constant rate if A; = A* for some 0 < A < 1. But other patterns are plausible, as when the A^ increase for a time and then decrease toward zero. Although the Xi need not be the same in a time-stream formulation, identical outcome sets for the periods allow comparisons associated with notions of persistence and impatience. Suppose the Xi are identical, preferential independence applies, and >^i is the marginal preference relation on X0 = Xi for each i. We say that preferences are persistent or stationary if £ » = £ ; J for all i and j , and that they exhibit impatience if, whenever outcome o is preferred to b, (• • •, a, • • •, b, • • •) >- (• • •, b, • • •, a, • • •), where the two streams are identical in all other periods. One cause of preference interdependence among periods is a desire for variety, for example in one's diet. If preference in period i + 1 depends only on the outcome in period i, we might consider the additive first-order Markovian model in which u(xi,x2,---)
= ui(xi) +u2(x2,xi)
+ u3(x3,x2)
H
.
This can be specialized under relaxed notions of persistence and related aspects to Ui(a, b) = Ajti(a, b), A, > 0, for each i > 2. Comparable preference differences of section 3 can be considered in the time-stream formulation as a way of enriching the structure. For example, the utility difference representation coupled with a straightforward notion of persistent preference differences in the periods leads to the additive representation u(xi,... ,xn) = YiUi{xi) in the finite-periods setting. 6
CHOICE FUNCTIONS
We now consider a choice function C defined on a family A of nonempty subsets of X as the primitive construct with 0 C C(A) C A
for every
AG A .
A prominent theme for choice functions is their ability to be characterized by maximal elements of weak orders on X. For any such weak order £ let M(A, £ ) = {x € A : x >z y
for all
y € A} .
We say that C is weak-order representable if there is a weak order £ on X such that M(A, £ ) is a nonempty subset of C(A) for every A e A, and that C is exactly representable if C(A) = M(A, £ ) for some weak order and all A&A.
21
Supposing that A contains every nonempty finite subset of X, the inclusion condition [ACB
and
An C(B) ^ 0] => C{A) = A n C(B)
implies that C is exactly representable under the weak order £ whose strict part is defined by x >- y
if
x^y
and
C({x,y})
= {x} .
When no special structure is presumed for A, we define a revealed preferenceor-indifference relation ^o on X by x £o V if
x £ C(A)
and
and consider its transitive closure £Q m [x£AeA,yG
2/ 6 A tne
C{A),x>£
for some
A & A ,
following condition: y] =>• a; 6 (7(4) .
This condition, known as Richter's congruence axiom, is necessary and sufficient for exact representability. For every nonempty B C A, define C(B) as the set of all x in UgB such that x € C(B) for all B £ B that contain x. The modified congruence axiom C(B) 7^ 0 for every nonempty finite B C A implies that C is weak-order representable when C(A) is finite for every A £ A. The implication can fail, however, if some choice sets are infinite. Many other conditions on choice functions have been proposed. An interesting example is Plott's path independence condition C(A U B) = C[C(A) U C(B)} . This says that choices from larger sets can be based on choices among choices from smaller sets that cover the larger sets. Under suitable structure for A, path independence implies that >- based on C({x, y}) = {x} is a partial order. Other conditions, referred to as axioms of revealed preference, have been used in consumer economics as an alternative to utility maximization to explain choices of budget-restricted consumption bundles. 7
SOCIAL CHOICE F U N C T I O N S
A social choice function is a mapping F from a set A x T>, where A is as in the preceding section, into nonempty subsets of X such that F( , D) is a choice function for every D E T>. Each D is a data set that describes preferences or potential choices of a set of individuals or voters with respect
22
to X or A, and V is a collection of such data sets. Members of X are often referred to as candidates or alternatives, and data sets in V are sometimes called voter preference profiles. The social choice set F(A, D) can be viewed as the candidates in A most acceptable to the voters when A is the feasible set of candidates and D is the voter preference profile. The premier result (and challenge!) in social choice theory is Arrow's impossibility theorem 15 . Suppose that F is a social choice function o n ^ l x P and that X has at least three candidates, A contains every two-element subset of X, and V is the set of all n-tuples D = (£1, £2, • • •, £n) of weak orders on X. For each profile D, define <^£> on X by xyDy
if
x^y
and
F({x,y},D)
= {x} ,
and let x ^D V denote not(y >o x). Arrow's theorem says that F cannot simultaneously satisfy four apparently reasonable conditions: 1 (Pareto). For all D and {x, y}, x yD y if x H y for all i; 2 (Binariness). For all D, D' and {x,y}, if D and D' are the same on {x,y}, then )~o and >-£>< are the same on {x,y}; 3 (Social order). Every <^£> is a weak order; 4 (No dictator). No i is a dictator in the sense that for all D and all {x,y}, x yt y => x yD y. Arrow's impossibility theorem gave rise to a few dozen other theorems for A x V structures that identify conditions for F that are mutually inconsistent. All have roots in an old observation known as Condorcet's paradox of cyclical majorities. Its simplest example uses three candidates and three voters with transitive preferences x >-i y >-i z, z >-2 x y^ y and 1/^3X^3 x. Let a ym b mean that more voters prefer a to b than b to a. Then x ym y ym z ym x, so the simple majority relation ym is cyclic. Many other voting anomalies for elections with three or more candidates have been noted. For example, some widely used procedures have the property that a potentially victorious candidate turns into a loser after a profile is unambiguously changed in its favor. Moreover, virtually all election procedures are vulnerable to strategic misrepresentation whereby voters can secure the election of a preferred candidate by lying about their preferences. Together, impossibility theorems, voting paradoxes, and strategic misrepresentation indicate that there is no such thing as a fully acceptable election procedure when three or more candidates compete. In consequence, new procedures continue to be proposed and old ones rediscovered in attempts to find a better voting system. A recent example
23
is approval voting, where each voter votes for a subset of candidates without ranking and the winner is the candidate with the most votes. This deceptively simple system has many nice features and has been adopted by several professional societies. It has also been promoted for party primary elections but has encountered strong resistance in that political sphere from people with vested interests who are averse to change. 8
S U B S E T R A N K I N G A N D CHOICE
A popular construct for subset comparisons is a comparative probability relation ^ on a set £ of events, or subsets of a state space S, that contains the empty event 0 and the universal event S. We say that (£, £ ) agrees with a probability measure /x on £ if, for all A, B £ £, A>zB&
n(A) > /i(B) .
Agreement entails weak order, 5 >- 0, A £ 0 for all A £ £, and independence or cancellation conditions similar to those of additive conjoint measurement. When £ is infinite, an Archimedean axiom is also needed. The simplest cancellation condition, A y A' o A U B >- A' U B,
provided (A U A') D B = 0 ,
suffices when a suitably strong Archimedean axiom is used. A variety of weaker representations have been proposed for comparative probability, including several that use intervals bounded by lower and upper measures. The ( £ , £ ) formulation has also been used to address event ambiguity, a notion concerned with the difficulty in assessing probability. One axiom here is A ~ (S \ A), which asserts that an event and its complement are equally ambiguous. A different subset concern is preference or choice among objects formulated as subsets such as committees, option packages, or meals. One approach considers relationships between preferences on single items and on subsets of items. A simple axiom in this case is: if x >- y and Af){x,y} = 0 then A U {x} y AiJ {y}. However, interdependencies arising from substitutabilities, complementarities, and desires for variety or representativeness often invalidate such axioms and force consideration of interactions among items in viable evaluations of subsets. The subset comparison problem has given rise to the notion of a signed order as a more informative basis than preferences between single items in extending those preferences to subsets. Let X denote the set of single items, and let X* denote a disjoint copy of X. We can think of x* as the negation or
24
denial of x, with (x*)* = x so that X U X* is closed under the * operation. A signed order is a binary relation ^ o n l U l * that satisfies a £ 6 <S> b* £; a* for all a, 6 £ l U l * . In the committee selection setting, a; y y indicates that you would rather have x than y on the committee, x* y y means that you would rather have x not on than to have y on, and so forth. Another recent notion related to subsets is that of joint receipt. This has been considered primarily for sums of money where there is a natural addition operation, but it applies to other entities as well. The question arises in the monetary setting as to whether an individual who receives distinct amounts x and y, which could be gains or losses, evaluates them separately and then aggregates their values, or evaluates the package holistically after forming the sum x + y. With © denoting joint receipt, the related utility question is whether u(x®y) = f(u(x),u(y)) or u(x®y) = u{x+y). A hedonic editing rule has been proposed to the effect that u(x (By) = max{u(i + y),u(x) + u(y)}, in which case the utility of x © y is the larger of the utility of the sum x + y and the sum of the utilities of x and y considered separately. 9
LOTTERIES AND RISK
A lottery on a set X is a probability distribution p on X for which p{A) — 1 for some finite AC X. Members of X could be wealth levels, gains and losses, consumption bundles, multiattribute outcomes, time streams, candidates, or pure strategies. We consider a preference relation ^ o n a set P of lotteries on X that is closed under convex combinations so that Xp + (1 — A)<7 is in P when p,q £ P and 0 < A < 1. A function u on P is linear if, for all p,q € P and all 0 < A < 1, u{\p + (1 - X)q) = Au(p) + (1 - X)u(q) . When P contains all degenerate lotteries and u on X is defined from u on P by u(x) = u{p) when p(x) — 1, linearity implies expectation: u{p) — S^p(x)u(x) x
for all p € P .
It should be noted that our lottery structure with convex combinations disguises an empirically important issue of how people think of such combinations. For instance, in the example given below for axiom 2, ($4000 with pr. 0.2, $0 otherwise) can be viewed as a one-shot gamble, or as a two-stage lottery in which one receives ($4000 with pr. 0.8, $0 otherwise) with probability 1/4 or $0 with probability 3/4 in the first stage.
25
The fundamental theorem of expected utility identifies conditions on (P, £ ) that are necessary and sufficient for the existence of a real-valued linear orderpreserving u o n P . They are, for all p, q, r S P and all 0 < A < 1: 1. Weak order; 2. p y q => Xp + (1 - X)r y Xq + (1 - X)r; 3. p y q y r => ap+(l—a)r (0,1).
y q y (3p+(l—(3)q for some
Q,j5 6
When these hold, u is unique up to a positive afjine transformation v = au + b with a > 0. Additional axioms are needed to extend the expected utility form to u(p) — J u(x)dp(x) when P is a set of probability measures on an algebra of subsets of X. There are also weaker versions of the fundamental theorem that replace weak order by partial order for p y q => u(p) > u(q), or that omit the Archimedean axiom 3, in which case u maps P linearly into a multidimensional vector space ordered lexicographically. Axiom 2 is the notorious independence axiom that is often violated by expressed preferences. An example is ($3000 with pr. 1) >- ($4000 with pr. 0.8, $0 otherwise) and ($4000 with pr. 0.2, $0 otherwise) y ($3000 with pr. 0.25, $0 otherwise). Theories that weaken axiom 2 to accommodate such violations have been developed. One example has the weighted linear representation u(p) w(p)
u(q) w(q)
where both u and w are linear and w is positive. Another is the rank-dependent form: p y q O u(p) > u(q), with / an increasing map from [0,1] into [0,1] and n
u(p) = u(a;i) + Y^iu(xj)
I
n
J2P^
~ u(xj-i)]f \i=j
j=2
when p has positive probabilities for x±,X2,... ,xn, which are ordered by increasing preference. A third relaxation of the linear theory that allows preference cycles as well as violations of independence has pyq&
tp(p, q) > 0 ,
where (p is a real-valued skew-symmetric function o n P x P that is linear separately in each argument. The function ip is often referred to as an SSB (skew-symmetric bilinear) utility function.
26
Risk attitudes typically refer to curvature properties of an increasing and differentiate function u on wealth or changes in wealth in the expected utility setting, but can also be formulated for nonlinear representations. Risk aversion applies when u is concave, or when the expected value of a nondegenerate lottery is preferred to the lottery or its certainty equivalent. Risk seeking describes the opposite behavior. It is often observed that people are risk averse in gains and risk seeking in losses. The theory of stochastic dominance associates classes of utility functions on wealth with comparisons between the cumulative distribution functions of lotteries. When p ^ q, p first-degree stochastically dominates q if the cumulative of p at x is no greater than the cumulative of q at x, for all x, and this is true if and only if the expected utility of p is greater than the expected utility of q for all increasing u. And p second-degree stochastically dominates q if the left-partial integral of the cumulative of p is uniformly no greater than the left-partial integral of the cumulative of q, and this is true if and only if p's expected utility exceeds q's for all increasing, concave u. Notions of stochastic dominance have also been developed for multivariate distribution functions. When X C XixX2x• -Xn, special conditions on (P, £ ) allow u o n l for the expected utility model to be decomposed into functions Ui on Xi for each attribute. If p ~ q whenever p and q have the same marginal distribution on Xi for each i, u has an additive decomposition u(xi,X2, • • • ,xn) = ^2iUi(xi). If the preference order for each i induced over marginal distributions on Xi when levels of other attributes are fixed is independent of those fixed levels, then u has a multiplicative if not additive decomposition. Decompositions have also been investigated for SSB functions and other nonlinear forms. 10
UNCERTAINTY
Our final category generalizes decision under risk to decision under uncertainty. Each potential decision in the uncertainty case is an act that assigns a consequence in X to each state in S. The act set F is a subset of X , the set of all maps from S into X. The decision maker is uncertain about which state is the true state and cannot affect its occurrence by the act taken. Consequences in X are the primary objects of value to the decision maker. Savage's theory 16 assumes that F = Xs and £ is the set of all subsets of S. It uses seven axioms for (F, £ ) to imply the existence of a unique subjective probability measure fi on £ and a bounded utility function u on X such that, forall/,5eF, f£9&
[ u(f(s))d(i(s) Js
> f u(s(s))d/i(s) • Js
27
The axioms imply that S is infinite and that u is unique up to a positive affine transformation. They include weak order, independence axioms, and an Archimedean condition. Numerous alternatives to Savage's SEU (subjective expected utility) theory have been developed. One group replaces consequences by lotteries in P , which facilitates derivation of the SEU model for finite S. If S = { 1 , 2 , . . . , n} and lottery act f assigns lottery pi to state i, we obtain the SEU form n
u(f) = Y2 fj-MPi) where the \ii are subjective probabilities and u is a linear function on P. This approach has been used with a weak Archimedean condition to obtain a lexicographic SEU model in which subjective probabilities are matrices and utilities are multidimensional vectors ordered lexicographically. The lottery-act formulation also facilities derivation of decompositional forms for multiattribute utilities. Two examples motivate theories that weaken the SEU representation but retain some of its key ideas. The first considers acts / and g for payoffs that depend on which face of a die comes up on one roll: 1 2 3 4 5 6 /ISIOOO $900 $800 $700 $600 $500 g $900 $800 $700 $600 $500 $1000 Even if an individual believes the die is balanced with probability | for each face, the correlations between payoffs under each state may lead to / >- g, or perhaps g >- /'. SEU theory requires / ~ g. The generalization of Savage's representation with
where ip is a skew-symmetric function on X x X, accommodates such preferences. Its axioms are similar to Savage's once weak order has been relaxed to allow preference cycles. Ellsberg's famous urn example suggests a different form for p. rather than u. One ball is to be drawn randomly from an urn containing 90 balls: 30 are red (R) and 60 are black (B) and yellow (Y) in unknown proportion. Consider acts /: win $10,000 if R drawn, nothing otherwise g: win $10,000 if B drawn, nothing otherwise / ' : win $10,000 if R or Y drawn, nothing otherwise g'\ win $10,000 if B or Y drawn, nothing otherwise .
28
Many people prefer f to g and g' to / ' for reasons of specificity. However, these preferences violate the SEU principle which says that if the only difference between (/, g) and (f',g') is that for some event E f(s) = 9(s) = x f° r a u s e E f'(s) = g'(s) = y for all seE , then f y g •& f > g1 • When / >- g and g' y f', subjective probabilities that reflect preferences in an obvious way cannot be additive. Nonadditive but monotonic subjective probabilities are involved in the representation f
I u{f(s))da{s) Js
> f u(g(s))da(s) Js
,
where a is a capacity, i.e. a monotonic but not necessarily additive probability measure on £, and integration is Choquet integration defined by />
JS
rO
/*00
/ w(s)da(s) =
a{s: w(s) > t]dt Jt=0
[1 - a{s : w{s) > t}]dt . Jt=-oc
This is referred to as Choquet expected utility, or CEU. Sufficiently strong structural assumptions imply that a is unique and u is unique up to a positive affine transformation. It is instructive to note that, when S is finite, the preceding integral turns into integration of a step function with a form similar to that of the preceding section for rank-dependent utility. Thus, CEU in the setting of uncertainty is analogous to rank-dependent utility in the lotterybased risk setting. References 1. P. C. Fishburn, Stochastic utility. In: S. Barbera, P. J. Hammond, C. Seidl, eds. Handbook of Utility Theory, volume 1. New York: Kluwer, 1998, pp. 273-319. 2. D. H. Krantz, R. D. Luce, P. Suppes and A. Tversky, Foundations of Measurement, volume 1. New York: Academic Press, 1971. 3. F. S. Roberts, Measurement Theory. Reading, Massachusetts: AddisonWesley, 1979. 4. P. C. Fishburn, Utility Theory for Decision Making. New York: Wiley, 1970. 5. P. C. Fishburn, Nonlinear Preference and Utility Theory. Baltimore, Maryland: Johns Hopkins University Press, 1988.
29
6. P. C. Fishburn, Utility and subjective probability. In R. J. Aumann, S. Hart, eds. Handbook of Game Theory, volume 2. Amsterdam: Elsevier, 1994, 1397-1435. 7. E. Kami and D. Schmeidler, Utility theory with uncertainty. In W. Hildenbrand and H. Sonnenschein, eds. Handbook of Mathematical Economics, volume 4. Amsterdam: Elsevier, 1991, 1763-1831. 8. C. Camerer and M. Weber, Recent developments in modeling preferences: uncertainty and ambiguity, J. of Risk and Uncertainty, 5: 325-370 (1992). 9. P. C. Fishburn, Lexicographic orders, utilities and decision rules: a survey, Management Sci., 20: 1442-1471 (1974). 10. C. R. Plott, Axiomatic social choice theory: an overview and interpretation, Amer. J. Pol. Sci., 20: 511-596 (1976). 11. A. K. Sen, Social choice theory: a re-examination, Econometrica, 45: 53-89 (1977). 12. A. Blais, The debate over electoral systems, Internat. Pol. Sci. Rev., 12: 239-260 (1991). 13. R. L. Keeney and H. Raiffa, Decisions with Multiple Objectives: Preferences and Value Tradeoffs. New York: Wiley, 1976. 14. P. P. Wakker, Additive Representations of Preferences. New York: Kluwer, 1989. 15. K. J. Arrow, Social Choice and Individual Values, second edition. New York: Wiley, 1963. 16. L. J. Savage, The Foundations of Statistics. New York: Wiley, 1954.
COMBINATORIAL ASPECTS OF MATHEMATICAL SOCIAL SCIENCE K.H. KIM Mathematics Research Group, Alabama State University, Montgomery, AL 36101-0271, U.S.A. and Fellow, Korean Academy of Science and Technology (KAST) F.W. ROUSH Mathematics Research Group, Alabama State University, Montgomery, 36101-0271, U.S.A.
AL
We survey briefly mathematical methods in a number of social sciences, and discuss the most famous result in the theory of social welfare functions from the viewpoint of Boolean matrices, to demonstrate that the theory is both combinatorial and discrete. Lastly, we provide important open problems.
1
INTRODUCTION
The purpose of this paper is to indicate the general nature of mathematical social sciences for those new to this area, and then, go into somewhat more depth on one important topic. What is mathematical social science? Mathematical social science is an application of mathematics to social science problems. Currently, active areas of research include (1) mathematical economics; (2) mathematical psychology; (3) mathematical sociology and (4) game theory. These disciplines each publish their own research journals, and there exist many related journals. Of course, there exists a journal known as Mathematical Social Sciences, since 1981 which encompasses the above-mentioned areas. The theory of binary relations or Boolean matrices is an effective tool for attacking various areas of social science and in fact serves as one common thread among these areas. 2
2.1
LIST OF A P P L I C A B L E MATHEMATICS IN SOCIAL SCIENCE Archaeology
Mathematical methods of seriation have been used to date artifacts in archaeology. These methods involve studying similarities between artifacts and then finding an order of the artifacts which results in the least total change between one date and the next following data. The mathematics can involve 30
31
matrix theory. This has some similarities with classification theory but is not the same, since the end product is an ordering on the different items, not a grouping of them into disjoint subsets. 2.2
Demography
In the nineteenth century, Thomas Malthus gave a mathematical theory of population. Demography is the forecasting of population by mathematical methods, based on birth and death rates for various age and class segments of the population. The basic methodology is finite difference equations, which can often be solved by matrix methods. 2.3
Economics and Econometrics
In the nineteenth century, David Ricardo translated some of Adam Smith's work into mathematics, such as the theory of comparative advantage in international trade. Cournot developed a mathematical theory of duopoly, 2 agent economic competition. That is, if AT& T were considered as competing against MCI only, we would have a duopoly. Bentham invented the idea of a mathematical utility function and its role in computing a welfare for society as a whole. Walras worked out aspects of modern competitive economics in detail. In particular he formulated the idea of a competitive equilibrium with a number of individuals. Economics is the most quantitative of the social sciences, and Walras's theory of competitive equilibrium is central to it. The basic ideas involved in this are, a utility function, or, equivalently, a preference function for each agent, for consumption (would he prefer a used car, or a vacation , a computer, and a country club membership), and a specification of the goods the agents can produce. An equilibrium is a set of prices at which supply for all goods equals demand for all goods, given that each individual can pay for what he buys out of what he sells. A variety of mathematical methods are used in this area: multidimensional calculus, linear programming, fixed point theorems, differential topology. Methods have been developed for computing equilibrium based on simplicial approximation. As we will discuss in greater length later, individual preferences are represented by numbers, utilities, the value of a situation to a certain person; the more he or she likes or desires or prefers the situation, the higher the utility. In market economies, it is expected that the outcome will be the result of a game played by the individual agents seeking to maximize their own utilities (again, to some degree). Thus, in these situations, one is interested in some
32
kind of equilibrium rather than an optimum. Before discussing the general theory of games, we mention some other areas of mathematical economics. Mathematical economists have produced models of nearly every conceivable situation bearing on economic life. Some are taken as qualitative, some are quantitatively fitted to reality using statistical methods. In the latter class belong the econometric models which are used to predict the performance of national economies, using a large number of aggregated economic quantities such as incomes in various sectors and aggregated demands estimated from the past behavior of consumers. These have been quite successful in predicting short-run behavior. This involves primarily statistics, which in turn involves a lot of linear algebra. Practical econometric models involve hundreds of variables, but a simplified example of an econometric model is the use of least squares to find a formula for the increase of wages over time. Some consider econometrics a separate discipline from mathematical economics. In long-term behavior it has been shown that many economic models are chaotic in the sense of the mathematical theory of dynamical systems. If this reflects the reality, then it may never be possible to uniformly predict economic behavior long in advance, as with the weather, unless government is able to control this behavior. It has been stated that no matter how powerful computers become, or how accurate weather measurements, it will never be possible to predict the weather in detail for more than 2 weeks in advance, due to intrinsic chaos in the mathematical sense. This involves chaos theory. Questions of motivation, incentives, and mechanisms have become very important in economic theory. One theme is, how can we design an economic system so that the individual agents, acting by self-interest, will act so as to benefit the group. For instance, one would want a situation in which "honesty is the best policy", efficiency pays off, and there is a basic fairness of the resulting distribution. To some degree the benefits of a free market are that it has many of these properties. Brams and Taylor's theory of fair division 5 has become well-known in the nineties. It deals with questions like the best way to divide goods among several players when the goods have different values to different players. In addition, is there some mechanism which under self-interest, will lead players to make fair divisions, like the classic method where one player cuts a cake and the other chooses which piece he wants. The mathematics involved is typically elementary algebra and probability.
33
2-4
Game Theory
Game theory analyzes situations of conflict between individuals, groups, countries, who are viewed as competitors or players in the game. It was invented by Von Neumann in 1928. Game theory is in part a branch of mathematics, insofar as its concepts are given and the question is to deduce their consequences; but it ties in to many mathematical social sciences in its attempts to define concepts that accurately reflect human behavior. Games are structures made up of a sequence of plays, in which each of a set of players chooses from a specified set of moves, given some information about previous moves, and obtains specified payoffs as a result. Each player has the ultimate goal of maximizing his or her utility in some sense- usually maximizing expected utility, but occasionally minimizing losses. The classical games of perfect information like chess have a complexity which is not much simplified by mathematics, and are not much related to most human situations. A class of games much more relevant to human interactions are n-person games consisting of one stage of play, in which each person chooses a strategy with no knowledge of what strategy the others choose. Thus, these are not games of perfect information because the players do not know already what move the other players have made. A strategy refers to some action which the player can take: a poker player can bluff, a general can invade a country. In terms of the n-tuple of plays, there is an n-tuple of expected utility payoffs. EXAMPLE. Probably the most famous 2-person game is the prisoner's dilemma, with payoff matrix something like (-1,-1) (0,-10)
(-10,0) \ (-9,-9),/
Each of two prisoners (who collaborated on a crime) is held in isolation, and is offered immunity if he testifies against the other, and the other does not testify against him. In that case the other receives a 10 year sentence. If both testify against each other, both get 9 years; but if neither talks they can only be convicted of a lesser charge resulting in 1 year in prison. Whatever strategy the other prisoner chooses, each player acting in selfinterest alone will choose to testify, with the result that both lose. It has been argued that this situation is at the heart of many real-life problems such as damage to the environment, where self-interest is at odds with a common interest which one player alone cannot secure.
34
As is, this is an example of a noncooperative game, where players are not allowed to communicate and form binding agreements. A noncooperative game can be like war, where the opposite sides do not meet together and jointly plan what they will do. The most important solution concept for noncooperative games is the Nash equilibrium. An n-tuple of strategies is in Nash equilibrium if and only if no single player by himself can achieve a higher payoff, when all the other players keep their strategies the same. A game is said to be 0-sum if the sums of the players utilities are always zero, meaning that whatever one player gains, another will lose. The number of players makes some difference in game theory; when there are only 2 players, game theory is simpler. The above prisoner's dilemma outcome is a Nash equilibrium, but Nash equilibrium solutions are much more convincing in some other situations like 2-person, 0-sum games. A cooperative n-person game is one in which the players can communicate and make binding agreements. This is like peace negotiations after a war, or labor negotiations, or political give-and-take within a country's government. If we extend this concept to coalitions, sets of players, by saying that no subset of the players, by choosing different strategies, can strictly improve the payoffs for all their members, we have a strong Nash equilibrium. This is one solution concept for cooperative n-person games. However in a great many classes of games, existence of a strong Nash equilibrium is very unlikely, and all sorts of complicated schemes have been devised, kernel, nucleolus, von Neumann-Morgenstern solutions. Game theory is important in modern mathematical economics in relation to the theory of incentive- compatible mechanisms, that is social structures such that players have an incentive to act in a way which is socially beneficial. Most of game theory involves only algebra and probability theory, but some uses linear programming, convex sets, topology, fixed point theorems, even measure theory. 2.5
Political science
The theory of voting methods and their properties has a great overlap with the theory of social welfare functions, which has a more economic origin. The goal of this theory is to describe which voting methods, in elections with at least 3 candidates, are to be preferred to others under various conditions. For example, one common method is plurality-runoff, in which if no candidate receives a majority on the first vote, the two candidates who receive the largest numbers of votes are compared with each other in a 2 way vote. This has some advantages over the straight plurality method, but other methods
35
such as the Borda method (the ranks of candidates by voters are added and compared), approval voting 3 , cumulative voting, have strong advocates. Two other areas of political science that have received a lot of mathematical study are the theory of parliamentary coalitions, an application of cooperative game theory (see, for instance, 4 ) , and the theory of power indices such as the Shapley value. This have been used in legal settlements. These involve primarily combinatorial probability. Statistical methods are also widely used in political science.
2.6
Psychology
In the nineteenth century, Galton and Fechner introduced mathematical methods into psychology. Psychology is in a sense the foundational social science, as the individual is the unit of society. Quantification of perceptions, attitudes, values, preferences leads to measurement theory, which in general is the theory of characterizing various types of scales in terms of qualitative properties. A very simple example is that any weak order on a finite set can be represented by a utility function. A less elementary one involves the theory of additive conjoint measurement, in which axioms about how pairs or triples of quantities are related can give rise to an additive function relating various scales. Measurement theory involves the theories of ordered groups and n-ary relations. This is only the briefest mention of a large body of theory, for more see 7.9>18>19.25,30 -j^g ftrst 0 f these in particular is highly combinatorial. In a sense game theory is relevant to psychology as a theory of individual behavior.
2.7
Sociology
In the nineteenth century, Quetelet applied mathematics, specifically statistics, to the study of sociology. Mathematical sociologists have used the theory of clustering and related methods to analyze social groups. For some limited group of people one considers a number of binary relations on it, such as friendship, spending time together, requesting assistance, or confrontation. All these binary relations are represented as matrices, where an (i,j) entry may denote the strength of this relationship between person i and person j . Then the group is divided into subgroups (clusters) based on the analysis of these matrices. The mathematics is primarily matrix theory and graph theory. This kind of analysis has recently been extended to a theory of social networks n .
36
3
SOCIAL WELFARE F U N C T I O N S (SWF)
A social welfare function, given a set of alternatives, expresses which choices are better than others for a given group of individuals. It is derived from knowing which alternatives are preferred to which others by the members of the group. A more combinatorial and discrete aspect of economics is the theory of social welfare functions 22 , 2 3 . Its most famous result won Kenneth Arrow a Nobel prize in 1973 for work done in 1951. Here we shall treat s.w.f. in terms of Boolean matrices rather than the usual relation-theoretic approach. For this, see Sen 31 . Sen also won a Nobel prize in 1998 for his excellent contributions to welfare economics including social choice theory, welfare and poverty indices, and studies of famine. In 1994, John Harsanyi won a Nobel prize for his game theory solutions, and Reinhard Selten won a Nobel prize for his perfect equilibrium concept in game theory; John Nash won the same year for his solution concepts in game theory such as Nash equilibrium. We consider a group of m individuals (consumers, competitors, voters), denoted M = { l , 2 , . . . , m } who are faced with a group choice between n alternatives (candidates, goods, proposals, choices) X = { 1 , 2 , . . . , n } . DEFINITION. The two-element Boolean algebra 0 = {0,1} is as follows:
+ 01
01 0 00 1 01
0 01 1 11
A matrix over /3 is called a Boolean matrix. Let Bn denote the set of all n x n Boolean matrices over /?. For both algebraic and combinatorial properties of Boolean matrices see 14 . Let Bx denote the set of all binary relations on X. For R £ Bx and A £ Bn, let
Jl
if
(i,j)eR
This defines an isomorphism from Bx to Bn under which union and intersection correspond to Boolean addition and the elementwise product ©. There is also an isomorphism from Bn to Dx, where Dx denotes the set of all directed graphs on X. EXAMPLE. Let X = {(1,2), (1,3), (2,3)}. Then
{1,2,3}.
Let R
=
{(1,2)}.
Let R
=
37
Most workers in this area use binary relations, since the basic datum that an individual prefers one alternative to another is a binary relation. However Boolean matrices have simple mathematical properties which enable us to give a simple proof of Arrow's theorem, and are closer to the main body of mathematics such as linear algebra, than are binary relations. Graphs are not much used by workers in social welfare functions but are often used in mathematical sociology, perhaps because they give a simple picture of social networks. Another, geometric approach is given in 27 - 28 .
38
Since we are going to work with Boolean matrices, we will restate the various order relations in terms of Boolean matrices. (a) Reflexive relation Va 6 X, (a, a) £ R: I < A, I is the identity matrix. (b) Symmetric relation Va, b € X,(a,b) e R =$• (b,a) e R: A = AT. (c) Antisymmetric relation Va, b e X, [(a, b) £ R A (6, a) € R] => a = b: A 0 AT < I. (d) Transitive relation Va, 6, c £ X, [(a, b) e R A (b, c) € R] =$> (a, c) G fl: A2 < A. (e) Complete relation Va, 6 G X, (a, 6) e R V (6, a) e R: A + AT = J, the matrix each of whose entries is 1. (f) Weak (pre-) order relation (complete, transitive binary relation) : A + AT — J, A2 < A. (g) Linear (total) order relation (complete, antisymmetric, transitive binary relation) : A © AT < I, A + AT = J,A2 < A. (h) Quasiorder relation (reflexive, transitive binary relation) :/ < A, A2 < A. (i) Partial order relation (reflexive, antisymmetric, transitive binary relation): i" < A, A 0 A < I, A + AT = J. Let Wx denote the set of all weak orders on X. Let Lx denote the set of all linear orders on X. Let Wn be the set of all n x n matrices corresponding to Wx • Likewise, let Ln denote the set of all n x n matrices corresponding to Lx- Then (1) |L„| = n\, (2) \Wn\ = X ) L i S(n,k)k\ where S(n,k), the Stirling's number of the second kind, is the number of equivalence relations on a set of n elements having k equivalence classes.
1 2 4 3 5 1 2 6 24 120 1 3 13 75 541 6 7 8 9 10 5040 40320 362880 3628800 720 \Ln\ n
\Ln\ \Wn\ n
Wn\ 4683 47293 545835 7087261 102247563 DEFINITION. A profile is an m-tuple of linear orders, an element of (Ln) . m
EXAMPLE. Suppose there are 3 committee members, John, Joe, and Smitty. They have to choose between 3 plans a,b,c for investing the committee's money. John prefers a to b and b to c. Joe prefers b to c and c to a. Smitty prefers a to c and c to b. The profile would be made up of these three linear orders, which as Boolean matrices are as follows: /l 0
1 1\ 1 1 ,
\0 0 1/
(I
0 0\ 1 1 1 , 1
V ° !/
(I 0
\°
1 1\ 1 0 . 1 l
J
39
We will denote the elements of a profile as i ? ( l ) , . . . , R(m). In Boolean matrix terms, we write them as A(l),... ,A(m). The linear orders can be called preference relations in the sense that they express the preferences of the individual voters. On the other hand they are linear orders also. Linear order is a more general concept insofar as it can apply to many other situations than individual or social choice, such as the linear order on the real numbers. Preference relation is restricted to linear order in this article to make possible a treatment by Boolean matrices, but other authors consider preference relations as being weak orders.
DEFINITION. A social welfare function (SWF) is a function F : (Ln)m
-»• Wn.
Thus in this definition, any function is allowed (the range being Wn implicitly gives transitivity and completeness). Later additional restrictions are considered. The notation L" refers to the set of n-tuples -R(l),..., R(m)) so that we can write a social welfare function in the form F(R(1),..., R(m)). EXAMPLE. Let F(A(l),.. .,A(m)) = A(l). Then the group preference is the same as that of individual 1. If we take the profile in the last example, then we get F([O V \0
I 0
i ],( i i i ],[ o i o N = [o l l I. 1/
\1
0
1/
\0
1
1/ J
\0
°
!/
EXAMPLE. Let F(A{1),..., A(m)) = J. Then the group is indifferent between any two alternatives. If we apply this to the profile in the last example, we will get F(\O v
\0
I
i ),[ i
i
i),( o i
0
1/
0
1/
\1
\0
o)) = I i
1 1/ J
V1
i 1
i ]. l
)
EXAMPLE. Suppose m = n = 2. We can write out a social welfare
40
function (chosen at random). 1
F
1 W 1
1
0 1J'\0
1
1 1 0 1
i
l W i
o
0
l)'\l
1
0
1 1
1 1
0 W 1 lj'\0
1 1
0
1 1
1 1
0 \ (I 11 ' 1 1
0 1
1 0' 1 1
There are many other social welfare functions which are somewhat more realistic, but they are much more complicated to define. The total number of SWF is
|L l
n
|w„| - = £( 5 (".*)* ! ) nim The number of dictatorial SWF is m, so the number of nondictatorial SWF is n
| V P g | L » l - m = ^(S(n,A;)fc!)" !m - m. it=i
It is difficult to count the number of Pareto optimal SWF but a related condition, unanimous, would suffice for Arrow's theorem, that is, whenever all voters have the same linear orders, this is also the group preference. The number of unanimous SWF is
|WglL"HL"l = £(S(n, *)*!)"'' This is because a unaminous SWF is completely determined when it is specified on the set of all profiles where not all voters have the same linear order-in this we remove the n! profiles where all voters have the same linear order. The number of SWF which satisfy Independence of Irrelevant Alternatives is much smaller and will be stated in the main theorem.
41
Since individuals in a free society think and behave differently, individuals demonstrate different degrees of various attributes and characteristics. The following three properties safeguard the basic human rights of individuals in the group and fairness to them. For the benefit of the readers, we will first state these attributes in terms of relations, then state them in terms of Boolean matrices. There are many other attributes. For a thorough treatment, see 31 .
(a) Pareto Optimality (PO) DEFINITION. A SWF is P. 0 . if whenever everybody in the group strictly prefers a to b, then the group strictly always prefers a to b (a, b here are only two out of n alternatives in X). Therefore a SWF is P.O if and only if
Vt,j e x, [v/c e M,R(k) ELXA => (i,j) 6 F(R(l),...,
{i,j) e R(k) A (j,i) i R(k)]
R(m)) A (j,i) i F(R(1),...,
R(m)).
Equivalently a SWF is P.O if and only if Vi, j € X, [V/c e M, A(k) e Ln A a{k)i:j = 1 A a{k)ji = 0] => F(A(1),...,
Aimfiij
= 1 A F(A(l),...,
A(m))ji = 0.
EXAMPLE. The Borda social welfare function is computed by having each individual i assign a rank r^ to alternative j , where for each i, r^ are 1,2,..., n in some order. Then the group ranks the alternatives according to the magnitude of Y^i rij •
(/?) Nondictatorial Condition (NC). DEFINITION. A SWF is nondictatorial if and only if there is no individual in the group such that if he prefers a to b, then the group always prefers a to b. Therefore a SWF is nondictatorial if and only if (where fci denotes the potential dictator)
42 VAi € M 3(R(k)) G ( L x ) m 3[VfcEM, i?(fc) G Lx]
A3i,jeXl
(i,j)<£R(k1)A(j,i)eR(k1) A[(j, t) g F ( i ? ( l ) , . . . , fl(m)) V (i,j) G F ( i i < l ) , . . . , i?(m))]. Equivalently, a SWF is nondictatorial if and only if Vfcx G M 3(i4<*:)) € L™ 3 [Vfc G M, A<Jfe) G L n ] A 3t, j G X 3 a{k\)ij = 0 Aa(ki)ji A[F(A(1),..., Aimfiji
— 1
= 0 V (t, j) G F(A(1),...,
4 ( m » y = 1].
EXAMPLE. The Borda social welfare function is also nondictatorial.
(7) Independence of Irrelevant Alternatives (IIA) DEFINITION. A SWF is IIA if and only if the group choice between alternatives a and b depends only on how the individuals feel about a and b, not on their choices regarding other alternatives. A SWF is IIA if and only if Vt,j G X[Vk G M,R(k)
ELXA
S(k) G Lx
A[(i,j) G R(k) & (i,j) G S(k)] A \j,i) G i?(fc) «*- (j,i) G 5
S(m)).
Equivalently, a SWF is IIA if and only if for all Vt, j G X [VJb G M, A(k) eLnAB(k}e
Ln
A[a(k)ij = b(k)ij] A [a(k)ji = b{k)ji}} => F ( A ( 1 ) , . . . , A{m))ij = F ( £ < 1 ) , . . . , B<m)) 0 .
43
EXAMPLE. The Borda social welfare function is not independent of irrelevant alternatives. We will show that very few social welfare functions are independent of irrelevant alternatives. Dictatorial social welfare functions, antidictatorial ones (which reverse the preferences of a fixed individual), and constant social welfare functions satisfy 7.
Arrow originally proved his theorem on the domain Wx- For a simple Boolean matrix treatment, it is necessary to deal with the subset LxHowever to some degree this can be considered a stronger theorem since the assumptions on the SWF are not as strict. It could be shown that a dictator on Lx must be a dictator on Wx also. ARROW'S IMPOSSIBILITY THEOREM No SWF has attributes (a), (0), (7) for \X\ > 3. Characterization Theorem. All SWF satisfying 7 are specified as follows. There is a certain fixed weak order on X. It determines group preferences between blocks. Within each equivalence class of this weak order, there is a dictator, an antidictator, or the s.w.f. is constantly J. The dictator or antidictator must be the same in each block and in blocks below a constant block there is always a constant block. Proof: Let A(p) denote the linear order of individual p. Note by antisymmetry of linear orders for any A(p) £ Ln,i,j £ X, a{p)ij = a(p)j;- Independence of irrelevant alternatives (7) on the domain of linear orders means that the (i, j)-entry of the social welfare function is some function fij(v) where v = ( a ( l ) i j , . . . ,a(m)ij) is the vector of (i,j)-entries in the individual preference matrices A(l),..., A(m) . That is, the group choice between i, j depends only on all individual preferences between i, j , and this vector specifies those. Completeness of the social welfare function means fij(v) + fji(vc) = 1 always, where vc denotes the vector whose entries are the complements of the entries of v. This follows from the Boolean matrix formulation of completeness above. The definition of s.w.f. implies completeness and transitivity (weak order). Take any three alternatives i,j, k, using our assumption that \X\ > 3, and let u,v,w be the vectors of individual preferences for i to j \ j to k; i to k. That is, u = a(l)ij,... v =
,a(m)ij
a(l)jk,...,a(m)jk
44
w
-a(l)ik,...,a(m)ik.
We wish to consider exactly what the possibilities are for u,v,w. Transitivity says if the i, j and j , k entries are one, so is the i, k. So w > uv. It also says if the i,j and j , k entries are zero, so an individual prefers j to i, k to j , then he prefers k to i, so the i, k entry is zero. Therefore w < u + v. We check for any individual linear order R(s), that the 6 linear orders on i, j , k produce exactly the 6 possibilities for us,vs,ws that satisfy uv < w < u + v. Specifically Preference order us vs ws ijk 1 1 1 ikj 1 0 1 jik 0 1 1 jki 0 1 0 kij 1 0 0 kji 0 0 0 Then the Boolean matrix interpretation of transitivity says that fij(u)fjk(v)
< Mw)
(1)
whenever the vectors u,v,w satisfy uv < w < u + v. Pareto optimality says that fij(0,...,0) = 0, fij(l,..., 1) = 1. If we let w = u, v = 1, where 1 = ( 1 , 1 , . . . , 1), the equations give fik{v) < fa(v). By symmetry fik(v) = fij(v). Likewise fki(v) = fkj{v)It follows for a domain of at least 3 alternatives that all the / y are equal, and we can write them as f(v). Transitivity then says, uv<w
+ v=> f{u)f(v)
< f(w)
(2).
Therefore f(u) - f(v) = 1 =>• f(uv) = 1 (set w = uv). And u < w => f(u) < f(w) (set v = 1). Completeness says for all v, either f(v) = 1 or f(vc) = 1. In set-theoretic terms, S* = {z E Vm\f(z) = 1} is an ultrafilter. It is nonempty since it contains I by (a). Therefore it has a minimal element w. By PO (a) w > 0. Suppose w has at least two one entries. Let 0 < z < w. Then f(z) = 0,f(zc) = 1. Therefore f(wzc) = l,wzc < w. This contradicts w being minimal. So w has a single 1 entry, say in location s, and z > w =>• f(z) = 1. Thus zs = 1 =>• f(z) = 1. Since the product of any two minimal vectors of S* is less than either, the minimal element of 5* is unique, and f(z) = 1 •#• zs = 1. Therefore for all i,j 6 X,z £ Vm,fij(z) = f(z) = zs. This says individual s is a dictator. This contradicts (/3). We briefly outline the arguments needed instead to prove the characterization theorem when we assume IIA only. We define a fixed weak order
45
W(L) by saying that w(t)ij = 0 if and only if V i ? ( l ) . . . , R(m) e L™, F(R(1)...,
fl(m»y
= 0.
The transitivity relation (1) above still holds, and it implies that if W{L)IJ = 1 then Vfc e X, fik > fjk,fki
> fkj
(3).
This means within each equivalence class, all the functions / are equal. On this class, completeness gives 3 possibilities: /(0) = 1,/(I) = 0;/(0) = 0 , / ( I ) = 1; / ( I ) = /(0) = 1. By the arguments above these give an antidictator, a dictator, or a constant J within the matrix block represented by that equivalence class. Completeness implies that Vi, j € X,w(L)ij — 0 => w(c)ji = 1. The inequalities (3) imply that below a dictator or antidictator can be only a constant or the same dictator or antidictator. Below a constant block can be only a constant block. |.
COROLLARY The number of s.w-.f. satisfying IIA is n
k
£ ( £ S ( n , f c , r ) f c ! ( 2 m r + l)). fc=l
r=0
Here S(n, k, r) denotes the number of equivalence relations with k equivalence classes among which r equivalence classes have at least 2 members. Proof: For each weak order chosen as W(i), and each value of j from 1 to k, we can choose that at or above level j there is a dictator or antidictator from m possible individuals and below is a constant function. This gives 2mr choices for each weak order of this type. Then we add in the case when all equivalence classes are constant. The special case r — 0 also fits the formula.
A number of other attributes have been defined and studied for social welfare functions. DEFINITION A social welfare function F is anonymous if the following holds, where Sm denotes the symmetric group: VTT £ SmVR(l),...,
R(m) e 1%, F(R(l),...,
R(m)) = F(R(w(l)),...,
R(n(m))).
46
This means that the identity of the individuals makes no difference in the social choice, that everyone plays a symmetrical role. DEFINITION A social welfare function F is neutral if the following holds, where for any binary relation R and permutation IT of X, Rn denotes the binary relation such that Vx,y G X,n(x)Ririr(y) VTT £ Sn, R(l),...,
R(m) G L £ , F«{R{1),...,
<£> xRy R(m)) = F(R{1)',...,
R{mY).
Being neutral means that every alternative is treated in an equal and symmetrical way, so that the ordering of the alternatives makes no difference. DEFINITION An s.w.f. F has positive responsiveness provided that whenever R{1),..., R(m)) and 5 ( 1 ) , . . . , S(m)) are two profiles, x G M such that \/y,z G X,z ^ x,Vi,yR(j)z
=> yS(j)z
we have the following: \fy £ X,xF(R(l),...,
R(m))y =* xF(S(l),...,
S(m))y
\fy G X, yF(S(l),...,
S(m))x =» yF(R(l),...,
R(m))x
The meaning of this is that if we change the preferences so as to increase the position of x, keeping other relative positions the same, the position of x is not decreased for the group choice. This hypothesis is more or less equivalent to a,/3, and also implies strategy-proofness, so it is very strict. In fact it is so strict that on the full domain L™,n > 2 it implies dictatorial. This corresponds to the concept of monotonicity for social choice functions. EXAMPLE. The Borda method has positive responsiveness, as do methods of this general rank-sum type.
EXAMPLE. An s.w.f. can be defined with positive responsiveness which is not strictly a rank-sum method by considering pairwise majority votes.
47
Define a relation C0 (Condorcet) by xC0y if and only if a majority of voters would prefer x to y in a contest between those two only. Now rank the alternatives according to the number of other alternatives which they would defeat in pairwise contests, i.e. |{j/|a;C 0 y}|.
EXAMPLE. Plurality-runoff does not have positive responsiveness (or monotonicity) in general. Suppose we have 3 candidates a, b, c and that 102 voters rank them as abc, 99 voters rank them as bca, 100 voters rank them as cab, 5 voters rank them as bac. Then b, a receive more first-place votes than c, and in a contest between them, a majority of the voters prefer a to b, so that a will be chosen. But if all bac voters were to raise the position of a by voting abc, then alternatives a, c have the largest number of first place votes, and now c defeats a in a contest between the two.
We have mentioned the Borda rule above, which adds up points for the different ranks for different voters. It has been extensively studied by D. Saari, who has argued that it is the best social welfare function. Indeed it does have many positive properties. However there are some other social welfare functions which have desirable properties which it lacks. Individual rationality is the same concept as strategyproofness, mentioned elsewhere. It means that if the voters vote as if they were playing a game, so that their votes are not necessarily their actual preferences, then the outcome from the s.w.f is a Nash equilibrium. In order for this to make sense for a s.w.f. some method of tie-breaking must be selected. Then individual rationality means that no individual, by voting as if his preferences were different, can obtain a result which is better according to his true preferences. Gibbard 12 and Satterthwaite 29 proved that if at least 3 alternatives are in the range of a social choice function and it is individually rational, then it must be dictatorial. The references by Pattanaik and Moulin study this topic in more detail. Some modified types of social welfare functions have been considered. DEFINITION A binary relation R on X is acyclic if and only iff Vfc > Vixi,...,xk e X
-'[(xi,x2)
£ R A (£2,2:3) € R A . . . A (xk-i,xk)
ERA {xk,Xi)
G R].
48
A binary relation on X is acyclic if and only if its matrix satisfies An = 0 where n = \X\. Let Tx denote the set of acyclic binary relations on X, and Tn the corresponding set of Boolean matrices. EXAMPLE Any strictly subtriangular Boolean matrix represents an acyclic binary relation.
A social decision function is a function i™ —> Tn. Social decision functions are not a major topic in welfare economics. Their primary advantage is that in contrast to Arrovian social welfare functions, they do exist. Their primary disadvantage is that unless an issue has overwhelming support, the group may not have a preference. This is like the case of two-thirds majority in the example above. Most legislative bills do not have enough support to pass by a two thirds majority Social welfare functions on general domains of partial orders were studied by Barthelemy 2 who proved a version of Arrow's theorem where the conclusion involves an oligarchy instead of solely a dictatorship. He gives other references to related results. Another variation, which is the subject of a large literature, is social choice functions. X
2
DEFINITION A social choice function is a function - 0, such that for any S C X, P G L™, C(S) C S.
C:(2
x
-0)xL^
In other words, a social choice function, given any set of voter preferences and any subset S of the alternatives, chooses a subset C(S) C S of the alternatives which it collectively prefers. Every social welfare function gives rise to a social choice function, which selects the set of alternatives which are maximal within the weak order. EXAMPLE. The Pareto set for preferences R(l),..., X\->3zinX 3 Vi G M, zR{i)y A ->yR{i)z.
R(m)
is {y G
That is, an alternative is Pareto optimal if there is no other alternative preferred to it by every voter. A well-known theorem of Plott 24 characterizes choice functions which arise from binary relations. Strategy-proofness is usually studied in terms of some form of social choice function. Every
49
social welfare function gives rise to a social choice function: this function selects those members of S which are optimal in the weak order defined by the social choice function. The study of special forms of strategy-proofness involves combinatorics, for instance systems of distinct representatives, statistical strategy-proofness, and the study of restricted domains where strategyproofness holds. Social choice theory has been exhaustively studied since Arrow's work l. Social welfare functions have been extended to continuous domains of interest in economics and have been characterized in various ways. For example John Harsanyi 13 characterized a version of Bentham's social utility. In general, to have a nontrivial social welfare function of this type it is necessary to allow some degree of interpersonal comparison of utilities, that is one must say the value of alternative a for person i is greater than the value of alternative b for person j . Harsanyi's social utility function is defined when X is a subspace of Rk and each individual's preferences are given, not by a binary relation, but by a utility function Ui(xi,... ,Xk). It represents the social utility simply as The leximin social welfare function proposed by John Rawls others uses the same domain, but its range is a weak order.
26
and
DEFINITION. Let v,w be vectors in Rn. In the leximin sense, v is preferred to w if for some x G R, for all y < x £ R
|{»K0
This means in a sense, that the fewest number of people intensely dislike, or would be severely harmed by, this alternative. In the leximin social welfare function, one alternative (choice) a is preferred to another alternative (choice) b if and only if the vector of utilities u»(o) is leximin preferred to the vector of utilities Ui(b). Deschamps and Gevers proved that given such cardinal utility, the only social welfare functions having properties of anonymity, strong Pareto, independence of irrelevant alternatives, independence of unconcerned votes, and in variance under joint utility transformations are the leximin, utilitarian, and leximax (reverse of leximin) social welfare functions. We proposed a social arbitration function 16 which can represent intermediate values between these two. We first need to define the Nash bargaining
50
solution. Usually the set X of alternatives is allowed to be infinite in this theory. DEFINITION. A utility function is a function u : X -> R, that is, it assigns a real number to each alternative.
We assume each individual has a utility function. Let m denote the utility function of individual i. DEFINITION. Suppose a disagreement point d 6 Rm is chosen, such that 3x G XVi £ M,Ui(x) > di. The Nash bargaining solution is {x e X\ Y[ {ui{x) - di) = max J J (ui(x) - di)}. ieM
ieM
The Nash bargaining solution is characterized by the following axioms, if the set of n-tuples of utilities Su arising from the outcomes is a compact, convex set in Euclidean ra-space. AXIOM 1. The Nash bargaining solution is unchanged if the individuals are interchanged, and their utility functions. AXIOM 2. The Nash bargaining solution is unchanged if any utility function Ui(x) is replaced by aui(x) + b. (The reason is, this will not change the preferences of individual i between alternatives). AXIOM 3. The Nash bargaining solution is Pareto efficient, that is, there is no other outcome preferred by every voter. AXIOM 4. The Nash bargaining solution is independent of irrelevant alternatives: if an outcome x is optimal for a set R of utility n-tuples, then it is optimal for any S C R such that x G R. In effect this means that
J J (ui(x) - di) ieM
is chosen as a utility function for the entire society.
51
DEFINITION The social arbitration function for a given group of individuals and their utility functions is {x e X\ TT (ui(x) - di) = max TT (ui(x) - cU)}. ieM
i€M
In the interpretation of this function, the point d is assumed to be chosen by a political process, which can accommodate more or less egalitarianism. It should consist of equal utilities to all individuals. That is, if the disagreement point were at a Pareto optimum, then all payoffs would be equal. If the disagreement value is set at a very low negative value —c, then we are in effect maximizing ^2log(ui(x)
+c) =mlog(c) + ] P l o g ( l + U i ( : r ) / c )
which is quite close to mlogc + y^(Mj(a:)/c) equivalent to the utilitarian social welfare function. The use of a bargaining solution means that social arbitration functions have some degree of incentive compatibility. That is, if the individuals were to bargain instead of being assigned its optimal value, they might arrive at the same result. M. Kaneko considered a somewhat similar use of the Nash bargaining solution to make social choices. One can make a parallel of the ideas of ordinal vs. cardinal (and interpersonal) utility to ideas of majority voting and free markets vs. welfare planning. A. Tangian has pointed out that Arrow's theorem in a sense says the two approaches cannot be combined. 4
PROSPECTS
The basic units of social science are human beings and their attributes, but currently existing mathematics cannot treat some attributes. For example, Boolean algebra cannot express the degree of intensity, but both fuzzy algebra 17 and incline algebra 6 can handle the degree of intensity. But both of these algebras are unable to express attributes involing inhibition or negative influence. Therefore, we believe new mathematics is needed to tackle other human attributes and emotions. See also 19 . We have mentioned utilitarian social welfare functions 13 , leximin social welfare functions 26 , and social arbitration functions, all of which are defined in terms of given numerical utility for each individual. Since the utilitarian
52
social welfare function function is an average, an increase in one person's utility will balance a decrease in someone else's. It can be objected that this is contrary to justice and equality, in that in some sense, one person can be trampled on to benefit another. The leximin function is the ultimate in egalitarianism. Insofar as there are really tradeoffs, the best social state would be one of complete equality at the highest level where this is possible. But where it is impossible, that is some benefits to one person do not reduce someone else's condition, they would remain. The social arbitration function 16 is a compromise between these two which can be axiomatically based on the Nash bargaining solution. The Nash bargaining solution is intended to predict the solution of bargaining between several parties when they start from some disagreement point and are faced with a range of possibilities, but every individual must agree to attain a favorable possibility. It is based on maximizing the product of the increases in utilities that would result from a desireable outcome, over the disagreement outcome. Thus, for example, it is useful in situations like the prisoner's dilemma. In the social arbitration function the choice of a disagreement function can allow the social welfare function in effect to vary continuously between something like a utilitarian social welfare function to something like a leximin social welfare function. This seems more realistic that either a situation of complete equality or a method which ignores the criteria of justice and equality. There are many social problems today which have an aspect due to many conflicting interest groups that are not able to reach a consensus. Problems of this nature can be analyzed mathematically by methods of game theory and social welfare theory, and can contribute to an understanding of the difficulties. In addition, mathematical methods can reduce large amounts of qualitative data to fairly simple quantitative models in the areas of measurement theory, cluster analysis, and seriation. Matrix theory, graph theory, and combinatorics play important roles in this analysis. The methods of dynamical systems can help us understand why it is so difficult to predict social behavior, especially over significant time periods. 5
O P E N PROBLEMS
1. If strong Nash equilibrium always existed, it is usually a convincing solution concept for cooperative n-person games, n > 2. However it is rare that nperson games have a strong Nash equilibrium, and we would like to predict behavior of players in any game. Von Neumann and Morgenstern found a
53
much weaker solution concept but even it does not always exist. There are a great variety of solution concepts in n-person game theory (see 20 for the earlier ones) but they do not agree with each other and most do not always exist. Find a satisfactory solution concept for cooperative n-person games, or a proof that no such solution concept exists. Denning "satisfactory" is part of the problem but it should always exist, coincide with known solution concepts in special cases where they are universally agreed-upon, and should correspond to individually rational behavior. 2. The reason that majority voting does not give social welfare functions is that it is not in general transitive. However it is known that on subsets of the linear orders it will be transitive, such as all linear orders which correspond to a utility function which increases to its maximum and then decreases. It is known that there are other domains which are larger than this one but in some 30 years of effort, no one has been able to discover what is the largest domain having transitivity. One reason for being interested in this question is that it may help decide in what situations majority voting is a good method of social choice. See 10 for an up-to-date survey on this. Find the largest domain of linear orders on a set of n elements such that majority voting is always transitive on them. 3. Is there some analogue of the methods of modern physics, such as relativity theory or statistical mechanics, which can be applied to pychology or other behavioral sciences? 4. Enumerate acyclic binary relations on a set of n > 3 elements. This is equivalent to enumerating Boolean matrices A 9 An = 0. This is mainly just a combinatorial problem but has some relationship to social science by means of social decision functions. The authors believe that problems 1 and 3 are quite significant, and that a good enough answer to either one of them would be as important as previous Nobel prize work and would probably win a Nobel prize in economics. Acknowledgment. This paper has benefitted from comments by R. D. Luce and A. Tangian. References 1. K. Arrow, Social Choice and Individual Values, Wiley, New York, 1963. 2. J.-P. Barthelemy, Arrow's theorem: unusual domains and codomains, Mathematical Social Sciences 3(1982),1-98. 3. S. Brams and P. Fishburn, Approval voting in scientific and engineering societies, Group Decision and Negotiation 1(1992),41-55. 4. S. Brams and P. Fishburn, Yes-no voting, Social Choice and Welfare
54
10(1993),35-50. 5. S. Brams and A. Taylor, Fair Division: From Cake-Cutting to Dispute Resolution, Cambridge Univ. Press, Cambridge, 1996. 6. Z.-Q. Cao, K. H. Kim, and F. W. Roush, Incline Algebra and Applications, John Wiley, New York, 1984. 7. J.-P. Doignon and J.-C. Falmagne, Knowledge Spaces, Springer, 1999. 8. J.-C. Falmagne, Elements of Psychophysical Theory, Oxford, 1985. 9. P. Fishburn, Decision theory and discrete mathematics, Discrete Applied Mathematics 68(1996),209-221. 10. P. Fishburn, Acyclic sets of linear orders: a progress report, preprint, AT& T Shannon Laboratory, 2000. 11. B. Ganter and F. Willie, Formal Concept Analysis, Springer, 1999. 12. A. Gibbard, Manipulation of voting schemes: a general result, Econometrica 41(1973), 587-601. 13. J. Harsanyi, Cardinal welfare, individualistic ethics, and interpersonal comparison of utility, Journal of Political Economy 63, 1955. 14. K. H. Kim, Boolean Matrix Theory and Applications, Marcel Dekker, New York, 1982. 15. K. H. Kim and F. W. Roush, Mathematical Consensus Theory, Marcel Dekker, New York, 1980. 16. K. H. Kim and F. W. Roush, Competitive Economics: Equilibrium and Arbitration, North Holland, New York, 1983. 17. , Generalized fuzzy matrices, Fuzzy Sets and Systems 4(1980), 293315. 18. D. H. Krantz, R. D. Luce, P. Suppes, A. Tversky, Foundations of Measurement, vol. I, 1971, II, 1989,III,Springer, 1990. 19. R. D. Luce, Utility of Gains and Losses, Erlbaum, 2000. 20. R. D. Luce and H. Raiffa, Games and Decisions, New York, Wiley, 1957. 21. R. D. McKelvey, General conditions for global intransitivities in formal voting models, Econometrica 47(1979), 1085-1111. 22. H. Moulin, The Strategy of Social Choice, North Holland, New York, 1983. 23. P. K. Pattanaik, Strategy and Group Choice, North Holland, New York, 1978. 24. C. Plott, Path independence, rationality, and social choice, Econometrica 41 (1973), 1075-1091. 25. J. Quiggin, Generalized Expected Utility Theory, Kluwer, 1993. 26. J. Rawls, Theory of Justice, Belknap Press, Cambridge MA, 1972. 27. D. G. Saari, The Geometry of Voting, Springer, 1994. 28. D. G. Saari, Basic Geometry of Voting, Springer, 1995.
55
29. M. A. Satterthwaite, Strategy-proofness and Arrow's conditions, Journal of Economic Theory 10(1975), 187-217. 30. L. J. Savage, Foundations of Statistics, 1954. 31. A. Sen, Collective Choice and Social Welfare, North-Holland, New York, 1970.
TWELVE VIEWS OF MATROID THEORY JOSEPH P.S. KUNG Department of Mathematics, University of North Texas, Denton, TX 76203, U.S.A.
INTRODUCTION At first sight, matroid theory is a forbidding subject. There are many axiom systems, all equivalent to each other, which one must master. Except possibly for the theory of topological spaces, this proliferation of axiom systems is unique in mathematics. There is a reason for this. Matroid theory is the distillation of many subjects and the different axiom systems acknowledge the different sources from which matroid theory is derived. In this introduction to matroid theory, we present the subject by viewing it from six of its sources, linear algebra or projective geometry, invariant theory or the theory of determinants, lattice theory, graph theory, combinatorial optimization, and matching theory. Our introduction is intended as a guide to some research areas in matroid theory. It is selective and not exhaustive. In general, we emphasize background, motivation, and possible areas of research rather than proofs or technical refinements. In the last four sections, however, new results are presented and proofs are provided. The new material includes an approach to the representation of matroid unions using the multiple Laplace identity, a connection between the common basis problem and the Cauchy-Binet identity, an extension of the matrix-tree theorem, and an identity for a generic rank-generating polynomial. Comprehensive lists of books and survey papers are given in the references for the reader who wishes to go further into the subject. 1
LINEAR D E P E N D E N C E W I T H O U T SCALARS
According to Hassler Whitney 71 , the founder of the subject, matroid theory is the study of the "abstract properties of linear dependence". Take a set S of vectors in a finite-dimensional vector space over a field or division ring F. If A is a subset of S, we define the rank of A to be the maximum size of a linearly independent set in A. Considering S as a set of column vectors, the rank of A is the rank of the matrix formed by putting the column vectors in 56
57
A side by side. The rank of A is also the dimension of the subspace lin(A) spanned by A. Dimensions of subspaces satisfy Grassmann's identity: if X and Y are subspaces, then dim(X) + dim(y) = dim(X V 7 ) + dim(X D Y), where X V Y is the subspace spanned by X and Y. If we just have a subset of vectors and not necessarily the entire vector space, then lin(^4) V lin(i?) equals Ym(A U B), but, because Af)B might not contain a spanning set of lin(^)nlin(B), lin(AnB) might be a proper subset of lin(A)nlin(.B). Thus, for ranks of subsets, Grassmann's identity weakens to the submodular inequality: (R3)
rank(A) + rank(5) > rank(A U B) + rank(A n B).
Using the submodular inequality, we can formally define a matroid. A matroid G on the set S is specified by a ranic function defined from subsets of 5 to the non-negative integers satisfying the normalization axiom: (Rl)
rank(0) = 0,
the unit-increase axiom: for any element a of S, (R2)
rank(A) < rank(A U {a}) < rank(A) + 1,
and the submodular inequality (R3). Note that the rank of the entire set 5 is a finite integer. Thus, we are restricting our attention to finite-rank matroids. We can also define a closure on A i-> A, 2s -» 2 5 on the set S of vectors by defining the closure A to be the subset of vectors in S in the linear span of A, that is, A = \in(A) n S. This closure satisfies the three defining properties of a closure, that is, (Cll)
(C12) (C13)
AC A,
ACB=>ACB, A= X
It also satisfies two additional properties. The first is the (Mac Lane-Steinitz) exchange property: if a, b <£ ~A, then a £ A U {6} «=^ b € A U {a}. The second is the finite basis property:
58
for every subset ACS,
there exists a finite subset A0 C A such that A0 = ~A.
Abstracting this, we have the closure axioms for a matroid: a matroid G is specified by a closure A i-» A satisfying the exchange property and the finite basis property. A set A is said to be closed if A — A. Closed sets are also called flats. Adapting terminology from geometry, rank-1 flats are called points, rank-2 flats are called lines, rank-3 flats are called planes, and so on. Going down, rank-(n — 1) flats in a rank-n matroid are called copoints or hyperplanes. The rank axiomatization and closure axiomatization are equivalent or, to use a term coined by Birkhoff [Bl, p. 154], cryptomorphic. Closure can be defined in terms of rank by defining the closure A of a set A to be the set {a : rank(A U {a}) = rank(^)}. In the other direction, rank can be defined in two steps: rank(yl) = rank(A) and the rank of a closed set A is the maximum length of a strict chain of closed sets 0 = A0 C Ai C A2 C . . . C Ar = A. The exchange property is equivalent to the property that the relation a ~ b if
a € A U {b}
defined on the complement S\A is symmetric. The properties (Cll), (C12), and (C13) imply that the relation ~ is reflexive and transitive. Hence, a ~ b is an equivalence relation. Thus, for closures, the exchange property is equivalent to the partition property: Let A be a closed set. Then, the subsets X\A, where X is a closed set of the form A U {o} for some element a $ A, partition the complement S\A of A. The unit-increase axiom implies that rank(^4) < \A\ for all subsets ACS. The subsets / for which rank(7) = \I\ are said to be independent. There are many axiom systems for matroids using independent sets. A lesser known one is the following. A matroid G on the set S is specified by a collection of finite subsets of S satisfying: (11)
0 is independent.
(12)
If / is independent and J CI, then J is independent.
(13)
Every maximal independent set in a subset A of S has the same size.
(14) There are no infinite ascending chains / 0 C / 1 C / 2 C . . . of independent sets.
59 One can prove that the independent set axiomatization given here is equivalent to two axiomatizations introduced earlier (see, for example, [Bll]). _ An element a is a loop if r({a}) = 0 or, equivalently, a £ I . If the elements a and 6 are not loops, we say that they are parallel if r({a,b}) = 1 or, equivalently, a € {6}. A matroid is simple if 0 = 0 and {a} = {a} for every element a in S, or, equivalently, if it has no loops or pairs of parallel elements. Simple matroids are also called combinatorial geometries [B4]. The vector space Fd is not a simple matroid. It can be made into a simple matroid by the following classical construction. Remove the zero vector. On the set of non-zero vectors, impose the equivalence relation u ~ v if and only if for some non-zero scalar a in F, u = av. An equivalence class is called a point and the set of points is in in one-to-one correspondence with the 1-dimensional subspaces of Fd. The set of points is the projective geometry PG(d — 1, F) of dimension d — 1 over the field F. For most purposes, we can work with points in PG(
{ai,bi,a2,b2},
{ai,bi,a3,b3},
{01,61,04,64},
60
{a2,b2,a3,b3},
{02,62,04,64},
and {03,63,04,64}.
All the other planes contain three points. If we "relax" the condition that the points 03,63,04,64 are coplanar, we obtain the Vamds matroid V$ on 8 points with the first five 4-point planes. The matroid Vs is not representable over any field because it does not satisfy the bundle theorem. The set {03,63,04,64} is both a circuit (that is, a minimal dependent set) and a hyperplane; thus, Vs is an example of a circuit-hyperplane relaxation. Kahn 32 shows that relaxing a circuit which is also a hyperplane in a matroid always yields another matroid. Besides Vg, there are two other 8-point rank-4 matroids not representable over any field. All three matroids are obtained by relaxation. See [Bll, p. 508 and 509]. The idea of relaxation is useful in many areas of matroid theory. For example, Lovasz 47 used a similar method to construct a large family of matroids to show that the matroid parity problem cannot be solved by a polynomial-time algorithm. Circuit-hyperplane relaxation shows that most theorems of projective geometry in the plane do not hold for all matroids. The situation is less clear in higher dimensions. For example, the rank-4 Desargues' theorem holds as a theorem for matroids (see [S7, p. 60]). 1.1. DESARGUES' THEOREM FOR MATROIDS. Let ei}l < i < 4, and eij, 1 < i < j < 4, be ten points in a matroid (having rank at least 4). Suppose that (1) ei,e2,e3,e4 are independent, and (2) for all pairs i and j , the point e^ is on the line e* V e, spanned by e; and ej.
If the three triples {ei2,ei 3 ,e 2 3}, {ei2,ei4,e 2 4}, and {ei3,ei 4 ,e3 4 } are collinear, then the fourth triple {623,624,634} is also collinear. In particular, if the fourth triple is declared independent, then the resulting failed rank-4 Desargues configuration is not a matroid. However, in the rank-3 Desargues configuration (obtained by projecting the rank-4 Desargues configuration from a point in general position), the fourth triple is a circuit-hyperplane and it can be relaxed to obtain a non-representable rank-3
61
matroid. Thus, the situation in matroid theory is the same as in projective geometry: there are non-Desarguesian rank-3 matroids and projective planes, but all matroids of rank at least 4 and all projective spaces of dimension at least 3 satisfy Desargues' theorem. (The earliest construction of a nonDesarguesian projective plane appears to be in the 1894 paper of Peano 54 . For a historical study of Desargues' theorem, which had its origins in the theory of perspective, see 24 .) 2
BASIS E X C H A N G E PROPERTIES
Do the axioms of matroid theory capture the characteristic or essential properties of linear dependence? One possible answer to this question is given by classical invariant theory 57 . According to Felix Klein's Erlanger Programm, projective geometry over a field F is the study of "algebraic" properties of an F-vector space which are invariant under changes of coordinate systems or, equivalently, action of the general linear group. The generally agreed way to express algebraic properties is to use polynomials. Let V be a vector space over a field F of dimension d and fix a basis of V. A polynomial p(xi,X2, • • • ,xm) in the vector variables Xi defined on V is a function from the m-fold cartesian product Vm to the field F which can be written as a polynomial in the coordinate variables Xij, where Xij is the j t h coordinate of the vector variable n relative to the fixed basis. The property of being a polynomial is not dependent on the choice of basis. A property P of vectors is said to be algebraic if it can be expressed in terms of the vanishing or non-vanishing of a set of polynomials, that is to say, if there exists a (first order) sentence a made up from atomic formulas of the form Pi{x1,x2,...,xm)=0, qi{xi,x2,...,xm)^Q such that for any m-tuple of vectors Uj, the property P holds if and only if the sentence a is true under the substitution x^ = V{. A property P is invariant (or "has invariant significance," in nineteenth century phraseology) if for any non-singular linear transformation a, the property P is true for the vectors Vi if and only if P is true for av,. What are the invariant algebraic properties? By "Gram's principle," every invariant algebraic property can be written as a sentence in which the polynomials pi and qj are relative invariants of the general linear group. A polynomial p is a relative invariant of the general linear group if and only if for every non-singular linear
62
transformation a, p(axi ,ax2,...,
axm) = x(a)p(xi ,x2,...,
xm),
where x(a) is the determinant of a raised to some non-negative integer power. Thus, the first step to finding all invariant properties is to find all relative invariants. The first fundamental theorem of projective invariant theory states that every relative invariant can be written as a polynomial in the brackets [%ii i Xi2,...,
Xid\ = det(:Ej t
j)i
where (xitj) is j-th coordinate variable of the vector variable Xit. Polynomials of brackets can be interpreted geometrically. A monomial or product [ x i , x 2 , • • •, xd][yi,
y 2 , • • •, Vd] • • • [z\, z 2 , . . . , zd]
of brackets is nonzero if and only if all the underlying sets {xx, x2,...,
xd}, {yx, 2/2, • • •, Vd}, • • •, {z\, z2,...,
zd)
in M are bases. If this is the case, we say that M is basic. This interpretation is faithful in that all the information about being bases contained algebraically in the monomial is preserved. Unfortunately, there is no faithful interpretation for sums or linear combinations. If a linear combination of monomials Mi + M2 + ... + Mr is nonzero, then all one can say is that at least one of the monomials is basic. If we have an identity Mi + M 2 + . . . + Mr = 0, then all one can say is that either none of the monomials is basic, or at least two of them are basic. We have now a language of bases for expressing abstract linear dependence. This language is best possible combinatorial translation of the first fundamental theorem. What is the grammar or syntax of this language? Appropriately enough, the second fundamental theorem provides an answer. This theorem says that all the algebraic relations between brackets can be deduced from the (single) Laplace expansion n
[xi,x2,...,xn][yi,y2,...,y„]
=^ 2=1
[yi,x2,. • • ,xn][yuy2,...
,yi-i,xi,yi+i,...
,yn
63
Translated into the language of bases, this yields the basis exchange axiom: (B2) If Bi and B2 are bases and x is any element of Bi, then there exists an element y in B2 such that (£?I\{:E}) U {y} and (B2\{y}) U {x} are bases. We can now define a matroid in terms of its bases. A matroid G on the set S is specified by a non-empty collection of finite subsets of S called bases satisfying the axiom (BI)
If B\ and B2 are bases, then B\ (f_ B2,
and the basis exchange axiom (B2). Bases are maximal independent sets. Using the basis exchange axiom, one can proved that all bases have the same size. One can also prove that the basis axiomatization is equivalent to the three axiomatizations given in Section 1. An apparent weakening of the basis exchange property is the basis replacement axiom: If B\ and B2 are bases and x is any element of Bi, then there exists an element y in B2 such that (Bi\{x}) U {y} is a basis. Given axiom (BI), it can be shown that the basis exchange axiom and the basis replacement axiom are equivalent. There are many determinantal identities one can deduce algebraically from the single Laplace expansion. Some of them, like the multiple Laplace expansion, translate to valid basis exchange property for matroids. Some do not. 2.1. PROBLEM. Describe explicitly those determinantal identities which translate to valid basis exchange properties for matroids? A "solution" to Problem 2.1 was given by Bukowski and de Oliveira in 10 . It is perhaps more accurate to say that they showed the problem is decidable by finding an algorithm (using commutative algebra and Grobner bases) for deciding whether a given determinantal identity yields a valid basis exchange property. The fundamental theorems of invariant theory hold over an infinite field of any characteristic. This is proved using a combinatorial method known as the straightening formula 13>16. (Other non-combinatorial proofs exist for the fundamental theorems, but they require more severe technical hypotheses on the field.) Over finite fields, there are other invariants besides the brackets. These invariants should have geometric interpretations and some of them
64
might have matroid-theoretic significance. Almost nothing is known about projective invariant theory over finite fields. 2.2. RESEARCH AREA. Study the invariants of the general and special linear group over a finite field from a algebraic, combinatorial, and geometric point of view. The process of deriving the basis exchange axiom from the Laplace expansion is similar to the process called "non-parametrization" in statistics. It consists of replacing a numerical quantity by a combinatorial or discrete property. As in statistics, non-parametrization makes the subject more "robust," that is, less sensitive to technical assumptions. In the case of matroids, we replace the numerical value of the bracket [x\, x2,..., xn] by the binary digit or bit, whether [x\,x2, • • • ,xn] is zero or non-zero. If we work over the real numbers, we can retain the trit or ternary digit of information, whether [xi,X2,... ,xn] is negative, zero, or positive, or, equivalently, the sign or orientation of the signed volume of the parallelopiped with sides xi,x2, • • • ,xn. Doing so, we obtain the chirotope definition of an oriented matroid. A rank-n oriented matroid on the set S is specified by a sign function
ip is not identically zero.
(Or2)
Alternation.
For any permutation u,
(p(xi, x2, - - -, xn) = sgn{a)(p(xa{1),
xa{2) ,•••, z
where sgn(er) equals + or — depending on whether a is even or odd. (Or3) Signed basis exchange. If ip(xi, x2, •. •, xn)ip(yi, y2,..., there exists i such that tp(yi, x2, • • •, xn)
,y2,...,yi-i,x1,
yi+1,...,
yn)
yn) = - , then =
-.
The standard reference on oriented matroids is [B3]. Oriented matroids have topological representations. They provide a combinatorial version of linear programming. The "robustness" of oriented matroids has proved useful in algebraic topology and differential geometry, particularly in the theory of combinatorial differential manifolds, where they have been used in a combinatorial reformulation of Chern classes (see l and 2 ) . There are other non-parametrizations of the Laplace expansion. For example, if we work over the complex numbers and replace the value of
65
[x\,X2,- •• ,x„] by the ordered pair (t,f) of trits, where t is the sign of the real part of [xi,x%,...,xn] and t' is the sign of the imaginary part of [x\,X2, • • • ,xn], we obtain complex (oriented) matroids. See 73 . If one has a taste for such things, one might define and study quaternionic matroids. This is not as frivolous or tedious as it might seem. For example, studying quaternionic matroids may shed light on determinants over skew fields. One might also study octonionic matroids. However, since the octonions or Cayley numbers are non-associative, octonionic linear algebra is unknown territory. It is inevitable that someone would use the invariant theory of other classical groups to derive analogues of matroids. One such analogue is a bimatroid, which abstracts the properties of non-singularity of square submatrices of a matrix (37 and 4 3 ). A bimatroid between the sets S and T is defined by a collection M of pairs (X, U) where X C S and U CT are finite subsets with the same cardinality. The pairs in J\f are called non-singular minors and they satisfy the following axioms: (BM1) The empty pair (0,0) is a non-singular minor. (BM2) The exchange-augmentation axiom. If (X, U) and (Y, V) are nonsingular minors and y £ Y, then at least one of the following holds: Exchange. There exists x 6 X such that the pairs ({X\{x}) and ({Y\{y}) U {x}, V) are non-singular minors, or Augmentation. {{Y\{y}),V\{v})
U {y},U)
There exists v £ V such that ((X U {y},U U {v}) and are non-singular minors.
(BM2') The analogous exchange-augmentation axiom, with the roles of (X, Y) and (U, V) interchanged, holds for any element v S V. Non-parametrizing the invariant theory of the symplectic group allows us to put a symplectic or PfafEan structure on a matroid. See 37 and 44 . (We note that Proposition 6.6 in 44 should read "The maximum number of maximal isotropic subspaces U intersecting a given maximal isotropic subspace at {0} is i?".) Oriented versions of bimatroids and Pffafian structures can be obtained in a straightforward way using the method in the chirotope definition of oriented matroids. 3
GEOMETRIC LATTICES
The closed sets or flats of a matroid G form a lattice L(G), called the lattice of flats, under set containment. Because intersections of closed sets are closed sets, the meet of two closed sets A and B is their set-theoretic intersection
66
AnB. The join A\/B can be described in two different ways: it is the closure of the union A U B or it is the intersection of all_ the closed sets containing both A and B. The lattice L{G) has a minimum 0 and a maximum S. Lattices of flats of matroids satisfy three characteristic properties: (GL1) Semimodularity. I V V covers X and Y.
If X and Y cover the meet X AY, then the join
(GL2)
Atomicity. Every element in L is a join of points.
(GL3) finite.
Finite rank. Every chain from the minimum 0 to the maximum 1 is
Abstracting this, we define a geometric lattice to be a lattice satisfying (GL1), (GL2), and (GL3). Birkhoff showed in 1935 that the theory of simple matroids and the theory of geometric lattices are equivalent. 3.1. BIRKHOFF'S THEOREM. The lattice L(G) of flats of a matroid is a geometric lattice. Conversely, a geometric lattice L defines a simple matroid G on its set of points such that L(G) is isomorphic to L. A flat X in a geometric lattice L is modular if for every flat Y in L, rank(X) + rank(y) = rank(X V 7 ) + rank(X n Y). Intuitively, a modular flat contains enough points of intersection so that relative to the points in the matroid, it behaves as if it were a subspace in a projective geometry. In particular, a copoint is modular if and only if every line not in the copoint intersects the copoint at a point. A geometric lattice is said to be modular if every flat is modular. Birkhoff proved that every finiterank atomic modular lattice is a direct product of lattices of flats of lines (or rank-2 matroids) and lattices of subspaces of vector spaces or projective planes. Going from a matroid to its geometric lattice of flats is a generalization of going from a vector space to its modular lattice of subspaces. This is the first step in von Neumann's program of "pointless" geometry. Taking the metric completion of a profinite limit of lattices of subspaces of Fn for a flexed field F, von Neumann constructed a complete complemented modular lattice with a "dimension function" satisfying Grassmann identity. This dimension function is "continuous" in the sense that there are subspaces of dimension r for any real number r in the unit interval [0,1] and maximal chains are homeomorphic to [0,1]. This modular lattice is called the continuous geometry over F. See 53 and [Bl], pp. 237-239.
67
In a geometric lattice, points play a special role. In algebraic geometry, points correspond to maximal ideals in the coordinate ring. The set of maximal ideals forms an antichain under set containment. However, the prime spectrum, the set of prime ideals partially ordered by inclusion, is more useful. Motivated by this, Rota has proposed studying analogues of matroids or geometric lattices in which the antichain of points is replaced by a partially ordered set of "generalized" points. It is not clear what kind (or kinds) of lattices play the role of geometric lattices in this setting. Semimodular lattices seem not to have sufficient structure for this. Two interesting relatives of geometric lattices are consistent lattices and meet-distributive lattices. Let L be a lattice. A element j in L is joinirreducible if j is not the minimum 0 and j = a V b implies j = a or j = b. A meet-irreducible is a join-irreducible in the order dual. In a lattice satisfying the descending chain condition, every element x can be decomposed into a join x = h V j 2 V . . . V j„ of a finite number of join-irreducibles. Such a decomposition is irredundant if for any k, x ^ ji V j2 V.. .Vjk-i Vjjt+i V . . . V j n . The Kurosh-Ore replacement property states that if hi V h2 V . . . V hm
and
ji V j2 V . . . V j
n
are two irredundant decompositions for an element x, then at least one of the decompositions j i V h2 V . . . V hk-x V /ifc+i V . . . V hm. is an irredundant decomposition for x. In a geometric lattice, every joinirreducible is a point. Hence, the Kurosh-Ore replacement property is a generalization of the basis replacement axiom and lattices satisfying the KuroshOre replacement property are generalizations of geometric lattices. A lattice L is consistent if for every join-irreducible j and every element x, j V x = x or j V x is a join-irreducible in the upper interval [x,l]. Consistent lattices were defined in 42 for combinatorial reasons. For example, they satisfy the following combinatorial inequality: in a finite consistent lattice, the number of join-irreducibles is less than or equal to the number of meet-irreducibles. Thus, the following theorem of Reuter 55 is somewhat surprising. 3.2. THEOREM. A finite-rank lattice satisfies the Kurosh-Ore replacement property if and only if it is consistent.
68
For a generalization of Theorem 3.2 to individual join-irreducibles, see 27 . Consistent semimodular lattices and consistent dually semimodular lattices share many properties with geometric lattices. See 23,27 and [B13]. The following conjecture would unify several extremal results in combinatorial lattice theory. 3.3. CONJECTURE. Let L be a finite consistent lattice. Suppose that the number of join-irreducibles equals the number of meet-irreducibles. Then the order dual of L is consistent. Going in the opposite direction in the exchange property, we get the antiexchange property: if a and b are distinct elements not in the closure of a set A, then a G Au{b}
==>• b g
AL){a}.
An antiexchange closure is a closure satisfying the antiexchange property. An antiexchange closure satisfying the finite basis property on a set S defines an antimatroid on S. An example of an antiexchange closure is convex closure for subsets of real n-dimensional Euclidean space Rn. If a is in the convex closure of Au{b}, then a is "inside" the convex set A U {b} and hence, b is "outside" the convex set A U {a}, that is, b is not in the convex closure of A U {a}. Thus, convex closure over the reals (or any ordered field) is an antiexchange closure. An axiomatic characterization of the lattice of convex sets in Rn can be found in 3 . It might be of interest to remark there is a characterization of compact convex sets in Rn not using the order relation. If A and B are subsets of Rn, then the Minkowski sum A + B is defined by A + B = {a + b:aeA
and b G B}.
A subset C is m-divisible if there exists a set C" such that C = C' + C' + ... + C, where there are m copies of C" in the Minkowski sum. The set C is infinitely divisible if it is m-divisible for every positive integer m. It is known 50 , p. 22 that a compact subset C in Rn is convex if and only if it is infinitely divisible. It would be interesting to explore the idea of infinitely divisible sets over other fields, such as the p-adics. Over fields of prime characteristic p, one cannot divide by p and a more useful definition is \p}-divisibility where we require m-divisibility only for integers m not divisible by p. The [p]-divisible sets over the finite vector space [GF(p e )] d have been characterized 65 : they are exactly
69 the sets closed under addition. In particular, over [GF(p)] d , a set is [p]-divisble if and only if it is a subspace. Let L be a lattice of finite rank. If x is an element in L, the element x* is the meet of all the elements covered by x. The lattice L is said to be locally lower distributive or meet-distributive every element x in L, the interval [ X% , XI IS cL distributive lattice. The following theorem (combining results due to Dilworth 14 , Edelman 18 , and probably others) describes the lattice structure of the lattice of closed sets of antiexchange closures. 3.4. THEOREM. are equivalent:
Let L be a lattice of finite rank. Then the following
(1) L is the lattice of closed set of an antimatroid. (2) L is locally lower distributive. (3) Every element in L has a unique decomposition into a join of joinirreducibles. Condition (3) is a finite version of the Krein-Milman theorem, which says that every convex set in Rn is the convex closure of its extreme points. It follows from Theorem 3.3 and 3.4 that locally lower distributive lattices are consistent. For more about antiexchange closures and locally lower distributive lattices, see 19 and 52 . We end this section with a discussion of subobjects and morphisms of matroids. If G is a matroid on the set S and T C 5, then the submatroid G\T is the matroid on T with the rank function of G restricted to subsets in T. The matroid G\T is also described as the matroid obtained by deleting the complement S\T from G. Note that the lattice of flats of a submatroid of G is usually not a sublattice of the lattice L(G). If a is an element in S, the deletion G\{a} is often written simply as G\a. If U C S, the contraction G/U ofG byU is the matroid on the complement S\U with rank function ranko/[/(yl) = r a n k e d U U) — rankest/) for a subset A in S\U. The lattice of flats of G/U is isomorphic to the upper interval [17, i] in L{G). As for deletions, the contraction G/{a} is often written as G/a. The lattice of flats of G/a is the upper interval [{a}, 1]. Thus, the simplification of G/a is the matroid defined on the lines of G containing a with lattice of flats [{a},l]. In particular, contraction by a point a corresponds to the classical geometric operation of projection from the point a.
70
Contractions and deletions commute. A matroid H is a minor of G if it can be obtained from G by a sequence of contractions and restrictions. Minors are subobjects when the morphisms are strong maps. There are two other categories of matroids: weak maps or specializations, and comaps. See 40 and 41 for more information on this. If G and H are matroids on disjoint sets S and T, then their direct sum G © H is the matroid on the union S U T with rank function rank(vl) = rank G (A f l 5 ) + rank;/(A n T). Taking the direct sum corresponds to putting the matroids G and H in the most general position possible, that is, in different dimensions. The lattice L(G © H) is the (cartesian) product L(G) x L{H). A matroid is connected if it is not the direct sum of two proper submatroids. An element a in a matroid G is an isthmus if rank({a}) = 1 and G equals the direct sum (G\a) © G|{a}. 4
G R A P H THEORY W I T H O U T VERTICES
Matroids can also be axiomatizatized using circuits. Circuits are abstractions of cycles in graphs and minimal linearly dependent sets in vector spaces. Matroid theorists usually allow graphs to have loops and multiple edges. A matroid on the set S can be specified by a collection of non-empty subsets of S called circuits satisfying the following axioms. (CI) If C\ and C2 are circuits, then C\ (£ C2 and C2 <£ C\. (C2) The circuit elimination axiom. If C\ and Ci are circuits and x 6 C\ flC2, then there exists a third circuit C3 such that C3C(C1UC2)\{x}. The cycles of a graph V satisfy the circuit axioms and specify a matroid on the edge set E of Y. This matroid is called the cycle matroid M(Y) of the graph T. The bases of M{Y) are the spanning forests of T. The rank of a set A of edges is given by the formula: rank(i4) = | V | - c ( i 4 ) , where c(A) is the number of connected components of the edge-subgraph r | A A matroid G is said to be graphic if there is a graph Y such that G = M(Y). The cycle matroid M(Y) of a graph can be represented over any field by an oriented vertex-edge incidence matrix. This matrix has rows indexed by the vertex set and columns indexed by the edge set. The column indexed by the edge e is defined as follows. If e is a loop (that is, an edge joining a
71
vertex to itself), then the column is the zero column vector. If e joins distinct vertices u and v, then one of the coordinates, say, the u-th coordinate is 1, the other coordinate, the v-th coordinate, is —1, and all other coordinates are zero. One of the motivations for introducing matroids was to extend the construction of the dual graph of a planar graph to arbitrary graphs 7 0 . Let T be a planar graph drawn on the surface of a sphere in real 3-dimensional space. Then the drawing of T divides the sphere into connected components, called faces. Each edge e in the drawing is on the boundary of two faces, except when e is an isthmus, that is, when removing e disconnects the graph. In this case, e is on the "boundary" of one face. The dual graph r x is the graph defined as follows. The vertex set is the set of faces of T. If e is an edge in T on the boundary of two faces f\ and / 2 , then the vertices fx and / 2 are joined by an edge in r x . If e is an isthmus and it is on the boundary of the face / , then there is a loop on the vertex / in T±. In particular, there is a one-to-one correspondence between the edges in T and F1- and we can label the corresponding edges by the same label. Whitney observed that a subset C of edges is a cycle in T if and only if it is a minimal cutset in T-1. Using this, he defined the duaJ of a graph T to be the matroid M- L (r) whose cycles are the cutsets of V. In general, the matroid M ± ( r ) is not graphic. Whitney 70 proved that following theorem, which is equivalent to Kuratowski's theorem for planar graphs. 4.1. THEOREM. graphic.
A graph is planar if and only if its dual matroid is
A set of edges is a minimal cutset in the cycle matroid M(T) if and only if it is the complement of a copoint. Abstracting this, we can define the (orthogonal) dual G1- of an arbitrary matroid G by specifying that C is a circuit in G x if and only if C is the complement of a copoint in G. For this reason, complements of copoints are called cocircuits. An equivalent way to define the dual is to specify that B is a basis in G1- if and only if the complement S\B is a basis in G. In particular, the rank of G1- equals \S\ — rank(G), the nullity of G. The basis description is perhaps the easiest way to see that the dual is indeed a matroid. If a rank-n matroid G is represented by an n x s matrix M, then its dual G x is represented by any matrix (s — n)xs matrix M' such that M' has rank s — n and every row of M is perpendicular (under the usual dot product) to every row of M. This accounts for our notation and terminology. A survey of orthogonal duality can be found in 12 .
72
A matroid G is said to be cographic if there is a graph T such that G = . M x ( r ) . Cographic matroids share many properties with cycle matroids of planar graphs. For example, Tutte 67 showed that a matroid is cographic if and only if it does not contain a minor isomorphic to the 4-point line, the Fano plane F7, the dual of the Fano plane F 7 X , the cycle matroid M(K5) of the complete graph K5, or the cycle matroid M(K3^) of the complete bipartite graph K3t3. Many results about graphs can be extended to results about matroids. For treatments of matroid theory emphasizing the graph-theoretic connections, see [Bll], [S8], and [S9]. 5
G R A P H THEORY A N D LEAN LINEAR A L G E B R A
In this section, we present a view of graph theory as "lean" linear algebra. The Hamming weight (relative to a given basis) of a vector in Fn is the number of non-zero coordinates in it. A vector is lean if it has Hamming weight at most 2. Using an oriented vertex-edge incidence matrix, the cycle matroid of a graph can be represented by lean vectors. This suggests that lean vectors behave similarly to edges of graphs. This idea is one of the motivations behind Dowling group matroids ( 15 ' 17 ). In this section, we describe a construction of Dowling group matroids which first appeared as part of a proof in 3 3 . Our description will make explicit some of the techniques in 3 3 . Let A be a group and let pi,P2, • • • ,pn, be a set of "basis points" called joints. For every group element a and pair of indices i and j such that i < j , let aij be a point called an internal point. The point a y is interpreted as the weight-2 linear combination Pi
-apj.
These points are thought of points in a projective space. Thus, pj - a~lpi should the same point as a y . Accordingly, we define o,-j to be ( a - 1 ) y when i <j. The points pi - apj are on the line pt Vpj spanned by pi and pj. Thus, 3-element subsets of the following form should be circuits: (5.1)
{Pi, otij,Pj}, {pu aij,0ij},
{ay, /?y , 7 y } .
In addition, the three points a y , (ijk, and (a/3)o: satisfy the linear relation (Pi - apj) + a(pj - (3pk) - {pi - {af3)pk) = 0. Hence, 3-element subsets of the form (5.2)
{an,Pjk,{aP)ik}
73
should also be circuits. The 3-element circuits of the form (5.1) or (5.2) are called atomic circuits. In the case when A is the multiplicative group of a field, all the circuits of the linear matroid on the vectors {pi : 1 < i < n} U {ay : 1 < i < j < n} can be deduced, by the circuit elimination axiom, from the atomic circuits. An example should make this clear. From the atomic circuits {pi,cti2,P2} and {ai2,/?23, (a/5)i3}, we can eliminate c*i2 and conclude that {pi,P2,023, (a/3)i3} contains a circuit. Since this set does not contain an atomic circuit, it is in fact a circuit and we have deduced the circuit {Pi,P2,/?23,(a/?)i3} from the circuits {pi,a12,P2} and {ai 2 ,/?23, (0^)13} using the circuit elimination axiom. One can show, by induction, that this deduction process is unambiguous and produces exactly the circuits of the linear matroid Qn(A). Circuit elimination for lean vectors does not use the additive structure of the field. Thus, when G is an arbitrary group, one can define the Dowling group matroid Qn{A) of rank n based on the group A to be the matroid on the set {pi : 1 < i < n} U {ay : 1 < i < j < n} with circuits all the circuits deducible from the atomic circuits (5.1) and (5.2) by circuit elimination. If the group A is finite, then Qn{A) has
points. There are three kinds of lines in Qn(A) : coordinate lines pt V pj with \A\ + 2 points, transversal lines {ctij,/3jk, (ce(3)ik} with three points, and two-point lines. Zaslavsky 72 has associated a complete gain graph with the Dowling group matroid Qn(A). The vertices of the graph are the integers i, 1 < i < n. On each vertex i is a half-edge representing the joint pi. Between two vertices i and j , there are \A\ edges labelled with group elements, with the edge labelled by a representing the internal points a y when i < j . Using the complete gain graph, Zaslavsky has described explicitly the circuits of Qn(A). A balanced cycle is a cycle cthi, (5tj, 7 J f c , . . . , 6tfl such that the product a/37 " ' <$ equals the identity in the group A. An unbalanced cycle is either a half-edge (thought of as a cycle of length one) or a cycle Q/ii,ftj,7jfc,... ,(5j/i such that the product a/?7 • • • S does not equal the identity. 5.1. LEMMA. A circuit of Qn(A) is a balanced cycle, a handcuff, that is, two disjoint unbalanced cycles connected by a non-empty path, a figureof-eight, that is, two unbalanced cycles meeting at exactly one vertex, or a theta-graph, that is, a union of two unbalanced cycles intersecting at a
74 (connected) non-empty subpath. Using this lemma, one can visualized "graphically" the circuits in Qn(A). For example, it is not hard to "see" the following lemma. 5.2. LEMMA. Every closed set X has a "natural" basis consisting of a set {pi : i 6 A] of joints and a set B of internal points such that in the complete gain graph, no edge in B is incident on a vertex in A and the edges in B form a forest. ,/,From Lemma 5.2, it is immediate that every closed submatroid of Qn(A) is isomorphic to a direct sum Qto {A)®M{Ktl+i)®M{Kt2+i)®.. .®M(Ktr+i), where tt > 0 and t0 + ti + ... + tT = n. From this, it follows that the simplification of any contraction of Qn(A) by a rank-A; flat is isomorphic to Qn-k(A). The closed sets can also be interpreted as "A-labelled partitions". Dowling's original definition of Qn(A) in 17 is in terms of A-labelled partitions, but this is perhaps not the easiest way to think about Qn(A). Special cases of Dowling group matroids occur in graph theory and algebra. The Dowling group matroid Q n ({l}) over the group of order 1 is isomorphic to the cycle matroid M{Kn+\) of the complete graph Kn+i. It is also the matroid of the root system or hyperplane arrangement An+\. The Dowling group matroid Q„({+1, —1}) over the group of signs of order 2 is the matroid of the root system Bn. More generally, if A is the group of k-th roots of unity, Qn(A) is the matroid of a complex reflection arrangement. For more about hyperplane arrangements and root systems, see [BIO]. From our discussion, it follows that the matroid structure of Qn(A) is determined by its 3-element circuits. 5.3. THEOREM. Let A be a group and let G be a rank-n simple matroid. Suppose that the points in G can be labelled by the set {Pi • 1 < i < n} U {<*ij •• a G A,l
so that {pi,P2, • • • ,pn} is a basis and 3-element subsets of the form (5.1) or (5.2) are circuits. Then G is isomorphic to Qn(A). In particular, if G is a rank-n simple matroid with ("J 1 ) points and the points can be labelled by {pi : 1 < i < n} U {e^ : 1 < i < j < n} so that {pi,P2i • • • ,Pn} is a basis and 3-element subsets of the form {Pi,£ij,Pj} and
{eij,ejk,eki}
are circuits, then G is isomorphic to the cycle matroid M{Kn+{)
of the com-
75
plete graph
Kn+\.
The method in 33 also shows that every circuit C in Q„(A) has a chord, that is, an element a such that for some partition C\ U C2 of C, both C\ U {a} and C2U{a} are circuits. Hence, closure using only 3-element circuits coincides with closure in Qn(A), that is, Qn(A) is 2-closed or line-closed. (Crapo denned and studied fc-closure in u . The special case of 2-closure or line-closure was rediscovered by Halsey 30 .) When n = 3, we do not need associativity in the group A to define Qn{A). Hence, rank-3 Dowling matroids can be defined using a quasi-group or, more or less equivalently, a Latin square. However, when the rank is greater than 3, Desargues' theorem holds and A must be a group for Qn(A) to be definable. The idea of lean linear algebra is developed further in [S7, Section 4.5], where abstract linear functionals are defined and applied to the critical problem on Dowling group matroids. We close this section with an intriguing but somewhat technical problem. 5.4. PROBLEM. Let A be a finite group. Find the maximum number r(n, A) of points in a rank-n simple matroid containing Qn{A) as a submatroid but not containing a (\A\ + 3)-point line as a minor. If A is the cyclic group of order q — 1, where q is a prime power, then Qn(A) is a submatroid of PG(n - 1, q). Hence, r(n, A) equals (qn - l)/(q - 1) in this case. However, for other groups A, almost nothing is known.
6
VARIETIES OF FINITE MATROIDS
Graph theorists have an advantage over matroid theorists in that there is a "biggest simple graph" on n vertices, the complete graph Kn, and every simple graph on n vertices is an edge subgraph of Kn. This allows, for example, a natural model for a random graph T on n vertices: a given edge of Kn is in V with a certain probability. Projective geometers have a similar advantage in that they consider sets of points inside the ambient space PG(n — 1,F). For example, if the point of intersection of two coplanar lines is not in the set, it can always be added. In addition, such points can be added consistently. We can abstract the notion of ambient space in the following way. A hereditary class of matroids is a class C of matroids satisfying the following conditions.
76
(HI) If G and H are matroids having isomorphic lattices of flats (or, equivalent^, isomorphic simplifications) and G is in C, then H is also in C. (H2) If H is isomorphic to a minor of G and G is in C, then H is also in C. (H3) If G and iJ are in C, then their direct sum G (B H is in C. Thus, we can perform within a hereditary class the basic operations in projective geometry: restrictions to a subset of points, projections, and direct sums. A class of matroids satisfying (HI) and (H2) is said to be minor-closed. See [S4] and [S10] for more about minor-closed classes. A sequence T„, 0 < n < oo, of simple matroids is a sequence of universal models for the hereditary class C if (UC1) the matroid Tn has rank n, and (UC2) every rank-n simple matroid in C is isomorphic to a submatroid of Tn. Universal models, if they exist, serve as ambient spaces for the hereditary class C. A variety is a hereditary class with a sequence of universal models. The possibility of taking direct sums, which seems a weak condition, is in fact very strong when combined with the existence of a sequence of universal models. Rather remarkably, there are only two families of non-degenerate sequences of finite universal models, projective geometries and Dowling group matroids 33 . To state this result precisely, we need to define three other "degenerate" sequences of universal models. The rank-n free matroid [/„,„ is the rank-n matroid with exactly n points. Its lattice of flats is the Boolean algebra of all subsets of an n-element set. If n is even, the rank-n matchstick matroid Mn{q) of order q is the direct sum of n/2 lines with q + 1 points; if n is odd, it is the direct sum of Mn-\(q) and a single point. The rank-n origami matroid On{q) of order q is the matroid on n + (n - l)(q - 1) points constructed as follows. Start with a basis e\, e 2 , . . . , e„ and add q — 1 points on each of the lines e\ V e2, e 2 V e%,..., e„_i V e„. 6.1. THEOREM. Let Tn be a sequence of universal models of a hereditary class of finite matroids. Then Tn is one of the following sequences: (1) the free matroids Un>n, (2) the matchstick matroids Mn(q) of order q for some positive integer q, (3) the origami matroids On(q) of order q for some positive integer q, (4) the Dowling group matroids Qn(A) over a finite group A,
77
(5) the projective geometries PG(n — l,q) over the finite field of order q for some prime power q. Theorem 6.1 says that projective geometries and Dowling group matroids are the only finite non-degenerate ambient spaces for matroids. Philosophically, this theorem says that matroid theory is the intersection of projective geometry and (generalized) graph theory. An element y in a lattice L with maximum 1 and minimum 0 is a complement of the element x if x A y = 0 and x V y = 1. A geometric lattice is modularly complemented if for every flat has a complement which is a modular flat. Using the methods in 33 , we can describe the simple matroids with modularly complemented lattices of flats 34 . 6.2. THEOREM. Let G be a connected simple matroid having rank at least 4. Then G has a modularly complemented lattice of flats if and only if G is a Dowling group matroid or a submatroid PG(n — 1,F)\T of a projective geometry, where T is a subset of points all having maximum Hamming weight n. In particular, for n > 4, there are exactly three rank-n connected binary simple matroids with modularly complemented lattices of flats: the cycle matroid M(Kn+i), the projective geometry PG(n — 1,2), and the punctured projective geometry PG(n — l,2)\{o}, obtained by removing a point from PG(n-l,2). From Theorem 6.2, we derive the following corollary. M(Kn+1) is the minimal rank-n connected simple matroid with a modularly complemented lattice of flats, in the sense that its lattice of flats is modularly complemented and every rank-n connected simple matroid with a modularly complemented lattice of flats contains it as a submatroid. The lattice of flats of M(Kn+i) is the rank-n lattice of partitions of an (n + l)-element set ordered by reverse refinement. There has been much interest, inspired partly by work of Whitman (see p. 235 ff. in [B6]), on finding lattice-theoretic characterizations of partition lattices. ^From Theorem 6.2, we obtain the following characterization. 6.3. THEOREM A connected rank-n geometric lattice is the rank-n partition lattice if and only if it is modularly complemented and contains exactly ("J 1 ) points.
78
We end this section with two suggestions for further research. 6.4. PROBLEM.
Classify varieties of infinite matroids.
Examples in 33 show that varieties of infinite matroids are much more difficult to study than varieties of finite matroids. Groh 29 has studied varieties of "topological" matroids. However, his conditions rule out all universal models except for certain infinite projective geometries. Any satisfactory theory should include as universal models Dowling matroids based on infinite groups and algebraic matroids, that is, matroids on elements of a field extension defined by algebraic rather than linear dependence. 6.5. RESEARCH AREA. Develop a theory of random matroids. A theory of random sets of points in a finite projective geometry has been developed (35>36) but a theory of random matroids or perhaps, random submodular functions, is yet to be discovered. 7
S E C R E T - S H A R I N G MATROIDS
Suppose a secret number c is to be "shared" among a set of people so that any subset of k people can pool their information and compute c, but any subset of fewer than k people cannot derive any information about c by pooling their information. An elegant way to do this is the Shamir threshold scheme. In this scheme, the number c is the constant term of a polynomial p(x) of degree k — 1. Each person is given a different point (a,p(a)) on the graph of p{x). By the Lagrange interpolation formula, p(x), and hence c, can be reconstructed using any k points, but knowing k — 1 or fewer points does not give any information about c. Secret-sharing matroids arose out of this circle of ideas. For more on secret-sharing schemes, see Chapter 11 of 6 3 . The best way to describe a secret-sharing matrix is with the motivating example. Let G be a matroid represented as a set S of vectors in the ddimensional vector space [GF(<7)]d over the finite field GF((?) of order q. Then we can construct a qd x \S\ matrix M as follows. The rows are labelled by the qd linear functionals on [GF(q)]d, the columns are labelled by the vectors in S, and the L, a-entry is the value L(a). Consider a subset A C S of columns, a column a not in A, and a linear functional L. Let n{L,b,A)
= {L'(b) : L'[a) = L(a) for all a e A},
79
that is, let n{L, b, A) be the set of GF(g)-elements of the form L'(b), where V is a linear functional such that V = L when restricted to A. If b is in the linear span of A, then the value L(b) is determined by the values L(a),a G A. Hence, n(L,b,A) is a single-element set. On the other hand, if b is not in the linear span of A, then the value L'(b) can be prescribed independently of the values of L' on A. Hence, when L' ranges over all linear functionals such that L' = L when restricted to A, we get all possible elements in GF(q) and hence n(L,b,A) = GF{q). Recall from linear algebra that a vector b is in the linear span of A if and only if for every linear functional L, the value L(b) is determined by the values of L on the vectors in A. Therefore, we have the following numerical characterization of closure in the matroid G : (7.1)
6 £ A < ^ - n(L, b,A) = l for all L.
Abstracting this situation, Brickell 6 defines a secret-sharing matrix. Let M = [rriia] be a matrix with rows indexed by the set I, columns indexed by the set 5, and entries in an alphabet with q symbols. If i is a row, A C S is a set of columns, and 6 is a column not in A, then n(i, b, A) = {rrijb • j £ I and rrijc = m.iC for all c G ^4}. A matrix M is a secret-sharing matrix if for every subset A C S and every column b g A, either |n(i, &, A)| = 1 for all rows i, or |n(i,6, A)| = q for all rows i. A secret-sharing matrix defines the closure of a matroid by condition (7.1). Matroids which can be "represented" by secret-sharing matrices are called secret-sharing matroids. Any Latin square defines a secret-sharing matrix, which, in turn, defines a rank-1 matroid on the column set. Seymour 62 has shown that the Vamos matroid (described in Section 1) is not secret-sharing. Matus 51 has found many other non-representable matroids which are not secret-sharing. The question remains whether Dowling group matroids are secret-sharing. Despite the fact that one can define abstract linear functionals on Dowling group matroids (see [S7], Section 4.5), one would conjecture that most Dowling group matroids are not secret-sharing. 7.1. CONJECTURE. If A is not a cyclic group and n > 3, then the Dowling group matroid Qn{A) is not a secret-sharing matroid.
80
8
G R E E D Y ALGORITHMS, MATROID I N T E R S E C T I O N , A N D MATROID PARTITION
Matroids also occur prominently in the theory of combinatorial optimization. A reason for this is that collections of independent sets of matroids are set systems on which the greedy algorithm always work. Let S be a finite set and X a collection of subsets of S containing the empty set 0. Let w : S —» R+ be a non-negative real-valued "weight" function on S. If J is a subset of S, its weight w(J) is defined to be the sum of the weights of its elements, that is,
aeJ
The greedy algorithm attempts to find a subset of maximum weight in X in the following way: Start with 7 = 0. Suppose that I has been chosen. Amongst all the elements not in I, choose an element a such that Iu{a} is in the collection X and w(a) is maximum. Replace I by I U {a}. When I is a maximal subset of X, stop and output I as the subset having maximum weight. Edmonds 22 discovered that the independent sets of a finite matroid can be characterized using the greedy algorithm. Specifically, he showed that a collection I of subsets of a finite set S is the collection of independent sets of a matroid on S if and only if the following axioms holds. (Grl)
0
(Gr2)
If I e X and J C I, then J
G
X. el.
(Gr3) For every non-negative real-valued weight function on S, the greedy algorithm outputs a subset in I having maximum weight. The greedy algorithm applied to the cycle matroid of a graph is Kruskal's algorithm for finding the maximum-weight spanning tree in a graph. A history of the greedy algorithm for trees in graphs can be found in 28 . There is a similar axiomatization in which maximum-weight subsets are replaced by lexicographically-greatest subsets. The axiomatization using the greedy algorithm led to an interesting generalization of a matroid. A greedoid on a finite set 5 is defined by a collection 1 of subsets of S called feasible sets satisfying (Grl), (Gr3), and the following weakening of (Gr2). Accessibility. If I is a non-empty feasible set, then there exists an element a e I such that I\{a} is feasible.
81
Introductions to greedoids can be found in [B7] and 4 . The point of view of combinatorial optimization also led to the study of polytopes associated with matroids. The classic paper in this area is 2 1 . Perhaps the deepest results in matroid theory coming out of combinatorial optimization are the matroid partition theorem and the matroid intersection theorem. Both theorems are due to Edmonds 20 ' 21 . The matroid partition theorem is similar to the marriage theorem in matching theory: both assert that an "obviously" necessary condition is also sufficient. 8.1. MATROID PARTITION THEOREM. For 1 < i < m, let Gt be a matroid with rank function rank; on the finite set S. Then there exist subsets Si, S2, • • •, Sm such that Si is independent in G; and Si U S2 U . . . U Sm = S if and only if for every subset
ACS, m
^ r a n k i ( A ) > \A\. i=l
In particular, if G is a matroid on S, then S can be partitioned into m independent sets if and only if for every subset A C S , mrank(A) > |A|. Applying the matroid partition theorem to the matroid G and the dual of the matroid H, we obtain the matroid intersection theorem. 8.2. MATROID INTERSECTION THEOREM. Let G and H be matroids with rank functions ranko and rank// on the same finite set S. Then the maximum size of a subset independent in both G and H equals min{rank G (A) + rank#(B) : A U B = S } . There are polynomial-time algorithms to find partitions into independent sets and maximum-sized common independent sets. These algorithms also give the clearest proofs of Theorems 8.1 and 8.2. For more about matroid algorithms (in particular, the matroid matching problem), see [B9], [SI], [Sll], and 4 7 .
9. MATROID UNIONS AND TRANSVERSAL MATROIDS
82
The idea of partitioning into independent sets of different matroids leads to the following construction. If G\ and G 2 are matroids on the same set S, then the matroid union G\ V G2 is the matroid whose independent sets are sets of the form I\ U/2, where I\ is independent in G\ and I2 is independent in G2. There is no known elementary proof that G\ V G2, as defined, is actually a matroid. See [Bll, p. 403] for a proof. A comprehensive survey of matroid unions can be found in [Bll, Chapter 12]. Roughly speaking, matroid union corresponds to putting one representation matrix on top of another. To make this intuition precise, we need a way to remove "accidental" linear dependences among the columns of a matrix. Let G be an F-representable matroid on the set S and let M be a representation matrix of G over F. Let xa,a £ S be indeterminates, one for each element of 5, thought of as transcendental elements over an extension of the field F. The generic diagonal matrix D on the set S is the diagonal matrix with row and columns indexed by S whose aa-entry on the diagonal is xa. Then the product MD is the matrix obtained from M by multiplying the column indexed by a by the indeterminate xa. In particular, right multiplication by D makes the columns of M "algebraically independent". 9.1. THEOREM. Let G\ and G2 be .F-representable matroids on the same set S with representation matrices Mi and M 2 . Then the matrix M defined by
is a representation matrix for the matroid union G\ VG2 over a transcendental extension of F. The proof uses the multiple Laplace expansion for a single matrix. If M is a matrix, I is a subset of row indices, and J is a subset of column indices, then M[I\ J] is the \I\ x \J\ submatrix of M obtained by restricting M to the rows and columns indexed by I and J (keeping the same order as in M). 9.2. MULTIPLE LAPLACE EXPANSION. Let TV be an / x I square matrix, with rows and columns indexed by the integers 1,2,...,/. Then detiV=
J2
(-l)il+h+-+ik+k{k~1)/2detN[l,2,...,k\i1,i2,...,ik] x det N[k + 1, k + 2 , . . . , l\jk+i, jk+2, ••-,ji],
83
where the sum ranges over all fc-element subsets {ii,i2,...,ik} of {1,2,. ..,1} and {jk+i,jk+2,---,ji} is the complement of {h,i2, • • • ,h} in {1,2,...,/}. Let M'2 be any representation matrix for Gi over any field containing the field F, let M be the matrix '
Mi
—
. MTJ' and let H be the linear matroid defined by the matrix M. We shall prove that every 77-independent set is the union of a G\ -independent set and a G2-independent set. Suppose 7 is .ff-independent. Then there exists an \I\element subset J of rows such that the square submatrix M[J|7] with rows indexed by J and columns indexed by 7 is non-singular. Let J\ be the subset of rows in J from the matrix M\ and let J2 be the subset of rows in J from M2. Expanding the matrix M[J|7] according to the Laplace expansion and observing that detM[J|7] ^ 0, we conclude that one of the summands in the expansion, say, ± det Mr [Ji \h] det M'2 [J2 \I2], is non-zero. Because both subdeterminants are non-zero, I\ is independent in G\ and I2 is independent in G2. Hence, / is the union of a G\-independent set and a G2-independent set. Conversely, suppose that i" is independent in G± VG2. Then, I = IiL)I2, where 7i is G\-independent and I2 is G2-independent. Removing elements from I\ or I2 if necessary, we may suppose that Ii and I2 are disjoint. Choose a subset J\ of row indices so that Mi[Ji|7i] is non-singular and a subset J2 so that M2[J2 \h] is non-singular. In the Laplace expansion of det M[J\ U J2 \h U I2], the term ±detM1[J1\h]detM2[J2\I2}(l[xa), a€l2
is non-zero. The monomial ]1 xa comes from the indeterminates in the generic matrix D. Because every summand in the Laplace expansion has a different monomial and the indeterminates algebraically independent, there are no algebraic relations between the summands. Hence, detM[Ji U J2\h U I2] is non-zero and the columns indexed by I\ \JI2 are linearly independent. This completes the proof of Theorem 9.1. Important examples of matroid unions are transversal matroids. Let R be a relation between the set S and the set { 1 , 2 , . . . , m}. A subset 7 in 5 is
84
said to have a partial matching if there is an injection i : I -» { 1 , 2 , . . . , m} such that for every element a in / , a is related to t(a). The sets with partial matchings form the independent sets of a matroid on S called the transversal matroid T(R) of the relation R. Transversal matroids are matroid unions of rank-1 matroids. If G is a rank-1 matroid on the set 5, then S is the disjoint union of 0 the set of loops, and 5\0, the set of elements having rank 1. Then the transversal matroid T(R) is the matroid union d V G2 V . . . V Gm, where Gi is the rank-1 matroid whose set of rank-1 elements is R~l{i), the set {a : aRi} of elements in S related to the element i. Let F be a field and let Xj)0 be indeterminates, one for each pair (i,a) such that a is related to i in the relation R. Then, using Theorem 9.1 repeatedly, we conclude that the transversal matroid T(R) can be represented over an extension of F by the m x \S\ matrix whose i,a-entry is :Ej>a if aRi and 0 otherwise. This matrix is called the free matrix of the relation. Frobenius 25 was the first to study free matrices. For a historical study, see 59 . The following research area, proposed by Rota, may be of interest to the philosophically-minded. 9.3. RESEARCH AREA. Develop matching theory using free matrices, linear algebra, and the theory of determinants. Some work in this direction can be found in 16>31>38. Many theorems in matching theory are special cases of matroid theorems. As one might expect, the marriage theorem is a special case of the matroid partition theorem. The matroid intersection theorem yields a necessary and sufficient condition for the existence of a common transversal for two relations and Rado's theorem for independent transversals. For more on matching theory, see 31 and 48 . There are many unsolved problems in the area of matroid unions. The best known one, posed by Welsh, is to characterize the union-irreducible matroids, that is, those matroids which cannot be expressed as the matroid union of two matroids, both having strictly smaller rank. See [Bll, p. 474].
9
MATRIX MULTIPLICATION A N D T H E CAUCHY-BINET IDENTITY
Yet another useful determinantal identity in matroid theory is the CauchyBinet identity, which is a generalization of the homomorphism or multiplica-
85
tive property det(MiV) = det M det TV of determinants. 10.1. THE CAUCHY-BINET IDENTITY. Let A be an n x s matrix and B be an s x n matrix with the columns of A and the rows of B labelled by the same set K. Then,
det(MTV) = Y,
det
Mi1}det
N 1
i }'
i
where the sum is over all n-element subsets i" of K, M[I] is the nxn matrix obtained by restricting M to the columns indexed by i", and N[I] is the nxn matrix obtained by restricting TV to the rows indexed by I. For example, det
[[2
3-1.
1 3 0 7 6 9
equals the sum 1 2
4 3
1 3 4 + 3 0 7
0 -1
0 6
7 1 + 2 9
0 -1
1 6
3 9 '
The Cauchy-Binet identity sheds some light on a special case of matroid intersection, the common basis problem: given two matroids M and N having the same rank, does there exist a subset which is a basis in both M and iV? When the matroids M and N are representable over the same field, the common basis problem is equivalent to determining whether a generic matrix product is non-singular. 10.2. THEOREM. Let M and TV be rank-n nx s matrices with columns indexed by the same set 5. Then the linear matroids on S defined by M and N have a common basis if and only if the matrix MDN1 is non-singular. Here, D is the generic diagonal matrix on S defined in Section 9.
86
Theorem 10.2 follows from the identity (10.1)
d e t ( M l W ) = ] T det M[B] det N[B}( J J B
xa).
aGB
Since the indeterminates xa are algebraically independent, det(MDNl) ^ 0 if and only if one of the summands on the right-hand side is non-zero. The set B indexing that term is a common basis of the linear matroids defined by M and TV. We can also use the Cauchy-Binet identity to define bimatroid multiplication. Let B be a bimatroid between S and E and C be a bimatroid between E and T. Then the product bimatroid C o B between S and T defined by: (X, U) is a non-singular minor in C o B if and only if there exists A C E, (X, A) is a non-singular minor in B and (A, U) is a non-singular minor in C. It is not easy to prove that the bimatroid product is in fact a bimatroid. In fact, the known proof uses the matroid intersection theorem. Because many constructions (such as strong maps, matroid unions, and matroid induction) can be modelled by matrix multiplication, bimatroid multiplication is a unifying idea in the study of matroid constructions. A product of oriented bimatroids can be defined, at the cost of introducing a lexicographic order on the subsets of C. (A similar idea is used in 44 to define oriented matroid-union.) A more natural product for oriented bimatroids (if it exists) would be very useful in combinatorial differential topology (see l and 2 ). 10
BASIS G E N E R A T I N G F U N C T I O N S A N D T H E MATRIX-TREE THEOREM
As in the previous two sections, let {xa,a 6 5} be a set of indeterminates. The basis generating function I5(G;x) of the matroid G on the set S is the polynomial defined by
B(G;x) = Yl Ux-' B
a€B
where the sum ranges over all bases B of G. The basis generating function encodes in an algebraic form the description of the matroid G in terms of its bases. The basis generating function satisfies the following recursions.
87
If a is neither a loop or an isthmus, B{G;x) = B{G\a;x)
+
xaB(G/a;x).
If a is a loop, B{G;x) =
B(G\a;x).
If a is an isthmus, B(G;x)
-xaB(G\a;x).
For certain matroids, the basis generating function can be expressed as a determinant. Let M be an n x s matrix representing the rank-n matroid G with s elements. Then, by equation (10.1),
det(MDMl) = Y, (det(M[B])2([J xa). B
a£B
Setting all the variables xa to 1, we have det(MM') = ] T
(det{M[B})2.
B
Hence, we have the following theorem. 11.1. THEOREM. Let M be an n x s representation matrix for rank-n matroid G. If all the n x n square subdeterminants of M have values —1, 0 or 1, then B(G;x) =
det(MDMt)
and the number of bases in G equals det(MM 4 ). A class of matroids to which the theorem can be applied is the class of regular matroids. A regular matroid is a matroid which is representable over all fields. It can be shown that a regular matroid G can be represented by a totally unimodular matrix, that is, a matrix all of whose subdeterminants are —1, 0, or 1. The decomposition theorem of Seymour 6 1 says that every regular matroid can be built using simple operations from graphic matroids, cographic matroids, and a "sporadic" 10-point rank-4 matroid i?ioThe oriented vertex-edge incidence matrix M of a connected graph T with one row deleted is a totally unimodular matrix representing the cycle matroid M(T). Applying Theorem 11.1 to this case, one obtains the classical matrix-tree theorem of graph theory. Seymour's decomposition theorem indicates that Theorem 11.1 for regular matroids is almost the same as the matrix-tree theorem for graphs.
88
11
G E N E R I C R A N K - G E N E R A T I N G POLYNOMIALS
The basis generating function satisfies all except one of the contraction-anddeletion relations. Let / be a function defined on matroids on finite sets. The function / satisfies the contraction-and-deletion relations if (CD1)
f(M)
= f(N) if M is isomorphic to N, and
(CD2)
for every matroid G and every element a in G, f(G) = f(G\a) +
f(G/a)
if a is neither an isthmus nor a loop, and f(G) =
f(G\{a})f(G\a)
otherwise. The rank-generating polynomial R(G;x,X) variables x and A defined by
R(G;x,X) = J2
is the polynomial in the
A rank ^- rank(yl >a;l >l l- rank( ' 4 \
Aes where the sum ranges over all subsets A of S. The exponent of the variable A is the corank of the set A. Tutte 66 (in the special case of graphs) and Brylawski 8 showed that every function on matroids satisfying the contractionand-deletion relations is a specialization of the rank-generating polynomial. Setting x = —1 in the rank-generating polynomial, we obtain the characteristic polynomial x(G; A) of G. Indeed, if G has no loops, then
X(G;A)=
Y,
M0,X)A r a n k ( 5 ) - r a n k W ,
X:X£L(G)
where [i is the Mobius function 56 of the lattice L(G); if G has loops, then x(G; A) = 0. Uninverting the Mobius inversion, we obtain the following identity which will be used later:
(12.1)
J2
X(G/X;X) = A rank ( s >- rank ^).
X-.ACX
The rank-generating polynomial and its close relative, the Tutte polynomial, occur all over mathematics. A comprehensive survey to about 1991 can be found in 9 . Some new applications can be found in 45>69. Tutte polynomials for graphs with weights on their edges have been studied 64 . (See also 5 and
89
the references in there.) Many of the theorems for weighted graphs generalize in a straightforward way to weighted matroids. Let G be a matroid on the set S, let {xa, a € 5} be a set of variables, and let A be a new variable. The Tugger polynomial R(G; x, A) is the polynomial in the variables xa and A defined by
R(G;z,A) = J2 r k < s )- ran ^»(JIi«), AeS
aeA
where the sum ranges over all subsets A of 5. For example, the Tugger polynomial of £^2,3, the 3-point line with points a, b, and c, is 1 + X(xa + xb + xc) + \2{xaxb
+ xaxc + xbxc +
xaxbxc).
The Tugger polynomial is a generic rank-generating function of the matroid G. It contains a complete description of the rank function of G in an algebraic form. The Tugger polynomial satisfies an identity generalizing an identity of Tutte 68 . Let R ( G ; z - 1, A) be the polynomial obtained from R(G;x, A) by replacing every variable xa by xa — 1. 12.1. THEOREM.
R(G; x^l, A) =
^°IX'
J2
A
)( I I x*)>
X-.XeL(G)
a£X
where the sum ranges over all the closed sets of G. To prove Theorem 12.1, we need the following elementary identity:
(i2.2)
n ^ = E (n^-1))aeA
B:BCA
\b£B
)
For example, xaxb = (xa - l)(xb - 1) + (xa - 1) + (xb - 1) + 1. Consider the right-hand side of the identity in the theorem. Note that since x(G/X; A) = 0 when X is not closed, we can take the sum to range over all subsets X in S. Thus, using equations (12.1) and (12.2) and changing the order of summation, we have
90
£
x(G/A-;A)(n*«)
X:XCS
VaeX
/
= YI x(G/x-x)[j2 fn^- 1 ))) X-.XCS
\ACX
\aeA
) J
= £ fn^" 1 )) [ E X(G/X;A)| A:ACS \aeA
J
\X:ACX
J
= YI (n( a ; a- i )) A r a n k ( s ) " r , m k ( > i ) A-.AeS \aEA
J
= R(G;x-l,A). This completes the proof of Theorem 12.1. The striking thing about Theorem 12.1 is that it allows us to obtain the family of closed sets of G from the rank function of G using a simple (albeit time-consuming) algebraic operation. 12.2. PROBLEM. of matroids.
Find algebraic relations between other descriptions
Such relations would allow matroid computations to be done using computer algebra.
ACKNOWLEDGEMENT I would like to thank Joseph Bonin for his comments on several drafts of this paper. I would also like to thank the National Security Agency for supporting my research under Grant MDA 90498-1-0025. Books on matroid theory and related areas Bl.
G. Birkhoff, Lattice theory, 3rd edition, Amer. Math. Soc, Providence, Rhode Island, 1967.
B2.
J. E. Bonin, J. G. Oxley, and B. Servatius, eds., Matroid theory, Amer. Math. Soc, Providence, Rhode Island, 1996.
91
B3.
A. Bjorner, M. Las Vergnas, B. Sturmfels, N. L. White, and G. M. Ziegler, Oriented matroids, Cambridge Univ. Press, Cambridge, 1993.
B4.
H. H. Crapo and G.-C. Rota, On the foundations of combinatorial theory: Combinatorial geometries, Preliminary edition, M. I. T. Press, Cambridge, Massachusetts, 1970.
B5.
P. Crawley and R. P. Dilworth, Algebraic theory of lattices, PrenticeHall, Englewood Cliffs, New Jersey, 1973.
B6.
G. Gratzer, General lattice theory, 2nd edition, Birkhauser, Basel, 1998.
B7.
B. Korte, L. Lovasz, and R. Schrader, Greedoids, Springer-Verlag, Berlin and New York, 1991.
B8.
J. P. S. Kung, ed., A sourcebook in matroid theory, Birkhauser, Boston and Basel, 1986.
B9.
E. L. Lawler, Combinatorial optimization: Networks and matroids, Holt, Rinehart and Winston, New York, 1976.
BIO.
P. Orlik and H. Terao, Arrangements of hyperplanes, Springer-Verlag, Berlin and New York, 1992.
Bll.
J. G. Oxley, Matroid theory, Oxford Univ. Press, Oxford, 1992.
B12.
B. Polster, A geometrical picture book, Springer-Verlag, Berlin and New York, 1998.
B13.
M. Stern, Semimodular lattices, Cambridge Univ. Press, Cambridge, 1999.
B14.
W. T. Tutte, Graph theory as I konw it, Oxford Univ. Press, Oxford, 1998.
B15.
D. J. A. Welsh, Matroid Theory, Academic Press, London and New York, 1976.
B16.
N. L. White, ed., Theory of matroids, Cambridge Univ. Press, Cambridge, 1986.
B17.
N. L. White, ed., Combinatorial geometries, Cambridge Univ. Press, Cambridge, 1987.
B18.
N. L. White, ed., Matroid applications, Cambridge Univ. Press, Cambridge, 1992.
92
Introductory or survey papers on matroid theory. 51.
R. E. Bixby and W. H. Cunningham, Matroid optimization and algorithms, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 551-609.
52.
A. Delandtsheer, Dimensional linear spaces, in Handbook of incidence geometry, F. Boukenhout, ed., Elsevier North-Holland, Amsterdam, 1995, pp. 193-294.
53.
A. W. Ingleton, Representations of matroids, in Combinatorial mathematics and its applications, D. J. A. Welsh, ed., 1971, Academic Press, London and New York, pp. 149-169.
54.
J. P. S. Kung, Extremal matroid theory, in Graph structure theory, N. Robertson and P. D. Seymour, eds., Amer. Math. Soc, Providence, RI, 1992, pp. 21-62.
55.
J. P. S. Kung, The geometric approach to matroid theory, in Gian-Carlo Rota on combinatorics, J. P. S. Kung, ed., Birkhauser, Boston, 1995, pp. 604-622.
56.
J. P. S. Kung, Matroids, in Handbook of algebra, Volume 1, M. Hazewinkel, ed., Elsevier North-Holland, Amsterdam, 1996, pp. 157184.
57.
J. P. S. Kung, Critical problems, in [B2], pp. 1-127.
58.
J. G. Oxley, Matroid structure and connectivity, in [B2], pp. 129- 170.
59.
J. G. Oxley, Matroids, in Graph connections, L. W. Beineke and R. J. Wilson, eds., Oxford Univ. Press, Oxford, 1987, pp. 110-115.
510.
P. D. Seymour, Matroid minors, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 527-550.
511.
D. J. A. Welsh, Matroids and their applications, in Selected topics in graph theory 3, L. W. Beineke and R. J. Wilson, eds., Academic Press, London and San Diego, 1988, pp. 43-70
512.
D. J. A. Welsh, Matroids: Fundamental concepts, in Handbook of combinatorics, R. L. Graham, M. Grotschel, and L. Lovasz, eds., Elsevier North-Holland, Amsterdam, 1995, pp. 481-609.
93
References 1. L. Anderson, Topology of combinatorial differentiable manifolds, Topology, 38(1999), 197-221. 2. L. Anderson, Matroid bundles, in New perspective in algebraic combinatorics, L. J. Billera et al, ed., Cambridge Univ. Press, Cambridge, 1999, pp. 1-21. 3. M. K. Bennett, Convexity closure operators, Algebra Universalis, 10(1980), 345-354. 4. A. Bjorner and G. M. Ziegler, Introduction to greedoids, in [B18], pp. 284-357. 5. B. Bollobas and 0 . Riordan, A Tutte polynomial for coloured graphs, Combin. Probab. Comput., 8(1999), 45-93. 6. E. Brickell and D. M. Davenport, On the classification of ideal secret sharing schemes, J. Cryptology, 4 (1991), 123-134. 7. R. Brualdi and H. J. Ryser, Combinatorial matrix theory, Cambridge Univ. Press, Cambridge, 1991. 8. T. Brylawski, A decomposition for combinatorial geometries, Trans. Amer. Math. Soc, 171(1971), 235-282. 9. T. Brylawski and J. G. Oxley, The Tutte polynomial and its applications, in [B18], pp. 123-225. 10. J. Bukowski and A. G. de Oliveira, Invariant theory-like theorems for matroids and oriented matroids, Adv. Math., 109(1994), 34-44. 11. H. H. Crapo, Erecting geometries, in Proceedings of the Second Chapel Hill Conference on Combinatorics and its Applications, Univ. North Carolina, Chapel Hill, NC, 1970, pp. 74-99. 12. H. H. Crapo, Orthogonality, in [Bll], pp. 76-96. 13. J. Desarmenien, J. P. S. Kung and G.-C. Rota, Invariant theory, Young bitableaux, and combinatorics, Adv. Math., 27(1978), 63-92. 14. R. P. Dilworth, Lattices with unique irreducible decompositions, Ann. Math., 41(1940), 771-777. 15. P. Doubilet, G.-C. Rota and R. Stanley, On the foundations of combinatorial theory. VI. The idea of generating function, in Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and its Applications, Vol. II (Probability theory), Univ. California Press, Berkeley, CA, 1972, pp. 267-318. 16. P. Doubilet, G.-C. Rota and J. Stein, On the foundations of combinatorial theory. IX. Combinatorial methods in invariant theory, Stud. Appl. Math., 53(1974), 185-216. 17. T. A. Dowling, A class of geometric lattices based on finite groups, J.
94
Combin. Theory Ser. B, 14(1973), 61-86; erratum, ibid., 15(1973), 211. 18. P. H. Edelman, Meet-distributive lattices and the anti-exchange closure, Algebra Universalis, 10(1980), 290-299. 19. P. H. Edelman and R. E. Jamison, The theory of convex geometries, Geometriae Dedicata, 19(1985), 247-270. 20. J. Edmonds, Minimal partition of a matroid into independent sets, J. Res. Nat. Bur. Standards Sect. B, 69B(1965), 67-77. 21. J. Edmonds, Submodular functions, matroids and certain polyhedra, in Combinatorial structures and their applications, 1970, Gordon and Breach, New York, 69-87. 22. J. Edmonds, Matroids and the greedy algorithm, Math. Programming, 1(1971), 127-136 23. U. Faigle, Geometries on partially ordered sets, J. Combin. Theory Ser. B, 28(1980), 26-51. 24. J. V. Field and J. J. Gray, The geometrical work of Girard Desargues, Springer-Verlag, Berlin and New York, 1987. 25. G. Frobenius, Uber zerlegbare determinanten, Sitzber. Preuss. Akad. Wiss., 1917, 274-277. 26. J. D. Golic, On matroid characterization of ideal secret sharing schemes, J. Cryptology, 11(1998), 75-86. 27. K. M. Gragg and J. P. S. Kung, Consistent dually semimodular lattices, J. Combin. Theory Ser. A, 60(1992), 246-263; erratum, ibid., 71(1995), 173. 28. R. L. Graham and P. Hell, On the history of the minimum spanning tree problem, Ann. Hist. Comput., 7(1985), 43-57. 29. H.-J. Groh, Varieties of topological geometries, Trans. Amer. Math. Soc, 337(1993), 691-702. 30. M. Halsey, Line-closed combinatorial geometries, Discrete Math., 65(1987), 245-248. 31. L. H. Harper and G.-C. Rota, Matching theory, an introduction, in Advances in Probability, Vol. 1, P. Ney, ed., Marcel Dekker, New York, 1971, pp. 171-215. 32. J. Kahn, A problem of P. Seymour on non-binary matroids, Combinatorica, 5(1985), 319-323. 33. J. Kahn and J. P. S. Kung, Varieties of combinatorial geometries, Trans. Amer. Math. Soc, 271(1982), 485-499. 34. J. Kahn and J. P. S. Kung, A classification of modularly complemented geometric lattices, European J. Combin., 7(1986), 243-248. 35. D. G. Kelly and J. G. Oxley, Asymptotic properties of random subsets of projective spaces, Math. Proc. Cambridge Philos. Soc, 91(1982),
95
119-130. 36. D. G. Kelly and J. G. Oxley, Threshold functions for some properties of random subsets of projective spaces, Quart. J. Math. Oxford (2), 33(1982), 463-469. 37. J. P. S. Kung, Bimatroids and invariants, Adv. Math., 30(1978), 238-249. 38. J. P. S. Kung, Jacobi's identity and the Konig-Egervary theorem, Discrete Math., 49(1984), 75-77. 39. J. P. S. Kung, Basis exchange properties, in [B16], pp. 62-75. 40. J. P. S. Kung, Strong maps, in [B16], pp. 224-253. 41. J. P. S. Kung, Weak maps, in [B16], pp. 256-271. 42. J. P. S. Kung, Matchings and Radon transforms in lattices. I. Consistent lattices, Order, 2(1985), 105-112. 43. J. P. S. Kung, Pfaffian structures and critical problems in finite symplectic spaces, Ann. Combin., 1(1997), 159-172. 44. J. Lawrence and L. Weinberg, Unions of oriented matroids, Linear Algebra Appl., 41(1981), 183-200. 45. C. M. Lopez, Chip firing and the Tutte polynomial, Ann. Combin., 1(1997), 253-259. 46. L. Lovasz, Matroid matching and some applications, J. Combin. Theory Ser. B, 28(1980), 208-236. 47. L. Lovasz, The matroid matching problem, in Algebraic methods in graph theory, Vol. I, II (Szeged, 1978), 1981, North-Holland, Amsterdam, pp. 495-517. 48. L. Lovasz and M. D. Plummer, Matching theory, North-Holland, Amsterdam, 1986. 49. S. Mac Lane, Some interpretation of abstract linear dependence in terms of projective geometry, Amer. J. Math., 58(1936), 236-240. 50. G. Matheron, Random sets and integral geometry, Wiley, New York, 1974. 51. F. Matus, Matroids represented by partitions, Discrete Math., 203(1999), 169-194. 52. B. Monjardet, The consequences of Dilworth's work on lattices with unique irreducible decompositions, in The Dilworth theorems, K. Bogart, R. Freese, and J. Kung, eds., Birkhauser, Boston, 1990, pp. 192-201. 53. J. von Neumann, Continuous geometry, I. Halperin, ed., Princeton Univ. Press, Princeton NJ, 1960. 54. G. Peano, Sui fondamenti della geometria, Rivista di Matematica, 4(1894), 73. 55. K. Reuter, The Kurosh-Ore exchange property, Acta Math. Hungar., 53(1989), 119-127. 56. G.-C. Rota, On the foundations of combinatorial theory. I. Theory
96
57.
58. 59.
60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73.
of Mobius functions, Z. Wahrscheinlichkeittheorie und Verw. Gebiete, 2(1964), 340-368. G.-C. Rota, Combinatorial theory and invariant theory, Notes taken by L. Guibas from the National Science Foundation Seminar in Combinatorial Theory, Bowdoin College, Maine, 1971, unpublished typescript. E. R. Scheinerman and D. H. Ullman, Fractional graph theory, Wiley, New York, 1997. H. Schneider, The concepts of irreducibility and full indecomposabihty of a matrix in the works of Frobenius, Konig and Markov, Linear Algebra Appl., 18(1977), 139-162. A. Schrijver, Matroids and linking systems, J. Combin. Theory Ser. B, 26(1979), 349-369. P. D. Seymour, Decomposition of regular matroids, J. Combin. Theory Ser. B, 28(1980), 305-359. P. D. Seymour, On secret-sharing matroids, J. Combin. Theory Ser. B, 56(1992), 69-73. D. R. Stinson, Cryptography: theory and practice, CRC Press, Boca Raton, FL, 1995. L. Traldi, A dichromatic polynomial for weighted graphs and link polynomials, Proc. Amer. Math. Soc, 106(1989), 279-286. R. T. Tugger, Convexity without order, manuscript, May 2000. W. T. Tutte, A ring in graph theory, Proc. Cambridge Philos. Soc, 43(1947), 26-40. W. T. Tutte, Matroids and graphs, Trans. Amer. Math. Soc, 90(1959), 527-552. W. T. Tutte, On dichromatic polynomials, J. Combin. Theory, 2(1967), 301-320. D. J. A. Welsh and G. P. Whittle, Arrangements, channel assignments, and associated polynomials, Adv. in Appl. Math., 23(1999), 375-406. H. Whitney, Non-separable and planar graphs, Trans. Amer. Math. Soc, 34(1932), 339-362. H. Whitney, On the abstract properties of linear dependence, Amer. J. Math., 57(1935), 509-533. T. Zaslavsky, Biased graphs. I. Bias, balance, and gains, J. Combin. Theory Ser. B, 47(1989), 32-52. G. Ziegler, What is a complex matroid? Discrete and Comput. Geom., 10(1993), 313-348.
ENUMERATION OF G R A P H COVERINGS, SURFACE B R A N C H E D COVERINGS AND RELATED GROUP THEORY* JIN HO KWAK Combinatorial
and Computational Mathematics Science and Technology, Pohang.
Center Pohang University 790-784 Korea
of
JAEUN LEE Mathematics,
Yeungnam
University.
Kyongsan.l'12-1%9
Korea
Lots of graphs having a symmetry property can be described as coverings of simpler graphs. In this manuscript, we examine several enumeration problems for various types of nonisomorphic graph coverings of a graph and some of their applications to a group theory or to a surface theory. This manuscript is organized as follows. In section 1, we introduce basic concepts. In section 2, by using covering graph construction, we count the positive isomorphism classes of cycle permutation graphs, which is equal to the number of double cosets of the dihedral group Ittn in the symmetric group Sn on n elements. In section 3, we count nonisomorphic (connected) coverings of a graph and, as its application, we have another recursive formula for the number of conjugacy classes of subgroups of given index of a finitely generated free group. In section 4, we count nonisomorphic regular coverings of a graph whose covering transformation groups are abelian and, as its application, we count subgroups of given index of free abelian groups. The same work is done in section 5 for regular coverings having dihedral voltage groups. In section 6, we discuss a general counting formula for regular coverings having any finite voltage group. In section 7, after discussing a combinatorial proof of Hurwitz theorem for surface branched coverings, we consider the number of subgroups of surface groups. Finally, in section 8, we discuss a distribution of branched surface coverings of surfaces and some related topological properties including a generalization of the classical Alexander theorem.
1
Definitions a n d N o t a t i o n s
Let G be a connected finite simple graph with vertex set V(G) and edge set E(G). The neighborhood of a vertex v € V(G), denoted by N(v), is the set of vertices adjacent to v. We use \X\ for the cardinality of a set X. The number /3(G) = \E(G)\ — |^(G)| + 1 is equal to the number of independent cycles in G and it is referred to as the Betti number of G. Two graphs G and H are isomorphic if there exists a one-to-one correspondence between their vertex sets which preserves adjacency, and such a correspondence is called an isomorphism between G and H. An automorphism of a graph G is an isomorphism of G onto itself. Thus, an automorphism of *THIS WORK IS PARTIALLY SUPPORTED BY COM 2 MAC-KOSEF, KOREA. 97
98
G is a permutation of the vertex set V(G) which preserves adjacency. Obviously, a composition of two automorphisms is also an automorphism. Hence the automorphisms of G form a permutation group, Aut (G), which acts on the vertex set V(G). A graph G is called a covering of G with projection p : G —> G if there is a surjection p : V(G) -* V(G) such that p\N(y) : N(v) —> N(v) is a bijection for any vertex v G V(G) and v € p" 1 (w). We also say that the projection p : G —> G is an n-fold covering of G if p is n-to-one. A covering p : G —> G is said to be regular (simply, A-covering) if there is a subgroup A of the automorphism group Aut (G) of G acting freely on G so that the graph G is isomorphic to the quotient graph G/A, say by h, and the quotient map G —> G/^4 is the composition hop of p and /i. The fibre of an edge or a vertex is its preimage under p. Two coverings pi : Gi —> G, i = 1,2, are said to be isomorphic (or, equivalent) if there exists a graph isomorphism $ : Gi —» G^ such that the diagram
commutes. Such a $ is called a covering isomorphism. In particular, when Pi = Vi (saYi = V) with Gi = G2 (say, = G), it is called a covering tansformation of p, and the set of all covering transformations forms a group under the composition, called the covering transformation group of the covering p:G->G. Every edge of a graph G gives rise to a pair of oppositely directed edges. By e _ 1 = vu, we mean the reverse edge to a directed edge e = uv. We denote the set of directed edges of G by D(G). Each directed edge e has an initial vertex ie and a terminal vertex te. Following 4 , a permutation voltage assignment 0 on a graph G is a map
99 values of (j> are called voltages, and A is called the voltage group. The ordinary derived graph Gx^ A derived from an ordinary voltage assignment 4>: D(G) —> A has as its vertex set V(G) x A and as its edge set E(G) x A, so that an edge (e,g) of G x^A joins a vertex (u, g) to (v,
I*/*4! = 717 £ l F i x (5)1 = 77[ (1*1+
l Fix ^)l
E
1
' g€A ' ' V g€A, g±\ , where Fix (g) = {a; (E X | gx = x}, the set of fixed elements by g. Consider another group .4-action on a set Y. Two ^-actions are called mutually orthogonal if each non-identity element g of A has a fixed element in at most one action, that is, g cannot have a fixed element in both X and Y. Let Ax = {g G A | gx — x} denote the stabilizer of x € X. Then, it follows from Burnside's Lemma that
Iy/-4*I = TT1 I v^vx I
and
I*/AI
|X
'
I **-y j
for any x £ X and any y € Y. 2
Cycle permutation graphs and the double cosets of D n in Sn
Throughout this section, let Sn denote the symmetric {1, 2 , . . . , n} and let D„ denote the dihedral subgroup n-cycle p = (1 2 • • • n), so that |D n | = In. An n-cycle permutation graph Pa(Cn) consists of cycle Cn, say Cx and Cy , with vertex sets V(CX) =
group on n elements of Sn containing the two copies of an n\x\,xi, • • • ,xn} and
100 GX
Gy
Figure 1. The dumbbell graph
v c
( y) = {Vi,y2,- -,Vn}, along with edges Xiya^ for some a € S„. The edges Xiya^ are called the permutation edges of a cycle permutation graph Pa\Gn)
•
Let G denote the dumbbell graph with two vertices x, y, an edges e = xy and two loops ex ~ xx,ey = yy as illustrated in Figure 1. The permutation derived graph G* with the voltage assignment 0 defined by 0(e x ) = 4>{ev) — p and 0(e) = a, a S Sn , is clearly the cycle permutation graph Pa{Cn). Moreover, with a suitable relabelling of the vertices of the inner cycle Cy of Pa(Cn), we can assume that the permutation edges are x^, i — 1,2,... ,n. It is not difficult to show the following theorem. Theorem 1 A cycle permutation graph Pa(Cn) is isomorphic to the permutation derived graph G^ with voltage assignment ip defined by ip(ex) = p, •0(e) = the identity in Sn and ip(ey) = a~xpa ( or rp(ey) = a~1p~1a ) over the dumbbell graph G. Note that the permutations a-1 pa and a~1p~1a, a € Sn, have the same cycle type as the cycle p. Let E n denote the conjugacy class of p = (12 • • • n) in Sn, i.e., E n is the set of all n-cycles in Sn. From the isomorphic identification in Theorem 1, it is enough to consider a permutation derived graph with a permutation voltage assignment which assigns the identity on the edge e, p = (1 2 • • • n) on the loop ex and a £ E n on the loop ey of the dumbbell graph G for a cycle permutation graph. Hence, the set E n can be identified with the set of all n-cyclic permutation graphs. Two n-cycle permutation graphs Pa(Cn) and Pp(Cn) are said to be isomorphic by a positive natural isomorphism O if 0 : Pa(Cn) —* Pp(Cn) is an isomorphism satisfying Q(CX) — Cx and @(CV) = Cy . The following theorem gives a group-theoretic characterization of two cyclic permutation graphs to be positively natural isomorphic. Theorem 2 Let a and (3 be two permutations in Sn Then the cyclic permutation graphs Pa(Cn) and Pp(Cn) are isomorphic by a positive natural
101
isomorphism if and only if there exists d E D n such that /TV/? = dia^p^d'1
or
f3~l pfi = d t a " 1 / * * ) " 1 ^ 1 .
It is also equivalent to say that [3 E D„aD„ , that is, the permutations a and (3 belong to the same double cosets of D n in Sn . Proof: Use the identification Pa(Cn) = G* and Pp{Cn) = G* given in Theorem 1. If G^ and G^ are isomorphic by a positive natural isomorphism, say 0 , then 0 maps the outer cycle of G* to the outer cycle of G^ isomorphically, which induces an element d in D n . (Note Aut (Cn) — D n .) Then it follows that the path Xiyiya-ipa^xa-ipa^ (or Xiyiya-ip-ia^xa-ip-ia^ depending on the orientation of ev) in G* is mapped to the path Xd{i)Vd{i)yp~i
p0d(i)xp~i
p/3d(i)
or
x
d{i)yd(i)Vl3-ip-^l3d(i)Xp-ip-ii3d(i)
depending on the orientation of ey in G^. In either case, we have (3~1p(3 = d ( a _ 1 pa)d~l or P~1pf3 = d(a~1pa)~1d~1 from the construction of the derived covering G1''. Also, it gives p=
=
{(3da-l)p{ad~ll3-1)
(Pda-^ptfda-1)-1,
or p-1 =
(pda-^pipda-1)-1
for some d E D„. Hence, fida'1 is contained in the normalizer N(p,p~1) of {p, p~1} in B n . But N{p,p-1) = D n . Therefore, /3 E OnalD>n . Conversely, if j3 = diad2 for some di,d2 £ Dn then j3~lpf3 = d% a - 1 d^1 pd\ad-2 . Then the element d2 in D n induces an automorphism in the n-cycle Cn, and hence an isomorphism from the outer cycle of G* to the outer cycle of G^. It is also easily extended to a positive natural isomorphism from G" to G^ by the condition. • So far, we show that the number of double cosets of the dihedral group D„ in the symmetric group Sn is equal to the number Iso p(Cn) of positive natural isomorphism classes of n-cyclic permutation graphs. Also, every ncyclic permutation graph can be constructed as an n-fold covering graph of the dumbbell. To count the number I s o p ( C n ) , let X : Sn —» Sn be the map defined by T(cr) = CT-1 for all a E Sn and denote T = D n x {1,1} . Define a group
102
n
3 1
Iso P (C„)
4 5 2 4
6 7 9 8 10 11 12 39 202 1219 9468 83435
Table 1. The number Iso p(C„) for small n
action r x E„ -> E„ by (d, 1)(CT) = dad'1 and (d,l)(a) by Theorem 2 and Burnside's Lemma, we get
= da^d^1.
Then,
iso P (c n ) = |En/r| = ±- J2 lFix Ml 4n
where Fix (7) = {a £ S n : 7
1 An
j2Phi(d)i^d-i)\d^
+
^.
(3+^)2^ if n is even,
= <
n-l An
if n is odd.
d\n
In particular, for odd prime q, Iso p(Cq)
1 4,7
(q-l)l
+ (q-iy
231 ( q - 1
+ q2~v-
A short calculation gives the table 1 for Iso p(Cn).
Question: Find an algorithm to list the representatives <J'S in E n of the double cosets of D„ in Sn . It gives how to draw all positively nonisomorphic cycle permutation graphs. Compute the size of each double coset of D n in Sn . It gives how many permutations in Sn present the same cycle permutation graph.
103
According to J.M. Montesinos 41 , any closed orientable 3dimensional manifold can be obtained as a finite sheeted covering of the 3dimensional sphere S 3 branched over the dumbell graph (i.e., over the Hopf link with a bridge). Hence, the number Iso p(Cn) of positive natural isomorphism classes of n-cyclic permutation graphs is equal to the number of closed orientable n-fold coverings of the sphere S 3 branched over the dumbell graph. REMARK
3
Graph coverings and subgroups of free groups
Let G be a connected graph and let T be a fixed spanning tree of G. A permutation voltage assignment
Sn,
(/3(G) times).
With an 5 n -action on the set C)p (G\ n) defined by simultaneous coordinatewise conjugacy: for any g G Sn and any (<7i,..., 073(G)) € C^(G\ n), 9(0-1,0-2,•..,00(G))
= (go-ig~1,go-2g~1,-
•
-,go-p(G)9~l),
it follows from Theorem 4 that two normalized permutation voltage assignments (/>, i/> in Cj,(G\n) derive isomorphic coverings of G if and only if they belong to the same orbit under the 5 n -action. That is, each /3(G)-tuple of
104
P(G) 1 n=z 1 1
n n n n
= = = =
22 33 45 57
2 4 5 6 3 1 1 1 1 1 4 8 16 32 64 11 49 251 1393 8051 43 681 14491 336465 7997683 161 14721 1730861 207388305 24883501301
Table 2. T h e number I s o (G;n) for small n and small /3(G)
permutations (o"i,..., <Jp(G)), Cf £ <Sn i s identified with a normalized permutation voltage assignment
for
« = 1,2,..., /3.
If we can find g £ Sn that leaves fixed some k in { 1 , 2 , . . . , n}, then the tuples are said to be k-similar. By Theorem 4, there is a one-to-one correspondence between the similarity classes of /?(G)-tuples of permutations in Sn and the isomorphism classes of n-fold coverings of the graph G. We denote by Iso (G; n) the number of such isomorphism classes of n-fold coverings of G. To count Iso (G; n) by Burnside's Lemma, we first count Fix (g) for each g G Sn- Let C(g) and Z(g) denote the conjugacy class containing g and the center of g in the symmetric group Sn, respectively. Lemma 2 Under the Sn-action on C^(G; n) = Sn x Sn x • • • x Sn, we have (1) tf 9i and g-2 are conjugate, then \F\x(gi)\ = |Fix(# 2 )|, (2) for each g € Sn, Fix (g) = Z{g) x Z(g) X • • • x Z(g),
/?(G) times,
(3) |C(ff)||Z(«7)| - n! for any £ S„. By using Lemma 2 and Burnside's Lemma, we have Theorem 5 ( 24 ) Tfte number of isomorphism classes of n-fold coverings of G is
Iso(G;n) =
£
(^!2^2! • • • n H j ) ^
1
.
105
Next, we aim to compute the number Isoc (G;n) of isomorphism classes of connected n-fold coverings of G. Let p : G —> G be an n-fold covering and let G\, G2, ..., Ge be the components of G. Then pi = P\Q, : Gi —• G is a connected covering of G for each i — 1, 2 , . . . , £. Let n* be the fold number of the connected covering pi : Gi —> G. Then n$ > 1 and n\ + • • • + rig = n. In this case, the ordered sequence [nin2 • • • ng\ with n\ < n.2 < • • • < ne is just a partition of n, denoted by p[n] or simply by p. Also, we say that a covering p : G —» G has the component type of partition p[n] = [nin2 • • • ng\. Clearly, any two isomorphic n-fold coverings have the same component type. A partition p of n is denoted by [[k; ^]] if every term of p is k. Note that [[/c;m]] denotes the partition of the natural number km each of whose terms is k. For a partition p of n, let jk{p) denote the multiplicity of A: in the partition p, so that ji(p) + 2J2(p) + • • • + njn(p) — n. For convenience, let ?P(n) denote the set of all partitions of a natural number n. For a partition p of n, let Iso (G; p) denote the number of nonisomorphic n-fold coverings of G having the component type p. Clearly, Iso (G; [[„; 1]]) = Isoc (G; n),
Iso (G; [[1; n]]) = 1,
and Iso ( G ; n ) =
] P Iso (G;p). pG?P(n)
It gives a recursive formula for calculation the number Isoc (G;n) as follows. Theorem 6 ( 28 ) For n > 2, the number of nonisomorphic connected n-fold coverings of G is Isoc (G; n)
£
((/1 + i)«a>-i_i)
*i+2*2 —+ ( n - l ) * „ _ i = n - l
x(£1!2^£2!...(n-l)^-1£n_1!)^(G)~1
+
(2iH2\3^e3\-..niHn\f(G)-1
Y, 2e2+3i3-i
\-n£„=n
(
-.
J'k(t>)-1
\
—^
J ] (I»oc(G; * ) + / ) ,
J
£=0
w
'
where the summation over the empty index set is defined to be 0.
/
106
Proof: Since an n-fold covering of G having the component type [\n\ 1]] is connected, we have Iso (G;[[n;l]])= Isoc (G;n) and Isoc (G;n) = Iso (G;n) -
^
Iso(G;p),
pe?P(n)-{[[n;l]]}
where the summation over the empty index set is defined to be 0. Let p £ ^3(«) with ii(p) 7^ 0 and let p : G —• G be a covering having the component type p. Then G has ji(p) components which are isomorphic to G, and the restriction of p : G —•> G on the complement of one of such components in G is an (n — l)-fold covering of G. Hence, we get Y^
Iso(G;p)=Iso(G;n-l).
p€!P(T»),Ji(p)^0
It implies that Isoc (G; n) = Iso(G;n)-Iso(G;n-l)-
]T
Iso (G;p)
P 6
/l+Ma...+(„-i)/„_1=„-i
x
( £ l ! ^ 2 , ... ( „ _ 1 ) / „ - 1 £ n _ l ! ) / ' ( G ) - 1
2e2+3e3+---+ne„=n ISO G
52
( ;P).
P 6 «P(n) - { [ [ m l ] ] } ii(p) = o
where Iso (G;0) = 0 by definition and the summation over the empty index set is defined to be 0. Since Iso (G; 1) = Isoc (G; 1) = Iso (G; [[l;m]]) = 1 for any natural number m, we have Iso(G;p)=
J]
Iso(G;[[/c;j fc (p)]])
107
for any partition p £ ^P(") — {[[^l]]} with ji(p) = 0. Now, to complete the proof, we need to estimate the number Iso (G; [[s;£]]) for any natural numbers s and t. Let p : G —> G be a covering having the component type [[s;i]]. Then G has exactly t components and the restriction of p : G —-> G on each of such t components is a connected s-fold covering of G. Hence, a covering automorphism on a covering p : G —> G having a component type [[s;i\] must permute its t components so that each component maps onto its isomorphic copy. It implies that the number Iso (G; [[s; t}]) is equal to the number of selections with repetition of t objects chosen from Isoc (G; s) types of objects, i.e., Iso (G; [[.; t]\) = ( I S ° C ( ^ J) + l - l) = i
(j](Isoc (G;s) + i)\ .
This completes the proof.
•
Corollary 1 Let p be a partition of a natural number n. Then
Iso(G;p)=
,
A f c (p)-i
\
n ^ v n (i»c(G;fc)+o .
In particular, if jk(p) = 0 or 1 for each k = 1, 2 , . . . , n, then Iso(G;p)=
Yl
Isoc(G;fc).
ifc(p)=i
In fact, Liskovets 3 1 computed the number Isoc (G;n) in terms of the Mobius function and the number <5^(G) (m) :
Isoc (G;n) ^ E ^ ^ E " ^ ) m\n
d ^ "
1
^ ,
d\-*L
where fx(k) is the number-theoretic Mobius function and Sjr . [m) denotes the number of subgroups of index m in the free group ^3(G) generated by /3(G) elements.
108
EXAMPLE 1 By applying Theorem 6, we have Isoc (G; 2) = (2^G>"1 - 1) + 2^G>~1 = 2^G> - 1, Isoc (G; 3) = (3^°)-1
- 1)2^ G )" 1 + 3p^~l = 6^°^ + 3^°^
- 2^G)"\
and Isoc (G; 4) = (4^(G)-i _ 1 ) 6 ^(G)-i + ^ ( G ) - 1 _ i)2' 3 ( G )- 1 + 8 ^ G ) _ 1 + 4<8(G)-i - I s o (G; [[2; 2]]) G
=
1
24^ )" + 80(G)-i
G
+
( 2 ^ ) _ 1)2
/3(G
1
^ - 6^
(G)_1
_(2/5(G) _ l)2^(°)- 1 = 24- 8 ( G )- 1 + s ^
- 1
- 6^(G)_1.
It is well-known (e.g., see 3 5 ) in topology that the fundamental group of a graph G is a free group of rank /?(G), and there exists a one-to-one correspondence between the isomorphism classes of connected n-fold coverings of G and the conjugacy classes of subgroups of index n of the fundamental group of G. Thus, by using the enumerating formula for Isoc (G; n) in Theorem 6, we can compute the number of conjugacy classes of subgroups of index n of any finitely generated free group. Notice that the number Iso (G; n) of nonisomorphic n-fold coverings of G can be expresses (in terms of Isoc (G\n)) as follows. /
Iso(G;n)=
£
II
1
™
p6!P(n)jfc(p)#0 \Jk^>-
i»:(p)-i
\
I I (Isoc (G;*)+*) • e=0
)
REMARK An enumeration of the number of nonisomorphic n-fold coverings or n-fold connected coverings of a graph was also independently done by Hofmeister ( 10,14 ). Liskovets ( 31 ) also enumerated those connected coverings by counting the conjugacy classes of subgroups of a finitely generated free group in terms of Mobius function.
Comparing with the combinatorial computation of the number Isoc (G; n) of nonisomorphic connected n-fold coverings of G in Theorem 6; there is another group-theoretic computation of it with Burnside's Lemma. A /3(G)tuple of permutations (<TI, . . . ,crp(G))i ai S Sn is called transitive if the permutation group < <Ti,... , 073(G) > generated by them acts transitively on the set {1, 2 , . . . , n} . Let <3(n; 0) denote the set of all transitive /3-tuples of permutations in Sn.
109 L e m m a 3 The following are equivalent (a1,..., o-p{G)) in C^{G;n). (1) It is transitive,
for
a voltage
assignment
i.e.,
(2) The associated transition graph with {1,2, . . . , n } as its vertex set and with pairs {i,
derived graph G* is
connected.
T h e following is a direct consequence of L e m m a 3 and T h e o r e m 4. L e m m a 4 ( 32 ) There is a one-to-one correspondence among the following sets: (1) The set of similarity
classes of transitive
(2) The set of nonisomorphic
j3(G)-tuples
connected n-fold coverings
(3) The set of conjugacy classes of subgroups generated by /3(G) elements.
of permutations
in
of G.
of index n in the free
group
Liskovets ( 31 ) used Burnside's L e m m a to compute the number I s o c (G; n) of the conjugacy classes of subgroups of index n in the free group generated by P = P(G) elements: Isoc (G;n) = |0(n;/3)/Sn| = - V |Fix(5)| n! *—'
where P = /3(G), /x(n) is the number-theoretic Mobius function and S^(m) denotes the number of subgroups of index m in the free group T generated by P elements. In advance of stating Liskovets' method for computing t h e number I s o c ( G ; n ) , we introduce Hall's formula to count the number of subgroups of index n in a finitely generated free group. Let T be the free group of rank P generated by Y = {si,S2, • • •, sp}. Let U be a subgroup of index n in T with a left coset representation:
T = U1 + Ug2 + • • • + Ugn = U + Ug2 + • • • + Ugn.
110
Here, we can assume that the representatives t^'s with g\ = 1, the identity, are selected to be a Schreier system,0, even it is not unique in general. Define a function 0 on the set {gse \g G {gi},s € Y, e = ±1} so that
{gi},
(ii) if gtsE £ {gi}, then
For a subgroup U of a free group T generated by Y, the pair U = U[{gi}, <j>} of a Schreier system {g{\ of coset representatives and a function 4> °n the set {gsE} satisfying the three conditions (i)-(iii) listed above is called the standard representation for the subgroup U. For a standard representation U = U[{gi}, 4>] for U, it is known that the elements gs4>{gs)~l,s, where g runs over the representatives gi's and s over the generating set Y, generate the subgroup U. In particular, the subgroup 14 is also finitely generated. The following lemma gives a criterion for recognizing different representations of the same subgroup. L e m m a 5 (7) Let U\ = U\\{g\ },4>i] and ^2 = ^{{d) jifc] be standard representations for the subgroups U\ and U2, respectively. Then 1A\ = U2 if and only if there is a one-to-one correspondence {g\ } <-> {g]j } between the representative sets mapping the identity onto itself such that if g\ <->
= 1 such that if gf'
<-> gj,
then (j>i(gl se) *-*
r
111
true for £(f) = 0, since / = 1 and 1 «-> 1. And if / is in the cosets Uig\ ' = U^g] , then fse is in the corresponding cosets Ui<j>\{g\ se) <-» U^chidj sE) • Hence, the corresponding cosets are the same, and in particular U\=1Ai- d For the next lemma, let U be a subgroup of a (free or not) group T generated by Y = {si,..., si}...}. For an s* £ Y, by multiplying Si on the right of each (left) coset of U in F, we have a permutation 7Tj = ir(si) on the left cosets. Since Y = {SJ} generates T, the 7Tj's generate a group which is transitive on the set of cosets. The following is a kind of converse for free groups. Lemma 6 Given a free group J- generated byY = {si,..., Sj,...}, and a set of indices I = { 1 , . . . , i,...}. With each generator Sj, associate a permutation 7Tj on the indices I. Suppose J = { 1 , . . . , j,...} is the transitive constituent of I containing 1. Then in T, there is a Schreier system {g± = 1,g2, • • •, 5j, • • •} indexed by J and a function
if and only if
7rf (J) = k .
Proof: The permutations iTi generate a permutation group P of indices. Let E be the subgroup of P consisting of permutations 7r which fix the index 1. The mapping Sj — i » -ni determines an epimorphism of T —> P, and let U be the subgroup of T mapped onto E; U —> E. Now, choose a coset representative {g} of U in T as a Schreier system: F = Ul + Ug2 + --+Ugj
+ ---,
E
and let 4>(gjS ) is the representative of the coset containing gjSe, as before. If g i—• TV, then Ug — i > En, i.e., the epimorphism T —> P preserves the left coset representative. If 7r maps the index 1 to j , we write (1)TT = j , and assign the index j to g, putting g = gj. Hence, the Schreier system {g} is indexed by J, in which if gj i—• w and s? H-> 7rf then gjsf >—> mrf. Now, if (j)n!- = k, then (l)7T7r? = k and 7T7rf belongs to the left coset Er\ of E consisting of those permutations T of E which maps 1 to k, ( l ) r = k. Here, Ugk —> -E77. Hence, c?jsf belongs to [ 7 ^ , or 4>{gjsf) = 5*. • With each element / of J 7 generated by { s i , . . . , s^}, say / = s^ • • • Sj t , the associated permutation 7r(/) = ^ ( s ^ ) • • -Tc(sit) defines a transitive n-degree permutation representation of the group T, and those elements / such that 7r(/) fixes 1 will form the subgroup 14. Conversely, any transitive n-degree permutation representation of the group T determines a subgroup of index n
112
in the group T, by Lemma 6. And, by Lemma 5, such kinds of two representations derived by a Schreier system {gt} and functions {<j>(gis)} determines the same subgroup U if and only if they are equivalent via a permutation a on { 1, there exists a one-to-one correspondence between the subgroups of index n of the free group T and the 1-similarity classes of transitive j3-tuples of permutations in Sn. So far, we used some group theory terminologies like Schreier systems or the standard representations for subgroups to have Lemma 7. But, we can give simpler proof by using graph coverings and a fundamental group theory as follows: Consider the free group T as the fundamental group ~K\(G, V) of a connected graph G with base vertex v. We assume that G has /? = j3{G) cotree edges. It is well-known that every subgroup of T = TT1(G,V) is expressed as the image Pfl(7ri(G*,Vj)) of the fundamental group of a connected covering p : G^ —» G, where cj> = ( o j , . . . ,O"^(G)) is a transitive permutation voltage assignment in Cj,(G; n), and vt is a vertex in the fibre of v. Furthermore, for any two transitive permutation voltage assignments 4>, ip in C^{G\n),
if and only if the two coverings are isomorphic by a covering isomorphism $ which preserves the base point. Hence, we can say that in Theorem 4, the permutation a leaves fixed 1, by relabeling of vertices in the fibre p _1 (w) if necessary. REMARK AS a generalization of Lemma 7, the connection between subgroups of any finitely presented group and its transitive permutational representations (see 8 or 33 ) can be formulated as follows: Given a finitely presented group A= { x i , x 2 , . . -,xr
: / i = l , / 2 = 1,...}
there is a one-to-one correspondence between the subgroups of index n > 1 in A and the root-similarity classes of transitive r-tuples (x\,x2, • • •, xr) in Sn that satisfy the defining relations {fj = l},j = 1,2,.... Now, we may enumerate recursively the number of subgroups of index n in the free group T. Theorem 7 (7) The number Sjr(n) of subgroups of index n in the free group T generated by (3 elements is given as
113 n-l
Sr(n) = n(n\f-1 - J ^ ( n - t^^S^t)
with
Sr(l) = 1.
t=i
Proof: Clear for n = 1. Choose /3 permutations Pi,...,Pp on symbols {1,#2, • • • >ffn}- In general, Pi,...,Pp need not generate a group transitive on all of 1, <72, • • •, ffn- Let the transitive constituent including 1 be 1, 62, • • •, &tDisregarding the remaining letters, we may take as 7r(si),..., Tr(sp) the permutations on 1,62,..., bt, and these will determine a unique subgroup of index t. The remaining n — t letters could occur in f - j , . . . ,Pp in [(n — £)!p ways. In addition, by Lemma 5, the same group will be determined if we replace l,62,...,6t by any other combination 1,C2, . . . , c t in the symbols {1,<72, • • • ,9n}> a n d the remaining n — t letters in an arbitrary way. Also, the symbols b%,... ,bt can be replaced by C2,...,c t from g^,.. •,gn in (n — l)(n — 2) • • • (n — t + 1) different ways. Thus a total of (n - l)(n - 2) • • • (n - t + l)[(n - t)\f = (n - 1)! [(n - i ) ! ] ^ 1 different permutations Plt... ,Pp may be associated with the same subgroup of index t, and (n — 1)! [(n — t)\\P~1Sjr(t) permutations are associated with the subgroups of index t. Hence, we get n
X>-l)![(n-t)!]^ 1 .SHt) = (n!)/3. t=i
Dividing by (n — 1)!, we can get the desired formula. The symmetric group Sn acts naturally on the set { 1 , 2 , . . . , n}, and also acts on the set (5(n;/3) by the simultaneous coordinatewise conjugacy. But, these two actions are mutually orthogonal, because any /?-tuple in the set 25(n; /?) is transitive. Hence, the group 5 n _i, as the subgroup of Sn consisting of permutations a fixing 1, i.e.,
\G(n;/3)\ =
(n-l)\Sr(n),
where T is the free group generated by (3 elements. Now, we go back to Liskovets' method for computing the number Isoc (G;n). It is already known that
Isoc (G;n) = |<5(n;/3)/5„| = i
£ ' gesn
|Fix( 5 )|
•
114
and Fix (g) = <&(n; 0) n (Z(g) x . . . x Z(g)), where /3 = /3(G). If Fix (g) ^ 0 and 4> = (o"i,... ,CT^)belongs to Fix (g), then 5 commutes with the the group < ai,...,ap >, which is transitive on the set {1, 2 , . . . , n}. Hence, 5 must be a regular permutation, i.e., it consists of independent cycles of the same length £. For each Ira = n, there exist n!/(7n!£ m ) regular permutations 5 in £„ consisting of m cycles of length £, and |Fix(#)| are equal for all such regular g. We denote this value by |Fix ((^ m ))|, and call such g a permutation of type (£m). Hence, we get
I»c(vG;n) = iV|Fixk)|= £ W l ' ' n! ^ ' ^ geSn
i !m!£ ^ Om i .
£\n,em=n
The following lemma is well-known and an elementary exercise in group theory. Lemma 8 Let go be the permutation in Sn of type (£m): go = (12 • • • £) (£ + 1 • • • 11) • • • ((m - l)£ + 1 • • • n) . Then, the centralizer Z(go) of go is a wreath product Zi I Sm, where %i is the cyclic group generated by the £-cycle (12 •••£). An element of the wreath product Zf I Sm is of the form a = ( c i , . . . , c m ; a), where Ci £ Ze and a € Sm. The element a = (c\,..., c m ; a) represents a permutation in Sn acting on the set { 1 , . . . , n} as follows. Notice that each element in { 1 , . . . , n} is of the form k=(s-l)£ + t = ts(l <s<m,l
= ((s)a - l)£ + (t)cs = ((i)c s ) ( s ) a ,
that is, first perform a cyclic transposition by the s-th cycle cs of the permutation go for all s = 1 , . . . , m, and then shift through the action of the permutation a in Sm. Ifb= ( d i , . . . , dm\ b) £ Z^\ Sm, then a • b = (ci + d ( i ) a , . . . , c m + d( m ) a ; ah) = (cid(i)a, • • •, c m d( m ) 2 ; 56), where (s)ab = ((s)a)b for all s £ { 1 , . . . , m}. Proof: Let g0 = (12 •••£){£ + l---2£) ••• ((m - 1)£ + l--n). Then g0 can be identified with the element ( 1 , . . . , 1; 1) in Zt \ Sm. Then, for each g = ( c i , . . . , c m ; a) in ZelSm, we have gg0 = (1 + ci,...,1+Cm; a) = g0g. It implies that Ze I Sm is a subgroup of the centralizer Z(g0) of g0- Let C(g0) be the conjugacy class of 5o - Then \C(g0)\ = n\/{m\£m). Since |5„| = \Z(g0)\ \C(g0)\, \Z(g0)\ = m\£m = \Ze I Sm\ and hence Ze I Sm is the centralizer Z(g0) of g0. • From the notations, we have |Fix((F"))| = |Fix(ff0)| and
115
Fix(5o) = { ( a l l . . . , a / , ) € ( Z * 2 S m y J | < a i , . . . , ap > is transitive in { 1 , . . . , n = £m}} . We set F(em) =
{(a1,...,a0)e(ZelSm)P\ < a i , . . . ,ap > is transitive in { 1 , . . . , m}} .
The following lemma is due to Liskovets 3 1 . Lemma 9 For any n = £m and any /?, we have (1) \F(n\=
km~l\Fix((dm))\.
E k\e,kd=e
(2)
m
\F(e )\=el3m\<S(m;f3)\.
Proof: Set S = {1,1 + k,..., 1 + {d - l)k}. 0 < /i s
= SU
For each 2 < s < m and
(\JS+(8-l)e+hs)
and B(h,a)t =B(0,h 2 ,.-.,/»m)o + * for each t = 1,2,..., k — 1, where all arithmetic is done by modulo £. It is not hard to show that every element in F(£m) is transitive on each of the following sets B(0, h,2, • • •, hm)o,..., J5(0, h,2, • • •, /»m)fe-i for some s = 2 , . . . , TO and hs = 0 , 1 , . . . , k - 1. Notice that Fix((d TO )) can be identified with the set of all elements in (Z^ I Sm)P which is transitive on each of the sets B(0,0,..., 0) 0 , • . . , 5 ( 0 , 0 , . . . , 0)f.-i. Moreover, (0, / i 2 , . . . ,/i m ;l)Fix((ci' n ))(0, h2, • • . , / i m ; l ) _ 1 is the set of all elements in (Z^ I Sm)P which is transitive on each of the following sets B(0, h2,.. . , h m ) o , • • • ,B(0,h2,.. .,hm)k-i. Hence, we have (1). Now, we aim to show (2). For each /3-tuple (ax,..., ap) which is transitive in { 1 , . . . , m}, there exists £^m elements (bx,..., bp) in F(£ m ) such that ( 6 i , . . . , bp) = ( 5 i , . . . , ap). It implies (2). • Theorem 8 ( 31 ) The number Isoc (G;n) of the conjugacy classes of subgroups of index n in the free group generated by j3 elements is given by the formula
116 I s o c (G; n) = - x V1 S^{m) n— m\n
Yfi(^) t—1
d^^m+\
\md)
dl-n-
1
' m
where (3 = /3(G) and fi(n) is the number-theoretic
Mobius
function.
Proof: r . A, \ \F(nm)\ , , , |Fix((n)m)| v l Let A{n) = _ / ' and a(n) = - ^_[ n . Then, by L e m m a 9 (1), m
^") = ^ ( E G ) m " ^ F - ( ( 0 ) i U E ^ T i F i x ( ( 0 ) i \d\n
J
d\n
£>(d). d\n
By the Mobius inversion formula, we can see t h a t 771—1
a(£) = 5>(dM(^)=I>(d) d\l
x
'
\dm
d|£
By the definition of t h e function a(n), we can get
\F\x((er)\ = Y/rid)dm-1 \d
m
d\e
Recall that Isoc (G;n) = l £ | F i x ()!= fifG5n
£ !|n,£m=n
[Fix ((l™)) | i\£m
and |l&(n;/3)| = ( n - l ) ! ^ ( n ) . Now, by using these facts and Lemma 9 (2) together with an elementary computation, we have the theorem. •
T h e number I s o c (G;n) for small n and /3(G) is listed in table 3. Q u e s t i o n : W h a t are the relations between two different formulas for I s o c (G;n) ?
117
5 6 2 3 4 1 1 1 1 1 15 31 63 21 3 7 31 7 41 235 1361 7987 14120 334576 7987616 4 1 26 604 207009649 24875000437 5 1 97 13753 1712845 6 1 624 504243 371515454 268530771271 193466859054994
P(G) 1 n= 1 1 n n n n n
= = = = =
Table 3. The number I s o c ( G ; n ) for small n and small /3(G)
REMARK The fundamental group of any (connected) graph G is a free group generated by /3(G) elements, and the conjugacy classes of its subgroups of index n are in one-to-one correspondence with the nonisomorphic connected nfold coverings of G. Such a correspondence is established via the monomorphic image of the fundamental group of a connected covering of G. Since any covering of a graph is also a graph, every subgroup of a free group is also a free group. Moreover, any subgroup of index n in the free group generated by /3 = /3(G) elements is a monomorphic image of the fundamental group of an n-fold connected covering of G. Hence, it must be a free group generated by 1 + n(/3(G) — 1) elements, because it must be equal to the Betti number of a
construction problem for all nonisomorphic connected n-fold coverings of a graph, one can ask the following two questions: (1) find a (minimal) generating set for each subgroup of a given index of a finitely generated free group J- and (2) find all possible lists of a (minimal) generating set for each of those subgroups. The first question can be answered by Reidemeister-Schreier method. The second one can be done by the description of Aut (J7) (See 3 4 ) . 4
Regular coverings with abelian voltage groups and subgroups of free abelian groups
Let A be a finite group and let S^ denote the symmetric group on the group elements of A. It gives the (left) regular representation of A —> S^ via g —> L(g), the left multiplication by g on A. Clearly, this representation is monomial and the group A can be identified with the group of left transformations L(g)'s: A = {L(g) \ g £ A} (Cayley Theorem). Notice that a permutation voltage assignment <j> : D(G) —• S^ having its images in A is nothing but an „4-voltage assignment of G, and for such a voltage assignment <j), the permutation derived graph G* is just the ordinary derived graph
118
G x0 A Let Cj. (G; A) denote the set of all normalized A-voltage assignments of G. Recall (4) that any regular n-fold covering of G is isomorphic to an ordinary derived graph G x
(0(G)
times),
that is, an ^4-voltage assignment 0 of G can be identified as a /3(G)-tuple (ffi> •• • ,9p(G)) °f group elements * € A. Moreover, such a /?(G)-tuple of g's derives a connected covering if and only if it is transitive. It means by definition that the subgroup < g\,..., /?() > generated by them acts transitively on the group A (under the left translation on *4), or equivalently {9i, 92, • • •, 9p(G)} generates A. Under the coordinatewise Aut (^l)-action on the set of transitive /3(G)tuples of group elements gi G A, any two transitive /3(G)-tuples of elements in A belong to the same orbit if and only if they derive (connected) isomorphic .4-coverings, by Theorem 9.
119
Clearly, the Aut («4)-action on the set of transitive /3(G)-tuples of group elements gt £ A is free (having no fixed element), from which Burnside's Lemma gives an enumeration formula for Isoc (G; A) as follows. Theorem 10 For any finite group A, Isoc(G;„4) = where <S(A;f3) = {(gi,g2,---,gp)
£AP
|AutM))
,
\ {gi,g2, ••• ,9p} generates A}.
Note that the set <&(A; f3(G)) can be identified as the set of epimorphisms from the free group generated by /3(G) elements onto the group A- Such kind identification will be reviewed again in section 6. It is not difficult to show that the components of any regular covering G x^ A —* G are isomorphic each other as coverings of G, and any two connected isomorphic regular coverings of G must have isomorphic covering transformation groups. To describe a component of the covering graph Gx
(G;d).
120 (2) For any natural number n, I s o c (G; n) = 2_[Isoc A over all nonisomorphic groups of order n. (3) For any finite group A, I s o (G;A) all nonisomorphic
(G; A), where A
= Y ^ I s o c (G;S), s
where S runs
runs
over
subgroups of A.
(4) For any finite groups A and B with (\A\, \B\) = 1, I s o (G; A®B)
= I s o (G; A) I s o (G;B)
and I s o c (G;A®
B) = I s o c (G; .4) I s o c (G; B).
(5) For any two relatively prime numbers m and n, IsoR{G;
mn) > IsoR(G;
m ) IsoR(G;
n).
N O T E T h e number I s o (G;mn) can be strictly greater t h a n the number I s o ( G ; m ) I s o f i ( G ; n ) , even if TO and n are distinct primes. For example, if /3(G) > 2, TO = 2 and n = 3, then Isofi(G;6) > Isofi(G;2)Isofl(G;3), because I s o f l ( G ; 6) = I s o c (G; Z 6 ) + I s o c (G; D 3 ) + I s o c (G; Z 3 ) + I s o c (G; Z 2 ) + 1 = Iso ( G ; Z 6 ) + Isoc (G;D3), and
I s o (G; Z 6 ) = Iso (G; Z 2 ) I s o (G; Z 3 ) = Iso f l (G; 2) Iso f l (G; 3).
E X A M P L E 2 Let Z p m be the cyclic group of order pm, p prime. Then Aut(Zpm) can b e identified with the set of all elements of Zpm which are relatively prime to pm, t h a t is, the set {AG Zp™ : (A,p m ) = 1}, and 0(Zpm;/3(G)) = { (5i,ff2,... ,5/J(G)) G ( Z p - ) " ^ | at least one of gi 's generates Zp™ }. It implies t h a t \k^{TLvm)\=pm-\p-\)
and
|(5(Zp~;/3(G))| = / (
G
)
m
_/(G)(--i)
Then, by Theorem 10, p/S(G)m
I s o c (G; Z p m ) = £
«(G)(m-l)
rf
«(G) _
=j)WG)-1)(m-1)-
X
—
121
for m > 0. Now, by Theorem 11(3) and the lattice structure of subgroups of Z„"», we have Iso(G;Zpm) J8(G)-l)(h-l)P (
P(G) _ I
P
I
i + E^
p - l
p-l
h=l
pm(l3(G)-l)
_
p/3(G)-l
_ 1
x
'
From Example 2 and Theorem 11(4), we can get Theorem 12 ( 16 , 22 ) For any n = p\lp^ • • -ps/ > 1 (a prime factorization), the number of isomorphism classes of connected 7Ln-coverings of G is 0 Isoc(G;Zn)
if /3(G) = 0,
l
/3(G)
TT M O - D ^ - D E i •i
-,
z± Pi
[ff3{G)
>
1
- 1
And, the number of nonisomorphic "Ln-coverings of G is (I Iso(G;Zn) = <
if/3(G) = 0 ,
fl^i + l)
if/3(G) = l,
(pf(G)-D(prWG)
(/3(G)-1)
Tff,,
1)
if/3(G) > 2 .
1) For the remain of this section, we aim to describe the enumeration of nonisomorphic regular coverings having a finite abelian voltage group. By the classification of finite abelian groups, any finite abelian group A is isomorphic to a direct sum of finite cyclic groups of order powers of prime numbers. In order to compute the number Iso (G; ^4), it suffices, by Theorem 11((3),(4)), to compute the number Iso (G; ©£ =1 mhZ p »h) or the number Isoc (G; ®eh=1mhZp'h) for a prime p. To do this, we first introduce the following lemma. Lemma 10 (22) (1) For any natural numbers m and n with m < n, and a prime p, we have '(•"-i)
\&(mZp,n)\=p^^(pn
-
IXP""1
! n—m-j-1 - 1) • • • (p -i),
and ("-••),
|Aut (mZp)| = |<S(mZp;m)| = p-^~^{pm
- l ) ( p m _ 1 - 1) • • • (p - 1).
122 (2) For any natural number s > 1, we have \<3(mZp,;n)\=p(s-Vmn\<5(mZp;n)l and |Aut(mZp.)| =p ( s ~ 1 ) r " 2 |Aiit(mZ p )|.
By Theorems 11(3), 10 and Lemma 10, we have Corollary 2 (22) For any m, the number of nonisomorphic connected rnLpcoverings of G is V
'
PJ
(pm - l ) ^ " 1 " 1 - 1) • • • (p ~ 1)
The number of nonisomorphic m1v-coverings Iso (G;mZ p ) = l + ^ /»=!
of G is
_ _ __ _ (p* - l)(p*-l - 1) . . . (p _ 1)
.
This formula for the number Iso (G;mZ p ) in Corollary 2 is much more explicit than that of Hofmeister's in 12 . It is well-known (see 46 ) that the number of the m-dimensional subspaces of the rt-dimensional vector space nZ p over the field Z p is equal to the Gaussian coefficient REMARK
fi
(p*-i)
i=n—m+1
iifr*-1) »=i
Hence, we can say that the number of nonisomorphic connected mZp-coverings of a connected graph G is equal to the number of the m-dimensional subspaces of the /J(G)-dimensional vector space /3(G)Zp. Let miZpsi © TT^Zpsj be the direct sum of two abelian groups miZ p »i and m 2 Z p * 2 (say, s2 < si) and let gi = (511,512), ••• ,gn = (9ni,9n2) € miZpsi © m2Zp»2 • Then {g1:..., #„} generates m i Z p n © m 2 Z p » 2 if and only if { ( p 8 1 - 1 ^ ! , ^ 2 - ^ ) , - - - , ^ 1 " 1 ^ ! , ^ 2 " 1 ^ ) } generates (m1+m2)Zp. An analogous argument to the proof of Lemma 10 gives \<5(miZpn
©m2ZpS2;n)| = p n ( m i ( s i - 1 ) + m 2 ( s 2 - 1 ) ) | ( 5 ( ( m 1 + m 2 ) Z p ; n ) | .
123
But, in general, |Aut(miZ p n ©m 2 Zp» 2 )| 7^ |<5(miZpn ® m2Zp»2; mi + m 2 )|. Note that the group m i Z p n © m^L-p'-i is an elementary abelian p-group, so that its automorphism group is isomorphic to the group of nonsingular linear transformations of the vector space miZ p »i © rr^Zp^. Now, an elementary exercise gives 2
m.i
|Aut (r^Zpn © m 2 Z p , 2 )| = pB(™^>) J J J J (p^~h+^
- l) ,
where , . I \-^ , _,.\ , ,. m(m — 1) g(m,i, Si) = m I ^ m^Si - 1) 1 - m i m 2 ( s 1 - s 2 - 1) H with m = mi + mi and si < s\. In general, we have the following. Lemma 11 Let mi,..., me and si,..., se be natural numbers with S£ < ... < Si. Let p be a prime number. Then we have (1) | 0 ( © l = 1 m h Z p 3 f c ; n ) | =
^ ( m 1 ( . 1 - i ) + - W < - i ) ) \<3{(mi + • • • + me)ZP; n)\. e e
(2) \Aut(® h=imhZpsh)\
=p9(mi,si)
mi
jQ p | ^ - h + 1
_ y _
i=l h=l
where g[mu Si)=m\S2
m s
i( i
~ l) I
E mi
l v^
1 m
s
2_^ j{ i
-,\\ s
l
- j -)
\ +
with m = mi + • • • + m,£. Now, the following comes from Theorem 10 and Lemma 11.
rn(m - 1)
124
T h e o r e m 13 ( ) Let mi,... ,me and s\,...,se be natural numbers with se < ••• < si. Then the number of nonisomorphic connected ©£ =1 mhZp*hcoverings of G is rn
Isoc (G; ®h=1mhZp*h)
= p
where m = mi + • • • + m^, p is prime and
f(P(G),musi)
=
(f3(G)-m)lj2mi(si-l)\ e-i i=i
I e \j=*+i
Now, we can compute the number Iso (G; ^4) for any finite abelian group A by using Theorems 11((3),(4)) and 13 repeatedly if necessary. For example, if p and q are two distinct prime numbers, then Iso (G;Z p 3 ©ZpffiZ^) = Iso (G;Z p 3 © Z p ) Iso (G;Z, 2 ) = ( 1 + 5 ^ Isoc (G; Zp<) + ^ \
i=l
Isoc (G; Zp< © Z p ) J
i=l
/
x ( l + ^Tlsoc (G;Z,i)J
1 +^
^
( l + / ( « ) - (l+Z^)- 1 ))
(g/5(G)_l)(g/3(G)-l+1)\ X
1
g-1
For some abelian groups A and small /3(G), the numbers Isoc (G;^4) and Iso (G; .4) are listed in table 4.
125 Isoc
ftp, g^p3 ® zp
Iso
zq2zp3 e z p e z g2 ^ p3
1(2,3) 0 1 2(2, 5) 6 30 3(3, 5) 1404 775 4(3,7) 126360137200
0 4 3 180 32 37 1088100 2757 807 1695792000 161451137601
12 1184 2224899 22215819051
Table 4. The number I s o c (G; A) and Iso (G; A) for some A and small /3(G)
p n = n = n = n = n = n = n= n =
1 2 3 4 5 6 7 8
1 1 1 1 1 1 1 1 1
2 1 3 4 7 6 12 8 15
3 1 7 13 35 31 91 57 155
4 5 6 8 9 10 7 1 1 1 1 1 1 1 255 15 31 63 127 511 1023 3280 40 121 364 1093 9841 29524 155 651 2667 10795 43435 174251 698027 97656 156 781 3906 19531 488281 2441406 600 3751 22932 138811 836400 5028751 30203052 400 2801 19608 137257 960800 6725601 47079208 1395 11811 97155 788035 6347715 50955971 408345795
Table 5. The number of subgroups of index n in i
For a connected ^4-covering p : G —> G, the image p*(ni(G)) of the fundamental group of the covering graph G is a normal subgroup of the fundamental group 7Ti(G) of the base graph G, and the quotient group 7Ti(G)/p»(7ri(G)) is isomorphic to A. If A is abelian, then p*(7Ti(G)) contains the commutator subgroup [7Ti(G),7ri(G)] of the free group 7Ti(G). Since [7r1(G),7r1(G)j is a normal subgroup of ^ ( G ) , the natural homomorphism q : 7r1(G) —> 7r1(G)/[7r1(G), 7r1(G)] induces a one-to-one correspondence between the set of all subgroups of 7Ti(G) containing [7Ti(G), 7ri(G)] and the set of all subgroups of the quotient group 7ri(G)/[7Ti(G), 7Ti(G)]. Notice that 7ri(G)/[7r1(G),7Ti(G)] is the free abelian group generated by /3(G) elements. Now, from a well-known classification theorem for regular coverings of a topological space, it follows that the number V^Isoc (G;A) = Y^ ' AM > A A |Aut(.4)| where A runs over all nonisomorphic abelian groups of order n, is equal to the number of subgroups of index n of the free abelian group Z x Z x • • • x Z generated by /3(G) elements. For small n and small (3, these numbers are listed in table 5. REMARK
126
5
Regular coverings having dihedral voltage groups
In this section, we consider a dihedral group as a nonabelian voltage group, and aim to compute the number of nonisomorphic regular coverings having a dihedral voltage group. Recall that the dihedral group of order 2n can be presented as follows: D n = (a, b : a2 = 1 = bn, aba = iT 1 ) . Note that Dj = Z 2 , 0 2 = Z 2 © Z 2 , D n is not abelian for n > 3 with (a) = Z 2 and (b) = Z n , and an element of D n can be of the form 6* or ab% for i = 0 , 1 , . . . , n - 1. Notice that any subgroup of the dihedral group D n is isomorphic to one of Dj (i is a divisor of n) or Zj (j is a divisor of n), where Zx = {identity}. It follows from Theorem 11(3) that for any n > 3 Iso (G; D n ) ' ] T Isoc (G; Z m ) + ^2Isoc ^
( G ; D™)
if n i s o d d
m\n
mln
Isoc (G; Z m ) +
^
Isoc (G; D m ) if n is even
m\n, m ^ l
,. m\n
' Iso (G;Z n ) + ] T Isoc (G;D m )
if n is odd
ra|n
Iso(G;Z„)+
Y^
I s o c ( G ; D m ) i f 7i is even.
To compute the number Isoc(G;D n ), we first compute |Aut(D n )| and |25(D„;r)|. L e m m a 12 Let n be a natural number with prime decomposition p™1 • • • p™*. If n > 3, then (1) |Aut(D„)| = n-phi(n)
- 1) • ••pTl~1{pe ~ 1).
= np^fa
(2) For any natural number r, |<S(O n ;r)| = (T - 1) Y [
p
^ - ^
( ^
- l) .
»=i
Proof: It is not hard to show that Aut(©„) = {a} : o-j(o) = ab\a){b) = V, 0 < i,j < n - 1, (n,j) = 1} .
127
It implies that |Aut(D„)| = n-phi{n) = np™1 _ 1 (pi - 1) • ••p™e~1(pi - 1). Next, we compute the number |C5(D„; r)\. Since the prime decomposition of n isp™1 • • • p™', Z n =< b > is isomorphic to ffi?=1Z *><, where Zp"w = < 6» > with 6 = b\ • • • be- Note that Dn = Z„ U aZ„, disjoint union. It is clear that if (gi,... ,gr) G ®(D„;r) then there exists at least one j (1 < j < r) such that gj G aZ n = {abl \ i = 1 , . . . , n}. Given any nonempty subset S of {1, 2 , . . . , r } , let <5[S] denote the set {(Si. • • •. flr) € ®(D n ;r) : £,• G aZ„ for j € S, and £j £ Z n for j £ S}. Then
(J S(^0)C{1,2
©[S] = <5(On;r). r}
Moreover, (5 [5] and C5[T] are disjoint for any two distinct nonempty subsets S and T of { 1 , 2 , . . . , r } . It implies that n\r)\
=
®[5]
U S(^0)C{l,2,...,r}
S(^0)C{l,2,...,r}
For convenience, for each g G D n , let (si,---.0«) a
ifflGZ„ = © f = 1 Z p r
(ffi, • • • > 9t) ^ 9 G aZ„ = a ©f=1 Z p - i .
Let S be a nonempty subset of { 1 , . . . , r} and (#i, • • •,
w rfjenv -Pu f n V'-1 * n*?v«t=i fe=o y^s * jes where Z m r i is the subgroup of Z ">i generated by 6?\ It implies that for any nonempty subset S of { 1 , 2 , . . . , r } ,
m i = n ( P r -pim,"1)|s| -Pi -P^-1)(r~|s|)) 1=1
= rH m i - 1 ) r + 1 (pr1 -1),
128 which does not depend on the set S. Now, the cardinality |<5(IDn;r)| of the set <9(On;r) is
i«[s]i=(r - i ) n p ^ - i ) r + i (pp1 -1)
£ S(^0)c{l,2,...,r}
i=l
D
Now, the next theorem follows from Theorem 10 and Lemma 12.
Theorem 14 (22) For any n > 3, the number of nonisomorphic connected B>n-coverings of G is
,
e
/?(G)-i
I s o c ( G ; O n ) = f2«°) - l ) Up(™^WG)-2)Pi V Jl\
Pi
-1
i
^
where p™1 • • -p™' is the prime decomposition of n.
For any edge e in the cotree G — T, we have (3{G — e) = (3(G) — 1. By Example 2, Theorems 13 and 14, we have Isoc (G; D„) = (2^ G ) - l)Isoc (G - e; Z„)
for any n > 3. Thus, if n is odd, then
5 3 Isoc (G; Dm) = (2«G> - l ) ^
I s o c
( G ~ e5 Z™)
2^G)-l)lso(G-e;Zn).
129
If n is even, then Y,
Isoc(G;Dm) =
J2
Isoc(G;Dm)+Isoc(G;D2)
m|n, m > 3
= [2^
- 1 j \J2Isoc
(G - e;Z m ) - [1 + Isoc (G - e; Z 2 )]
\mjn
+ Isoc(G;ID>2) = ( 2 ^ ° ) - l ) Iso (G - e;Z n ) - (2^ G > - l ) 2 ^ G ) " : +
I (2"
= (2« G ) - l ) Iso (G - e;Z„) - 1 ( 4 « G ) - l ) . We summarize our discussion as follows. Theorem 15 (16, coverings of G is
22
) For any n > 3, i/ze number of nonisomorphic Dr
Iso(G;D n ) ' Iso (G; Z n ) + (2 / 3 ( G ) - l ) Iso (G - e; Z„)
if n is odd,
Iso (G; Z„) + f 2^G> - l ) Iso (G - e; Z„) 4/3(G) _ i
if n is even, where e is an edge in the cotree G — T. Recall that the number Iso (G; Z„) was computed in Theorem 12. The numbers Isoc (G; D n ) and Iso (G; D n ) for small n and /3(G) are listed in tables 6 and 7. Let p be a prime number. Then every group of order p or p2 is abelian. Hence, there is only one group of order p up to isomorphism; it is the cyclic group Zp, and there are only two groups of order p2 up to isomorphism; they are Zp2 and Z p ©Z p . Letp and q are distinct primes. Ifp < q, p J( (q — 1), then there are only one nonisomorphic group of order pq; it is the cyclic group Z p 9 which is isomorphic to Z p ® 7Lq. If p < q, p \ (q — 1), then there are only two
130 n = 3 n = 4 n = 5 n = 6 rj = 7 n = 8 n = 9n= 10 n = 11 0 1 0 0 0 0 0 0 0 0 3 2 3 3 3 3 3 3 3 3 28 42 3 42 84 56 84 84 126 84 4 195 420 465 1365 855 1680 1755 3255 1995 5 1240 3720 4836 18600 12400 29760 33480 72540 45384
p
Table 6. The number I s o c (G;D„) for small n and small /3(G)
n = 3 n = 4 B = 5n = 6n = 7n = 8n = 9n = 10n=ll 1 3 3 3 4 3 4 4 4 3 13 27 15 29 26 35 19 2 11 14 49 85 81 231 121 281 250 431 225 3 4 251 591 637 2251 1271 3231 3086 6267 3475 5 1393 4403 5649 23899 15233 42099 44674 102555 61521
p
Table 7. The number I s o (G; D„) for small n and small P{G)
nonisomorphic groups of order pq; one of them is the cyclic group Zpq and the other is a nonabelian group /C generated by two elements a and b such that < a>
V,
ab = bsa,
q;
where s ^ 1 and s p = 1 (mod q). More on the classification of finite groups that needed in this manuscript can be found in 44, •. The following come from the classification of finite groups and Theorem 11 (2). For a prime p, the numbers of p-, p2-, pq- or p3-fold nonisomorphic connected regular coverings of G are R
Isoc (G>P) IsocR(G;p2)
vp(G)
F
p-l
-1 '
( p « ° ) - l ) ( p « g ) - i - l ) ^ n(s(G)_1)PW +p" (P2-1)(P-1) ( pP(.G) _ !
qP{G)
p - l
IsocR(G;pq)
_ !
(pP(G) - 1) (q0(G)-i q - l _
+
if p
q - l
pP(G)
p-l
- 1
l
_
qP(G)
J(
(q-l),
^
_
q-l
l
if
p
131 (/(g)-l)(^)-i-l)(p^)-^-l) IS C
°
(G;P } =
( P 3 -1)(P 2 -D(P-1) (P + 2)
(p2 - l ) ( p - 1)
^ - 1 & ( ^ Now, by using Theorem 11 (1), we have IsoR(G;p)
=^
)
-
1
-
1
)
— + 1, (p/»(G>-l)(p/»(GM-l)
IsoR(G;p2)
0
(p2-l)(p-l)
rp^(G)+p_2/(G) p- 1 IBOR (G;pq) = <
p^(G)-l
V ( G ) _ 1 + i) + i,
+
+
p-1 _ 2 . g ifp < 9, P/f(9 - 1),
Q - 1
pP(G)+p_2qP(G)+q_2
p-1
g-1
(p«G) _ ! ) ( g « G ) - l _ 1)
+
9
ifp < 9 , p\(q- 1),
- l
Iso*(G;p 3 ) (pfl(g) _ i)(pg(Q)-i _ l ) ( p ^ ( g ) - 2 - l ) " (p3 - l)(p2 _ 1 ) ( p _ 1) (pP(Q) - i)(pg(Q)-i - l ) ,
+
2(p2-l)(p-l) p/3(G)
+P
x
( / ( O + l + pP(G)-l + 4 • p « ° ) - 2 + 2)
_ !
_ " ' ( p ' l ^ - D + ^ - l + l) + 1.
REMARK More enumerations of graph coverings satisfying some properties like concrete or bipartite coverings were studied in the sequel. Hofmeister (11, 13 ) introduced the notion of a concrete (resp. concrete regular) covering of a graph G and gave formulas for enumerating the isomorphism classes of concrete (resp. concrete regular) coverings of G. An n-fold covering p : G —> G is said to be concrete if it is accompanied by an explicit partition V = { P i , . . . , Pn} of V(G) such that every partition set Pi meets every vertex fiber exactly once. The partition sets Pi are the sheets of the covering p. A concrete regular covering is a concrete covering p : G —> G which is regular
132 6 7 8 9 10 11 /3 n = 1 2 3 4 5 1 1 1 1 1 1 1 1 1 1 1 1 2 1 3 4 7 6 15 8 19 13 21 12 3 1 7 13 35 31 119 57 211 130 259 133 4 1 15 40 155 156 795 400 1955 1210 2805 1464 5 1 31 121 651 781 4991 2801 16771 11011 29047 16105 Table 8. The number lsocR(G;n)
137i = 1 2 3 4 5
1 1 1 1 1 1
2 2 4
3 2 5 8 14 16 41 32 122
for small n and small /3(G)
4
5 6 7 8 9 10 11 3 2 4 2 4 3 4 2 9 11 7 23 30 18 31 13 43 32 140 58 254 144 298 134 171 157 851 401 2126 1251 2977 1465 683 782 5144 2802 17452 11133 29860 16106
Table 9. The number lsoR(G;n)
for small n and small /3(G)
and every covering transformation of G preserves the sheets. Later, R. Feng et al 3 showed that the number of nonisomorphic n-fold concrete (resp. concrete regular) coverings of G is equal to that of nonisomorphic n-fold (resp. regular) coverings of the join G+oo of G and an extra vertex oo. As a consequence, the isomorphism classes of concrete (resp. concrete regular) coverings of a graph can be enumerated by using known formulas for enumerating the isomorphism classes of coverings (resp. regular coverings) of a graph. It also gives a new formula to compute the number of the isomorphism classes of graphs with n vertices because the number of nonisomorphic concrete double coverings of the complete graph on n vertices is equal to the number of nonisomorphic graphs with n vertices. For enumeration of bipartite coverings, see 2 and 17 .
6
Regular coverings; A general case
In this section, we introduce a general formula to enumerate ^4-coverings of a graph G for any finite group A in terms of the Mobius function denned on the subgroup lattice of A by P. Hall in 6 . G. Jones 2 0 2 1 used such Mobius function to find a method for counting normal subgroups of a surface group or a crystallographic group, and applied it to count some covering surfaces. To apply the Jones' method to a graph covering case, first recall that the set
133
Cj.{G\A) of .4-voltage assignments of G can be identified as C^(G;A)=AxAx---xA,
((3(G) times),
from which every .4-covering of the graph G can be derived. Let Tp denote the free group generated by (3 elements, where /? = /3(G). Then, the ,4-voltage assignments in C^(G\ A) correspond bijectively to homomorphisms from the free group J-p to the voltage group A, thus |C^(G;yl)| = |Horn(^,^4)1 = \A\P. Also, it can be written as \C1T(G]A)\ = \Rom(Tp,A)\
= £
|Epi(^,X)|
K
the sum of the numbers of epimorphisms from the free group Tp onto subgroups K of the group A, and such epimorphisms correspond bijectively to transitive K-voltage assignments in &(K;(3). It follows that \Epi(Tp, K)\ = | ©(if; P)\. Now, one can invert the equation \Rom{F(t,A)\
= J2
|Epi(^.#)l>
K
to count epimorphisms in terms of homomorphisms, by introducing the Mobius funtion for A. This assigns an integer fJ.(K) to each subgroup K of A by the recursive formula Y, H>K
liiK = A, K B ) = SK,A = { 0 if K < A.
The equation \Epi(Fp,A)\
= J2
»(K)\Uom(Fp,K)\
K
is then easily deduced, and Theorem 10 gives Uoc(GM) =
'
Y.
e(*)|H°m(.F„,tf)|
ra£/w K
134 EXAMPLE 3 (1) The cyclic group A = Z„ has a unique subgroup Z m for each m dividing n, and has no other subgroups. The Mobius function on the subgroup is /x(Zm) = fj,(n/m) (the Mobius function of the elementary number theory) and |Aut (Z„)| = phi(n) (Euler phi-function), so it implies that
ISoc(G;Z„) = phi(n) - i — ^-r1 ^i-)™" \mJ This coincides with the formula given in Theorem 12. (2) Let A = Bn = (a,b : a2 = 1 = bn,aba = b" 1 ) be the dihedral group of order 2n. For convenience, let Z m = < b™ > and let 0™ = Z m U a(Zm6*) for f = 0 , . . . , ^ — 1. Then each subgroup of ID>n is one of Z m or D „ for each m dividing n. Now, consider the lattice induced by the subgroups of D n . Then, for each subgroup S of D n , we have
MS)
lil — ) \mJ
if S = B% for each i = 0 , . . . , ^ - 1, m
m vra/ Since |Aut (O ra )| = n • phi(n) for n > 3, we have Isoc ( G ; D „ > _ i — (^ ± , ( - ) (2mf n • phi(n) \ *-^ m \mJ \rn\n
- £ - /*(") *-*' m \m/ m\n
phi(n) ""—^ Vm ^
V
'
m\n
for n > 3. This coincides with the formula given in Theorem 14. 7
N e w classifications of branched coverings and the number of subgroups of a surface group
A surface § is a compact connected 2-manifold without boundary. By the classification of surfaces, a surface S is homeomorphic to one of the following: the orientable surface with k handles if fc > 0, the sphereS 2 if fc = 0, the nonorientable surface with —k crosscaps if k < 0. A continuous surjective map p : § —> S is a branched covering if p|§_-p-i(B) : § - p"1(B) —> § - B is a covering for a finite subset B of S. The branch set
135
B of a branched covering p : S —» S is the collection of points i £ S which have the property that x has no neighborhood Nx such that each component of p~1(Nx) is mapped homeomorphically onto Nx by p. A branched covering p : § —> S is regular (or ^-covering) if p\g_p-irm '• § ~P~1(B) —> S — -B is a regular covering (with the covering transformation group ^4). Two branched coverings pi : S» —> S (i = 1, 2) are isomorphic (or equivalent) if there exists a homeomorphism h : Si —» §2 such that p2°h = p\. A (branched) covering of a surface is closely related to a graph covering which is embeddable into it. To see such a kind of relation, we first review a graph emdedding. An embedding of a graph G into a surface S is a homeomorphism 1 : G —> § of G into S. If every component of S — i(G), called a region, is homeomorphic to an open disk, then the embedding 1 : G —» S is called a 2-cell embedding, and the regions are called faces of the embedding. When a graph G is 2-cell embedded into a surface, every boundary walk of a face induces a walk in the graph G of the same length. A face of a 2-cell embedding of a graph G into a surface is said to be n-sided if its boundary walk is of length n. Note that if G is disconnected, no embedding of G into a surface § will be a 2-cell embedding. An embedding scheme (p, A) for a graph G consists of a rotation scheme p which assigns a cyclic permutation pv on N(v) = { e £ D{G) : ie = v } to each v £ V(G) and a voltage assignment A which assigns a value A(e) in Z 2 = {1, - 1 } to each e € E(G). Stahl 4 3 showed that every embedding scheme for a graph G determines a 2-cell embedding of G into a surface S, and every 2-cell embedding of G into a surface S is determined by such a scheme. To see the relation between an embedding scheme for a graph and its 2-cell embedding to a surface, we give the following example.
4 Let G be a figure eight having a vertex v and two loops £1 and £2, and let (p, A) be an embedding scheme defined by pv = {£\£2^^2~l)'> -M^i) = 1 and A(^2) = —1- In a geometric presentation of G in M3 with directed loops initiating at v in counterclockwise order according to the rotation scheme pv as in Figure 2 (b), we attach a closed disk at the vertex v and 1-bands along loops £\ and £2, where a 1-band is twisted if A(^) = —1 and untwisted if A(£j) = 1 as in Figure 2 (c). Finally, we attach a closed disk along each boundary of the graph with 1-bands. Note that there exists only one component of the boundary of the graph with 1-bands in this example, and we get a 2-cell embedding of the figure eight into the Klein bottle with only one face as in EXAMPLE
136
e2
<—> A(^I)
= i, x(e2) = - l (a)
•
«
-
• fr ",
fe
• < = — = >
o^-
(d)
(c)
Figure 2. An embedding scheme for a figure eight embedded to t h e Klein bottle
Figure 2 (d). Conversely, if there exists such an embedding as in Figure 2 (d), it induces an embedding scheme (p, A) as described above. The orientability of the surface S can be detected by looking at the voltage assignments of cycles of G. In fact, S is orientable if and only if every cycle of G is X-trivial, that is, the number of edges e with A(e) = —1 is even in every cycle of G. In particular, every 2-cell embedding of G into an orientable surface can be determined by an embedding scheme (p, A) with A(e) = 1 for each e € E(G). Let i : G —> S be a 2-cell embedding and (p, A) the associated embedding scheme. Let
137 G* P
——-
S* p0
commutes. Moreover, if G^ is connected, then S* is also connected. Gross and Tucker 4 showed the following relation between branched coverings of a surface and coverings of a graph. T h e o r e m 1 6 Let (p, A) be an embedding scheme for a graph G which a 2-cell embedding i : G —> §.
induces
(1) Let
138
Notice that a 2-cell embedding of a graph G into a surface § determines a cell decomposition of the surface having the graph G as its 1-skeleton. Let 0 be a voltage assignment of G and let G be 2-cell embedded in a surface S. Then the lifted embedding scheme determines the cell decomposition of the surface S* having the covering graph G* as its 1-skeleton. Moreover, the branched covering map p^ : S* —» S preserves cells, that is, it assigns i-cell to i-cell for each i = 0,1, 2 and the restriction of p^ to its 1-skeleton is just the covering p^ : G^ —» G. It implies that if two branched coverings Ptj, : S* —> § and jy, : S1'' —» S are isomorphic, then the two coverings p^ : G& —» G and p^ : G^ —> G are isomorphic as graph coverings. Conversely, if two coverings p^ : G* —» G and p^, : G^ —» G are isomorphic, then, by Theorem 4, there exists a function / : V(G) —> Sn such that ip(uv) = f{v)4>(uv)f{u)~l for each uv in D(G). Notice that the map $ : G* -> G* defined by &(ug) = w/(«)(g) is a covering isomorphism. Let (uv)g = UgV^^^ maps to (uw)g = ugw^uw^g^ by the induced rotation system (p^)u . By the definition of $ , $((uv)g) = Uf(u)(g)Vf(v)4>(uvXg) = uf(,u)(g)V^Uv)f{u){g) and $((uu;) ff ) = Uf(u){g)W^(uw)f{u)(g)- S o , $/? 0 = /0^$- Now, by combining this fact with A^($(e 3 )) = A(e) = A*(eff), we can show that $ is extended to a cell preserving homeomorphism h from S* to §^ such that p^ oh = p
Let 2$m be the graph consisting of one vertex and m self loops, say £ i , . . . ,£ m . We call it the bouquet of m circles or simply, a bouquet. Clearly, 2$ m is irreducible {i.e., having no vertices of degree 2) if m > 2. A surface St can be represented by a 4/c-gon with identification data Yls=i asbs
139 /
\B\
k
ai,... ,ak,h,...
,bk,ci,...
\ /
\
,c\B\ ; ^ Q a A a ^ & ^ J J c t = 1 > if fc > 0; -k
«=i \B\
t=i
ai,...,a_fc,ci,...,C|B| ; J | a s a s J ^ c t = 1 ) s=l
/
\
if k < 0;
t=l
|B|
( ci,...,C| B | ; P J c t = l \
if fc = 0.
We call this the standard presentation of the fundamental group TT\ (Sk — B, *). For each t = l , 2 , . . . , | i ? | , we take a simple closed curve based at * lying in the face determined by the polygonal representation of the surface §& so that it represents the homotopy class of the generator ct. Then, it induces a 2cell embedding of a bouquet of m circles into the surface §& such that the embedding has \B\ 1-sided regions and one (\B\ + 4fc)-sided region if k > 0; |B| 1-sided regions and one (\B\ — 2fc)-sided region if k < 0; and |J5| 1-sided regions and one |£?|-sided region if k = 0, where m is the number of the generators of the corresponding fundamental group. We call this embedding i : 5Bm —* Sjj the standard embedding, denoted by 5}m^-> §£ — B. For example, Figure 3 illustrates the standard embeddings of bouquets with \B\ — 3. Figure 3 (a) represents the standard embedding %$i<—> S 2 — B and (b) does the standard embedding 586*—> §-3 ~ B. For convenience, let ak = 2k if k > 0, and ak = —k if k < 0. Let CH^ak+\B\ ^ §fc - B-n) (resp. C1(f8ak+\B\ ^ §k - B;A)) denote the subset of (Sn)ak+^ (resp. of {A)ak+^) consisting of all (a* + |B|)-tuples (o"i,..., o"afc+|B|) which satisfy the following three conditions: (CI) The subgroup < alt..., cr0fc+|B| > generated by {a1:... sitive on { 1 , 2 , . . . , n} (resp. is the full group ^4), and (C2)
(i) if k > 0, then k
\B\
*=i
«=i
(ii) if k < 0, then -fc
|B|
{ J Oi°i | J ~k+i = 1. »=1
i=l
,<70fc+|B|} is tran-
140
(a)
(b)
Figure 3. Two examples of standard embeddings
(C3) < j j ^ l for each z = a* + 1 , . . . , a* + |i?|. Note that condition (CI) guarantees that the surface S* is connected, and conditions (C2) and (C3) do that the set B is the same as the branch set of the branched covering p$ : S^ —> S. By using a similar method as in 23 , we can obtain the following theorem. Theorem 18 (Existence and classification of branched coverings) Every permutation voltage assignment in C1(^Sak+\Q\ '—> S& — B; n) induces a connected branched n-fold covering of Sfc with branch set B. Conversely, every connected branched n-fold covering of §k with branch set B can be derived from a voltage assignment in C1(Q3afc+|B| *—* §k — B;n). Moreover, for any given two permutation voltage assignments
a^ejo--1
where ak = 2k if k > 0, and a*. = —k if k < 0.
a For a finite group A, let S^ denote the symmetric group on the group elements of A. It gives the (left) regular representation A —• SA of A via
141
g —> L(g), the left translation by g on A. Clearly, this representation is faithful and the group A can be identified with the group of left transformations L(gYs: A = {L(g) \ g £ A} (Cayley Theorem). Notice that a permutation voltage assignment (p : D(G) —> S^ having its images in A can be considered as an ,4-voltage assignment of G, and for such a voltage assignment <j>, the permutation derived graph G* is nothing but the ordinary derived graph G X0 A- By using this fact, Kwak et al. showed the following. Theorem 19 2 3 (Existence and classification of regular branched coverings) Every ordinary voltage assignment in C1(*Bak+\B\ <—> Sfc — B\A) induces a connected branched A-covering of §*. with branch set B. Conversely, every connected branched A-covering of Sfc with branch set B can be derived from a voltage assignment in C1(*&ak+\B\t—> Sfc — B;A). Moreover, for any given two voltage assignments 4>,ip € C1(*Bak+\B\ •—> S — B;A), two branched Acoverings p^ : S^ —* S and p^ : S^ —> § are isomorphic if and only if two graph coveringsp^ : 93 a j t + | B | x^.4 —> Q5 afc+ | B | andp^ : 5S afc+ | B | x^A —> Q5 afc+ | B | are isomorphic. It is also equivalent to say that there exists a group automorphism a of A such that
Mb) = *{*&)) for all £i € D(2$ a)t+ |B|), where a^ = 2k if k > 0, and a^ = —k if k < 0.
D There are two classical Hurwitz theorems: the existence and the classification theorems of surface branched coverings. Let p : § —» § be an n-fold surface branched covering, where S is possibly disconnected. Hurwitz 18 introduced a system, called Hurwitz system, for p as follows: Consider the associated covering p|s_ p -i(£) : § — P X(B) —> S — B of p. A Hurwitz system is a representation Hp : Tr\(§ — B, *) —> Sn, which is determined by choosing a one-to-one correspondence p _ 1 (*) <-> { 1 , 2 , . . . , n} and assigning to a loop a in S — B based at * the permutation of { 1 , 2 , . . . , n} induced by the liftings of a. For any finite set B of points in S and a representation H : 7Ti (S — B, *) —> Sn, there exists an n-fold branched covering p : § —> S, where S is perhaps not connected, with branch set contained in B and Hp = H (Hurwitz existence theorem). Two n-fold branched coverings pi : Sj —» S, i = 1,2, are isomorphic if and only if HP2 = HPl modulo inner automorphisms of Sn. (Hurwitz classification theorem). Every group homomorphism from 7Ti(S — B, *) to Sn is uniquely determined by its values on the generator set {as, bs, ct} of 7ri(S — B, *) which preserves the corresponding relation in the standard presentation of 7Ti(S — B,*). Hence, a Hurwitz system Hp : 7r1(§ — B, *) —> Sn for a branched n-fold covering p : 8 —> S is nothing but a voltage assignment in C 1 (*B m ; n) which satisfies
142
the conditions (C2) and (C3), and that of a connected branched n-fold covering p : S —» § is nothing but a voltage assignment in C1(SBm <-> S — B;n). So, Theorems 18 and 19 are new combinatorial statements of the Hurwitz existence and classification theorems for branched coverings and for branched regular coverings, respectively. REMARK Let p : S —> S be an n-fold connected unbranched covering and let * £ S. The monodromy representation of 7Ti(§, *) is a homomorphism Hp : 7Ti(§, *) —> Sn determined by choosing a one-to-one correspondence 2>_1(*) <-» {1, 2 , . . . , n} and assigning to a loop a in 7rj(S, *) the permutation of {1,2, . . . , n } induced by the liftings of a. This permutation maps * € p~1(*) to the terminal point of the lifting a having * as an initial point, i.e., Hp(a)(*) = 3(1). The image of the monodromy representation is a subgroup of Sn and is called the monodromy group. Its element is called a monodromy map. Notice that the monodromy representation for a surface covering is equal to the Hurwitz system for a surface covering and hence it can be identified with a voltage assignment
€ C 1 (?B m ;n),
g • (<Ti,...,o- m ) = ( ^ l f i r V ' - ^ m S r 1 ) It follows from Theorem 18 that two voltage assignments in C1(*Bafc+|B| <^-> Sfc — B; n) derive isomorphic branched coverings of S/t if and only if they belong to the same orbit under the 5„-action. Hence we have the following. Lemma 13 Let k be any integer and let B be a finite subset of the surface Sfe. Then the number of isomorphism classes of connected n-fold branched coverings of the surface S& with branch set B is ISoc(Sk,B;n)
=
\C1(<8ak+lBl<^Sk-B;n)/Sn\.
a Now, we aim to express the number Isoc (Sfc, B;n) in terms of known parameters.
143
Let £(?Bm;n) denote the set of all m-tuples (CTI, . . . , 0"m) in (Sn)m such that the group < a1}...,am > generated by { a 1 , . . . , c r m } is transitive on { 1 , 2 , . . . , n}, that is, £(93 m ;n) = {{ai,a2,..
• ,am) G ( 5 n ) m : < 0-1,02,... ,am > is transitive on {1, 2 , . . . , n}}.
Then £(58 m ; n) contains all representatives of connected n-fold coverings of the bouquet of m-circles Q5m and the number Isoc (25m; n) of isomorphism classes of connected n-fold coverings of 25 m is equal to \£(*Bm;ri)/Sn\, where the 5 n -action on £(
= {
: 0 satisfies (CI), (C2) and
where 4> = (°"ii 0"2> • • •, aak+b). If t = b, then the set S(k, b, b) is equal to the set Cl(?Bak^-> Sfc;n), and ift^b, then there is a one-to-one correspondence between the sets S(k,b,t) and€(^8ak^b~t-i',n). Moreover, the correspondence preserves the Sn-action on the both sets which are defined by simultaneously coordinatewise conjugacy. Proof: The case of t = b is clear. Assume that t ^ b. Then every element in S(k,b,t) is of the form (o^,... ,aak, 1 , . . . , l,
o-ak+b) = (""I, • • • , O " o f c ) 0 " a f c + t + l , • • • , C a f c + b - l )
is well-defined and bijective (Note that the function / is defined by deleting l's and the last coordinate). This completes the proof. • Theorem 20 Let k be any integer and let B be a b-subset of the surface Sfc. Then the number of connected n-fold branched coverings of the surface Sfc with branch set B is Isoc (Sfc, B; n) = ( - l ) 6 I s o c (Sfc, 0; n) + £ ( - l ) « Q l s o c ( » 0 f c + b _ t _ i ; n), t=o ^' where 23 m is a bouquet of m circles, a^ = 2k if k > 0, and a^ = —k if k < 0.
144
Proof: For each i = ak + I,,.. ,ak + b, let Vi be the property that the i-th coordinate of an element of (Sn)ak+b is the identity. For each subset S of {ak + 1 , . . . , ak + b}, let N(Vs) be the number of elements in the product (Sn)ak+b which satisfy conditions (CI), (C2) and the properties Vi for all i € S. Notice that N(V$) is the number of all elements in the product (Sn)ak+b which satisfy conditions (CI) and (C2), and that the set C1(?Bafc+h^+ Sfc — B\n) is equal to the set of elements of (Sn)ak+b which satisfy conditions (CI) and (C2), but not any other property Vi for i = ak + 1 , . . . , ak + b. It comes from the principle of inclusion and exclusion that \C1(^ak+t^Sk-B;n)\ = Y;(-l)t\ t=0
E \ SCW + I
N(VS) *k + b}
Since N(Vs) = N(VS>) for any two subsets S, S' of {ak + 1,. •., ak + 6} with the same cardinality, we have N
E SC{ojfc + l
[t,
ifc + t }
\{4> € (5„) a f c + b : 0 satisfies (CI), (C2) and o-j = 1, vi = ak + 1 , . . . , ak + t } | .
Now, it comes from Lemma 14 that |C1(»0fc+6-»Sfc-.B;n)| = E ^ 1 ) * (J) |C(®« fc+ 6-t-i;n)l + ( - 1 ) 6 I C
1
^ ^ Sfc; n)| .
By taking the 5„-action on the underlying sets of the both sides of this equation, we have 6-1
Isoc (Sfc, B; n) = ( - l ) b I s o c (Sfc, 0; n) + ^(-if t=o
,,.
(J)Isoc (
n By using Burnside's Lemma, Mednykh ( 37 , 38 ) counted the number of subgroups in the fundamental group vri(Sfc, *) of an orientable surface Sfc and the number of conjugacy classes of subgroups in 7T;i(Sfc, *). The same problem for a nonorientable surface was done by A. Mednykh and G. Pozdnyakova in
145 k = 4 (\B\,n) k = - 4 k = - 3 k = -2 k = -1 k = 0k = 1 k = 2fc= 3 15 63 255 15 7 3 1 3 (0,2) 0 96104 18 4 4 100 2884 90 1 0 (0,3) 0 0 0 0 0 0 0 0 0 (1,2) 145 3 135 5103 185895 23 3 0 0 (1.3) 256 4 16 64 16 8 4 2 1 (2,2) 31 991 34231 1218031 981 171 31 6 1 (2,3)
T a b l e 10. T h e n u m b e r I s o c (Sfc, B;n)
Theorem 21 (37,
38
,
40
for s m a l l k, n a n d s m a l l | B |
)
(1) The number of subgroups of index n in the fundamental group 7Ti(Sfc, *) of a (orientahle or nonorientable) surface Sfc of genus k is Sk(n) = 5^(8^*) (n)
=n^ s=l
-1)
s+l
J2
Ax/V-'ft.,
ij. + i 2 + • • • + i , = : 'X.*2 *» > 1
w/ae
*-£(&)'• Age*
2k - 2 if fe > 0, fc - 2 if fc < 0,
.Dh is i/»e set of all irreducible representation of the group Sh, and fW is the degree of the representation X. (2) The number of nonisomorphic connected n-fold unbranched coverings of a surface Sfc of genus k is Isoc (Sfc,0;n)
m\n
m\n
d\%
d\^
if fc < 0,
where fi(m) is the Mobius function, S~£(m) = 0 ifm is odd, and S£ (m) = <5fc(y) if m is even, S^(m) = <Sjt(m) — S£(m), and (2,d) denotes the greatest common divisor of 2 and d.
146
Next, we aim to compute the number Isoc (Sk, B; n) of nonisomorphic connected regular n-fold branched coverings of the surface S^ with branch set B. Clearly, any two connected regular branched coverings are not isomorphic if their covering transformation group (or voltage groups) are not isomorphic. Since every connected regular n-fold branched covering is isomorphic to a connected branched ,4-covering for some group A of order n, we have Isoc*(Sfc, B; ») = £
' ^ ' ^ ^ l
"
= E
^
^
^ ' *' A)'
where A runs over all representatives of isomorphism classes of groups of order n. Recall that the number Isoc (Sk,B;A) of nonisomorphic connected ^4-coverings of the surface §& with branch set B is equal to number of the orbits of the coordinatewise Aut (^4)-action on the set <3(k, B; A). Note that this Aut (^4)-action on the set &(k, B; A) is free because <xi,..., <xm generates A. Now, by the Burnside Lemma, we have lsoc(Sk,B,A) =
| A u t M ) |
=
^
^
,
where m = 2k+\B\ if k > 0; and m = -k + \B\ if k < 0. We summarize our discussion as follows. Theorem 22 Let k be any integer and let B be a b-subset of the surface Sk. Then we have (1) the number of nonisomorphic connected regular n-fold branched coverings of the surface Sk with branch set B is IsocR(Sk, B;n) = ^ I s o c (Sfc, B;A), A
where A runs over all representatives of isomorphism classes of groups of order n, and (2) the number of nonisomorphic connected regular A-coverings of the surface Sk with branch set B is Isoc(Sk,B;A)
=
|Aut(<4)|
-
\Art(Aj\
where m = 2k + \B\ if k > 0; and m = -k + \B\ if k < 0.
, •
By Theorem 22, we now need to compute the number Isoc (Sk, B; A) for each finite group A of order n. By using a method similar to the proof of Theorem 20, we can have the following theorem.
147
Theorem 23 Let k be any integer and let B be a b-subset of the surface Sfc. Then, for any finite group A, the number of branched connected A-coverings of the surface Sfc with branch set B is
6-1
,,.
Isoc (Sfc, B; A) = ( - l ) l s o c (Sfcl 0; A) + X ) ( - l ) * (J)Isoc («B0fc+6-t_i; A),
where 23TO is bouquet of m circles, a% = Ik if k > 0, and a^ = —k if k < 0.
• Recall that an explicit computing of the number Isoc (93 m ; .4) was done for any m and any finite abelian group A or dihedral groups D n of order 2n (See sections 4-6). But the number Isoc (Sfc, 0; A) is known if *4 is Z p or D p (see 2 3 , 2 9 or see next section 8). As a final discussion of this section, we aim to introduce a formula for computing the number Isoc (Sfc, 0; ^l) when A is abelian. If A is an abelian group and Sfc is an orientable surface, then the number Isoc (Sfc,0; A) of connected .4-coverings of the surface Sfc is equal to the number Isoc (?E>2k', A) of connected ^4-coverings of the bouquet of 2k circles 932k- In this case, we computed this number in section 4. By the classification theorem of finite abelian groups, we can express a finite abelian group A as follows.
A = A0®Aa=
Iffif=1®f=lrn^.Zs . j 0 ( © ^ [ t n ^ , ) ,
where pi are odd primes and pt ^ pv if i ^ i'. Let 0(A) denote the number of direct summands of A whose order is a multiple of 4 and u(A) denote the number of direct summands of A whose order is 2. For example, Z 6 © Z 8 = Z 3 © Z 2 © Z 8 , 0(Z 6 © Z 8 ) = 1 and w(Z6 © Z 8 ) = 1. Lemma 15 Let k be any integer and let B be a b-subset of the surface Sfc. Let A be an abelian group. Then we have the following. (1) lfk>0,
then Isoc (Sfc, 0; A) = Isoc {%>2k]A),
148 (\B\,p)
(0,2) (0,3) (1,2) (1,3) (2,2) (2,3)
k = - 4 k - - 3fc= -2 k = -1 A: = 3 1 15 7 1 0 13 4 0 0 0 0 3 1 27 9 2 4 16 8 2 6 54 18
0 *:= 0 0 0 0 1 1
1 *:= 2 fc= 3 k = 4 3 15 63 255 4 40 364 3280 0 0 0 0 0 0 0 0 4 16 64 256 9 81 729 6561
Table 11. The number I s o c (Sfc, B\ Z p ) for small k, p and small |£?|
(2) lfk<0,
then
Isoc(Sk,d);A) • 2»W (2-k-eiA)
_ i)
2-k-{e{A)+u{A))
_ i
Isoc(93_ fc _i;.4) if 6{A) + w(A) < -k,
2e{A)
(2-k-0(A)
_ i)
^-—
r^T
I s o c
( ® - * - i ; A ) I s o c («B_fc; A )
if 0(.4) + w{A) = -k and 0(,4) ^ -fc, 0
Wjere A = A0 @ Ae = (®Li ®%i
otherwise,
m
* i Z S ) © (®£=i"VZ2^)D
As an illustration, we compute Isoc fi (S fc , B;p) for any prime p. Recall that Isoc(Q3 m ;Z p ) = ^ p for a prime number p. Since every group of order p is isomorphic to the cyclic group Z p , it comes from Theorem 22 that IsocR(§k, B;p) = Isoc{Sk,B;Zp) for any A: and any B C Sfc. Now, by applying Theorem 23 and Lemma 15, we have the following. Theorem 24 Let B be a b-subset of the surface Sk and let p a prime. Then the number IsocR(Sk, B;p) of nonisomorphic regular connected branched p-
149 fold coverings o/Sfc with branch set B is IsocR(Sk,
B-p) ,2k
P p
1
if k > 0 and b = 0,
2 f c - l ( ( p _ 1 ) 6 - M ( - 1 ) 6 ) if /c > 0 and 6 ^ 0,
2"*-l
if /c < 0, 6 = 0 and p = 2,
2-fc~1(l + (-l)6)
if fc < 0, 6 ^ 0 and p = 2,
p-k-l
_! if k < 0, 6 = 0 and p ^ 2,
P-1 LP - ^ ( p - l ) " "
1
if fc < 0, 6 ^ 0 and p ^ 2.
In fact, Isoc f l (S f c , B;p) for fc > 0 was computed by Mednykh in 3 6 , . In 3 0 , we can also found an explicit formula for computing the numbers Isoc f l (Sfc, B; 2p) and I s o c (S*, B;p2) when p is a prime number. This kind of enumeration of regular coverings will be continued in the next section. 37
8
D i s t r i b u t i o n s of b r a n c h e d s u r f a c e c o v e r i n g s
A well-known theorem of Alexander (*) says t h a t every orientable surface is a branched covering of the sphere S 2 , and every nonorientable surface is a branched covering of the projective plane. In the study of surface branched coverings, we can ask naturally as a generalization of Alexander's theorem: In how many different ways can a given surface be a branched covering of another given surface? To give a systematic answer of this question, we define two polynomials, called branched covering distribution polynomials. (i) For each i G Z, let aj(§, B; n) denote the number of equivalence classes of branched n-fold coverings p : §j —> S with branch set B, and let R(S,B;n)(x)
z^
aj(S,
B;n)x%.
(ii) For each t £ Z, let di(S, B; A) denote t h e number of equivalence classes
150
of branched ,4-coverings p : Sj —> S with branch set B, and let oo
i=—oo
These two polynomials can have at most finitely many nonzero terms by the Riemann-Hurwitz equation: x(S) = nx(§) - X ^ e s def(6), where def(fo) = n - |p - 1 (6)| and x denotes the Euler characteristic. (Here, n = |.4| for an A covering.) REMARK Prom the covering distribution polynomials, we see that the number oo
%,B;«)(1)= J2
a S B n
i( > > )
i = —oo
is equal to the total number Isoc (S, B; n) of nonequivalent branched n-fold coverings of the (orientable or nonorientable) surface S with branch set B. In particular, the total number i?(s,0;n)(l) of nonequivalent unbranched n-fold coverings of the surface § is equal to the number of the conjugacy classes of the subgroups of index n of the fundamental group 7r1(S, *). Also, for the regular coverings, the number oo
l
R(S,B;A)( ) = E
a
^
B
^ )
is equal to the total number Isoc (§, B\ ,4) of nonequivalent branched ,4-coverings of the surface S with branch set B. The total number Isoc (S, 0;,4) = R(s,6-A)(^) °f nonequivalent unbranched ,4-coverings of the surface S is equal to the number of the normal subgroups H of the fundamental group 7Ti(S, *) such that the quotient group wi(S,*)/H is isomorphic to
A. Now, we are interest in the number R(s,B;A)W a n d m the polynomial R(s,B;A)(x)- I n section 7, the number R(s,B;A)(^) w a s discussed and computed when A is an abelian group. Notice that the computation of the polynomial R($,B;A)(X) Is harder than the computation of the number R(S,B;A)(^)• The polynomial R(s,B;A)(x) ls known for the case when A is the cyclic group Z p of prime order p or the dihedral group % of order 2p. (See 23 - ? .) By Theorem 24 and the Riemann-Hurwitz equation, we can obtain the following which also can be found in
151 7 17 13 11 5 3 \B\ p= 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 1 1 2 1 1 15x8 9x 5 5x 3 llx6 3x 2 0 x 3 241x16 133x12 x 3x2 91x 10 31x6 13x4 4 3855x24 1595x18 909x15 185x9 5 51x6 0 5x 3 61681x32 19141x24 9091x20 x2 l l x 4 mil12 205x8 6 986895X40 229691x30 90909x25 6665x15 819x10 0 21x5 7 x 3 43x6 3277x12 3999 lx 18 909091x30 2756293x36 15790321X48 8 0 85a 7 13107x14 239945x21 9090909x35 33075515x42 252645135x56 9
Table 12. The polynomial Rrs,B,Z ) ( x ) f° r the sphere SQ
Theorem 25 (23) Let A = Zp and p be a prime. (1) Let B be a finite set of points in an orientable surface §& (A; > 0) and let b = \B\. Then we have ai(Sk,B;Zp) < p2k - 1
if i = l+p(k-l),
P-l
6 = 0,
f c - l1 p„ 2 *" ((p - l ) 6 " 1 + ( - l ) b ) if i = pk + ^ ( b
- 2)
and 6 ^ 0 , otherwise. (2) Let B be a finite set of points in a nonorientable surface Sk (k < 0) and let b = \B\. Then we have ai(Sk,B-Z2)
(I
if i = -k - 1, 6 = 0, k
2~ - 2 if i = 2(k + 1), k ^ - 1 , 6 = 0, 2~k
if i = 2(fc + 1) - 6, 6 ^ 0 , 6 = even,
0
otherwise.
(3) Lei B be a finite set of points in a nonorientable surface §k (k < 0) and let b — \B\. Then, for each odd prime p, we have
152 p=7 p= 5 p = 13 p = 11 8x 6x 4x 3x 14x 12x 0 0 0 0 0 0 7 5 3 2 11 49x 25x 9x 4x 169x13 121X 10 7 4 19 16 245x 75x 9x 0 1859x 1089x 1519x13 325x9 27x5 4x 3 llOllx21 22477x25 9065x16 1275x11 109989x26 0 45x6 269555x31 5125x13 54439x19 99x 7 4x 4 llOOOllx31 3234829x37 326585x22 0 189x8 20475x15 38817779x43 10999989x3a 4x 5 387x9 81925x17 1959559x25 11000001lx41 465813517x49 0 765x10 327675x19 11757305x28 1099999989x46 2147483647x55
\B\ p = 2 p = 3 0 1 2 3 4 5 6 7 8 9
Table 13. The polynomial R(^s,B,z )(x) f° r the torus §i
ai(Sk,B;Zp) p-k-l
_l
if i = p(k + 2) - 2, b = 0,
P-1 -k-l,
lo
(p - If'1
if i=P(k+ otherwise.
2)-b(p-I)-2,
b^ 0,
153 p = 2
13 p = 11 P=7 0 0 1 1 X" xx" 12x" 1 3 10x-11 2x~2 2x~ 3 6x-7 4x~ 5 _2s 21 -5 -13 9 144x 100x~ 4x 0 3 6 x 16x37 31 19 13 1728x~ lOOOx" 2x-4 8x~ 7 216x" 64x49 20736x~ 10000x~ 41 16x - 9 0 16x~ 256x -- 11 77 1296x - 2 5 20736x~ 49 2x~ 6 3 2 x - 1 1 1024x~ 21 7776x" 31 100000x~ 51 248832x~ 61 0 64x~ 13 4096x~ 25 46656x" 37 lOOOOOOx-61 2985984x~ 73 2 x - 8 128x - 1 5 16384x -29 279936x - 4 3 lOOOOOOOx-71 35831808x -85 0 256x~ 17 65536x~ 33 1679616x~49 lOOOOOOOOx"81 429981696x~ 97 i= 3 T -
1
1
Table 14. The polynomial R(s,s,z ) ( x ) f° r the projective plane S—i
The following can be found in
29
.
T h e o r e m 26 (29) Let A = B>p and p be an odd prime.
(1) Let B be a finite subset of the sphere So and let b = \B\. Then we have
Oi(So,B;IDp)
LK" 2 ^- 1 ) 6 " 28 " 1 if i = b(p -
1) + 1 - s(p - 2) - 2p
for 1 < s < L ^ J , 6>3, J>-2
if i = p ( ^ ) + 1, 6(> 3) is even,
lo
otherwise.
(2) Let B be a finite subset of an orientable surface Sfc (A; > 0) and let b = \B\
154
Then we have ai(Sk,B;Bp) (4* - 1)
p2k-2
- 1
(4* - 1) p2k~2(p 6 2 s)
if i = 2p(k - 1) + 1, b = 0,
- l)"-1
4 fc p 2fc + 2 ^ 2
if i = 2p(k - 1) + &(p - 1) 8
xb-1)" " fc
if i = 2p(k - 1) + b(P - 1) + 1,
^ p 2fc+fc-2 _
1
+l-s(p-2) for 1 < s < L ^ J , 6 ^ 0 ,
x
if»=p(2(fc-l) + | ) + l ,
p-1
6(^ 0) is even, otherwise,
0
(3) Let B be a finite set of points in a nonorientahle surface S& {k < 0) and let b = \B\. Then we have ai(Sk,B;Bp)
• p~k~l
- 1
if z = l + p ( - f c - 2 ) , 6 = 0,
p-1 „-k-2
(2~k - 2) -
-1
if i = 2p(k + 2) - 2, b = 0, p—1 p - f c " 2 ((p - l ) 6 " 1 + (-1) 6 ) if i = p(-fc - 2) + b(P - 1) + 1, 6^0, (2- fc - 2)p-k-2(P = <
fo 2
Vv f c + 2 s ~ 2
- l)*-1
if i = 2p(fc + 2) - 26(p - 1) - 2, 6^0, if i = 2p(fc + 2) - 26(p - 1) +2s(p - 2) - 2, for 1 < s <
-k
P
-fc+b-2
p-1
lo
3-1
,M0,
if i = p(2k + 4 - b) - 2, fe(^ 0) is even, otherwise,
From Theorem 26, we have the following.
155 p= 3 \B\ 0 0 0 1 0 2 3 3 4x + 12x2 4 3 45x + 40x 4 5 40x4 + 270x5 + 120x6 6 567x6 + 1260x7 + 336x8 7 8 364x7 + 4536x8 + 5040x9 + 896x 10
p= 5
3906x11
0 0 0 3 6x + 24x 4 5 125x + 160x8 156x6 + 1500x9 + 960x 12 4375x10 + 14000x13 + 5376x16 + 70000x14 + 112000x17 + 28672x20
p = 7 p = 11 \B\ 0 0 0 0 0 1 0 0 2 3 3 3 8x + 36x6 12x + 60x 10 4 245x7 + 360x 12 605X11 + lOOOx20 5 400x8 + 4410x13 + 3240x18 1464x12 + 18150x21 + 15000x30 6 16807x14 + 61740x19 + 27216x24 102487x22 + 423500x31 + 210000x40 7 8 19608x15 + 403368x20 + 740880x25 + 217728x30
T a b l e 15. T h e p o l y n o m i a l R(s,B
D )(x)
f ° r t h e s p h e r e So
0 3x 3x5 16x + 6x 24x 6 + 12x 9 300x10 + 48x13 108x 6 + 1 2 x 160: :7 + 432x 8 + 24x 624X 11 + 2400x 1 4 + 192x 1 7 ltjZUx' 9 +- 144UX" S xl "l 12500x 1 5 + 16000x 1 8 + 768x 2 1 1620x 1440x 1 0 + 448x - 9720X 1 1 + 4 3 2 0 x 1 2 + 9 6 x 1 3 15624x 1 6 + 150000x 1 9 + 96000x 2 2 + 3 0 7 2 x 2 5 4
1456X 1
p = 7 p = 11 \B\ 0 0 0 7 1 3x 3X 11 8 13 12 21 2 32x + 18x 48x + 30x 3 588x14 + 108x19 1452x22 + 300x31 4 1600x15 + 7056x2D + 648x25 5856x23 + 29040x32 + 3000x41 5 48020x21 + 70560x26 + 3888x31 292820x33 + 484000x42 + 30000x51 6 78432x22 + 864360x27 + 635040x32 + 23328x37
T a b l e 16. T h e p o l y n o m i a l R(s,B,fB
Corollary 3 Let A
)(x)
f ° r the torus Si
and p be an odd prime.
156 p = 3
\m 0 1 2 3 4 5 6 720x~
18
p = 5 0 0 0 0 2 2 2 2x~~ + x 2 l " + x4 6 4 10 18x~~ + x 30x~ + 3x8 7 2 x - 1 0 + 2 6 x - 8 + 3x 6 2 4 0 x - 1 8 + 6 2 x ~ 1 2 + 13x 1 2 2 4 0 x - 1 4 + 2 7 0 x - 1 2 + 5x8 1 6 0 0 x - 2 6 + 1250x~20 + 51x16 + 1620x~ 1 6 + 2 4 2 x ~ 1 4 + l l x 1 0 9600x"" 3 4 + 1 5 0 0 0 x ~ 2 8 + 1 5 6 2 x ~ 2 2 + 2 0 5 x 2 0
p = 7 |B| 0 0 1 0 2 2x~ + x6 2 l4 12 3 42x+ 5x 504x"26 + 1 1 4 x _ l e + 31x18 4 5 0 4 0 x " 3 8 + 3 4 3 0 x " 2 8 + 185x 2 4 5 6 45360x" " 5 0 + 6 1 7 4 0 x ~ 4 0 + 5 6 0 2 x ~ 3 0 + l l l l x 3 0
p = 11 0 0 2
10
2x~ +x 6 6 x - 2 2 + 9x20 1320x + 266x~24 + 91x30 22000x~62 + 13310x~44 + 909x40 -42
Table 17. The polynomial R^s,B,H) )(x) f° r the projective plane § - i
(1) Let B be a finite set of points in the sphere SQ and let b = \B\.
Then
Isoc (S 0 , B;BP)
% i^V^kv-^-2'-1 = <
s=1
y-2-i p-l
0
if&>3,
(i+ (-!)") 2 otherwise.
(2) Let B be a finite set of points in an orientable surface §* (k > 0) and let
157 (|B|,p) k = -4 k = -3 k = -2 k = -1 k = 0 k = 1 fe = 2 fc = 3 fc = 4 69 1 0 0 60 2520 92820 10 0 (0,2) 1 (0,3) 115 12 0 0 0 90 9828 996030 0 0 0 3 135 5103 185898 0 0 (1,2) 0 0 0 0 0 3 375 39375 3984375 (1,3) 919 149 23 3 22 910 33502 1211470 0 (2,2) 4021 37 3 36 3996 407484 40937436 393 0 (2,3)
Table 18. The number Isoc (Sfc, B;B P ) for small k, p and small |B|
b = |B|. Then we have Isoc(Ek,B;Bp) (4fc - 1)
„2fc-2
-1 p-1
if b = 0,
(Ak-l)P2k-2{p-l)b'1
if b ^ 0.
b \ k 4 2k+2s-2/ _ -i \ 6 - 2 s - l + + E [2 S ) v* ^(p-ir A
p2k+b~2
+4'
_
p-1
l
(1
+
(_!)*)
2
(3) Let B be a finite set of points in a nonorientable surface Sfc (k < 0) and letb = \B\. Then we have Isoc(S f c ,B;D p ) ( „-fc-i
V
i
_ 1
+
( 2 - * _ 2)1-
V
p - * - = » ( ( p _ 1)6-1 +
-fc-2
p-1
(_1)6)
fe 2
if 6 = 0,
if 6 ^ 0.
fc 1
+p- - (2-*-2)(p-l) -
E feyv^^-ir2*-1 5= 1
2-fcp-
fc+fc
-2-l(l+(-l)b) p-1 2
From Theorems 25 and 26, we can make tables 12-17, and derive many
158
interesting topological properties of branched regular surface coverings. We list some of them in the following. A group A action on a surface S is pseudofree if the number of fixed points of the action is finite, i.e., the cardinality of the set {x € S | gx = x for some g ^ id in ^4} is finite. A group action on a surface is spherical if the quotient surface of the action is homeomorphic to the sphere. 1. For any k > 0, there are exactly 4fc —1 nonequivalent connected unbranched double coverings of S^, and all of their covering surfaces are §2fc-i2. For any surface S, there does not exist a connected branched double covering of S with odd number of branch points. 3. For any k > 0 and even number 26, 6 > 1, there are exactly 4fe nonequivalent connected branched double coverings of §^ having given 26 branch points, and all of their covering surfaces are §2fc+6-i4. There exists a unique connected unbranched double covering of the projective plain §_i up to equivalence, and its covering surface is the sphere. For any k < —2, there exist 2~k — 1 connected unbranched double coverings of S^ up to equivalence, and one of their covering surfaces is the orientable surface §_fc_i and all others are the nonorientable surface §2(fc+i)5. For any k < — 1 and even number 26, 6 > 1, there are exactly 2~k nonequivalent connected branched double coverings of Sfc having given 26 branch points, and all of their covering surfaces are the nonorientable surface §2(fc-6+l)-
6. Every orientable surface is a branched double covering of the sphere S 2 . Every nonorientable surface is a branched double or triple covering of the projective plane S_i (This is Alexander's theorem). 7. Let p be prime > 2. Then the dihedral group O p can act freely on the surface §*. if and only if either k > 1 and k — 1 = 0 (mod p) or k < —3 and k + 2 = 0 (mod 2p). Moreover, i. if k > 1, k - 1 = 0 (mod p) and k - 1 ^ 0 (mod 2p), then S fc /© p is the nonorientable surface Si-* 9 ; ii. if k > 2 and k — 1 = 0 (mod 2p), then §fc/Dp is either the orientable surface S t ^ i , , or the nonorientable surface Si-* 2 ; iii. if k < — 3 and k + 2 = 0 surface Sk+2 9 . ~2p-
(mod 2p), then §k/O p is the nonorientable
J
8. For any prime p > 2, a surface S^ has a spherical pseudofree D p -action if and only if k = (jp — l)m + n, where m,n > 0 and m + 1 > n. Moreover, for such a k = (p — l)m + n, the number of branch points of the Dp-covering p : Sfc —• Sfe/Dp = §o is m + n + 3.
159
Acknowledgement The authors are grateful to Professor A.D. Mednykh for his careful remarks and valuable suggestions from a draft version of this manuscript. References 1. J.W. Alexander, Note on Riemann spaces, Bull. Amer. Math. Soc. 26 (1920) 370-372. 2. D. Archdeacon, J.H. Kwak, J. Lee and M.Y. Sohn, Bipartite covering graphs, Discrete Math. 214 (2000) 51-63. 3. R. Feng, J.H. Kwak, J. Kim and J. Lee, Isomorphism classes of concrete graph coverings, SIAM J. Discrete Math. 11 (1998) 265-272. 4. J.L. Gross and T.W. Tucker, Generating all graph coverings by permutation voltage assignments, Discrete Math. 18 (1977) 273-283. 5. J.L. Gross and T.W. Tucker, Topological Graph Theory, Wiley, New York (1987). 6. P. Hall, The Euclidean functions of a group, Quart. J. Math. Oxford 7 (1936) 134-151. 7. M. Hall, Jr., Subgroups of finite index in free groups, Canadian J. Math. 1 (1949) 187-190. 8. M. Hall, Jr., The Theory of Groups, Macmillan, New York (1959). 9. M. Hofmeister, Counting double covers of graphs, J. Graph Theory 12 (1988) 437-444. 10. M. Hofmeister, Isomorphisms and automorphisms of coverings, Discrete Math. 98 (1991) 175-183. 11. M. Hofmeister, Concrete graph covering projections, Ars Combin. 32 (1991) 121-127. 12. M. Hofmeister, Graph covering projections arising from finite vector spaces over finite fields, Discrete Math. 143 (1995) 87-97. 13. M. Hofmeister, Enumeration of concrete regular covering projections, SIAM J. Discrete Math. 8 (1995) 51-61. 14. M. Hofmeister, A note on counting connected graph covering projections, SIAM J. Disc. Math. 11 (1998), 286-292. 15. S. Hong and J.H. Kwak, Regular fourfold coverings with respect to the identity automorphism, J. Graph Theory 15 (1993) 621-627. 16. S. Hong, J.H. Kwak and J. Lee, Regular graph coverings whose covering transformation groups have the isomorphism extension property, Discrete Math. 148 (1996) 85-105. 17. S. Hong, J.H. Kwak and J. Lee, Bipartite graph bundles with connected
160
fibers, Bull. Austral. Math. Soc. 59 (1999) 153-161. 18. A. Hurwitz, Uber Riemann'sche Flachen mit gegebenen Verzweigungspunkten, Math. Ann. 39 (1891) 1-61. 19. A. Hurwitz, Uber die Anzahl der Riemann'sche Flachen mit gegebenen Verzweigungspunkten, Math. Ann. 55 (1902) 53-66. 20. G.A. Jones, Enumeration of homomorphisms and surface-coverings, Quart. J. Math. Oxford (2) 46 (1995) 485-507. 21. G.A. Jones, Counting subgroups of non-Euclidean crystallographic groups, Math. Scand. 84 (1999) 23-39. 22. J.H. Kwak, J. Chun and J. Lee, Enumeration of regular graph coverings having finite abelian covering transformation groups, SIAM J. Discrete Math. 11 (1998) 273-285. 23. J.H. Kwak, S. Kim and J. Lee, Distributions of regular branched primefold coverings of surfaces, Discrete Math. 156 (1996) 141-170. 24. J.H. Kwak and J. Lee, Isomorphism classes of graph bundles, Canad. J. Math. XLII (1990) 747-761. 25. J.H. Kwak and J. Lee, Counting some finite-fold coverings of a graph, Graphs and Combinatorics 8 (1992) 277-285. 26. J.H. Kwak and J. Lee, Isomorphism classes of cycle permutation graphs, Discrete Math. 105 (1992) 131-142. 27. J.H. Kwak and J. Lee, Enumeration of graph coverings and its applications, Graph Theory, Combinatorics, Algorithms, and Applications; Proceedings of the 7th quadrennial international conference on the theory and applications of graphs, (Y. Alavi, et al., eds) Wiley (1995) 649-659. 28. J.H. Kwak and J. Lee, Enumeration of connected graph coverings, J. Graph Theory 23 (1996) 105-109. 29. J.H. Kwak and J. Lee, Distributions of branched Dp-coverings of surfaces, Discrete Math. 183 (1998) 193-212. 30. J.H. Kwak, A.D. Mednykh and J. Lee, Enumerating surface branched coverings, manuscript 31. V. Liskovets, Towards the enumeration of subgroups of the free group, Dokl. Akad. Nauk BSSR, 15 (1971) 6-9 (in Russian). 32. V. Liskovets, Reductive Enumeration under mutually orthogonal group actions, Acta Applicandae Mathematicae, 52 (1998), 91-120. 33. A. Lubotzky, Counting finite index subgroups, Proc. Conf. "Groups'93 Galway/St Andrews", London Math. Soc. Lect. Note Ser., 212 (1995), 368-404. 34. W. Magnus, A. Karrass and D. Solitar, Combinatorial Group Theory, Dover, New York (1976). 35. W.S. Massey, A basic course in algebraic topology, Springer-Verlag, New
161
York (1991). 36. A.D. Mednykh, Determination of the number of nonequivalent coverings over a compact Riemann surface, Soviet Math. Dokl., 19 (1978), 318-320. 37. A.D. Mednykh, On unramified coverings of compact Riemann surfaces, Soviet Math. Dokl., 20 (1979), 85-88. 38. A.D. Mednykh, Hurwitz problem on the number of nonequivalent coverings of a compact Riemann surface, Siber. Math. J. 23 (1982), 415-420. 39. A.D. Mednykh, On the number of subgroups in the fundamental group of a closed surface, Communications in Algebra 16 (1988), 2137-2148. 40. A.D. Mednykh and G.G. Pozdnyakova, Number of nonequivalent coverings over a nonorientable compact surface, Siber. Math. J. 27 (1986), 99-106. 41. J.M. Montesinos, Representing 3-manifolds by a universal branching set, Math. Proc. Cambridge Phil. Soc. 94 (1983), 109-123. 42. I. Sato, Isomorphisms of some coverings, Discrete Math. 128 (1994) 317-326. 43. S. Stahl, Generalized embedding schemes, J. Graph Theory 2 (1978) 4 1 52. 44. M. Suzuki, Group Theory I, Springer-Verlag, New York (1982). 45. M. Suzuki, Group Theory II, Springer-Verlag, New York (1986). 46. V.D. Tonchev, Combinatorial Configurations Designs, Codes, Graphs, English version, Wiley, New York (1988).
A N OVERVIEW OF T H E P O S E T OF IRREDUCIBLES GEORGE MARKOWSKY Computer Science Department. University of Maine Orono. ME 04469-5752. U.S.A.
1
Introduction
An interesting fact that is of great practical importance is that finite lattices have an associate poset, called the poset of irreducibles that acts much like the basis of a vector space. The poset of irreducibles of a finite lattice provides a compact representation of the lattice from which many of the properties of the lattice can be deduced easily. This paper is dedicated to explaining the poset of irreducibles and providing some examples of its usefulness. Proofs are omitted except for the very simple ones, but all results can be found in the references located at the end of this paper. These results can be extended to infinite lattices, but we will not discuss such extensions here. The interested reader is invited to read 4 and 5 for more details. 12 provides additional historical and motivational material, which might be of interest to the reader. The poset of irreducibles generalizes the construction used by Garrett Birkhoff 2 to provide a representation for finite distributive lattices. Birkhoff proved that a distributive lattice, L, is isomorphic to the lattice of all closed from below subsets of the poset consisting of the join-irreducible elements of L in the induced order. An interesting extension of this result is that the connected components of the poset of join-irreducibles correspond to the posets of join-irreducibles of the Cartesian factors of a lattice and that the automorphism group of the poset of join-irreducibles is isomorphic to the automorphism group of the lattice. Since in a distributive lattices the poset of meet-irreducibles is isomorphic to the poset of join-irreducibles. It is sufficient to work with either the join-irreducibles or meet-irreducibles. Figure 1 illustrates Birkhoff's Theorem. Definition 1 A join-irreducible element, j , of a lattice, L, is has the property that j = sup S, where S is a subset of L, implies that j is in S. • The bottom element of a lattice is never join-irreducible since it is the join of the empty set. There is a dual definition of meet-irreducible. The focus on irreducibles is a key aspect of a combinatorial approach to lattice theory. In this approach, one focuses on things such as the Hasse diagram of a lattice rather than on algebraic identities satisfied by elements of 162
163
the lattice. Especially satisfying, are results that connect algebraic properties to combinatorial properties.
{b,d} o {a,b,d}
{a, b}
d\
{b,d}
{a} cp
{6} »
{} 1
{} 1
factors of L d 9
{*>} a o
6 o
Poset of Join-Irreducibles Figure 1. An Illustration of Birkhoff 's Theorem
One example of such a theorem is a result that I discovered in my thesis ( 4 ), but which I later found had been discovered a decade earlier by Avann ( 1 ). The result is the following: Theorem 1 (Avann, Markowsky) A finite lattice is distributive if and only if { 1. The number of meet-irreducibles equals the length of the lattice. 2. The number of join-irreducibles equals the length of the lattice. 3. The lattice satisfies the Jordan-Dedekind chain condition.
•
Figure 2 shows three simple lattices that illustrate the graphical test for distributivity. In the first case, the lattice satisfies the Jordan-Dedekind chain condition, but both the number of join-irreducibles and the number of meetirreducibles are greater than the length of the lattice. In the second case, the lattice has more join-irreducibles than either the number of meet-irreducibles or the length of the lattice. In the third case, the lattice satisfies all the requirements and is distributive.
164
•• ® NO! Too Short.
NO! Too Many Join-Irreducibles.
0 YES!
Figure 2. Examples of the Graphical Test for Distributivity
2
The Poset of Irreducibles
It seems clear that for general lattices both the join-irreducible and meetirreducible elements need to be considered. Since elements can be both joinirreducible and meet-irreducible, it seems reasonable to consider a bipartite graph where an element can appear twice if necessary. One natural construction is to put the meet-irreducibles in a row over a row of join-irreducibles and connect an element in the top row to an element in the bottom row if the top element is > the element in the bottom row. Interestingly enough, a more useful construction is to connect the top element to the bottom element iff the top element is ^ the bottom element. The big advantage of the second construction over the first, is that the Cartesian factors of a lattice can be read directly from the associated poset, because the connected components of the poset (when the poset is considered a graph) correspond to the Cartesian factors of the lattice. Figure 3 shows the lattice M3, the induced order on the irreducibles, a bipartite graph using the induced order to relate the two rows of irreducibles, and finally the complementary order on the irreducibles. The induced order has the undesirable property that it splits into 3 connected components while the lattice does not have direct factors. The complementary order, on the other hand, has only a single connected component. Figure 4 shows the same constructions applied to the Boolean algebra with 3 atoms. Note that in this case the lattice has 3 direct factors, while
165
a o
fe
o
o c
Induced Order Lattice a
b
c
6
6 b
6 c
a
Extended Induced Order
a
6
c
Complementary Extended Induced Order
Figure 3. Illustration of the Poset of Irreducibles and Related Constructions
the bipartite directed graph (bidigraph for short) that uses the induced order consists of only one connected component. On the other hand, the bidigraph derived from the complementary order has 3 connected components. Definition 2 Given a finite lattice L, the poset of irreducibles, P(L), is the poset formed by putting all the join-irreducibles of P(L) in a row and then placing all the meet-irreducibles in a row above the join-irreducibles, and ordering them as follows. In P(L), a meet-irreducible element, m, is above a join-irreducible element, j , iff m ^ j in L. The Poset of Irreducibles was introduced in my thesis in 1972-73 4 , and developed in a series of papers published from 1973 through 1994. In 1982 Wille 14 in a paper entitled Restructuring Lattice Theory introduced the terms concept lattices and context. A context is the same thing as the bipartite poset of irreducibles discussed above, but with the induced order. As noted earlier this construction does not make evident the Cartesian factorization of a lattice, but is simply the dual of the Poset of Irreducibles construction. Even though Wille and his school have been aware of my work on the Poset of Irreducibles since 1973 they have not acknowledged it in their work. The technique for recovering a lattice from its poset of irreducibles is
166
Induced Order Lattice
Extended Induced Order
fa
e o
a
a d>
b ®
* c
Complementary Extended Induced Order
Figure 4. Illustration of the Poset of Irreducibles and Related Constructions
fairly straightforward: 1. For each element on the bottom row, form the set of all elements in the top row that are connected to it. 2. The set of unions of all such sets (we include the empty set as the empty union) ordered by set inclusion is isomorphic to the original lattice. Figure 5 shows the basic reconstruction process. Note that we use the abbreviation Rep to represent the set of meet-irreducibles linked to a particular join-irreducible on the bottom row. The calculations of the three Reps in Figure 5 is straightforward as is the construction of all unions. It is easy to see that we recover the original lattice in this way. Figure 6 shows the same construction for the 3-atom Boolean algebra. Reconstruction can also be done from different perspectives such as Galois connections and the lattice of maximal antichains. For details see 6 . There has been some interesting work done in the area of lattice reconstruction by Morvan and Nourine 13 , and by Jourdan, Rampon and Jard 3 .
167
a =4> {b, c} Call this Rep(a) b => {a, c} Call this Rep(6) c =>• {a, b} Call this Rep(c) {a,b,c} {b,c}
{a,b}
{a,c}
{} Figure 5. Reconstructing the Lattice
/
t
-
d
a=K/}
o
a
(?
b^{e} c^{d}
o
()
ci
{ej}
{d,e}
Figure 6. Reconstructing the Boolean Algebra
To summarize the preceding discussion we note that the Poset of Irreducibles of a lattice L, denoted by P(L), is possibly a compact representation of a lattice. In my thesis I proved a generalized form of the following result which can be used with some infinite lattices. Theorem 2 (Markowsky 4 and 6) Given a finite lattice L. Its poset of irreducibles has the following properties: 1. L can be easily reconstructed from P(L) using the union
construction.
168 2. The connected components of P(L) are the posets of irreducibles of the direct factor lattices whose product is L. 3. The group of all order preserving automorphisms of L is isomorphic to the group of all order preserving automorphisms of P(L). • In general, P(L) is significantly smaller than L. In the case of Boolean algebras, P(L) is exponentially smaller than L. Throughout this we will use J(L) to denote the set of join-irreducibles of a lattice L, and M(L) to denote the set of meet-irreducibles of L. Let's consider one more example. Figure 7 shows that P(L) makes it easy to spot the fact that a lattice can be factored directly and that the factors can be computed directly from the components of P(L).
Factors
Figure 7. A more complicated example of the Poset of Irreducibles
Notice that since the poset of irreducibles of each Cartesian component of the lattice is given by a connected component of the poset of irreducibles of the lattice, the factors are themselves not further reducible. Another interesting question to consider is which bidigraphs (bipartite digraphs) can be P(L) for some lattice L. To give this condition we just need to extend the definition of Rep for any bidigraph. In particular, if S is set of nodes of the bidigraph, G — (X, Y, Arcs), let Rep(S) = all nodes that are
169 linked in G to some node in S. Now we can characterize which bidigraphs are P(L) for some lattice L . Theorem 3 (Markowsky 4 , &) A bidigraph G is P(L) for some lattice L iff the following condition holds for each node n: Rep(n) = Rep(S) can only happen if n is in S. • 3
Applications
Looking at lattices from the point of view of their posets of irreducibles, provides another approach to solving problems and better understanding the features of the lattices in question. The following subsections briefly sketch some of the instances where focusing on the poset of irreducibles has led to some key insights. 3.1
Locally Distributive Lattices
Earlier, a simple test for distributivity was mentioned. A slight modification of this result provides a simple characterization of locally distributive lattices. In particular, we get the following theorem. Theorem 4 (Avann x , Greene and Markowsky (lower) locally distributive iff
10
) A finite lattice is upper
1. It is Jordan-Dedekind 2. Its meet-rank (join-rank) = its length
•
Note that the meet-rank (join-rank) of a lattice is simply the number of meet-irreducible (join-irreducible) elements in the lattice. 3.2
Factor-Union
Representation
Associated with the poset of irreducibles are some theorems that provide some information about mappings between lattices. A very fundamental theorem is the following. Theorem 5 (Markowsky 4 , 5, 6 , g) 1. If f : Li —> L 2 *s join-preserving then |M(Li)| < |M(L2)|2. The map f : L —> 2 M ( L ) given by: f(a) = {meM(L)\a sups.
< rn} preserves
170
3. The mapping f in the preceding item is optimal in the sense that no smaller Boolean algebra can be found to represent L with unions representing sups.
a The last part of the preceding theorem demonstrates that the poset of irreducibles is the smallest construction that can be used to represent the lattice using unions of sets. The preceding theorem can be used to develop an algorithm for determining if a genetic system can be described as a union of different factors. Systems that can be described in such a manner are called factor-union systems. Suppose that a group of individuals displays genetic variations, and you would like to understand how genes can carry traits. One simple model of such behavior represents traits as being made from unions of simpler traits. For example, consider a simplified eye-color model in which there are only blue eyes and brown eyes. Further, suppose that brown eye genes are dominant over blue eye genes. Recall that a phenotype is a type that can be objectively recognized such as brown-eye vs. blue-eye. Also, a genotype is a particular combination of genes. In general, multiple genotypes might produce the same phenotype. In the system under discussion the phenotype of having brown eyes consists of the 3 genotypes: (brown, brown), (brown, blue), and (blue, brown). On the other hand, the genotype (blue, blue) is the only one in the blue eye phenotype. The ordered pairs represent the combination of genes that the individual gets from each parent. A simple factor-union model for this eye color system is the following: assume that having blue eyes is the default and requires no particular trait, whereas a brown eye-color gene contains a single factor x, which colors eyes brown if it is present. In this case, the three genotypes (brown, brown), (brown, blue) and (blue, brown) will produce an individual with brown eyes, while (blue,blue) will produce a blue-eyed individual. To determine whether a factor-union representation in possible for some system of phenotypes, we must order phenotypes based on the assumption that the system in question is a factor-union system. If it is indeed a factorunion system, then the algorithm will eventually produce a lattice in which unions represent sups. If the system in question is not a factor-union system, the algorithm eventually produce a cycle of distinct elements such a j b j a, which is impossible in a poset. For the details of this construction see 9 . If no cycles appear while the order is being completed, then the algorithm produces a lattice and any join-representation of that lattice is a factor-union representation of the original system. By the preceding theorem, a minimal
171
factor-union representation is constructed using the meet-irreducibles of the generated lattice. Note that the minimal representation need not be the correct biological model. In fact, just because a factor-union representation can be found for a genetic system one cannot simply assume that it is the correct explanation. This determination must be made on a biological basis, but the results here suggest some starting points. The results in 9 apply to multi-locus systems as well as single-locus systems. 3.3
Subprojective Lattices and Projective Geometry
A variety of people have developed axioms systems for projective geometries. Initially, all of the axiom systems proposed contained a numerical parameter, and hence were unsuitable for infinite dimensional projective geometries. Basing their work on the poset of irreducibles, Markowsky and Petrich 7 produced a purely point and hyperplane, numerical-parameter-free, self-dual axiomatization of subprojective lattices. In finite dimensions, subprojective lattices are also projective, so this axiomatization gives a parameter-free parametrization of finite dimensional projective geometries. This work also provided conditions under which subprojective lattices became projective. 3.4
Extremal Lattices
The Jordan-Dedekind chain condition is a strong condition to require. As a result of the prompting of Garrett Birkhoff, I investigated what can be said in the absence of the Jordan-Dedekind chain condition, and focused in particular on the case where the length of a lattice matched its join-rank and/or its meet-rank. It is clear that every element that covers another must have at least one additional join-irreducible below it and at least one fewer meet-irreducible above it than the element it covers. Thus, for any lattice L, length(L)<|J(L)|,|M(L)|. Definition 3 1. A lattice, L, is called join-extremal iff length (L) = \J(L)\. 2. A lattice, L, is called meet-extremal iff length(L) = \M(L)\. 3. A lattice, L, is called extremal iff length(L) = \J{L)\ — |M(L)|. D The various types of extremal lattices have many interesting properties. One simple property is given by the following theorem. Theorem 6 (Markowsky10). A Cartesian product of lattices is (join-, meet-) extremal iff each factor is (join-, meet-) extremal. •
172
The term p-extremal (p = the empty string, "join", "meet") is used to refer to any of the three types of extremal lattices. Just be sure to make the same substitution for the prefix p in the same context. P-extremal lattices have many interesting properties and generalize decompositions of finite Boolean algebras. An interesting fact is that ideals of join-extremal lattices are joinextremal, and dual ideals of meet-extremal lattices are meet-extremal. Another interesting fact about p-extremal lattices is that they cannot be categorized algebraically. Furthermore, the family of p-extremal lattices includes many interesting lattice families including distributive lattices, locally distributive lattice, and Tamari associativity lattices (see below). The first step to proving some of these results is to characterize the posets of irreducibles of extremal lattices (we can also do this for p-extremal lattices in general). Theorem 7 (Markowsky w) A bidigraph (X, Y, Arcs) is P(L) for an extremal lattice iff:
1. \X\ =
\Y\=n.
2. You can number X and Y from 1 to n such that (a) (xi, yi) is an arc for all i. (b) if (xi, yj) is an arc, i < j .
•
Using the characterization of P(L) for extremal L leads immediately to the following results. Theorem 8 (Markowsky w) Any finite lattice is isomorphic to an interval of some finite extremal lattice. • Corollary 9 (Markowsky 10) Extremal lattices cannot be characterized algebraically. n Of special importance when working with p-extremal lattices are the coprime and prime elements. Definition 4 1. An element a^O in L is called coprime if for all x and y in L, xV y > a implies that x > a or y > a. 2. An element aneql in L is called prime if for all x and y in L, x Ay < a implies that x < a or y < a. • Coprimes are special kinds of join-irreducibles, while primes are special kinds of meet-irreducibles. The following three result is a straightforward consequence of the above definitions and is found in 10 . Theorem 10 The following are equivalent 1. L is distributive.
173
{!,' 2,3,4} 3
{1, 3,4}
4
p
{1,2 3}
{1,2"T
\ 3
4
Jj2,3}
{1} {1,2 3}
{1,2"T 1 {1}
\
{2,3}
/
Figure 8. Some Posets of Irreducibles of Extremal Lattices Numbered
2. All join-irreducibles 3. All meet-irreducibles
are are
coprime. prime.
a
T h e next result is a bit surprising is a generalization of the fact t h a t in a distributive lattice the poset of join-irreducibles in the induced order is isomorphic to the poset of meet-irreducibles. T h e o r e m 1 1 (Markowsky 10) In any lattice the subposet of coprimes is isomorphic to the subposet of primes. D C o r o l l a r y 1 2 In a distributive lattice J(L) is isomorphic to M(L). • T h e existence of primes and coprimes in lattices is of great significance because it permits you to decompose the lattice into simpler lattices. Of crucial importance is the fact t h a t extremal lattices must contain at least one prime and at least one coprime, which can be used to decompose t h e m . Details on these mappings can be found in 1 0 and 1 2 . A key result t h a t summarizes the decomposition properties is the following theorem. T h e o r e m 1 3 (Markowsky 10) Let L be an extremal lattice. Then L has an atom a that is coprime and a matching prime b such that the intervals A = [a, 1} and B = [O, b] partition L. Let the mapping g : B —> B be given by g(x) = i V o . Then the following are true: 1. g is injective
and for all x, g(x) covers x.
174
2. A is
extremal.
3. M(L)
- M(A)
4- Length(A)
= b.
= Length(L)
— 1.
5. J ( 4 ) C J ( L ) U ( a V J ( I ) ) . 6. B is join-extremal.
•
It is relatively easy to compute P(A) and P(B) from P ( L ) . Of course, a dual theorem holds for a coatom t h a t is prime. T h e numbering t h a t exists for extremal lattices can be used to derive an alternative characterization of distributive lattices. For details see 1 0 . 3.5
Tamari Associativity
Lattices
Tamari associativity lattices are the lattices t h a t result when you take expressions in n + 1 variables and a single binary operator * and you order t h e m as follows. One expression covers another if it can be derived by moving parentheses to the left using associativity. Thus, (a * b) * c -< a * (b * c). It is a non-trivial fact t h a t this covering relation p u t s a lattice structure on the expressions. T h e family of lattices t h a t results for all n is called the family of T a m a r i associativity lattices. Figure 9 shows some of the smaller Tamari lattices and indicates various coprime/prime decompositions and some of the relations between consecutive members in the family. M. K. Bennett and G. Birkhoff determined the posets of irreducibles for the Tamari lattices. It is natural to consider the posets of irreducibles of Tamari lattices. T h e results are summarized in the following theorem. 10 T h e o r e m 1 4 (Markowsky ) 1. Tamari lattices are
extremal.
2. The coprimes are exactly the atoms, 3. The longest length n-1.
maximal
Jf.. Tamari lattices are
the primes
chain has length n(n-l)/2
are exactly the
coatoms.
and the shortest
has
self-dual.
5. Tn has a coprime/prime decomposition and the corresponding A is extremal.
such that B is isomorphic
to T r a _i,
175 1 55 (2,2) * (1,2) T2 <2.2>
tl,3]
(1)
< 1.2 > [1,2]
Figure 9. The Decomposition of Tamari Lattices
6. Bn, the Boolean algebra on n atoms, is a retract
ofTn+\.
7. Every distributive
ofTn+\.
3.6
Permutation
lattice of length n is a sublattice
D
Lattices
Permutations can be ordered using a covering relationship similar t o the one described for t h e Tamari lattices. n presents the structure of the P(Sn) where Sn is t h e permutation group on n elements. T h e key results on the structure of P(Sn) are summarized in the following theorem. n T h e o r e m 1 5 (Markowsky ) 1. The join-irreducibles of Sn correspond to pairs of subsets of l,...n, (A,B), such that A and B are complements and A is not of the form l,...,i for any i. 2. The meet-irreducibles of Sn correspond to pairs of subsets of l,...,n, (C, D), such that C and D are complements and D is not of the form l,...,i for any i.
176
3. A join-irreducible, (A, B), of Sn, is connected to a meet-irreducible, (C, D), of Sn in P(Sn) iff max{A n D) > mm{B C\C). U There are many additional properties of Sn that are derived in will not discuss further here. 3.7
u
that we
Additional Applications
There are many additional applications for the ideas presented in this paper. One application is to check a poset to see whether it is a lattice or not. The idea is to assume that it is a lattice and to construct its poset of irreducibles. One can test whether the resulting bidigraph satisfies the conditions for being a poset of irreducibles of a lattice and whether one reconstructs the original poset from the supposed poset of irreducibles. Many results about concept lattices are results about the poset of irreducibles of a lattice, the results in that area provide an example of the power of this approach. Furthermore, as noted earlier there are many decompositions and representations that can be applied to lattices. This is an area that deserves further study. For some ideas in this area see 8 . References 1. S.P. Avann, Application of the join-irreducible excess function to semimodular lattices, Math. Annalen 142, pp. 345-354 2. G. Birkhoff, Lattice Theory, Amer. Math. Soc, Providence, RI, 1967. 3. Guy-Vincent Jourdan, Jean-Xavier Rampon, and Claude Jard, Computing On-Line the Lattice of Maximal Antichains of Posets, Order (11), 1994, pp. 197-210. 4. G. Markowsky, Combinatorial Aspects of Lattice Theory with Applications to the Enumeration of Free Distributive Lattices, done under the direction of Professor Garrett Birkhoff. Ph.D. Thesis, Harvard University, 1973. 5. G. Markowsky, Some Combinatorial Aspects of Lattice Theory, Proc. Univ. of Houston Lattice Theory Conf, 1973, 36-68. 6. G. Markowsky, The Factorization and Representation of Lattices, Trans. Am. Math. Soc. 203, 1975, 185-200. 7. G. Markowsky and Mario Petrick, Subprojective Lattices and Projective Geometry, J. Algebra 48, 1977, 305-320. 8. G. Markowsky, The Representation of Posets and Lattices by Sets, Algebra Univ., 11, 1980, 173-92.
177
9. G. Markowsky, Necessary and Sufficient Conditions for a Phenotype System to Have a Factor Union Representation, Math. Biosciences, 66 (1983), 115-128. 10. G. Markowsky, Primes, Irreducibles and Extremal Lattices, Order, 9 (1992) 265-290. 11. G. Markowsky, Permutation Lattices Revisited, Mathematical Social Sciences, 27, (1994), 59-72. 12. G. Markowsky, The Poset of Irreducibles: A Basis for Lattice Theory, to appear. 13. Michal Morvan and Lhouari Nourine, Simplicial Elimination Schemes, Extremal Lattices and Maximal Antichain Lattices, Order (13), 1996, pp. 159-173. 14. R. Wille, Restructuring Lattice Theory: an Approach Based on Hierarchies of Concepts, pp. 445-470 in Ordered Sets, ed. Ivan Rival, D. Reidel, Dordrecht, 1982.
N U M B E R THEORY A N D PUBLIC-KEY C R Y P T O G R A P H Y DAVID P O I N T C H E V A L LIENS - CNR.S. Ecole Normale Superieure, 45 rue d'Ulm, 75230 Paris Cedex 05, France - CNB.S. E-mail: [email protected] http://urww. di. ens.fr/ pointche For a long time, cryptology had been a mystic art more than a science, solving t h e confidentiality concerns with secret and private techniques. Automatic machines, electronic and namely computers modified the environment and the basic requirements. The main difference was the need of public mechanisms to allow large-scale communications with just a small secret shared between the interlocutors, but that furthermore resist against adversaries with more powerful computers. Unfortunately, the security remained heuristic: with a permanent fight between designers (the cryptographers) and breakers (the cryptanalysts). In 1976, Diffie and Hellman claimed the possibility of achieving confidentiality between two people without any common secret information. However, they needed quite new objects: (trapdoor) one-way functions. Hopefully, mathematics, with algorithmic number theory, have been realized to provide such objects. A new direction in cryptography was under investigations: asymmetric cryptography and provable security. In this paper we review the main problems that cryptography tries to solve, and how it achieves these goals thanks to the algorithmic number theory. After a brief history of the ancient and conventional cryptography, we review the DiffieHellman's suggestion with the apparent paradox. Then, we survey t h e solutions based on the integer factorization or the discrete logarithm, two problems t h a t nobody knows how to efficiently solve.
1
Introduction
The need of confidentiality has always existed. Thus, one can find some "cryptographic" techniques in a quite far past, even before the ancient Greek civilization. But confidentiality, at that period, only relied on the secrecy of the techniques, which were various. Some of those techniques are reviewed in the main books about cryptography 65>76>41. 1.1
Brief History of Ancient Cryptography
The Lacedemonian Scytale is one of the oldest technique, used during the 5th century B.C. It consists in rolling up a papyrus around a piece of wood, then writing the message and getting off the papyrus from that piece of wood. The resulting message contains all the letters but scrambled according to the 178
179 diameter of the piece of wood. Of course, the security completely vanishes against an adversary who knows the technique, even if he does not exactly know t h e diameter of the piece of wood. Another well-known cipher, the shift cipher, a.k.a. the Caesar's cipher, has been used by Julius Caesar during the 1st century B.C. It simply consisted in shifting the letters through a constant number k of characters: i.e. with k = 3, 'A' was replaced by 'D', ' B ' by ' E \ ' C by ' F \ etc. Once again, the knowledge of the general technique is enough t o break the confidentiality. However, this Caesar's cipher gives a good taste of the cryptographic techniques used during the two following millennia (except the last two decades). Indeed, u p t o a recent past, confidentiality was achieved using more or less intricate combinations of permutations and substitutions. T h e conventional cryptography, currently in use, is still in the same vein. 1.1.1
Transpositions/Permutations
For a transposition/permutation cipher of length £, a message is split into blocks of size £, on which one applies t h e same p e r m u t a t i o n 7r among the indices between 1 and I: for a block m = mi...mg, the ciphertext is c = m w (i) . . . m„-(£). T h e permutation n has to be kept secret between interlocutors. However, such a cipher is weak because it preserves the frequency distribution of each character. T h e Lacedemonian scytale provides an example of permutation cipher. 1.1.2
Substitutions
Another technique mostly used to design block ciphers is the substitution of blocks of characters. If the substituted blocks are of size 1, one talks a b o u t a mono-alphabetic substitution. T h e Caesar's cipher is t h e most famous example, which replaces each letter rrii of the message m = m,\mi... mn by the letter Cj = Wj + k mod 26, where k is the shift parameter. According t o folklore, Caesar used k = 3. T h e affine cipher generalizes this latter cipher to Cj = ami + b mod 26, where (a, b) is the secret parameter, also called the secret key. B u t , the substitution may involve larger blocks. For example, the Hill cipher encodes blocks of size £ as vectors in M £ Zjg. T h e n one encrypts by multiplying the plaintext vector M by an invertible square m a t r i x K, C = KM. More intricate variants appeared using many substitution mappings, instead of only one as for previous examples. T h e Vigenere cipher is a classical poly-alphabetic substitution. It involves several distinct shifting ciphers ac-
180
cording to the position of the character: c* = rrii + kt mod £ mod 26, where k — k0.,. k(-1 is a keyword of length £ (the secret key). Beaufort has designed a variant of Vigenere, c* = ki mo a e - "i* mod 26, which is its own inverse. Unfortunately, when one uses such techniques to encrypt messages in a natural language, the high redundancy may help an adversary to recover the keyword. Indeed, some statistics of the plaintext are preserved in the ciphertext. Therefore, Kasiski provided a general technique for cryptanalyzing poly-alphabetic ciphers, with repeated keywords, such as the Vigenere cipher: if the period between two identical mappings is not two large, one can recover that period as well as each mapping which simply consists of a mono-alphabetic substitution.
1.1.3
Specialized Devices and Rotor-Machines
In the 18th century, appeared dedicated devices to encrypt and decrypt more efficiently. The most famous is the Jefferson cylinder, which implements a poly-alphabetic substitution at no computational cost for the parties: it consists of 36 disks on which the 26 letters A-Z are written in a random ordering, distinct for each disk. All the disks rotate around a same axis. The sender rotates the wheels so that the plaintext appears along a reference line, along the cylinder's length. The 25 other line positions each defines a ciphertext. Then, with the same cylinder device, the recipient rotates the wheels to obtain the ciphertext written in a line, and the plaintext is the only intelligible message among the 25 other lines. Alternatively, they also may agree on a common offset to uniquely define the plaintext. Then appears the most dominant technique of the World War II: the rotorbased machines. A plaintext is encrypted through the successive rotors which each performs a mono-alphabetic substitution, for a fixed position. Therefore, for fixed rotor positions, the rotors implement a mono-alphabetic substitution which is the composition of the substitutions defined by individual rotors. To provide a poly-alphabetic substitution, after any encipherment of a character, rotors move, which therefore provides a new mapping. The most important property of the rotor-based machines is the long period between two identical mappings, which avoids Kasiski attacks. As said above, many rotor-based machines have been implemented during the World War II, such as the famous German cipher device, Enigma.
181
1.1.4
Kerckhoffs' Principles
Unfortunately, none of all those ciphers resisted against adversaries who exactly knew the mechanism of the transformation. Indeed, the breaking of Enigma by Alan Turing has been helped by the robbery of a machine in a submarine. However in 1883, Kerckhoffs claimed that the security of a cryptosystem should not rely on the secrecy of this latter, but just on the secrecy of a small parameter, the secret key. Nevertheless, until a recent past, cryptographic applications were limited to military people. Even if they claim not to require the secrecy of the schemes for security concerns, they still keep them secret. But for such a specific class of people, it is possible to assume the secrecy of the mechanisms. However, since several decades, many people have become aware of the importance and the need of secrecy and authentication, not only for military applications. First industrial/commercial people wanted to be able to exchange critical data secretly. And now, everybody would like to be able to sell and buy over the Internet, or simply discuss while preserving his private life. 1.2
Conventional Cryptography
In the mid 1970s, the American government took care of that industrial need of confidentiality, and asked for an Encryption Standard. Such a standard has therefore been developed in cooperation with IBM. This "Data Encryption Standard" 50 , the well-known DES, is the first commercial-grade algorithm, officially defined by the American Standard FIPS 46-2, with openly and fully specified implementation details, as required by the Kerckhoffs' principles. The confidentiality thus relies on a 56-bit secret key shared between the two parties. Then followed FEAL (Fast Data Encipherment Algorithm) 71 the Japanese alternative to DES, IDEA (International Data Encryption Algorithm) 34 , SAFER (Secure and Fast Encryption Routine) 3 8 , etc. Unfortunately, no formal security can really be proven, even if a theory is beginning to capture some kinds of attacks {e.g. the decorrelation technique 7 8 ) . Such a conventional encryption can be modelled as presented on Figure 1, where £ is the encryption device, T> the decryption device and k the common secret, shared between both parties. The basic security requirement is the impossibility, for anybody who does not know the secret key k, to recover the plaintext m from the ciphertext c. More formally, one would like that no information about the plaintext m would be leaked in the ciphertext c. Unfortunately, Shannon 70 showed that such a security level can only be reached if one uses a secret key k as long as the message to encrypt. Furthermore, this key has to be renewed for each
182
k
y, £
k
\/ c
V
unsecure channel Figure 1. Conventional Encryption
new message to be encrypted. Such a perfect encryption scheme exists, and is called the Vernam Cipher 79 , a.k.a. the one-time pad: • let m £ {0,1}™ be an n-bit long message to be encrypted; • the two parties share a common secret k € {0, l } n of the same length as the message m; • the ciphertext is simply c = m © k; • it can easily be decrypted by the recipient since m — c © k. However, that perfect security is not practical, and cannot be used for largescale communications. 1.3
Practical/Provable Security: the Limit of Mathematics
Hopefully, that impossibility of perfect secrecy does not close the cryptographic research: adversaries are not powerful but limited in both computational power and time. Therefore, we can consider practical security that prevents attacks from real adversaries. With provable security, one would like, to prevent, at least, any kind of attacks, known and unknown, that an adversary could perform in "reasonable" time. Unfortunately, as said above, no general security analysis can be driven about conventional cryptographic schemes: one can just prevent some restrictive classes of attacks. Therefore, nothing guarantees that no attack can ever be found against a scheme. This limitation is mainly due to the mathematics. Indeed, mathematics are the main tool to analyze the security of cryptographic schemes, trying to study the probability distributions of the ciphertexts, the plaintexts and the keys. But such analyses cannot take advantage of any limitation in time of the adversary.
183
\I
\1 c
£
V
unsecure channel Figure 2. Public-Key Encryption
1.4
Asymmetric
Cryptography: on the Importance of Mathematics
In 1976, Diffie and Hellman 15 suggested to extend the Kerckhoffs' principles, with the remark that in an encryption scheme, one just wants to protect the decryption process. Why the encryption phase should be secret or use secret information? Let us follow that suggestion with the encryption model described on Figure 2. It consists of • an encryption phase £, which allows anybody to transform a plaintext m into an unintelligible ciphertext c • a decryption phase P , which allows the owner of the secret data ks to recover the plaintext from the ciphertext c. Of course, the encryption phase £ has be related to the secret data ks, since it is clear that any encryption process has to be specific to the recipient, but in a public way. Therefore, it uses a public data kp. The pair (kp, k s ) is a pair of matching secret/public keys, associated to a specific user. Hence, the name of public-key cryptography. However, whereas ks and kp have to be related, it should be impossible to recover ks from kp: this impossibility can be guaranteed by a computational problem, i.e. a problem that is difficult to solve in practice. 1.5
Outline of the Paper
In this paper, we first develop this Difne-Hellman's suggestion of public-key cryptography while precising the new requirements. Then, we show how mathematics, and particularly the algorithmic number theory, helped to actually provide public-key cryptosystems.
184
2
Public-Key Cryptography
In public-key cryptography, each people, say Alice, owns a pair of matching secret and public keys (kp, k s ), where kp has to be widely published to belong to Alice, while ks has to be kept secret by Alice. Thanks to these keys, one hopes to be able to achieve confidentiality with an encryption scheme and authentication with a digital signature scheme. 2.1
Public-Key
Encryption
The aim of a public-key encryption is to allow anybody who knows the public key kp of Alice to send her a message that she will be the only one able to recover it, thanks to her private key ks. 2.1.1
Definition
A public-key encryption scheme can be formally defined by the three following algorithms (as depicted on Figure 2): • The key generation algorithm G. On input lk, where k is the security parameter, the algorithm G produces a pair (kp, k s ) of matching public and secret keys. Algorithm G is probabilistic. • The encryption algorithm £. Given a message m and a public key kp, £ produces a ciphertext c of TO. This algorithm may be probabilistic. • The decryption algorithm T>. Given a ciphertext c and the secret key k s , T> gives back the plaintext m. This algorithm is necessarily deterministic. 2.1.2
Basic Security
Informally, one would like that nobody can recover the whole plaintext from a ciphertext, except the designated recipient. But in some cases, that security notion is not enough: let us consider the situation where we know that the ciphertext encrypts "yes" or "no" (or "sell"/"buy"), one bit of information about the plaintext reveals the whole plaintext. Therefore, one could furthermore require that nobody can get any information about the plaintext from a ciphertext. Both security notions will be more formally defined later, under the name of one-wayness and semantic security (a.k.a. indistinguishability of encryptions or polynomial security 24 ) respectively. On the other hand, the adversary may have access to some additional information. In a public-key setting, she can get the encryption of any plaintext
185
V 5» ) S
1
V Vi
\\i Y/N Figure 3. Digital Signature
of her choice. But one can furthermore assume that the adversary can get the decryption of some ciphertexts of her choice. The first scenario is called the Chosen-Plaintext Attack while the second one is named the Chosen-Cipher text Attack 48 - 61 . 2.2
Digital Signature
Digital signature schemes are the electronic version of handwritten signatures for digital documents: a user's signature on a message m is a string which depends on m and the secret key ks of the user, in such a way that anyone can check the validity of the signature by using the public key kp only. 2.2.1
Definition
A signature scheme is usually defined by the three following algorithms (as depicted on Figure 3): • The key generation algorithm G. On input lfc, where k is the security parameter, the algorithm G produces a pair (kp, k s ) of matching public and secret keys. Algorithm G is probabilistic. • The signing algorithm X. Given a message m and a pair of matching public and secret keys (k p ,k s ), S produces a signature a. The signing algorithm might be probabilistic.
186
• The verification algorithm V. Given a signature a, a message m and a public key kp, V tests whether a is a valid signature of m with respect to kp. In general, the verification algorithm need not be probabilistic. 2.2.2
Basic Security
As above, everybody has an intuition about the security notion that a signature scheme should satisfy. First, one would like that only the owner of the secret key ks related to the public key kp could produce an accepted signature a for a message m. But according to the choice of the message m, many kinds of security notions have been defined. Furthermore, according to the additional information the adversary may have (access or not to a signature oracle), many scenarios of attacks have been formalized 27 , as we will see later. 3
N e w Requirements
Above descriptions just follow the Diffie-Hellman's suggestion but do not give any solution. Since we have given a more precise explanation of the public-key cryptography, we can focus on some new tools which are the basis of an actual achievement: the one-way functions and the trapdoor one-way functions. 3.1
One-Way Functions
The pairs of matching public/secret keys have to be related. The public key kp is derived from the secret key ks. But since kp is thereafter published, to remain secret, ks has to be computationally unrecoverable from kp. Definition 1 (One-Way Functions) A function f is said one-way if for any x one can easily compute y — f{x). But for a given y = f{x), nobody can recover any z such that y = f{z). But what does that mean, easy and difficult? Mathematics only study the existence of a pre-image but do not care about the means to get it. Hopefully, the complexity theory addresses this problem with the classes of complexity V and NV, namely the TVP-complete problems. Indeed, those classes can informally be seen as follows • the class V contains the problems which can be solved in polynomial time (in the size of the data) • the class MV contains the problems for which a solution can be checked in polynomial time • the A/'P-complete problems are the strongest problems in the class AfV-
187
More precisely, any problem in HV can be polynomially reduced to any J\fIncomplete problem: if one AAP-complete problem can be solved in polynomial time, then any problem in NT can be solved in polynomial time, and therefore MV = V. However, this latter equality is the strongest open problem of the complexity theory. But both classes are widely believed to be distinct: the .A/T'-complete problems cannot be solved in polynomial time. Furthermore, no one knows better algorithms than exponential ones to solve any TV'P-complete problem. Such an TVP-complete problem 22 seems a good candidate as a one-way function. Indeed, in general, given a solution x, it is easy to define an instance y which admits a; as a solution. However, given that instance y, the MVcompleteness seems to claim that there is no efficient algorithm to find a solution. Unfortunately, the .A/"'P-completeness only deals with the worst case, whereas the recovery of the secret key ks should be always difficult, except maybe for a negligible fraction of cases. 3.2
Trapdoor One-Way Functions
Let us come back to encryption, and more precisely to the DifRe-Hellman's suggestion 15 . They claimed that the encryption phase should be available to anybody: c = £(k p ,m) where £ is the public encryption process and kp the public key of the recipient. While the decryption can only be proceeded by the designated recipient: m = T>(ks,c) where T> is the public decryption algorithm which requires the secret key ks to correctly decrypt c. For such an application, one needs a function /(•) = £(kp, •) that anybody can easily compute, but such that the inversion is impossible for anybody, except for the one who knows ks, a trapdoor. Definition 2 (Trapdoor One-Way Functions) A function f is said trapdoor one-way if it is a one-way function, except for those who know a trapdoor information t: knowing t, for any given y = f(x), one can easily compute a z = g(t, y) which satisfies y = f(z). As we have previously seen, the complexity theory provides convenient definitions and classes to encompass "easy" and "difficult" tasks. But the AfP-complete problems cannot all be used. Firstly, because complexity theory analyzes problems in an asymptotic framework. Therefore, a problem may become difficult in practice only for very large instances, which cannot be used for actual cryptographic protocols. Secondly, because ATP-completeness only says that the worst cases cannot be efficiently solved. But the worst cases may be rare, whereas one would like a problem for which almost all the
188
instances are difficult to solve. Nevertheless such convenient TVP-complete problems have been identified: the problem of decoding an arbitrary linear code 40 , the Knapsack or Subset Sum problem 43 , the Permuted Kernel Problem 68 , the Syndrome Decoding 7 5 or the Permuted Perceptrons Problem 53 . However, their application to cryptography is mainly restricted to interactive authentication (by zero-knowledge proofs of knowledge 2 5 ) , but cannot be efficiently used for encryption or signature. 4
The Algorithmic Number Theory
Hopefully, even if mathematics cannot help to analyze the security of cryptographic schemes, they provide candidates as one-way and trapdoor one-way problems with namely the integer factorization and the discrete logarithm problem. 4-1
The Integer Factorization
A first simple candidate that may come to mind is the factorization of integers: while it is easy to multiply two prime integers p and q to get the product n = p • q, it is not simple to factorize n into its prime factors p and q. Indeed, the multiplication between two integers p and q, both of size k, just requires a quadratic amount of time in k. However, the factorization of any integer n, which consists of writing n as a product of prime integers n = a YIPT is little more intricate. First, it is a well-known result of arithmetic that any integer n > 2 has a factorization, as a product of prime powers n = YIPV> w n e r e the pi are distinct primes (which can only be divided by 1 and themselves) and V{ are the valuations, represented by positive integers. Furthermore, this factorization is unique up to a permutation of the factors. 4-1-1
Generic Techniques
For a long time, many methods have existed for factorizing integers, from the trial-division to the Pollard's methods 57 . But their complexity is very bad: let us see some of them for an integer n, for which p is the smallest prime factor • the trial-division requires p divisions to find the first prime factor, and up to y/n divisions to fully factorize n
189 • the Pollard's p method 57 , later improved by Brent n , finds p after 0{sJp) iterations and fully factorizes n in C ( n 0 1 0 6 ) , on average • then some methods have been dedicated to special integers, such as the p — 1-method which is quite fast when p — 1 is smooth, and the p + 1method 8 0 . Anyway, all these methods that simply consist in trying to divide n by many primes provide algorithms which require an exponential amount of time w.r.t. k the size of n (k = Inn), typically exp(/c/2) or exp(/c/4) in the special case that n = pq, the product of two primes of similar size. Indeed, those numbers seem the strongest to factorize, then they will be used for cryptographic purpose. 4-1.2
Improved Techniques
More recently, new methods appeared: • First, using the remark that thep —1 method is quite general, one can use it on any group related to the prime factors of the integer n to factorize, such as an elliptic curve modulo n 4 6 . Therefore, on an elliptic curve modulo n, we can use the addition law as if it would be modulo a prime. But at some time, this addition is impossible. Such an accident reveals a factor of n. This method is called ECM and finds a prime factor p of n in 0{L{py2) where L(x) = exp(\Ana;mlna;). • Then come the methods based on congruential relations which likely lead to a factor of n, such as x 2 = y2 mod n. The first algorithm using such relations is CFRAC (Continued Fraction Algorithm 4 7 ' 6 0 ) which exploits some properties of the continued fraction development of ^/n. This method factorizes n in 0{L{n)^/2). It has been used to perform the first record: Fj, a 39-digit number. • Other mechanisms have been used to collect such relations, by sieving. First the quadratic sieve 59 has been proposed, with many optimizations (Multiple Polynomial QS, Large Prime Variation). That technique, with a time complexity in 0(L(n)) has been used to establish many records, up to a 129-digit number in 1994 x . Currently, the most efficient algorithm is based on sieving on number fields. The Number Field Sieve (NFS) method 3 6 has a complexity in C(exp((1.923 + o(l))(lnn) 1 / 3 (lnlnn) 2 / 3 )).
190 In practice, it becomes more efficient t h a n the quadratic sieve for 130digit numbers. It has been used t o establish t h e last record, in august 1999, by factorizing a 155-digit integer, product of two 78-digit primes 1 2 . T h e factorized number, indicated by RSA-155, was taken from t h e "RSA Challenge List", which is used as a yardstick for t h e security of the RSA cryptosystem (see later) which is used extensively in hardware and software t o protect electronic d a t a traffic such as in the international version of the SSL (Security Sockets Layer) Handshake Protocol. This latter record is very important since 155 digits correspond t o 512 bits. And this is the size which is in use in almost all the implementations of the RSA cryptosystem (namely for actual implementations of SSL on the Internet). RSA-155 = 1094173864157052742180970732204035761200373294544920\ 5990913842131476349984288934784717997257891267332497\ 625752899781833797076537244027146743531593354333897 102639592829741105772054196573991675900\ 716567808038066803341933521790711307779 * 106603488380168454820927220360012878679\ 207958575989291522270608237193062808643
Therefore, from the current state of the science, factorization is believed t o b e a difficult problem, especially for products of two primes of similar sizes larger t h a n 384 bits. 4-2
The Discrete
Logarithm
Problem
Let Q be a cyclic group of order q, with an internal law denoted multiplicatively. This means t h a t
for some g, called a generator of the cyclic group Q. Therefore, for any y £ Q, there exists at least a n i E Z such t h a t y = gx. One defines logff y = minfa;
gx}.
T h a n k s t o the square-and-multiply technique 3 0 , for any integer x, it is x x easy to compute g : if x = £ ^ 2 * , g = Jig** where g0 = g and gi =
191
gf_1(= g2'). Indeed, if £ = \x\, this method requires £ squares and less than £ multiplications (but just £/2 on average). However, how do we compute log„ y for a given y £ QI 4-2.1
Generic Techniques
Many methods are known for computing discrete logarithms (the reader is referred to a recent review 7T for more details), such as the Baby Steps, Giant Steps technique 69 : x = \oggy is known to belong in { 0 , . . . , q — 1}, and therefore there exist a, b G { 0 , . . . , s}, where s = \y/q\, such that x = a + bs. Thus, building the two sets Si = {p° |a €{<),...,s}} S2 = {y(g-s)b\be{0,...,s}} after a sort, one can easily find a collision c € <Si l~l 1S2: c = ga = yg~sb for some pair (a, 6), which leads to y = ga+sb. However, the time complexity of this method is in 0(^/q\ny/q), because of the sort. Furthermore, the space complexity is also in 0(^/q). This latter, which quickly becomes very huge, makes this technique impractical as soon as q is over hundred bits. Pollard's p and A methods 58 avoid this large space storage and the sort. The p method is well-known and provides a (heuristic) time complexity in O(yfq). The A method is less known (or a.k.a. the method for catching kangaroos). They are still both impractical as soon as q is over hundred and twenty bits. When the order of g is not prime, a divide-and-conquer technique in all the subgroups of order the factors of q can be applied: the Pohlig-Hellman decomposition 52 . For example, let us assume that q = Y\i Qi (this can be extended to greater valuations), one computes Xi = logffi y, where gi = gq/Qi, for i = 1 , . . . ,£. The Chinese Remainder Theorem (which will be recalled later) gives a solution of the simultaneous congruences, x = Xj mod qi, for i = 1 , . . . , £. The overall complexity is dominated by the cost of finding the discrete logarithm for the largest prime factor. 4-2.2
Suitable Groups
This discrete logarithm problem needs a suitable cyclic group Q, of order with at least a large prime factor. The first group in use in cryptography has been a cyclic subgroup of multiplicative groups of finite fields (g) C Z* where p is a large prime such that p — 1 admits a large prime factor q. More recently, other groups have been introduced, such as the algebraic varieties of (hyper)-elliptic curves and some ideals in number fields 29 . Indeed,
192 in 1985, N. Koblitz 3 and V. Miller 4 5 have proposed to use elliptic curves in cryptography, where the underlying problem is the discrete logarithm over points (x, y) G F 2 which satisfy an equation of the form y2 + axxy + a3y = x3 + a2x2 + a^x + a6
(a±, a 2 , a 3 , a 4 , a 6 G F ) ,
equipped with an Abelian group structure. In 1988, N. Koblitz was the first to suggest using hyperelliptic curves 32 > 33 . 4.-2.3
Specific
Techniques
Of course, specific techniques have been designed to address the particularities of the underlying group. Therefore, in the particular case of subgroups of Z* the quadratic and number field sieves can be used. This later provides a time complexity in 0(exp((1.923 + o ( l ) ) ( l n p ) 1 / 3 ( l n
\nV)2'3)).
However, the quadratic sieve 3 5 is still the most efficient for current records, and has been used by R. Lercier and A. Joux (CELAR, France) to establish the last one: discrete logarithms modulo a 90-digit prime. More precisely, p = \ l f l o o r 10~{89} \ p i \ r f l o o r + 156137 = 3141592653589793238462643383279502884197169399\ 37510582097494459230781640628620899862959619 g = 2 y = \ l f l o o r 10~-C89} e \ r f l o o r = 27182818284590452353602874713526624977572470936\ 9995957496696762772407663035354759457138217 = g~176713807211421696273204823407162027230205795\ 2449914157493844716677918658538374188101093 About elliptic curves, for a long time, only the generic techniques were known, b u t some new algorithms also appear in some particular cases: for anomalous elliptic curves 6 4 , for supersingular curves, where the discrete loga r i t h m problem can be reduced to the finite field setting, because of the Frobenius m a p which has a trace zero 4 2 , but also for curves of trace one 7 4 and more recently when many automorphisms exist on the curve 1 7 . However, the generic p method with distinguished points has been used to establish the latest record, by R. Harley (INRIA, France). This record solves the so-called ECC2K-108 challenge from the list provided by the Canadian company Certicom. This challenge can be defined as follows: • Let the curve C be y2 + xy = x3 + x2 + 1 over F 2 io9.
193 • Represent F 2 ios as F 2 [£]/(/(£)) where f(t) irreducible over ¥2.
= t109 + t9 + t2 + t + 1 and is
• T h e n the following two points: P = (0x0478C46CC96338CED91565E17257, 0xlE7965E4A3AFB73A48FC9AB790E9) q = (0xlFF0CE5EC61893F2119C3077C59E, 0xlF20E9B010AC691C9B87B438241D)
are on C, where the coordinates have been written as hexadecimal integers by reducing modulo / and setting t to 2. T h e problem is t o find t h e logarithm of Q t o the base P. T h e problem takes place in the subgroup of order q = 324518553658426701487448656461467, which is a prime. T h e solution is 47455661896223045299748316018941 mod q. Hyper-elliptic curves (or more precisely their Jacobian) have been proposed in the hope of using smaller fields, thanks to a high genus which increases the number of points. And thus, b o t h the computation load and the size of d a t a are smaller. However, new methods 1 7 ' 2 3 have been recently proposed to compute more efficiently discrete logarithms in those groups. Therefore, from the current state of the art of the discrete logarithm algorithms, one uses subgroups (g) of either • the multiplicative group Z* where p is a 512-bit prime such t h a t p — 1 admits a 160-bit prime factor, which is the order of g; • an elliptic curve, over the field ¥pk where p is a prime and k x \p\ ~ 160, such t h a t the cardinality admits a large prime factor. 5
Trapdoor One-Way Problems
Unfortunately, b o t h problems, the integer factorization and the discrete logar i t h m problem, are just one-way. And no information can make t h e m easier. However, some algebraic structures are based on the factorization of an integer n, where some computations are difficult without the factorization of 71, b u t easy with it; the finite ring Z n which is isomorphic to the product ring Z p x Zq if n = p • q. About the discrete logarithm, it helps to solve the so-called Diffie-Hellman problem 1 5 , the first to have been proposed for cryptology purpose.
194 5.1
Finite Rings: the RSA problem
Thanks to the relation of equivalence, modn, for any integer n G Z, one can define the quotient ring Z n = Z/nZ, where a,b G Z are in the same class if a = b mod n. In the following, we always identify a class of Z n with any integer of this class. 5.1.1
Addition, Subtraction, Multiplication, Division
Many results are known about this quite rich structure. First, one can efficiently check equality, add, subtract or multiply two elements. The following results claim that the division is also easy, by any invertible element. Theorem 1 (Bezout) Let a and b be two integers in Z then (3u, wGZ)(au + to = l ) « gcd(a, 6) = 1. More generally, Theorem 2 Let a and b be two integers in Z then (3u, v G Z)(au + bv = d) <=? gcd(a, b) | d. where x | y means that there exists z such that y = xz. In particular, for any pair (a, 6) G Z 2 there exists a pair (u, v) G Z 2 such that au + bv = gcd(a, b). Furthermore, the (extended) Euclidean algorithm is an efficient algorithm which, given such a pair (a, 6), provides both gcd(a, b) and the pair {u,v) such that au + bv = gcd(a, b). Its time complexity is linear in the size of the input, i.e. in 0{\a\ + \b\). As a consequence, for any x € Z„, x is invertible (relies in Z*) if and only if (3y € Z„)
xy = 1 mod n
& (3y € Z n )(3/c G Z)
xy + kn = 1
•£> gcd(s, n) = 1 Therefore, the extended Euclidean algorithm efficiently provides the inverse of any x G Z*, which allows an efficient division. Furthermore, the following corollary comes: Corollary 1 p is a prime •<=> Z p is a field. 5.1.2
Powers and Roots modulo n
When n is not a prime, Z„ is not a field, but the Chinese Remainder Theorem provides the explicit structure of Z*:
195
Theorem 3 (Chinese Remainder Theorem) Let n = m\mi be a composite integer, where gcd(mi,m 2 ) = 1. Then the ring Z n is isomorphic to the product ring Z m i x Z m 2 and the following morphisms, f : Z„ > Zmi x Zm2 x i—> (x mod mj , x mod 7712) 9 '• Z m i X Z m 2 > £in (a , b) 1—> aum-2 + bvm\ mod n where u = m^1 mod mi and v = m^1 mod 77x2 are isomorphisms of rings and f o g = Id%m xzm while g o f = Idz„ • About the multiplicative group, one gets the following corollary Corollary 2 Let n = m\mi be a composite integer, where gcd(mi,m2) = 1. (Z„,+, x) ~ ( Z m i , + , x ) x (Z
+,x),
and thus (Z;,x)~(Z^1Ix)x(Z^,x). When we have a group, it is useful to know its cardinality to apply the Lagrange's theorem. Theorem 4 (Lagrange's Theorem) Let Q be a group denoted multiplicatively. If we denote by c its cardinality, for any element x £ Q, xc = 1. Therefore, we denote by
Card^).
Thanks to the Chinese Remainder Theorem, this function is weakly multiplicative: gcd(mi,m 2 ) = 1 =>
196 Euler applied the Lagrange's theorem to the particular situation of the multiplicative group Z*: Theorem 6 (Euler's Theorem) Let n be any integer, for any element x G Z*, x^M = 1 mod n. Therefore, • let e be any integer relatively prime to
= x
. (^(n))-*
= x m o d n.
As previously seen, the e-th power can be easily computed using the square-and-multiply method. Above relation allows to easily compute e-th roots, by computing d-th powers, where ed = 1 mod
The RSA Problem and the RSA
Assumption
In 1978, Ronald Rivest, Adi Shamir and Leonard Adleman 63 denned the following problem. Definition 4 (The R S A Problem) Let n = pq be the product of two large primes and e an integer relatively prime to ip(n). For a given y G Z*, compute x £ Z* such that xe = y mod n. We have seen above that with the factorization of n (the trapdoor), this problem can be easily solved. However nobody knows whether the factorization is required, but nobody knows how to do without it either: Definition 5 (The R S A Assumption) For any product of two large primes, n = pq, large enough, the RSA problem is intractable (presumably as hard as the factorization of n). 5.2
The Diffie-Hellman Problem
In 1976, when Whitfield Diffie and Martin Hellman 15 suggested the asymmetric cryptography, they proposed the following problem: Definition 6 (The Diffie-Hellman Problem) Let (g) C Q be a cyclic group. Given A = ga,B = gb G (g), compute C = gab.
197
This problem is assumed to be intractable for any suitable group, where the discrete logarithm problem is intractable. However, even if with the discrete logarithm a of A in basis g it is easy to compute C, which is equal to Ba, the Diffie-Hellman problem is not proven equivalent to the discrete logarithm problem, or strictly easier either. Another problem has been more recently defined 10 , known as the Decisional Diffie-Hellman Problem: Definition 7 (The Decisional Diffie-Hellman Problem) Let (g) C Q be a cyclic group. Given A = ga,B = gb,C = gc G (g), decide whether C = gab. 6
Recapitulation
Let us make a final review of all the problems we've just seen, splitting them in two families, the factorization-like problems and the discrete logarithm-like ones. 6.1
The Factorization
With the current knowledge from algebraic theory and algorithmic, the factorization problem is believed to be intractable, at least for RSA-integers (products of two primes of similar size) over more than 700 bits: Definition 8 (Factorization - FACT(n)) Given an integer n = pq product of two large primes p and q, find these prime factors p and q. This provides a one-way function. The factorization also provides a trapdoor to the following problem Definition 9 (RSA(n, e)-Problem - RSA(n, e)) Let us given an integer n = pq product of two large primes p and q, as well as an exponent e, relatively prime with
/ : K —• K
x i—• xe mod n
is a permutation onto Z*, whose inversion is intractable unless one knows the factorization of n. Hence the name of trapdoor one-way permutation, which is a very useful property. However, RSA is the sole candidate for that kind of nice object!
198
Since FACT(n) provides a trapdoor to RSA(n,e) for any e, clearly we have FACT(n) > RSA(n, e), where "A > B" means that a machine that can efficiently solve A, can be used to efficiently solve B. 6.2
The Discrete Logarithm Problem
Let us consider a suitable cyclic group of order q ((g) C Z* or a subgroup of an elliptic curve, etc), with a generator g. The discrete logarithm problem on Q is considered intractable: Definition 10 (The Discrete Logarithm Problem - DL((/)) Given y £ Q, find i £ Z , such that gx = y. However, it provides a trapdoor to solve the Diffie-Hellman problem: Definition 11 (The Diffie-Hellman Problem - DH(£)) Given A = ga,B = gb eg, find C = gab. It can also be stated as follows: let us given A = ga £ g, for any B = gb find C = gab. This problem is intractable, unless one knows a, since C = Ba. Furthermore, about this problem, even if one knows a candidate for the solution C, one cannot check its correctness: Definition 12 (The Decisional Diffie-Hellman Problem - D D H ( £ ) ) Given A ~ ga, B = gb,C = gc £ g, decide whether c — ab mod q. Once again, the discrete logarithm provides a trapdoor and therefore DL(0) > DH(S) > DDH(S). REMARK
1 / / 5 c Z * , for some n = pq, DL(n) > FACT(n)
+ DL(jp) + DL(q),
where DL(k) denotes DL(G) for some group G c 2 | . Therefore the discrete logarithm in some subgroup in Z* is stronger than both the factorization ofn and the discrete logarithm in the subgroups modulo the prime factors of n: it combines the difficulty of both problems. 7
Application to Public Key Cryptography
Since we have one-way problems (the factorization and the discrete logarithm) and trapdoor one-way problems (the e-th root and the Diffie-Hellman problems), asymmetric cryptography becomes reality. 7.1
Public Key Encryption
A public key encryption scheme is a triple (g, £, V) as defined in the second part. Let us more formally define the security notions that an encryption
199 scheme should satisfy. 7.1.1
Security Notions
The first common security notion that one would like an encryption scheme to satisfy is the one-wayness: with just public data, an attacker cannot get back the whole plaintext of a given ciphertext. More formally, this means that for any adversary A, her success in inverting £ without the secret should be negligible, over the choice of m and the random coins of £ and herself: Slices = Yx[A(£y (m)) = m\. m
y
However, many applications require more from an encryption scheme, namely the semantic security (a.k.a. polynomial security/indistinguishability of encryptions 2 4 ) , as already remarked in the introduction. This security notion requires the computational impossibility to distinguish between two messages, chosen by the adversary, which one has been encrypted, with probability significantly better than one half: her advantage, defined as below, where the adversary is seen as a 2-stage Turing machine (A\, A2), should be negligible.
Adv^ = Pr b
(k p ,k s ) +- G(lk) (m0,m1,s) <— Ai(kp) c = £kp(mb)
: A2(m0,m1,
s, c) = b
Another notion has been thereafter defined, the so-called non-malleability 16 , but this notion is equivalent to the above one in some specific scenarios 4 , s , then we don't detail it. On the other hand, an attacker can play many kinds of attacks: she may just have access to public data, and then encrypt any plaintext of her choice (chosen-plaintext attacks) or moreover query the decryption algorithm (adaptively/non-adaptively chosen-ciphertext attacks 49>61). A general study of these security notions and attacks has been recently driven 4 , we therefore refer the reader to this paper for more details. Another scenario has been recently studied which involves many users that receive encryption (under their respective public keys) of related messages. So many encryptions may reveal some information about the plaintexts to an adversary that intercepts all the ciphertexts. However, recent papers 3 ' 2 have proven that if the scheme is semantically secure, it does not leak any information even in the multi-user scenario.
200
7.1.2 7.1.2.2
Some Examples The RSA
Encryption.
When they defined the RSA problem, Rivest-Shamir-Adleman 63 wanted to propose a public-key encryption, thanks to the "trapdoor one-way permutation" property of the RSA function: the key generation algorithm produces a large composite number n = pq, a public key e, and a secret key d such that e • d = 1 mod ?(n). The encryption of a message m, encoded as an element in Z*, is simply c = me mod n. This ciphertext can be easily decrypted thanks to the knowledge of d, m — cd mod n. Clearly, this encryption scheme is oneway, under chosen-plaintext attacks, relative to the RSA problem. However, since it is deterministic, it cannot be semantically secure. Therefore, it cannot be proven secure in the multi-user scenario either. Moreover, it is well-known to be weak against the broadcast attack when using a small exponent e 28 , e.g. e = 3: with the encryptions of a message m under 3 different public keys (n\,ei = 3), (ra2,e2 = 3) and (n3,e3 = 3), denoted by Cj = m 3 mod rii, for i = 1,2,3, respectively. The Chinese Remainder Theorem provides the solution c of the simultaneous congruences c = Ci mod rii. But m 3 mod n^n^n^ is also a solution to this system. The uniqueness proves that c = m 3 £ Z, since m < rrii which implies that m 3 < 7iiri27i3. An easy third root in Z helps to recover m. 7.1.2.2
The El Gamal Encryption.
In 1985, El Gamal 18 designed a public-key encryption scheme based on the Diffie-Hellman key exchange protocol 15 : given a large prime p, and an element g in Z* of large prime order q, the key generation algorithm produces a random element x £ Z* as secret key, and a public key y = gx mod p. The encryption of a message m, encoded as an element of (g), is a pair (c = ga mod p, d = yam mod p) for a random a € Z g . This ciphertext can be easily decrypted thanks to the knowledge of x, m = d/cx mod p. This encryption scheme is well-known to be one-way, under chosen-plaintext attacks, relative to the Diffie-Hellman problem (DH). It is also semantically secure, under chosen-plaintext attacks, relative to the Decisional Diffie-Hellman problem (DDH). However, we had to wait a long time before any efficient proposal provably semantically secure under adaptively chosen-ciphertext attacks. The most famous one was just proposed two years ago 14 , whose security is relative to the Decisional Diffie-Hellman problem. Many other have also been proposed 73>20>21'54 but under the additional assumption of the random oracle
201
model 5 . 7.2
Digital Signature Schemes
As above, a digital signature scheme (G, S, V) admits various levels of security. 7.2.1
Forgeries and Attacks
In this subsection, we formalize some security notions 2 6 ' 2 7 which capture the main practical situations. On the one hand, the goals of the adversary may be various: • total break : Disclosing the secret key of the signer. • universal forgery : Constructing an efficient algorithm to sign any message. • existential forgery : Providing a new message-signature pair. In many cases this latter forgery, the existential forgery, is not dangerous, because the output message is likely to be meaningless. Nevertheless, a signature scheme which is not existentially unforgeable does not guarantee by itself the identity of the signer. For example, it cannot be used to certify randomly looking elements, such as keys. On the other hand, various means can be made available to the adversary, helping her for her forgery. We focus on two specific kinds of attacks against signature schemes: the no-message attacks and the known-message attacks. In the first scenario, the attacker only knows the public key of the signer. In the second one, the attacker has access to a list of valid message-signature pairs. According to the way this list was created, we usually distinguish many subclasses, but the strongest is the adaptively chosen-message attack, where the attacker can ask the signer to sign any message that she wants. She can therefore adapt her queries according to previous message-signature pairs. Definition 13 (Secure Signature Scheme) A signature scheme is secure if an existential forgery is computationally impossible, even under an adaptively chosen-message attack. 7.2.2
Some Examples
The first secure signature scheme was proposed by Goldwasser et al. 26 in 1984. It uses the notion of claw-free permutations pairs, and provides polynomial algorithms with a polynomial reduction between the research of a claw and an
202
existential forgery under an adaptively chosen-message attack. However, the scheme is totally unpractical. Hopefully, some practical schemes have been proposed. 7.2.2.2
The RSA Signature.
In the same paper as their public key encryption scheme, Rivest-ShamirAdleman 6 3 proposed the first signature scheme based on the "trapdoor oneway permutation paradigm", using the RSA function: the key generation algorithm produces a large composite number n = pq, a public key e, and a secret key d such that e-d—1 mod tp(n). The signature of a message m, encoded as an element in Z*, is its eth root, a = m 1 / 6 = md mod n. The verification algorithm simply checks whether m = ae mod n. However, the RSA scheme is not secure by itself since it is subject to existential forgery: it is easy to create a valid message-signature pair, without any help of the signer, first randomly choosing a certificate a and getting the signed message m from the public verification relation, m = ae mod n. Nevertheless, a classical way to increase security is to use the "hash-andsign" paradigm: to sign any message m, one first hashes it, using any hash function such as MD5 62 or SHA-1 51 , into h = H(m), encoded as an element in Z*. The signature of m is therefore the eth root of h, a = h}le = hd mod n. The verification algorithm simply checks whether
secure '• . 7.2.2.2
The Schnorr Signature.
In 1986 a new paradigm for signature schemes was introduced. It is derived from fair zero-knowledge identification protocols involving a prover and a verifier 25 , and uses hash functions in order to create a kind of virtual verifier. The first application was derived from the Fiat-Shamir 19 zero-knowledge identification protocol, based on the hardness of extracting square roots modulo n (which is equivalent to the factorization of n), with a brief outline of its security. Another famous identification scheme 6 6 , together with the signature scheme 67 , has been proposed later by Schnorr, based on that paradigm, under the discrete logarithm problem: the key generation algorithm produces two large primes p and q, such that q >2k, where k is the security parameter, and q\p — 1, as well as an element g of Z* of order q. It also creates a pair of keys, x 6 Z* and y = g~x mod p. The signer publishes y and keeps x secret. The signature of a message m is a triple (r, e, s), where r = gK mod p, with a ran-
203
dom X g Z * the "challenge" e = H(m, r) mod q and s = X + ex mod g. It satisfies r = gsye m o d p with e = H(m,r), or simply e = H(m,gsye modp), which is checked by the verification algorithm. The security results for that paradigm have been considered as folklore for a long time but without any formal validation. But this formal validation appeared few years ago b 5 , 5 6 under the random oracle assumption 5 . 8
Conclusion and Open Problems
After the failure of mathematics to help in proving the security of public-key cryptosy stems, because of a "computational" point of view, they have been realized to provide useful objects to make practical the suggestion of Dime and Hellman. Indeed, they provide one-way and trapdoor one-way problems which are the foundations of the public-key cryptography, for solving the confidentiality and the authentication concerns as seen above, but also for many other applications 4 1 . However, many problems remain open. • About the factorization-like problems: — Do there exist algorithms to solve the factorization problem more efficiently than the NFS? — What is the exact relation between R S A and FACT? Even if they are likely not equivalent 9 , nobody has ever shown any gap. • About the Discrete Logarithm-like problems: — Is this problem stronger over an elliptic curve than in Z* (for similar sizes)? The research has just begun recently, and better algorithms are appearing. — Are there other suitable groups? — What about the real difficulty of the D H and D D H problems? Some relations are known, but in particular cases 3 9 , 7 2 • Are there other candidates as one-way and trapdoor one-way problems? Such new problems would immediately lead to new cryptosystems thanks to generic conversions 5>6>20>21.54. On the other hand, mathematics provided many tools for breaking cryptosystems, such as LLL 3T or NFS 36 . Maybe they can provide some other tools to cryptographers.
1. D. Atkins, M. Graff, A. K. Lenstra, and P. C. Leyland. THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE. In Asiacrypt '94, LNCS 917, pages 263-277. Springer-Verlag, Berlin, 1995. 2. O. Baudron, D. Pointcheval, and J. Stern. Extended Notions of Security for Multicast Public Key Cryptosystems. In Proc. of the 27th ICALP, LNCS. Springer-Verlag, Berlin, 2000. 3. M. Bellare, A. Boldyreva, and S. Micali. Public-key Encryption in a Multi-User Setting: Security Proofs and Improvements. In Eurocrypt '2000, LNCS. Springer-Verlag, Berlin, 2000. 4. M. Bellare, A. Desai, D. Pointcheval, and P. Rogaway. Relations among Notions of Security for Public-Key Encryption Schemes. In Crypto '98, LNCS 1462, pages 26-45. Springer-Verlag, Berlin, 1998. 5. M. Bellare and P. Rogaway. Random Oracles Are Practical: a Paradigm for Designing Efficient Protocols. In Proc. of the 1st CCS, pages 62-73. ACM Press, New York, 1993. 6. M. Bellare and P. Rogaway. Optimal Asymmetric Encryption - How to Encrypt with RSA. In Eurocrypt '94, LNCS 950, pages 92-111. SpringerVerlag, Berlin, 1995. 7. M. Bellare and P. Rogaway. The Exact Security of Digital Signatures How to Sign with RSA and Rabin. In Eurocrypt '96, LNCS 1070, pages 399-416. Springer-Verlag, Berlin, 1996. 8. M. Bellare and A. Sahai. Non-Malleable Encryption: Equivalence between Two Notions, and an Indistinguishability-Based Characterization. In Crypto '99, LNCS 1666, pages 519-536. Springer-Verlag, Berlin, 1999. 9. D. Boneh and R. Venkatesan. Breaking RSA May Not be Equivalent to Factoring. In Eurocrypt '98, LNCS 1403, pages 59-71. Springer-Verlag, Berlin, 1998. 10. S. A. Brands. An Efficient Off-Line Electronic Cash System Based on the Representation Problem. Technical Report CS-R9323, CWI, Amsterdam, 1993. 11. R. P. Brent. An Improved Monte Carlo Factorization Algorithm. BIT, 20:176-184, 1980. 12. S. Cavallar, B. Dodson, A. K. Lenstra, W. Lioen, P. L. Montgomery, B. Murphy, H. te Riele, K. Aardal, J. Gilchrist, G. Guillerm, P. Leyland, J. Marchand, F. Morain, A. Muffett, C. Putnam, C. Putnam, and P. Zimmermann. Factorization of a 512-bit RSA Modulus. In Eurocrypt '2000, LNCS. Springer-Verlag, Berlin, 2000. 13. J. S. Coron. On the Exact Security of Full-Domain-Hash. In Crypto
205
'2000, LNCS. Springer-Verlag, Berlin, 2000. 14. R. Cramer and V. Shoup. A Practical Public Key Cryptosystem Provably Secure against Adaptive Chosen Ciphertext Attack. In Crypto '98, LNCS 1462, pages 13-25. Springer-Verlag, Berlin, 1998. 15. W. Dime and M. E. Hellman. New Directions in Cryptography. IEEE Transactions on Information Theory, IT-22(6):644-654, November 1976. 16. D. Dolev, C. Dwork, and M. Naor. Non-Malleable Cryptography. In Proc. of the 23rd STOC. ACM Press, New York, 1991. 17. I. Duursma, P. Gaudry, and F. Morain. Speeding Up the Discrete Log Computation on Curves with Automorphisms. In Asiacrypt '99, LNCS 1716. Springer-Verlag, Berlin, 1999. 18. T. El Gamal. A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. IEEE Transactions on Information Theory, I T 31(4):469-472, July 1985. 19. A. Fiat and A. Shamir. How to Prove Yourself: Practical Solutions of Identification and Signature Problems. In Crypto '86, LNCS 263, pages 186-194. Springer-Verlag, Berlin, 1987. 20. E. Fujisaki and T. Okamoto. How to Enhance the Security of Public-Key Encryption at Minimum Cost. In PKC '99, LNCS 1560, pages 53-68. Springer-Verlag, Berlin, 1999. 21. E. Fujisaki and T. Okamoto. Secure Integration of Asymmetric and Symmetric Encryption Schemes. In Crypto '99, LNCS 1666, pages 537-554. Springer-Verlag, Berlin, 1999. 22. M. R. Garey and D. S. Johnson. Computers and Intractability, A Guide to the Theory of NF'-Completeness. Freeman, San Francisco, CA, 1979. 23. P. Gaudry. An Algorithm for Solving the Discrete Log Problem on Hyperelliptic Curves. In Eurocrypt '2000, LNCS. Springer-Verlag, Berlin, 2000. 24. S. Goldwasser and S. Micali. Probabilistic Encryption. Journal of Computer and System Sciences, 28:270-299, 1984. 25. S. Goldwasser, S. Micali, and C. Rackoff. The Knowledge Complexity of Interactive Proof Systems. In Proc. of the 17th STOC, pages 291-304. ACM Press, New York, 1985. 26. S. Goldwasser, S. Micali, and R. Rivest. A "Paradoxical" Solution to the Signature Problem. In Proc. of the 25th FOCS, pages 441-448. IEEE, New York, 1984. 27. S. Goldwasser, S. Micali, and R. Rivest. A Digital Signature Scheme Secure Against Adaptative Chosen-Message Attacks. SIAM Journal of Computing, 17(2):281-308, April 1988. 28. J. Hastad. On Using RSA with Low Exponent in a Public Key Network.
206
In Crypto '85, LNCS 218, pages 403-408. Springer-Verlag, Berlin, 1986. 29. D. Huhnlein, M. J. Jacobson, S. Paulus, and T. Takagi. A Cryptosystem Based on Non-Maximal Imaginary Quadratic Orders with Fast Decryption. In Eurocrypt '98, LNCS 1403, pages 294-307. Springer-Verlag, Berlin, 1998. 30. D. E. Knuth. The Art of Computer Programming ~ Seminumerical Algorithms, volume 2. Addison-Wesley, Readings, Massachusetts, London, 1981. 31. N. Koblitz. Elliptic Curve Cryptosystems. Mathematics of Computation, 48(177) :203-209, January 1987. 32. N. Koblitz. A Family of Jacobians Suitable for Discrete Log Cryptosystems. In Crypto '88, LNCS 403, pages 94-99. Springer-Verlag, Berlin, 1989. 33. N. Koblitz. Hyperelliptic Cryptosystems. Journal of Cryptology, 1:139150, 1989. 34. X. Lai and J. L. Massey. A Proposal for a New Block Encryption Standard. In Eurocrypt '90, LNCS 473, pages 389-404. Springer-Verlag, Berlin, 1991. 35. B. A. LaMacchia and A. M. Odlyzko. Computation of Discrete Logarithms in Prime Fields. Designs, Codes and Cryptography, l(l):47-62, May 1991. 36. A. Lenstra and H. Lenstra. The Development of the Number Field Sieve, volume 1554 of Lecture Notes in Mathematics. Springer-Verlag, 1993. 37. A. K. Lenstra, H. W. Lenstra, and L. Lovasz. Factoring Polynomials with Rational Coefficients. Mathematische Annalen, 261(4):515-534, 1982. 38. J. L. Massey. SAFER K-64: a Byte-Oriented Block Ciphering Algorithm. In Proc. of the 1st FSE, LNCS 809, pages 1-17. Springer-Verlag, Berlin, 1994. 39. U. M. Maurer. Diffie Hellman Oracles. In Crypto '96, LNCS 1109, pages 268-282. Springer-Verlag, Berlin, 1996. 40. R. J. McEliece. A Public-Key Cryptosystem Based on Algebraic Coding Theory. DSN progress report, 42-44:114-116, 1978. Jet Propulsion Laboratories, CALTECH. 41. A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryptography. CRC Press, 1996. Available from http://www.cacr.math.uwaterloo.ca/hac/. 42. A. J. Menezes, T. Okamoto, and S. A. Vanstone. Reducing Elliptic Curve Logarithms to Logarithms in a Finite Field. IEEE Transactions on Information Theory, 39:1639-1646, 1993. 43. R. C. Merkle and M. E. Hellman. Hiding Information and Signatures in
207
44. 45. 46.
47.
48.
49.
50. 51. 52.
53.
54.
55. 56.
57. 58. 59.
Trapdoor Knapsacks. IEEE Transactions on Information Theory, I T 24:525-530, 1978. G. Miller. Riemann's Hypothesis and Tests for Primality. Journal of Computer and System Sciences, 13:300-317, 1976. V. Miller. Uses of Elliptic Curves in Cryptography. In Crypto '85, LNCS 218, pages 417-426. Springer-Verlag, Berlin, 1986. P. L. Montgomery. Speeding the Pollard and Elliptic Curve Methods for Factorization. Mathematics of Computation, 48(177):243-264, January 1987. M. A. Morrison and J. Brillhart. A Method of Factoring and the Factorization of F-j. Mathematics of Computation, 29(129) :183-205, January 1975. M. Naor and M. Yung. Universal One-Way Hash Functions and Their Cryptographic Applications. In Proc. of the 21st STOC, pages 33-43. ACM Press, New York, 1989. M. Naor and M. Yung. Public-Key Cryptosystems Provably Secure against Chosen Ciphertext Attacks. In Proc. of the 22nd STOC, pages 427-437. ACM Press, New York, 1990. National Bureau of Standard U.S. Data Encryption Standard, 1977. NIST. Secure Hash Standard (SHS). Federal Information Processing Standards PUBlication 180-1, April 1995. S. C. Pohlig and M. E. Hellman. An Improved Algorithm for Computing Logarithms over GF(p) and its Cryptographic Significance. IEEE Transactions on Information Theory, IT-24(1): 106-110, January 1978. D. Pointcheval. A New Identification Scheme Based on the Perceptrons Problem. In Eurocrypt '95, LNCS 921, pages 319-328. Springer-Verlag, Berlin, 1995. D. Pointcheval. Chosen-Ciphertext Security for any One-Way Cryptosystem. In PKC '2000, LNCS 1751, pages 129-146. Springer-Verlag, Berlin, 2000. D. Pointcheval and J. Stern. Security Proofs for Signature Schemes. In Eurocrypt '96, LNCS 1070, pages 387-398. Springer-Verlag, Berlin, 1996. D. Pointcheval and J. Stern. Security Arguments for Digital Signatures and Blind Signatures. Journal of Cryptology, to appear. Available from h t t p : / / w w w . d i . e n s . f r / ~ p o i n t c h e . J. M. Pollard. A Monte Carlo Method for Factorization. BIT, 15:331334, 1975. J. M. Pollard. Monte Carlo Methods for Index Computation (mod p). Mathematics of Computation, 32(143) :918-924, July 1978. C. Pomerance. The Quadratic Sieve Algorithm. In Eurocrypt '84, LNCS
208
209, pages 169-182. Springer-Verlag, Berlin, 1985. 60. C. Pomerance and S. S. Wagstaff. Implementation of the Continued Fraction Integer Factoring Algorithm. Congressus Numerantium, 37:99118, 1983. 61. C. Rackoff and D. R. Simon. Non-Interactive Zero-Knowledge Proof of Knowledge and Chosen Ciphertext Attack. In Crypto '91, LNCS 576, pages 433-444. Springer-Verlag, Berlin, 1992. 62. R. Rivest. The MD5 Message-Digest Algorithm. RFC 1321, The Internet Engineering Task Force, April 1992. 63. R. Rivest, A. Shamir, and L. Adleman. A Method for Obtaining Digital Signatures and Public Key Cryptosystems. Communications of the ACM, 21(2):120-126, February 1978. 64. T. Satoh and K. Araki. Fermat Quotients and the Polynomial Time Discrete Log Algorithm for Anomalous Elliptic Curves. Comment. Math. Helv., 47(l):81-92, 1998. 65. B. Schneier. Applied Cryptography, Protocols, Algorithms, and Source Code in C. John Wiley & Sons, Inc, 1994. 66. C. P. Schnorr. Efficient Identification and Signatures for Smart Cards. In Crypto '89, LNCS 435, pages 235-251. Springer-Verlag, Berlin, 1990. 67. C. P. Schnorr. Efficient Signature Generation by Smart Cards. Journal of Cryptology, 4(3):161-174, 1991. 68. A. Shamir. An Efficient Identification Scheme Based on Permuted Kernels. In Crypto '89, LNCS 435, pages 606-609. Springer-Verlag, Berlin, 1990. 69. D. Shanks. Class Number, a Theory of Factorization, and Genera. In Proceedings of the Symposium on Pure Mathematics, volume 20, pages 415-440. AMS, 1971. 70. C.E.Shannon. Communication Theory of Secrecy Systems. Bell System Technical Journal, 28(4):656-715, 1949. 71. A. Shimizu and S. Miyaguchi. Fast Data Encipherment Algorithm FEAL. In Eurocrypt '87, LNCS 304, pages 267-278. Springer-Verlag, Berlin, 1988. 72. V. Shoup. Lower Bounds for Discrete Logarithms and Related Problems. In Eurocrypt '97, LNCS 1233, pages 256-266. Springer-Verlag, Berlin, 1997. 73. V. Shoup and R. Gennaro. Securing Threshold Cryptosystems against Chosen Ciphertext Attack. In Eurocrypt '98, LNCS 1403, pages 1-16. Springer-Verlag, Berlin, 1998. 74. N. Smart. The Discrete Logarithm Problem on Elliptic Curves of Trace One. Journal of Cryptology, 12(3):193-196, 1999.
209
75. J. Stern. A New Identification Scheme Based on Syndrome Decoding. In Crypto '93, LNCS 773, pages 13-21. Springer-Verlag, Berlin, 1994. 76. D.R. Stinson. Cryptography Theory and Practice. CRC Press, 1995. 77. P. C. van Oorschot and M. J. Wiener. On Diffie-Hellman Key Agreement with Short Exponents. In Eurocrypt '96, LNCS 1070, pages 332-343. Springer-Verlag, Berlin, 1996. 78. S. Vaudenay. Provable Security for Block Ciphers by Decorrelation. In STAGS '98, LNCS 1373, pages 249-275. Springer-Verlag, Berlin, 1998. 79. G. S. Vernam. Cipher Printing Telegraph Systems for Secret Wire and Radio Telegraphic Communications. Journal of the American Institute of Electrical Engineers, 45:109-115, 1926. 80. H. C. Williams. A p + 1 Method of Factoring. Mathematics of Computation, 39(159):225-234, July 1982.
SOME APPLICATIONS OF G R A P H THEORY * FRED ROBERTS Department of Mathematics and DIM A CS Rutgers University
1
Introduction
Graph theory is an old subject, but one that has many fascinating modern applications. These applications in turn have offered important stimulus to the development of the field, leading to generalizations of important graphtheoretical concepts and challenging questions about them. We will illustrate these points with three graph-theoretical concepts: graph coloring, intersection graph, and competition graph. For each, we will mention a variety of applications, concentrating on a few; discuss generalizations related to applications; and describe a few recent results and open questions. We will use the terminology of graph theory from the book 124 . Formally, a graph consists of a set V called the set of vertices or points and a set E called the set of edges and consisting of unordered pairs of vertices. 2
G r a p h Coloring
A coloring of a graph is an assignment of a color to each vertex so that if two vertices are joined by an edge, they get different colors. We say that a graph is k-colorable if it can be colored in k or fewer colors. The smallest k so that graph G is fc-colorable is called the chromatic number of G and is denoted by x(G). For an introduction to graph coloring, see 124 . 2.1
Applications of Graph Coloring
Graph coloring has many fascinating applications. See 127 for a survey of such applications. In channel assignment, we have a set of transmitters taken to *FRED ROBERTS THANKS THE NATIONAL SCIENCE FOUNDATION FOR ITS SUPPORT UNDER GRANT NSF-SBR-9709134 TO RUTGERS UNIVERSITY. HE THANKS THE COMBINATORIAL AND COMPUTATIONAL MATHEMATICS CENTER AT POHANG UNIVERSITY OF SCIENCE AND TECHNOLOGY FOR SPONSORING THE CONFERENCE THAT LED TO THIS PAPER AND OFFERS HIS CONGRATULATIONS AND BEST WISHES FOR COM2MAC'S SUCCESS. 210
211
be the vertices of a graph, an edge means that the corresponding transmitters interfere, and the color assigned to a vertex is its assigned channel. The idea is that if two transmitters interfere, they get different channels. The channel assignment problem was formulated graph-theoretically in 6,6 ° and has been studied widely. See 166 for an example of a recent paper on the subject. In traffic phasing, we have a set of individuals (or cars or ...) with requests to use a facility (room, tool, traffic intersection). These are taken to be the vertices of a graph and an edge means their requests interfere. The color assigned to a vertex is the time the facility is assigned to the individual. The idea is that if two individuals have interfering requests, the individuals will be given different times to use the facility. This problem arises in phasing traffic lights, but also in a variety of other scheduling problems. An early paper formulating a graph-theoretical approach to the traffic phasing problem is by Staffers 144 . Staffers' method was discussed and generalized in 106,107,108,118,120,123
Graph coloring also arises in scheduling meetings of legislative committees. In work connected to the legislature in the State of New York in the USA, various legislative committees are taken to be vertices of a graph and an edge means the committees have a member in common. The color assigned to a vertex is its assigned meeting time and, of course, if two committees have a common member, they must be assigned different meeting times. This problem was studied for the New York State legislature in 9 and an exposition of it can be found in 124 . A similar problem arises in scheduling final exams at a university or jobs in a factory. Graph coloring has also arisen in assigning schedules to garbage trucks in the City of New York. Here, the vertices of a graph represent "tours" of garbage trucks, schedules of sites they visit on a given day, an edge means they visit a common site, and the color assigned to a vertex is the day of the week the "tour" is scheduled to run. If two tours visit a common site, they must get different days. This graph coloring problem is a subproblem that arises as part of a large and complicated problem in operations research and was discussed in 151 . See also 118>124 for an exposition of the approach. In the fleet maintenance problem, we have a set of vehicles (planes, cars, ships) scheduled for regular maintenance. These become the vertices of a graph and an edge means that the corresponding vehicles are in the maintenance facility at overlapping times. The color gives the space assigned to a vehicle and two vehicles scheduled into the facility at overlapping times must get different spaces. This problem first arose in shipyards with the "vehicles" being ships and was studied at IBM by Alan Hoffman and Ellis Johnson (see 50 ). For more general discussion of the problem, see 106 . 124 .
212 In the task assignment problem, a variety of tasks need to be performed. The tasks become vertices of a graph and an edge means that the corresponding tasks use a common tool or common space or common worker. Then the color is the time assigned to the task. If two tasks use a common tool or space or worker, then they must get different times for these tasks. See 106 . 124 for a discussion. The mobile radio frequency assignment problem is concerned with assigning frequencies to mobile phones in a region. The region is divided into zones, which are the vertices of a graph, and an edge between two zones means that the phones in the two zones interfere. The color assigned to a zone is the frequency to be used by the phones in that zone, and the restriction is that if two zones interfere, they must get different frequencies. This problem was first formulated graph-theoretically by Gilbert 4 8 . For discussion, see 106 . 120 > 123 . 2.2
Variants of Ordinary Graph Coloring
The problems described in the previous subsection often have complications that make ordinary graph coloring an oversimplified model. These complications have given rise to fascinating new graph coloring concepts. 2.2.1
T-Coloring
One complication is that channels assigned to interfering transmitters might be subject to various distance constraints. We can formalize this by talking about a set T of nonnegative integers. We can think of a channel as a positive integer. Then, the idea is that interfering transmitters cannot get channels which are separated by an integer in the set T, thought of as a disallowed separation. Formally, we seek a function / that assigns a positive integer to each vertex of a graph (V, E) so that
{x,y}eE^\f(x)-f(y)\tT. A function / satisfying this condition is called a T-coloring. The special case of T = {0} is ordinary graph coloring. Another simple case is T = {0,1}. Here, interfering transmitters get not only different channels, but non-adjacent channels. Consider for example the case of a complete graph on three vertices and the set T = {0,1,4,5}. We can color the vertices "greedily," using the lowest acceptable positive integer for each. In that case, we would color the first vertex 1, the second 3, and the third 9. Another T-coloring would use the
213 integers 1, 4, 7. These two T-colorings are comparable in terms of the number of colors (channels) used. However, the second is better in terms of the separation between the largest and smallest colors used. This separation is called the span of the T-coloring and the minimum span over all T-colorings is denoted spr(G). T-colorings were introduced by Hale 60 and formalized later by Cozzens and Roberts 29 . Since then, there have been dozens of papers written on Tcolorings, and five Ph.D. theses n.92,ni,i46,i62 p o r a s u r v e v 0 f the literature on this topic, see 126 . There is much work to be done on T-colorings. For example, we don't even know SPT(KP) for every T-set and every complete graph Kp. {Kp has p vertices and all possible edges.) Cozzens and Roberts 29 showed that if T = {0,1,2, ...,r}, then
spT(G) = (r + l)\x(G) - 1] =
spT(Kx{G)).
The result even holds if T = {0,1,2,..., r } U S, where S does not contain a multiple of r + 1. Such a set T is called an r-initial set. For example, T = {0,1,2, 5, 7} is2-initial. Raychaudhuri m > 1 1 2 showed that if T = {0, s, 2s,..., ks], then spT(G) = st + skt - sk - 1 if x(G) = st and spT{G) = st + skt+p-1 if x(G) = st + p,0
214
2.2.2
£(2,1)- Coloring
Another complication in channel assignment is that transmitters might interfere at different levels. Then we might have a different distance separation requirement for each level of interference. One formalization of this idea is to have two kinds of edges, e.g., a red edge if two transmitters are within 50 kilometers and a blue edge if they are within 100 kilometers. Then we would also have different sets T for the two colored edges. If {x, y} is a red edge, we would require \f(x) — f(y)\ $. T\ , and if {x, y} is a blue edge, we would require \f(x) — f(y)\ $ T 2 . Another formalization of this is the following: Given a graph G = (V, E), find a coloring of the vertices with positive integers so that if x and y are joined by an edge, they get colors that are not in the set T\ = {0,1}, and if they have distance 2 in G, then they are not in the set T = {0}. Such an assignment of a coloring is called an L (2,1)- coloring of G. L(2, l)-colorings were introduced by Yeh 169 and Griggs and Yeh 56 . They studied the smallest k so that graph G has an L(2, l)-coloring using only integers in {1, 2, ...,k +1}. This number is denoted A(G). Griggs and Yeh showed that, in general, the problem of determining A(G) is NP-complete. On the other hand, they showed that for trees T, A(T) = A(T) + 1 or A(T) + 2, where A(T) is the maximum degree of a vertex. Chang and Kuo 18 gave a polynomial algorithm for determining A(T) for a tree T. For arbitrary graphs, Chang and Kuo showed that A(G) < A 2 (G) + A(G). Tighter bounds on A in terms of A for special families of graphs are established in 18,46,47,56,69,133,167,169 Griggs and Yeh conjectured that A(G) < A 2 (G). This conjecture remains open. More recent work on L(2, l)-colorings is found in the papers 3 6 ' 3 7 and these papers provide a variety of references to the literature. They leave open in general the question of characterizing graphs G such that A(iJ) < A(G) for all proper subgraphs H of G and the problem of characterizing graphs G for which there is an L(2, l)-coloring using integers in { 1 , 2 , . . . , A(G) + 1} such that all these integers are used on some vertex. 2.2.3
Set Colorings; n-Tuple Colorings
In all of the problems we have mentioned, it makes sense to speak of assigning a set of colors to a vertex rather than a single color. If S(x) is the set of colors assigned to vertex x, then it is natural to require that
{x,y}€E^S(x)nS(y)
= 9.
215
Such an assignment is called a set coloring. \S(x)\ = 1 for all x gives an ordinary graph coloring. If each set S(x) has exactly n colors, we call this an n-tuple coloring. The smallest number of colors required for an n-tuple coloring of G is called the n-tuple chromatic number of G and is denoted by Xn{G). Consider for example the 4-cycle. If we use the color sets {red, blue} and {green, yellow} alternating around the cycle, then we get a 2-tuple coloring. This shows that X2{Gi) < 4. Of course, it equals 4. n-tuple colorings were introduced by Gilbert 4 8 in connection with the mobile radio frequency assignment problem. Here is a useful early result about n-tuple colorings. Given graphs G and H, the lexicographic product G\H] is defined as follows: V(G[.ff]) = V(G) x V(H) and (a, b) is joined to (c, d) by an edge if and only if {a, c} € E(G) or a == c and {b,d} G E(H). Stahl 142 proved that Xn(G) = x(G[K n ]). This result is often more useful theoretically than practically - just try to compute X2(Gi) this way! Theoretically, for example, the result is used to prove that if G is a weakly 7-perfect graph, then Xn(G) = nx{G). (G is weakly 'y-perfect if its chromatic number is equal to the size of its largest clique, i.e., the largest complete subgraph.) This holds for G = Cp for p even. For example, we know t h a t x 2 ( C 4 ) = 4 = 2x(C 4 ). To show how difficult n-tuple coloring can be, we consider the Kneser graph G(m,p): The vertices are the p-element subsets of an m-element set. There is an edge between two subsets if and only if they are disjoint. A sticky open question in the theory of n-tuple colorings is to find Xn{G(m,p)). This is important because ordinary colorings are "homomorphisms" into complete graphs while n-tuple colorings are homomorphisms into Kneser graphs. Lovasz 93 calculated x(G(m,p)) = Xi(G(m,p)) in 1978 in the process of settling an important open problem in graph theory by proving Kneser's conjecture. (Kneser's conjecture says: If we split the p-element subsets of a (2p + fc)-element set into k + 1 classes, one of the classes will contain two disjoint p-element subsets.) A formula for Xn(G(m,p)) was conjectured by Stahl 142 : If n — 1 = qp + r, q > 0,0 < r < p, then Xn(G(m,p)) = {q + l)m - 2(p - r - 1). This formula is known to hold for a variety of values of n,p, m. See for example 40,44,142 f j o w e v e r ! it s truth or falsity remains open in general. Many sets other than sets of n integers have been studied in connection with set colorings. Among the most important are sets consisting of real intervals or unions of real intervals. These are discussed briefly below, in
216
Sec. 3.5. See 2.2.4
1
^
for a variety of concrete cases and references.
List Coloring
In many applications, there is an extra complication. There are some acceptable colors for each vertex and the color assigned to a vertex must be chosen from the set of acceptable colors. For instance, in channel assignment, we specify a set of acceptable channels and in traffic phasing a set of acceptable times. Given a graph G, let L(x) be a non-empty set of integers assigned to vertex x. We call L a list assignment for G. We seek a graph coloring / so that for every vertex x, f(x) is in L(x). Such a coloring is called a list coloring for (G,L). If a list coloring exists, we say that (G,L) is list colorable. List colorings were introduced in the 1970's, independently by Vizing 156 and by Erdos, Rubin, and Taylor 32 , and there have been a very large number of papers about this subject in the past decade. Some recent survey articles are 3,88,155
A great deal of emphasis has been placed on the case where each set L(x) has the same fixed number of elements, k. If (G, L) can be list colored for every possible list assignment L in which all |£(a;)| = k, we say that G is k-choosable. To give an example, we note that C5 is not 2-choosable. Use L(x) = {1,2} on each vertex. If there is a list coloring for this list assignment, then C 5 has an ordinary coloring using two colors. On the other hand, it is easy to show that C4 is 2-choosable. Erdos, Rubin, and Taylor 3 2 characterized 2-choosable graphs. However, the characterization of 3-choosable graphs remains a wide open problem. There have been major results on choosability in the past ten years. For example, Erdos, Rubin, and Taylor 32 conjectured that every planar graph is 5-choosable but that there are planar graphs that are not 4-choosable. Alon and Tarsi 4 proved that bipartite planar graphs are 3-choosable. Voigt 157 showed that not all planar graphs are 4-choosable. Thomassen 147 proved that every planar graph is 5-choosable. In spite of this progress, many open questions about list coloring and other generalizations of ordinary graph coloring remain. See 67 for many such problems. 3
The Second Concept: Intersection Graph
Let F = { 5 2 , 5 2 , . . . , Sn} be a family of sets. We can build a graph corresponding to F by taking the vertex set to be F and including an edge between Si and Sj if and only if Si P Sj ^ 0. This graph is called the intersection
217 graph of the family of sets. For instance, consider the sets Si = {a, b, c}, S2 = {b, c, d, e}, S3 = {d}, 5 4 = {e}. Then in the intersection graph, the edges join 52 to the other vertices. It is natural to ask: What graphs arise this way? Formally put: Given a graph G = (V, E), can we assign a set S(x) to each vertex x of V so that for all x ^ y, {x,y}£E^S{x)f\S{y)^% Marczewski 102 proved that very graph is the intersection graph of some family of sets. 3.1
Interval Graphs
The question of what graphs arise as intersection graphs is much more interesting and leads to really important ideas if we restrict the families of sets to sets of certain kinds. We shall concentrate on one important special case. If F is a family of intervals on the real line, then its intersection graph is called an interval graph. We shall return to the question of what graphs are interval graphs. First, we note that this one very simple concept has a large number of fascinating applications. For general references on interval graphs and their applications, see 34,5o,ii8,i2o,i48 3.2
Applications of Interval Graphs
A variant of the following problem was one of the motivating problems for the notion of interval graph (Hajos 5 9 ). A group of students goes to the library and later a book is missing. To find out who might have taken it, we try to reconstruct when people were there. We assume that each student stays in the library for a certain interval of time, then leaves. For each pair of students x and y, we know whether or not x and y saw each other. Can we construct time intervals so that if x and y saw each other, their time intervals in the library overlap, and if they didn't see each other, then their time intervals in the library do not overlap? We define a graph G by letting the vertices be the individuals and an edge between two individuals mean that they saw each other. Then our question is equivalent to the question: Is G an interval graph? Even if the answer is yes, we need a way to find the intervals. There are good algorithms for doing so, as we shall note below. Also, even finding
218
the intervals is just a first step in solving the mystery. However, it is an important first step. Interval graphs arose independently from Hajos' problem and from a problem in molecular genetics known as Benzer's Problem 7 ' 8 . Seymour Benzer was a Nobel Prize-winning geneticist. In 1959, he was studying the "fine structure" of bacterial genes; it was not known whether or not the collection of DNA composing a bacterial gene was linear. He asked: Is the fine structure of the gene linear or is it circular or does it have a different topology? How do we tell if we can't see it? At that time, there were no sophisticated methods (such as gel electrophoresis) for answering questions like this. Benzer observed that we can determine whether or not two connected substructures inside the gene overlap. We do it by gathering mutation data. He then asked: Is the overlap information consistent with a linear structure? Equivalently, the question can be stated this way: Can we assign an "interval" to each substructure so that two substructures overlap if and only if their intervals do? Or: Is the graph of overlaps among substructures an interval graph? There is no longer active interest in Benzer's problem, but there is a great deal of modern work in molecular biology that involves interval graphs. For example, interval graphs arise in connection with research on restriction maps which show the location of certain sites (short specific sequences) on a specific DNA ( 125 ' 163 ' 165 ). More generally, interval graphs also play an important role in the whole study of physical mapping in molecular biology. See 57 > 136 . 164 for a general introduction to this problem and 1.2,10,68,71,72,138,139,170 fQT examp\ea of research papers involving interval graphs in physical mapping. Interval graphs also arise in the study of preference and indifference in economics and psychology. Suppose we consider some alternative purchases about which we do not have a good estimate of the value, such as antiques for example. We can associate with each alternative u a range of possible values J(u). We would be comfortable in saying that we prefer u to v if the interval J(u) is strictly to the right of the interval J(v), and otherwise not. Supose that we are indifferent between u and v if and only if we neither prefer u to v nor v to u. Then we are indifferent between u and v if and only if J{u) and J(v) overlap. Let V be the set of alternatives under consideration and define a graph on V by letting edges correspond to indifference. If observed indifference judgments fit our assumptions, then this graph is an interval graph. For further discussion of interval graphs as models of indifference, see 116,118,122
Seriation or sequence dating is an important area in in archaeology. Here, we study a variety of types of artifacts (e.g., pottery) dug up in archaeological digs. We would like to place them in chronological order, assuming each type
219
of artifact was in use over an interval of time. Let us build a graph with the types of artifacts as vertices. Suppose that an edge means that they were found in common in some dig. If they were found in common in some dig, then it is reasonable to conclude that the time intervals during which the two artifacts were in use overlapped. Assuming enough digs, then we can assume that we have two types of artifacts appearing in common in some dig if and only if their time intervals overlapped. Thus, the graph is an interval graph and we can use the corresponding assignment of intervals to find a preliminary chronological order. For more on seriation in archaeology, see 64,73,74,75,76,77,78,118,120,121,140
An analogous problem arises in seriation in developmental psychology. In studying the development of children, psychologists have noted that traits such as crawling, sitting up, etc. each develop over a certain interval of time over the course of development of the child. There is a stage when a child crawls, a stage that overlaps when the child is starting to sit up, a stage when the child begins to pull itself up to a standing position, etc. It is hypothesized that the pattern of development is common to all children. In studying this hypothesis, we can build a graph whose vertices are the traits and which has an edge between two traits if they are found in common in some child. Assuming we have studied enough children, we expect that two traits were found in common in some child if and only if they developed in overlapping intervals of time. Thus, the graph is an interval graph and we can use the corresponding assignment of intervals to find a preliminary developmental order. For more on seriation in developmental psychololgy, see 26 . n 8,i20 3.3
What Graphs are Interval Graphs?
There are various theorems that allow us to tell if a graph is an interval graph. Early results were obtained by Lekkerkerker and Boland 91 , Gilmore and Hoffman 49 , and Fulkerson and Gross 42 , and a polynomial algorithm was developed by Booth and Lueker 12 . Here is one sample result. A clique in a graph G is a complete subgraph and we say it is maximal if it is not contained in any larger clique. We define the maximal clique-vertex incidence matrix as follows. The rows correspond to the maximal cliques Ki, K2, • • •, Kp and the columns to the vertices x\, X2,... ,xn, with the i,j entry equal to 1 if Ki contains Xj and equal to 0 otherwise. Consider for example the graph H = (V, E) where V = {a, b, c, d, e, /, g}
220
E = {{a, b}, {b, c}, {a, c}, {d, e}, {/, g}, {e, / } } . Then the maximal cliques are K1 = {a, b,c},K2 = {d,e},K3 = {f,g},K4 = {e, / } and the maximal-clique vertex incidence matrix is the following matrix:
K2 K3
a b abc / 1 1 de 0 0 f9 0 0 e/ \° 0
/ 9 0 0 0 1 1 1
°\
In the 4-cycle C 4 with vertices a, 6, c, d in order around the cycle, we have the maximal-clique vertex incidence matrix d a b 1 ab / I o\ be 0 1 0 cd 0 0 1 ad \l 0 1/ Pulkerson and Gross 42 showed that a graph is an interval graph if and only if the rows of the maximal clique-vertex incidence matrix can be reordered so that the l's in each column appear consecutively. We see that this can be done with the graph H defined above. If we reorder the maximal cliques, we get the following matrix: a b c d e / 9 = abc 11 1 1 0 0 0 ^1 0 = de 0 0 1 1 0 0 K2 0 0 0 0 1 1 0 K± = ef K3 = fg \« 0 0 0 0 1 V However, there is no way to reorder the rows of the maximal-clique vertex incidence matrix for C4 in order to accomplish the same goal. That shows that C4 is not an interval graph, as is easy to demonstrate directly. A matrix whose rows can be permuted so that the l's in each column appear consecutively is said to have the consecutive 1 's property (for columns). This property is very useful in a variety of applications. See 28 for a discussion. An ordering of maximal cliques for which the maximal clique-vertex incidence matrix has l's appearing consecutively in each column is called a consecutive ordering. We shall see that consecutive orderings are very important in applications.
°\
221 3.4
Circular-Arc Graphs
Other families of interesection graphs are also of interest. We say that G is a circular-arc graph if it is the intersection graph of a family of arcs on a circle. While C4 is not an interval graph, it is easy to see that it is a circular-arc graph. It seems natural to conjecture the following: A graph is a circular-arc graph if and only if the rows of the maximal clique-vertex incidence matrix can be reordered so that the l's in each column appear consecutively in a circular sense (i.e., continuing from bottom to top). However, this is not the case. The graph V consisting of a triangle with a pendant edge added to each vertex is a circular-arc graph. However, there is no way to reorder rows of the maximal clique-vertex incidence matrix so that the l's in each column appear consecutively in a circular sense, ( r has vertices a, b, c, x, y, z with a,b,c forming a triangle and edges from x to a, y to b, and 2 to c.) Technically speaking, one needs to add a condition that the family of circular arcs has the Helly property: Any subfamily of arcs that pairwise overlap has a common point. Then, we can conclude that a graph is the intersection graph of a family of circular arcs with the Helly property if and only if we can reorder the rows of the maximal clique-vertex incidence matrix so that the l's in each column appear consecutively in a circular sense. Any family of circular arcs whose intersection graph is the graph T defined above would have circular arcs corresponding to a, 6, c pairwise overlapping, but no one point belonging to all three such arcs. Thus, F is not the intersection graph of a family of circular arcs with the Helly property. The problem of recognizing or characterizing circular-arc graphs is difficult. There is a long literature on this problem. Finally, good algorithms were described for testing for whether not a graph is a circular-arc graph. However, structural understanding of these graphs remains difficult to obtain. For work on the circular-arc graph recognition problem, see for example 33,87,141,149,150,152,153,154
3.5
Connection to Traffic Phasing
The notions of circular-arc graph and interval graph have applications in traffic phasing. We have different streams of traffic approaching an intersection. We would like to put in a traffic light. The light will give a green signal to each traffic stream over an interval of time. (Similar problems arise in scheduling other facilities such as computers, rooms, etc.) We can think of a large circular clock and the traffic streams each receiving an arc of time along that clock for their green time. Some traffic streams
222
are compatible and some are not. We build an incompatibility graph by taking the vertices to be the traffic streams and an edge between two traffic streams if they are incompatible. A green light assignment is called feasible if incompatible traffic streams get non-overlapping circular arcs. If (V, E) is the incompatibility graph, then if J(x) is a circular arc assigned to traffic stream a; in a feasible green light assignment,
{x,y}£E^J(x)f)J(y)
= $.
If we think of each J(x) as a set of colors or times assigned to a vertex x, we have a set coloring problem, more specifically, what is sometimes called an interval coloring problem. It is useful to consider the compatibility graph instead, the graph (V, E') whose edges join compatible traffic streams. For this graph, we have:
{x,y}eE'^J{x)nJ(y)^9. Note that we do not necessarily have <->. The set of all edges in E' corresponding to x,y so that J(x) n J(y) ^ 0 is a subset of E'. Thus, a feasible assignment defines a so-called spanning subgraph H of the compatibility graph (spanning meaning that the subgraph has the same vertex set and a subset of the edges). Moreover, since the J(x)'s are circular arcs, H is a circular-arc graph. Thus, feasible traffic light assignments correspond to spanning subgraphs of the compatibility graph that are circular-arc graphs. Unfortunately, circular-arc graphs are not easy to identify. Observe that if the last green light ends before the first starts, then H is an interval graph. Real traffic lights have this property. Thus, feasible traffic light assignments correspond to spanning subgraphs of the compatibility graph that are interval graphs. Fortunately, interval graphs are easy to identify. Consider for example a traffic intersection with eastbound traffic stream a, westbound traffic stream b, northbound traffic stream c, southbound traffic stream d, left-turning traffic stream e going from south to west, and leftturning traffic stream / going from north to east. (This intersection is studied in 118 .) A compatibility graph is given by taking V =
{a,b,c,d,e,f}
£ = {{a,6},{c,d},{d,/},{e,/},{c,e}}.
223
Note that this is not a circular-arc graph or an interval graph. However, omitting edge {c, e} gives us an interval graph spanning subgraph H and this corresponds to a feasible green light assignment, which can be found by an assignment of intervals (or circular arcs) whose intersection graph is this spanning subgraph. One such assignment is the following: J{a) = (0,1), J(6) = (0,l), J(c) = (1,3), J(d) = (1,6), J(e) = (6,10), J ( / ) = (3,10). Of all feasible green light assignments, which ones are the best or most efficient? We can make each green light interval very short, thus almost surely obtaining a feasible assignment. However, we usually specify that each green light interval have a certain minimum length. If the traffic light is at an isolated intersection with no other traffic lights nearby, then it is reasonable to try to find an assignment that minimizes the sum total of waiting times. If the intersection is not isolated, vehicles can be expected to arrive at the intersection at given times. Then, we try to minimize the (weighted) time lags between ideal and realized starting times. We concentrate on the isolated traffic intersection. A good procedure was developed by traffic scientist Karl Staffers 144 . See 118 for an exposition of this procedure. Staffers' idea was: • 1. Find all spanning subgraphs of the compatibility graph that are interval graphs. • 2. For each such spanning subgraph, find the most efficient feasible green light assignment. • 3. Find the overall most efficient assignment by comparing the solutions in step 2. How do we find the solution in step 2? We use the Fulkerson-Gross characterization of interval graphs given in Sec. 3.3. • 2a. Find a consecutive ordering of maximal cliques, i.e., an ordering so that the maximal clique-vertex incidence matrix has consecutive l's: K\,K2, • • • ,Kp. Each clique corresponds to a "phase" during which all of the traffic in that clique gets a green light. • 2b. For each vertex u, let it get green light in the first Ki that contains it and continue until the last Ki that contains it. Since the matrix has consecutive l's, each vertex is in a consecutive set of maximal cliques and so we end up with an interval of green light time.
224
In the specific example given above, the maximal clique-vertex incidence matrix with rows permuted to give consecutive l's is the following: a b c d e / Kx = ab (I 1 0 0 0 o\ K2 = cd o 0 1 1 0 0 K3 = df o 0 0 1 0 1 K± = ef 0 0 0 1 l) Ki,K2,K3,Ki is a consecutive ordering and corresponds to one phasing. In this phasing, we start with a green light for streams a and 6, the eastbound and westbound traffic. We then have a green light for the northbound and southbound traffic. In the third phase, we turn off the green light for the northbound traffic and turn on a green light for left-turning traffic from north to east. Finally, in the fourth phase, we have both left turn arrows on. Another consecutive ordering is K\,K±,Kz,Ki. Here, we start with the east-west traffic, then turn on both left-turn arrows, then replace the leftturning traffic from south to west with southbound traffic, and finish with north-south traffic. A different interval graph spanning subgraph would give some entirely different phasings. The reader might wish to experiment to see what phasings can be found if instead of omitting edge {c, e} from the compatibility graph, we omit instead edge {c, d}. The rest of the solution requires us to find the lengths of the intervals (circular arcs).
1°
• 2c. Assign maximal clique Ki a duration a\. • 2d. How do we find the values di that minimize the total waiting time? This is a linear programming problem. It was formulated in some generality by Opsut and Roberts 107 . 108 . Note that the method is not at all efficient in a technical sense. Even the problem of finding all maximal cliques of a graph is an NP-complete problem. However, the graphs to which the method is applied are relatively small, so the method can be applied in practice in a reasonably efficient way. For more general discussion of the above procedure, see i° 6 . 1 0 7 , 1 0 8 . 1 2 3 . 3.6
Unit Interval Graphs
The special case of interval graphs where every interval has the same unit length is an important case. Such graphs, called unit interval graphs, were characterized in 115 ' 116 and have found numerous applications in philosophy
225
of science, preference and utility measurement, molecular biology, etc. See 50,120,122 for e x a m p i e s 0 f Some of these applications.
3.7
Unions of Two Intervals
A variant of the traffic phasing problem is to allow green light assignments where each "green light set" is a union of two intervals. This problem is difficult in part because it is even an open problem to characterize the double interval graphs, intersection graphs of families of sets each of which is a union of at most two intervals. For some results about double interval graphs, see 55,61,111,113
3.8
Intersection Graphs of "Boxes" in Euclidean Space
Suppose we have boxes in Euclidean fc-space, generalized rectangles with sides parallel to the coordinate axes. In 2-space, these are rectangles. What graphs arise as the intersection graphs of boxes in fc-space? This is in general an unsolved problem, even for k = 2, and is known to be NP-complete for k > 2 27,168 p o r ^ = 1, boxes reduce to intervals, and the intersection graphs of boxes are exactly the interval graphs. It is easy to see that every graph is the intersection graph of boxes in fc-space for some sufficiently large k. The smallest such k is called the boxicity of the graph. This concept was introduced in 115>117. It is still fundamentally an open problem to understand the boxicity of large classes of graphs, even though this concept goes back to 1967. Intersection graphs of boxes have a variety of important applications. One of the major applications is in ecology and we turn to this as we discuss the third basic concept of emphasis in this paper, that of competition graph.
4
Third Concept: Competition Graph
Suppose D = (V, A) is a digraph. The competition graph C(D) has vertex set V and an edge between x and y if there is u in V so that (a;, u) and (y, u) are arcs of A. This notion was introduced by Joel Cohen in 1968 19 . We say that G is a competition graph if it is C(D) for some D. In some cases, we require that D be acyclic, but we shall not make that assumption here except in certain cases. Since Cohen's work, a large literature on the subject of competition graphs has arisen. This literature is surveyed in 80 > 95 . 125 . 128 ,
226
4-1
Applications of Competition Graphs
Competition graphs arose in the work of Cohen 19 in connection with an application in ecology. Here, D = (V, A) represents a food web for an ecosystem. V is the set of species in the ecosystem and (x, y) £ A if x preys on y. Then x, y is an edge of C(D) iff x and y compete for a common prey u. Consider for example the following food web T (part of a food web in 1 6 ). The species are Bear (B), Deer (D), Fox (F), Grass (G), Hare (H), Owl (O), Raccoon (R), Shrew (S), and Wildcat (W) and the predator-prey relations are given by the ordered pairs (B,D), (B,G), (B,S), (0,G), (D,G), (F,0), (F,H), (F,S), (H,G), (R,0), (W,0), (W,S). The Foxes and Wildcats compete because they both eat owls, and so there is an edge {F,W} in the competition graph. The ecological application of competition graphs was a primary motivation for the paper 20 , the book 21 and a large number of papers on the topic. A second application of competition graphs is in communication over a noisy channel. Here, V is the set of letters in an alphabet and (x, y) £ A if when x is sent, y could be received. Then x,y is an edge of C(D) iff x and y could be received as the same letter. C(D) is the confusion graph. This graph arises in Shannon's theory of noisy channels and in his definition of the capacity of such a channel (see for example 58>94>104> 120,137^ These concepts give rise to challenging open questions in graph theory. For example, it took a long time to calculate the so-called Shannon capacity of the simple confusion graph C5 (see 9 4 ) , but we still don't know the capacity of all cycles. Competition graphs also arise in channel assignment. Here, V is a set of transmitting/receiving stations and (x, y) £ A if a signal sent at x can be received at y. Then x,y is an edge of C{D) iff messages sent by x and y can be received at the same place. C(D) is the conflict graph. The channel assignment problem we have discussed before in Sec. 2.1 is concerned with coloring this graph. The channel assignment application has given rise to a lot of work on competition graphs of undirected graphs. See 98>99>i0i,H4 A fourth application of competition graphs is in modeling of complex systems. In large-scale computer models, for instance those dealing with energy or economic systems, we often deal with a matrix M representing the contraints of an LP or a similar problem. Here, we can take V to be the set of rows of M and (x,y) £ A if M(x,y) ^ 0. Then x,y is an edge of C(D) iff rows x and y have a nonzero entry in the same column. They "link" a common variable. C(D) is known as the row graph of M. The row graph has many uses, including applications to the structure of linear programs. The properties of row graphs are studied in 51.52>53.54 and applications to the structure of linear programs are discussed in 5 1 , 1 0 9 ' 1 1 0 .
227
The problem of phylogenetic tree reconstruction is the following: Given a set of species and some information about the similarities between pairs of species, reconstruct an evolutionary (phylogenetic) tree, an oriented tree whose vertices are species and which has an arc from vertex u to vertex v if v is a direct ancestor of u, and do so in such a way that two species in S are closer together in the tree iff they are more similar. Usually, we want species in S to be leaves (vertices of indegree 0), but suppose we don't make this assumption. Suppose we also consider the very special case where all similarities are 0 or 1 and build a graph G on the vertex set V by taking an edge between x and y if their similarity is 0. Roberts and Sheng 130 showed that under a particular definition of distance, the problem of reconstructing the phylogenetic tree reduces to the problem of finding an oriented tree T which, after addition of loops at each vertex, has a competition graph that contains G as an induced subgraph. Related papers on competition graphs and phylogenetic tree reconstruction are 129>131. See 128 for a survey paper on this topic. 4-2
What Graphs are Competition Graphs?
A widely studied problem is to identify what graphs are competition graphs. In particular, there has been a great deal of work on the special case of competition graphs of acyclic digraphs. Acyclicity is a natural assumption in the ecological application that motivated the subject. Note that every acyclic digraph has a vertex with no outgoing arcs and therefore every competition graph of such a digraph has an isolated vertex. This is the case in the food web JF defined in Sec. 4.1. Vertices G and S are isolated in the competition graph. We start by discussing competition graphs of acyclic digraphs. It was observed in 119 that given any graph G, G together with sufficiently many isolated vertices is a competition graph of some acyclic digraph. (The proof is straightforward: Add a new "prey" vertex p(x, y) for each edge x, y of G and let x and y prey on p(x,y).) We use the notation G Li Ir for G together with r isolated vertices. The smallest number r so that GUlr is the competition graph of an acyclic digraph is called the competition number of G and is denoted by k(G). This concept was introduced in 119 , where it was observed that characterization of competition graphs of acyclic digraphs is equivalent to the calculation of competition numbers. Opsut 105 proved that computation of k(G) is an NP-complete problem. While the problem of characterizing competition graphs is NP-complete, there are some interesting graph-theoretical approaches to it. Competition graphs of acyclic digraphs were characterized by Dutton and Brigham and
228
Lundgren and Maybee 31 ' 96 in terms of edge coverings by cliques. This idea was generalized to characterizations of graphs of competition number at most m by Lundgren and Maybee 96 (corrected by Kim 7 9 ) , and of competition graphs of arbitrary digraphs with loops allowed by Dutton and Brigham 31 and to competition graphs of arbitrary digraphs without loops by Roberts and Steif 132 . An interesting special situation is to characterize competition graphs of special types of digraphs. For instance, motivated by the channel assignment problem, Raychaudhuri and Roberts 114 characterized competition graphs of symmetric digraphs that are unit interval graphs. Other work on competition graphs of symmetric digraphs is found in the papers 9 8 ' 9 9 and elsewhere see survey 128 . Competition graphs of strongly connected and of hamiltonian digraphs are studied in 41>100 and of tournaments in 38 - 39 . There has been a great deal of research on competition number. We mention here two long-term open problems. Opsut 105 calculated the competition number of a line graph and conjectured that a similar result held for all graphs with the property that two cliques could cover all of the vertices of the open neighborhood of any vertex. In spite of almost 20 years of work, Opsut's conjecture remains open. See 83.158>i59>16i for partial results. The second open problem concerns an elimination procedure for computing the competition number. Such a procedure was introduced by Roberts 119 , who conjectured that it always led to the competition number. Opsut 105 showed that this was false. A modified elimination procedure due to Kim and Roberts 8 5 is known to calculate the competition number for a large class of graphs, and it remains open to determine whether or not it always works. 4-3
Ecological Niches
The competition between species is a major motivating application for the study of competition graphs. In an ecosystem, each species has a "normal, healthy" environment. This environment can be characterized as follows: There is a range (interval) of acceptable values of certain parameters, for instance temperature, humidity, acidity, etc. Suppose there are k such parameters. The set of all points in Euclidean fc-space so that on each parameter, the point is within the given range, is called the species' ecological niche. The ecological niche is a "box" in A;-space with sides parallel to the coordinate axes. It is an old ecological principle that two species compete if and only if their ecological niches overlap, i.e., iff their boxes overlap. Joel Cohen 19 suggested we try to find the smallest number of parameters k that are needed to account for observed competitions. His approach was to define competition
229 in a different way, specifically by using a food web and letting its competition graph define competition. The question is: How many parameters are needed so that each species corresponds to an ecological niche in fc-space and so that there is an edge between x and y in the competition graph if and only if their ecological niches overlap? In other words, what is the smallest k so that G is the intersection graph of boxes in fc-space? In the case of the food web T defined in Sec. 4.1, there are five maximal cliques: Ki = {B,D,H,0}, K2 = {B,F,W}, K3 = {F,R,W}, KA = {G}, Kh = {S}. It is easy to see that K1,K2,K3,Ki,Ks is a consecutive ordering of maximal cliques and hence that the competition graph is an interval graph. So, one dimension suffices. What is surprising is that, in practice, one dimension usually suffices. What does that mean? It means that the competition graph of most real world food webs is an interval graph! This was first observed in 1968 by Joel Cohen 19 . For more than thirty years people have been trying to explain why, without complete success. The approaches to this problem have been both mathematical and ecological. This has led to a large scientific literature. We discuss this problem in the next subsection. 4-4
What Food Webs Have Competition Graphs that are Interval Graphs?
A fundamental open problem in applied graph theory is to characterize the acyclic digraphs (food webs) whose competition graphs are interval graphs. There are many approaches to this problem. We give some sample results. From a purely computational point of view, the fundamental open problem is easy to solve: Given a digraph, compute its competition graph (easy to do) and use one of the standard linear-time algorithms for checking if this is an interval graph. (A fundamental algorithm is the Booth-Lueker algorithm that uses PQ-trees. See 12 .) However, we seek a solution that highlights the structure and properties of the food web that lead to an interval graph competition graph. Cohen 21 took a statistical approach to this problem. He generated food webs randomly and determined whether or not their competition graphs were interval graphs. By varying the distributions from which the food webs were generated according to some assumptions corresponding to structure, he tried to find models which gave rise to interval graph competition graphs with high probability. (Other statistical/probabilistic approaches to the structure of competition graphs can be found in the papers 15.22>23>24>25.i03 ) Lundgren and Maybee 97 found a characterization of acyclic digraphs
230
whose competition graph is an interval graph that makes use of the FulkersonGross characterization of interval graphs discussed in Sec. 3.3 and essentially reduces the problem to determining directly if a competition graph is an interval graph. It only begins to shed light on the structural properties of a digraph required for it to have a competition graph that is an interval graph. Steif 143 approached the problem from just such a structural point of view. However, he obtained a negative result: There is no forbidden subdigraph characterization of acyclic digaphs with interval graph competition graphs. Sugihara 145 showed statistically that the frequency with which food webs have interval graph competition graphs could be accounted for by requiring that the competition graph be triangulated, i.e., have no cycles of length greater than 3 as generated subgraphs. Hefner, Jones, Kim, Lundgren, and Roberts 63 approached the problem by studying digraphs with limited indegrees and outdegrees. They characterized the acyclic digraphs with indegree and outdegree at most 2 at each vertex and which have interval graph competition graphs. This is the most elementary case, but there is evidence that real world food webs tend to have very low indegrees and outdegrees (see 2 2 ), so small cases do give useful insights. The problem remains open for higher bounds on indegree and outdegree. Hefner, et al. also studied the special case where every vertex has the same indegree and every vertex has the same outdegree. This turns out to be closely related to combinatorial designs, in particular block designs. Hefner, et al. showed, for instance, that under certain circumstances, an acyclic digraph where every vertex has indegree 0 or k and outdegree 0 or r has an interval graph competition graph if there is a (b, v, r, k, A)-design. D is an interval digraph if two real intervals J(x) and K(x) can be assigned to each vertex x so that (a;, y) is an arc of D iff J{x) D K(y) ^ 0. (Interval digraphs are not necessarily acyclic.) Langley, Lundgren, and Merz 89 showed that interval digraphs have interval graph competition graphs and that every interval graph is the competition graph of some interval digraph. 4-5
Variants of Competition Graphs
As with the notions of graph coloring and intersection graph, there are many interesting variations of the notion of competition graph. For instance, if we take an edge between x and y iff there is a vertex u with arcs (u,x), (u,y) in D, then we have the common enemy graph of D; if there is an edge between x and y iff there are u and v with arcs (x,u), (y,u) and (v,x), (v,y) in D, then we have the competition-common enemy graph of D; and if there is an edge between x and y iff there is u with arcs (x,u), (y,u) in D or there is
231
v with arcs (v,x), (v,y) in D, then we have the niche graph of D, In these graphs, in the ecological interpretation, we have an edge between two species iff they have a common enemy, or both a common prey and a common enemy, or either a common prey or a common enemy. Common enemy graphs are studied in 90 . 145 . 160 ) competition-common enemy graphs in 43,70,86,134,135^ a n c j niche graphs in M3,14,17,35,45 _ ^Q g^ve s o r n e references. Still another variant arises if we take an edge between x and y iff there are ui, u2, • . •, up with arcs (a;, ui), (y, ui), (x, u2), (y, u2), ...,(x, uv), (y, up) in D. In this case, we speak of the p-competition graph of D. See 65.66>8i>82 For each of these types of graphs, there are characterization problems, notions analogous to competition number, and many open questions. 5
Closing Comment
Graph theory has found widespread application in numerous fields. In turn, these fields have stimulated the development of many new graph-theoretical concepts and led to many challenging graph theory problems. We can anticipate that the continued interplay between graph theory and many areas of application will lead to important new developments as we begin a new century. References 1. Alizadeh, F., Karp, R.M., Newberg, L.A., and Weisser, D.K., Physical mapping of chromosomes: A combinatorial problem in molecular biology, Algorithmic, 13 (1995), 52-76. 2. Alizadeh, F., Karp, R.M., Weisser, D.K., and Zweig, G., Physical mapping of chromosomes using unique probes, J. Computational Biology, 2 (1995), 159-184. 3. Alon, N., Restricted colorings of graphs, in K. Walker (ed.), Surveys in Combinatorics, Proc. 14th British Combinatorial Conference, London Math. Soc. Lecture Notes Series, Vol. 187, Cambridge University Press, Cambridge, 1993, pp. 1-33. 4. Alon, N., and Tarsi, M., Colorings and orientations of graphs, Combinatorial, 12 (1992), 125-134. 5. Anderson, C.A., Loop and cyclic niche graphs, Linear Alg. & Applications, 217 (1995), 5-13. 6. Anderson, L.G., A simulation study of some dynamic channel assignment algorithms in a high capacity mobile telecommunications system, IEEE Trans. Commun., 21 (1973), 1294-1301.
232
7. Benzer, S., On the topology of the genetic fine structure, Proc. Nat. Acad. Sci. USA, 45 (1959), 1607-1620. 8. Benzer, S., The fine structure of the gene, Sci. Amer., 206 (1962), 70-84. 9. Bodin, L.D., and Friedman, A. J., Scheduling of Committees for the New York State Assembly, Tech. Report USE No. 71-9, Urban Science and Engineering, State University of New York, Stony Brook, 1971. 10. Boedlander, H.L, and Babette, de F., On internalizing fc-colored graphs for DNA physical mapping, Discrete Appl. Math., 71 (1996), 55-77. 11. Bonias, I., T-Colorings of Complete Graphs, Ph.D. Thesis, Department of Mathematics, Northeastern University, Boston, MA, 1991. 12. Booth, K.S., and Lueker, G.S., Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms, J. Comput. Syst. Sci., 13 (1976), 335-379. 13. Bowser, S., and Cable, C.A., Some recent results on niche graphs, Discrete Appl. Math., 30 (1991), 101-108. 14. Bowser, S., Cable, C.A., and Lundgren, J.R., Niche graphs and mixed pairs of tournaments, J. Graph Theory, 31 (1999), 319-332. 15. Briand, F., and Cohen, J.E., Community food webs have scale-invariant structure, Nature, Lond., 307 (1984), 264-266. 16. Burnett, R.W., Fisher, H.I., and Zim, H.S., Zoology: An Introduction to the Animal Kingdom, Golden Press, New York, 1958. 17. Cable, C , Jones, K., Lundgren, J.R., and Seager, S., Niche graphs, Discrete Appl. Math., 23 (1989), 231-241. 18. Chang, G.J., and Kuo, D., The L{2, l)-labeling problem on graphs, SIAM J. Discrete Math., 9 (1996), 309-316. 19. Cohen, J.E., Interval graphs and good webs: A finding and a problem, RAND Corporation Document 17696-PR, Santa Monica, CA, 1968. 20. Cohen, J.E., Food webs and the dimensionality of trophic niche space, Proc. Nat. Acad. Sci., 74 (1977), 4533-4536. 21. Cohen, J.E., Food Webs and Niche Space, Princeton University Press, Princeton, NJ, 1978. 22. Cohen, J.E., and Briand, F., Trophic links of community food webs, Proc. Nat. Acad. Sci. USA, 81 (1984), 4105-4109. 23. Cohen, J.E., Briand, F., and Newman, CM., A stochastic theory of community food webs III. Predicted and observed lengths of food chains, Proc. R. Soc. Lond., B228 (1986), 317-353. 24. Cohen, J.E., and Newman, CM., A stochastic theory of community food webs I. Models and aggregated data, Proc. R. Soc. Lond., B224 (1985), 421-448. 25. Cohen, J.E., Newman, CM., and Briand, F., A stochastic theory of
233
26. 27.
28.
29. 30.
31. 32.
33.
34. 35. 36. 37. 38. 39.
40.
community food webs II. Individual webs, Proc. R. Soc. Lond., B 2 2 4 (1985), 449-461. Coombs, C.H., and Smith, J.E.K, On the detection of structures in attitudes and developmental processes, Psych. Rev., 80 (1973), 337-351. Cozzens, M.B., Higher and Multi-dimensional Analogues of Interval Graphs, Ph.D. thesis, Department of Mathematics, Rutgers University, New Brunswick, NJ, 1981. Cozzens, M.B., and Mahadev, N.V.R., Consecutive one's properties for matrices and graphs including variable diagonal entries, in F.S. Roberts (ed.), Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, Springer-Verlag, New York, 1989, pp. 75-94. Cozzens, M.B., and Roberts, F.S., T-colorings of graphs and the channel assignment problem, Congressus Numerantium, 35 (1982), 191-208. Cozzens, M.B., and Roberts, F.S., Greedy algorithms for T-colorings and the meaningfulness of conclusions about them, J. Comb., Info., & Syst. Sci., 16 (1991), 286-299. Dutton, R.D., and Brigham, R.C., A characterization of competition graphs, Discrete Applied Math., 6 (1983), 315-317. Erdos, P., Rubin, A.L., and Taylor, H., Choosability in graphs, in Proc. West-Coast Conference on Combinatorics, Graph Theory and Computing, Areata, CA, Congressus Numerantium, 26 (1979), 125-157. Eschen, E.M., and Spinrad, J.P., An 0(n2) algorithm for circular-arc graph recognition, Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, Austin TX, 1993, ACM, New York, 1993, 128-137. Fishburn, P.C., Interval Graphs and Interval Orders, Wiley, New York, 1985. Fishburn, P.C., and Gehrlein, W.V., Niche numbers, J. Graph Theory, 16 (1992), 131-139. Fishburn, P.C., and Roberts, F.S., Full color theorems for 1/(2,1)colorings, DIMACS Technical Report 2000-09, 2000. Fishburn, P.S., and Roberts, F.S., Minimal forbidden graphs for L(2,1)colorings, DIMACS Technical Report, 2000. Fisher, D.C., Lundgren, J.R., Merz, S.K., and Reid, K.B., Domination graphs of tournaments and digraphs, Congr. Numer., 108 (1995), 97-107. Fisher, D.C., Lundgren, J.R., Merz, S.K., and Reid, K.B., The domination and competition graphs of a tournament, in H.H. Cho and S.G. Hahn (eds.), Proc. of Workshops in Pure Mathematics, Vol. 16, Part I, 1997, 27-38. Frankl, P., and Fiiredi, Z., Extremal problems concerning Kneser graphs,
234
J. Comb. Theory, B40 (1986), 270-284. 41. Fraughnaugh, K.F., Lundgren, J.R., Maybee, J.S., Merz, S.K., and Pullman, N.J., Competition graphs of strongly connected and hamiltonian digraphs, SIAM J. Discr. Math., 8 (1995), 179-185. 42. Fulkerson, D.R., and Gross, O.A., Incidence matrices and interval graphs, Pacific J. Math., 15 (1965), 835-855. 43. Fiiredi, Z., Competition graphs and clique dimensions, Random Structures and Algorithms, 1 (1990), 183-189. 44. Garey, M.R., and Johnson, D.S., The complexity of near-optimal graph coloring, J. ACM, 23 (1976), 43-49. 45. Gehrlein, W.V., and Fishburn, P.C., The smallest graphs with niche number three, Comput. Math. AppL, 27 (1994), 53-57. 46. Georges, J.P., and Mauro, D.W., On the size of graphs labeled with a condition at distance two, J. Graph Theory, 22 (1996), 47-57. 47. Georges, J.P., Mauro, D.W., and Whittlesey, M.A., Relating path coverings to vertex labelings with a condition at distance two, Discrete Math., 135 (1994), 103-111. 48. Gilbert, E.N., unpublished technical memorandum, Bell Telephone Laboratories, Murray Hill, NJ, 1972. 49. Gilmore, P.C., and Hoffman, A.J., A characterization of comparability graphs and of interval graphs, Canad. J. Math., 16 (1964), 539-548. 50. Golumbic, M.C., Algorithmic Graph Theory and Perfect Graphs, Academic Press, New York, 1980. 51. Greenberg, H.J., Lundgren, J.R., and Maybee, J.S., Graph-theoretic foundations of computer-assisted analysis, in H.J. Greenberg and J.S. Maybee (eds.), Computer-assisted Analysis and Model Simplification, Academic Press, New York, 1981, pp. 481-495. 52. Greenberg, H.J., Lundgren, J.R., and Maybee, J.S., Rectangular matrices and signed graphs, SIAM J. Alg. & Discrete Math., 4 (1983), 50-61. 53. Greenberg, H.J., Lundgren, J.R., and Maybee, J.S., Inverting graphs of rectangular matrices, Discr. Appl. Math., 8 (1984), 255-265. 54. Greenberg, H.J., Lundgren, J.R., and Maybee, J.S., Inverting signed graphs, SIAM J. Alg. & Discr. Meth., 5 (1984), 216-223. 55. Griggs, J.R., and West, D.B., Extremal values of the interval number of a graph, Discrete Math., 28 (1979), 37-47. 56. Griggs, J.R., and Yeh, R.K., Labelling graphs with a condition at distance 2, SIAM J. Discrete Math., 5 (1992), 586-595. 57. Gusfield, D., Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, Cambridge, UK, 1997.
235
58. Haemers, W., On some problems of Lovasz concerning the Shannon capacity of a graph, IEEE Trans. Inform. Theory, 25 (1979), 231-232. 59. Hajos, G., Uber eine art von graphen, Internat. Math. Nachr., 47 (1957), 65. 60. Hale, W.K., Frequency assignment: Theory and applications, Proc. IEEE, 68 (1980), 1497-1514. 61. Harary, F., and Trotter, W.T., On double and multiple interval graphs, J. Graph Theory, 3 (1979), 205-211. 62. Hell, P., and Nesetfil, J., On the complexity of i7-coloring, J. Comb. Theory, B48 (1990), 92-110. 63. Hefner, K., Jones, K., Kim, S., Lundgren, J.R., and Roberts, F.S., i,j competition graphs, Discr. Appl. Math., 32 (1991), 241-262. 64. Hubert, L. Some applications of graph theory and related non-metric techniques to problems of approximate seriation: The case of symmetric proximity measures, British J. Math. Statist. Psychol, 27 (1974), 133153. 65. Isaak, G., Kim, S., McKee, T., McMorris, F., and Roberts, F.S., 2competition graphs, SIAM J. Discr. Math., 5 (1992), 524-538. 66. Jacobson, M.S., On the p-edge clique cover numbers of complete bipartite graphs, SIAM J. Discr. Math., 5 (1992), 539-544. 67. Jensen, T.R., and Toft, B., Graph Coloring Problems, Wiley Interscience, New York, 1995. 68. Jiang, T., and Karp, R.M., Mapping clones with a given ordering or interleaving, in Proceedings 8th ACM-SIAM Symp. on Discrete Algs., 1997. 69. Jonas, K., Graph Coloring Analogues with a Condition at Distance Two: L(2, l)-Labelings and List X-Labelings, Ph.D. thesis, University of South Carolina, 1993. 70. Jones, K., Lundgren, J.R., Roberts, F.S., and Seager, S., Some remarks on the double competition number of a graph, Congr. Numerantium, 60 (1987), 17-24. 71. Kaplan, H., and Shamir, R., Pathwidth, bandwidth and completion problems to proper interval graphs with small cliques, SIAM J. Computing, 25 (1996), 540-561. 72. Kaplan, H., Shamir, R., and Tarjan, R.E., Tractability of parameterized completion problems in chordal and interval graphs: Minimum fill-in and physical mapping, in Proceedings 35th IEEE Symp. Found. Computer Science, 1994, pp. 780-791. 73. Kendall, D.G., A statistical approach to Flinders Petrie's sequence dating, Bull. Internat. Statist. Inst, 40 (1963), 657-680.
236
74. Kendall, D.G., Incidence matrices, interval graphs, and seriation in archaeology, Pacific J. Math., 28 (1969), 565-570. 75. Kendall, D.G., Some problems and methods in statistical archaeology, World Archaeology, 1 (1969), 61-76. 76. Kendall, D.G., A mathematical approach to seriation, Philos. Trans. Roy. Soc. London Ser. A, 269 (1971), 125-135. 77. Kendall, D.G., Abundance matrices and seriation in archaeology, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 17 (1971), 104-112. 78. Kendall, D.G., Seriation from abundance matrices, in F.R. Hodson, et al. (eds.), Mathematics in the Archaeological and Historical Sciences, Edinburgh University Press, Edinburgh, 1971. 79. Kim, S-R., Competition Graphs and Scientific Laws for Food Webs and Other Systems, Ph.D. thesis, Department of Mathematics, Rutgers University, New Brunswick, NJ, 1988. 80. Kim, S-R., The competition number and its variants, Annals of Discrete Mathematics, 55 (1993), 313-325. 81. Kim, S-R., McKee, T., McMorris, F., and Roberts, F.S., p-competition numbers, Discr. Appl. Math., 46 (1993), 87-92. 82. Kim, S-R., McKee, T., McMorris, F., and Roberts, F.S., p-competition graphs, Linear Alg. & Applications, 217 (1995), 167-178. 83. Kim, S-R., and Roberts, F.S., On Opsut's conjecture for the competition number, Congr. Numer., 71 (1990), 173-176. 84. Kim, S-R., and Roberts, F.S., Competition numbers of graphs with a small number of triangles, Discr. Appl. Math., 78 (1997), 153-162. 85. Kim, S-R., and Roberts, F.S., The elimination algorithm for the competition number, Ars Combinatoria, 50 (1998), 97-113. 86. Kim, S-R., Roberts, F.S., and Seager, S. On 1 0 1-clear (0,1) matrices and the double competition number of bipartite graphs, J. Comb., Info., & Syst. Sci., 17 (1992), 109-143. 87. Klee, V., What are the intersection graphs of arcs in a circle?, Amer. Math. Monthly, 76 (1969), 810-813. 88. Kratochvi'l, J., Tuza, Z., and Vbigt, M., New trends in the theory of graph colorings: Choosability and list coloring, in R.L. Graham, J. Kratochvil, J. Nesetfil, and F.S. Roberts (eds.), Contemporary Trends in Discrete Mathematics, DIMACS Series, Vol. 49, American Mathematical Society, Providence, RI, 1999, pp. 183-197. 89. Langley, L., Lundgren, J.R., and Merz, S.K., The competition graphs of interval digraphs, Congr. Numer., 107 (1995), 37-40. 90. Langley, L., Lundgren, J.R., Merz, S.K., and Rasmussen, C., Digraphs with chordal or interval competition and resource graphs, mimeographed,
237
Department of Mathematics, University of Colorado, Denver, 1997. 91. Lekkerkerker, C.B., and Boland, J. Ch., Representation of a finite graph by a set of intervals on the real line, Fund. Math., 51 (1962), 45-64. 92. Liu, D.D., Graph Homomorphisms and the Channel Assignment Problem, Ph.D. Thesis, Department of Mathematics, University of South Carolina, Columbia, SC, 1991. 93. Lovasz, L., Kneser's conjecture, chromatic number, and homotopy, J. Comhin. Theory, A25 (1978), 319-324. 94. Lovasz, L., On the Shannon capacity of a graph, IEEE Trans. Inform. Theory, 25 (1979), 1-7. 95. Lundgren, J.R., Food webs, competition graphs, competition-common enemy graphs, and niche graphs, in F.S. Roberts (ed.), Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, Springer-Verlag, New York, 1989, pp. 221-243. 96. Lundgren, J.R., and Maybee, J.S., A characterization of graphs with competition number m, Discr. Appl. Math., 6 (1983), 319-322. 97. Lundgren, J.R., and Maybee, J.S., Food webs with interval competition graphs, in Graphs and Applications: Proceedings of the First Colorado Symposium on Graph Theory, Wiley, New York, 1984, pp. 231-244. 98. Lundgren, J.R., Maybee, J.S., and Rasmussen, C.W., An application of generalized competition graphs to the channel assignment problem, Congr. Numer., 7 (1990), 217-224. 99. Lundgren, J.R., Maybee, J.S., and Rasmussen, C.W., Interval competition graphs of symmetric digraphs, Discrete Math., 119 (1993), 113-122. 100. Lundgren, J.R., McKenna, P.A., Langley, L., Merz, S.K., and Rasmussen, C.W., The p-competition graphs of strongly connected and Hamiltonian digraphs, Ars Combinatoria, 47 (1997), 161-172. 101. Lundgren, J.R., Merz, S.K., and Rasmussen, C.W., A characterization of graphs with interval squares, Congr. Numer., 98 (1993), 132-142. 102. Marczewski, E., Sur deux proprietes des classes d'ensembles, Fund. Math., 33 (1945), 303-307. 103. Newman, CM., and Cohen, J.E., A stochastic theory of community food webs IV. Theory of food chain lengths in large webs, Proc. R. Soc. Land., B228 (1986), 355-377. 104. Nowakowski, R.J., and Rail, D.F., Associative graph products and their independence, domination, and coloring numbers, Discussiones Math. Graph Theory, 16 (1996), 53-79. 105. Opsut, R.J., On the computation of the competition number of a graph, SIAM J. Alg. & Discr. Meth.., 3 (1982), 420-428. 106. Opsut, R.J., and Roberts, F.S., On the fleet maintenance, mobile radio
238
frequency, task assignment, and traffic phasing problems, in G. Chartrand, Y. Alavi, D.L. Goldsmith, L. Lesniak-Foster, and D.R. Lick (eds.), The Theory and Applications of Graphs, Wiley, New York, 1981, pp. 479492. 107. Opsut, R.J., and Roberts, F.S., /-colorings, /-phasings, and Iintersection assignments for graphs, and their applications, Networks, 13 (1983), 327-345. 108. Opsut, R.J., and Roberts, F.S., Optimal /-intersection assignments for graphs: A linear programming approach, Networks, 13 (1983), 317-326. 109. Provan, J.S., Determinacy in linear systems and networks, SIAM J. Discr. Metk, 4 (1983), 262-278. 110. Provan, J.S., and Kydes, A., Correlation and determinacy in network models, BNL Report 51243, Brookhaven National Laboratory, Upton, NY, 1980. 111. Raychaudhuri, A., Intersection Assignments, T-coloring, and Powers of Graphs, Ph.D. Thesis, Department of Mathematics, Rutgers University, New Brunswick, NJ, 1985. 112. Raychaudhuri, A., Further results on T-coloring and frequency assignment problems, SIAM J. Discr. Math., 7 (1994), 605-613. 113. Raychaudhuri, A., Optimal multiple interval assignments in frequency assignment and traffic phasing, Discr. Appl. Math., 40 (1992), 319-332. 114. Raychaudhuri, A., and Roberts, F.S., Generalized competition graphs and their applications, in P. Brucker and A. Pauly (eds.), Methods of Operations Research, Vol. 49, Anton Hein, Konigstein, W. Germany, 1985, pp. 295-311. 115. Roberts, F.S., Indifference Graphs, Ph.D. thesis, Department of Mathematics, Stanford University, Stanford, CA, 1968. 116. Roberts, F.S., Indifference graphs, in F. Harary (ed.), Proof Techniques in Graph Theory, Academic Press, New York, 1969, pp. 139-146. 117. Roberts, F.S., On the boxicity and cubicity of a graph, in W.T. Tutte (ed.), Recent Progress in Combinatorics, Academic Press, New York, 1969, pp. 301-310. 118. Roberts, F.S., Discrete Mathematical Models, with Applications to Social, Biological, and Environmental Problems, Prentice-Hall, Englewood Cliffs, NJ, 1976. 119. Roberts, F.S., Food webs, competition graphs, and the boxicity of ecological phase space, in Y. Alavi and D. Lick (eds.), Theory and Applications of Graphs, Springer-Verlag, New York, 1978, pp. 477-490. 120. Roberts, F.S., Graph Theory and its Applications to Problems of Society, CBMS-NSF Monograph No. 29, SIAM, Philadelphia, 1978.
239 121. Roberts, F.S., Indifference and seriation, in F. Harary (ed.), Advances in Graph Theory, New York Academy of Sciences, 1979, pp. 171-180. 122. Roberts, F.S., Measurement Theory, with Applications to Decisionmaking, Utility, and the Social Sciences, Addison-Wesley, Reading, MA, 1979. 123. Roberts, F.S., On the mobile radio frequency assignment problem and the traffic light phasing problem, Annals, NY Acad. Sci., 319 (1979), 466-483. 124. Roberts, F.S., Applied Combinatorics, Prentice-Hall, Englewood Cliffs, NJ, 1984. 125. Roberts, F.S., Seven fundamental ideas in the application of combinatorics and graph theory in the biological and social sciences, in F.S. Roberts (ed.), Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, Vol. 17 of IMA Volumes in Mathematics and its Applications, Springer-Verlag, New York, 1989, pp. 1-37. 126. Roberts, F.S., T-colorings of graphs: Recent results and open problems, Discr. Math., 93 (1991), 229-245. 127. Roberts, F.S., From garbage to rainbows: Generalizations of graph coloring and their applications, in Y. Alavi, G. Chartrand, O.R. Oellermann, and A.J. Schwenk (eds.), Graph Theory, Combinatorics, and Applications, Vol. 2, Wiley, New York, 1991, pp. 1031-1052. 128. Roberts, F.S., Competition graphs and phylogeny graphs, in L. Lovasz (ed.), Graph Theory and Combinatorial Biology, Bolyai Society Mathematical Studies, 7 (1999), 333-362. 129. Roberts, F.S., and Sheng, L., Phylogeny numbers of arbitrary digraphs, in B. Mirkin, F.R. McMorris, F.S. Roberts, and A. Rzhetsky (eds.), Mathematical Hierarchies and Biology, Vol. 37, DIMACS Series, American Math. Society, Providence, RI, 1997, pp. 233-237. 130. Roberts, F.S., and Sheng, L., Phylogeny numbers, Discrete Appl. Math., 87 (1998), 213-228. 131. Roberts, F.S., and Sheng, L., Phylogeny numbers for graphs with two triangles, Discrete Appl. Math., 703(2000), 191-207. 132. Roberts, F.S., and Steif, J.E., A characterization of competition graphs of arbitrary digraphs, Discrete Appl. Math., 6 (1983), 323-326. 133. Sakai, D., Labeling chordal graphs: Distance two conditions, SIAM J. Discrete Math., 7 (1994), 133-140. 134. Scott, D.D., The competition-common enemy graph of a digraph, Discr. Appl. Math., 17 (1987), 269-280. 135. Seager, S.M., The double competition number of some triangle-free graphs, Discrete Appl. Math., 29 (1990), 265-269. 136. Setubal, J., and Meidanis, J., Introduction to Computational Molecular
240
Biology, PWS Publishing Co., Boston, MA, 1997. 137. Shannon, C.E., The zero-error capacity of a noisy channel, IRE Trans. Inform. Theory, IT-2 (1956), 8-19. 138. Sheng, L., Some Graph Theoretic Approaches to Problems of the Social and Biological Sciences: Social Roles, Phylogenetic Trees, and Physical Mapping, Ph.D. thesis, Rutgers Center for Operations Research, Rutgers University, New Brunswick, NJ, 1998. 139. Sheng, L., Wang, C , and Zhang, P.S., Tagged probe interval graphs, DIMACS Technical Report 98-12, 1998. 140. Skrien, D., Chronological orderings of interval graphs, Discr. Appl. Math., 8 (1984), 69-83. 141. Sprinrad, J., Circular-arc graphs with clique cover number two, J. Comb. Theory, B44 (1988), 300-306. 142. Stahl, S., n-tuple colorings and associated graphs, J. Comb. Theory, B20 (1976), 185-203. 143. Steif, J.E., Frame Dimension, Generalized Competition Graphs, and Forbidden Sublist Characterizations, Henry Rutgers Thesis, Department of Mathematics, Rutgers University, New Brunswick, NJ, 1982. 144. Staffers, K.E., Scheduling of traffic lights - a new approach, Transp. Res., 2 (1968), 199-234. 145. Sugihara, G., Graph theory, homology, and food webs, in S.A. Levin (ed.), Population Biology, Proc. Symposia in Applied Mathematics, Vol. 30, American Mathematical Society, Providence, RI, 1983, pp. 83-101. 146. Tesman, B., T-Colorings, List T-Colorings, and Set T-Colorings of Graphs, Ph.D. Thesis, Department of Mathematics, Rutgers University, New Brunswick, NJ, 1989. 147. Thomassen, C , Every planar graph is 5-choosable, J. Combin. Theory, B62 (1994), 180-181. 148. Trotter, W.T., Interval graphs, interval orders and their generalizations, in R.D. Ringeisen and F.S. Roberts (eds.), Applications of Discrete Mathematics, SIAM, Philadelphia, PA, 1988, pp. 45-58. 149. Tucker, A.C., Characterizing circular-arc graphs, Bull. Amer. Math. Soc, 75 (1970), 1257-1260. 150. Tucker, A.C., Matrix characterizations of circular-arc graphs, Pacific J. Math., 39 (1971), 535-545. 151. Tucker, A.C., Perfect graphs and an application to optimizing municipal services, SIAM Review, 15 (1973), 585-590. 152. Tucker, A.C., Structure theorems for some circular-arc graphs, Discrete Math, 7 (1974), 167-195. 153. Tucker, A.C., Circular arc graphs: New uses and a new algorithm, in Y.
241 Alavi and D. Lick (eds.), Theory and Applications of Graphs, SpringerVerlag Lecture Notes #642, 1978. 154. Tucker, A.C., An efficient test for circular-arc graphs, SIAM J. on Computing, 9 (1980), 1-24. 155. Tuza, Z., Graph colorings with local constraints - a survey, Discussiones Mathematicae - Graph Theory, 17 (1997), 161-228. 156. Vizing, V.G., Coloring the vertices of a graph in prescribed colors, Metody Diskret. Anal, v Teorii Kodov i Schem, 29 (1976), 3-10. (In Russian) 157. Voigt, M., List colorings of planar graphs, Discrete Math., 120 (1993), 215-219. 158. Wang, C. Competition Graphs, Threshold Graphs and Threshold Boolean Functions, Ph.D. thesis, Rutgers Center for Operations Research, Rutgers University, New Brunswick, NJ, 1991. 159. Wang, C , On critical graphs for Opsut's conjecture, Ars Combinatoria, 40 (1992), 183-203. 160. Wang, C , Competition graphs and resource graphs of digraphs, Ars Combinatoria, 40 (1995), 3-48. 161. Wang, C , Competitive inheritance and limitedness of graphs, J. Graph Theory, 19 (1995), 353-366. 162. Wang, D., The Channel Assignment Problem and Closed Neighborhood Containment Graphs, Ph.D. Thesis, Department of Mathematics, Northeastern University, Boston, MA, 1985. 163. Waterman, M.S., Some mathematics for DNA restriction mapping, in F.S. Roberts (ed.), Applications of Combinatorics and Graph Theory in the Biological and Social Sciences, Springer-Verlag, New York, 1989, pp. 337-345. 164. Waterman, M.S., Introduction to Computational Biology: Maps, Sequences and Genomes, Chapman Hall, London, 1995. 165. Waterman, M.S., and Griggs, J.R., Interval graphs and maps of DNA, Bull. Math. Biol, 48 (1986), 189-195. 166. Welsh, D.J.A., and Whittle, G.P., Arrangements, channel assignments, and associated polynomials, Advances in Applied Math., 23 (1999), 375406. 167. Whittlesey, M.A., Georges, J.P., and Mauro, D.W., On the A-number of Qn and related graphs, Discrete Math., 8 (1995), 499-506. 168. Yannakakis, M., The complexity of the partial order dimension problem, SIAM J. Alg. & Discr. Meth., 3 (1982), 351-358. 169. Yeh, R.K., Labeling Graphs with a Condition at Distance Two, Ph.D. thesis, University of South Carolina, Columbia, 1990.
242
170. Zhang, P., Schon, E.A., Fischer, S.F., Cayanis, E., Weiss, J., Kistler, S., and Bourne, P.E., An algorithm based on graph theory for the assembly of contigs in physical mapping of DNA, CABIOS, 10 (1994), 309-317.
D U A L I T Y A N D ITS C O N S E Q U E N C E S FOR O R D E R E D COHOMOLOGY OF FINITE T Y P E SUBSHIFTS * K.H. KIM Mathematics Research Group. Alabama State University. Montgomery. AL 36101-0271. U.S.A. and Fellow. Korean Academy of Science and Technology (EAST) P.W. ROUSH Mathematics Research Group. Alabama State University. Montgomery. AL 36101-0271, U.S.A. SUSAN G. WILLIAMS Department of Mathematics and Statistics, University of South Alabama, Mobile, AL 36608, U.S.A.
1
INTRODUCTION
In recent years a concept of ordered cohomology introduced by Y. T. Poon 9 has proved important for zero-dimensional dynamical systems. Here we show that the ordered cohomology group without its order structure corresponds by duality to a homology group and this in turn arises from a fundamental group. The unordered fundamental groups of all irreducible subshifts of finite type (SFT) of positive entropy are isomorphic but they have some order structure also. The fundamental group can be used to classify coverings of irreducible SFT just as with path connected topological spaces. An SFT SA represented by an n x n (0,l)-matrix A is the set of all sequences {xi) of vertices corresponding to biinfinite paths in the graph with adjaceny matrix A, where the shift operator shifts coordinates by 1. It is topologized by pointwise convergence (product topology). The cohomology 3 of a dynamical system (X, s) is the quotient of the group of all continuous functions / from X into the integers with discrete topology, by the subgroup of coboundaries (g(x) — g{sxj) for continuous g. Its order subsemigroup is the subsemigroup generated by nonnegative / , and it has a unit, the class of the constant function 1. Here we explore several issues related to the question of how much of the geometry of an SFT is determined by the ordered "THE FIRST T W O AUTHORS W E R E PARTIALLY SUPPORTED BY NSF GRANTS DMS 9024813, DMS 9405004, DMS 9900265. T H E THIRD AUTHOR WAS PARTIALLY SUPPORTED BY NSF GRANT DMS 0071004
243
244
cohomology. Ordered cohomology is more directly related to orbit closure equivalence, that is, homeomorphism preserving closures of orbits, than it is to orbit equivalence.
2 FUNDAMENTAL GROUP AND HOMOLOGY We refer the reader to
7
for basic definitions and results concerning subshifts.
DEFINITION 2.1. Let SA be an SFT represented by a (0,l)-matrix A with graph G and let En{G) denote the nth-iterated edge graph; the vertices of En(G) are identified with blocks [XQ,. .. ,xn-\] corresponding to walks of length Jiin G. Given b = (6j) £ SA, consider the fundamental group of a geometric realization of En(G) with basepoint the vertex [bo, • • • ,bn-i}. The natural graph homomorphism from En+1(G) to En(G) taking [bo,... ,bn] to [bo, • • • ,bn_i] induces a map on their fundamental groups. The fundamental group of SA with basepoint b, denoted TT(SA, b), is the inverse limit of this sequence of groups. It is a topological group given the inverse limit topology, where the graph fundamental groups have the discrete topology.
This concept of fundamental group can be extended to general 0dimensional systems in the manner of Definiton 2.7 below. PROPOSITION 2.2 A homomorphism 4> : SA -> SB between shifts of finite type induces homomorphisms from TT(SA,X) to 7r(Sg, 0(a;)). If 0 is a conjugacy, the fundamental groups are isomorphic. Proof: A k-block shift homomorphism induces maps from En+k(G) to E (H) and hence maps on their fundamental groups, and the inverse limit system. Since the inverse limit is unchanged by a shift in the subscripts, inverse maps on subshifts give inverse mappings on the fundamental groups. n
• For irreducible SFT a change of basepoint yields an isomorphic fundamental group. Indeed, since the graph of En(G) is connected, a change of base points induces isomorphisms of the fundamental groups, by adding paths between the two base points at each end of a loop. To make this consistent over higher edge graphs we must make the image in En(G) of such path in
245
En+1(G) homotopic relative to its end points to the path chosen in En(G). This can be done by taking any path between the basepoints in En+1(G) and adding loops at the endpoints to adjust the homotopy class, using the following result. PROPOSITION 2.3 The fundamental group of a strongly connected directed graph G is generated by homotopy classes of closed directed walks which begin at the base point. The natural mappings from En+1(G) to En(G) always induce epimorphisms on their free fundamental groups.
Proof: Regarding the first statement, by definition, the fundamental group is generated by the homotopy classes of undirected loops at the basepoint b. If G is connected it has a directed spanning tree with root 6, and so its fundamental group is generated by the walks x made up by going out on the tree by a path pi to some vertex v, passing by a single edge e to another vertex w, and then taking an inverse path pi along the spanning tree to b. For, using such undirected paths, we can find a homotopy of any undirected walk starting and ending at b to an undirected walk along the edges of the spanning directed tree. Let p$ be a directed walk from w to b. Then x is homotopic to PiepzipiPz)-1 • This method converts an undirected closed walk to a homotopic product of directed walks and their inverses. To prove the second statement, it suffices to look at a single graph G and its edge graph E1{G). Let the basepoint in El(G) be a vertex [ba\ mapping to the edge ba in G, where b is the basepoint of G. The images in G of directed closed walks in E(G) from ba to ba will consist of all closed walks starting at vertex b and ending ba. Products and inverses of such walks are also in the image. Let bcP be any walk from 6 to 6 in G, and let baPi be a walk from 6 to 6 that goes first to a. Then (baPi)~1baPibcPbaPi(baPi)~1 represents the same homotopy class as bcP and lifts to E ^ G ) . The necessity of such a formula and the way it works can be seen if we consider a graph made up of two cycles only, bcP, baP\ with only vertex b in common. •
If SA is an irreducible SFT of positive entropy then by the above its fundamental group is an inverse limit of epimorphisms of free groups of unbounded finite rank. With a little group theory (6, -T-hr£3,2, applied to images of generators), this implies that the fundamental group of an irreducible SFT is an inverse limit of a system of free groups Fn on n generators under the map sending the nth generator to the identity. Hence all fundamental group
246
of irreducible subshifts of finite type and positive entropy are isomorphic. By a covering of a subshift X we will mean a subshift Y together with an epimorphism
Proof: A continuous homomorphism from TT(SA) to T must induce a homomorphism from the fundamental group of some En(G) to I\ The corresponding covering space of En(G) can be regarded as a digraph G\ by taking the orientations on the edges inherited from En{G). The SFT represented by G\ is a regular covering of SA with covering group F. We can see that this construction is consistent with taking higher edge graphs by considering E1{G\). There is an induced V action on E1(Gi), giving a quotient graph G2 whose vertices correspond to edges of G\ taken modulo T, and edges to paths of length 2 in G\ modulo I\ There is a natural quotient map from G2 to E1(G), which in fact is a a graph isomorphism since r acts freely on Gi. For the last statement we use a lemma of J. Franks (Proposition 2.9 in 5 ) saying that any finite group action on a subshift of finite type SA can be represented by a permutation of vertices on some graph G representing a vertex shift conjugate to SA- A free action can be represented by a free permutation of vertices if we pass to some En(G). The quotient of the graph by the group is a topological regular covering, and hence is induced by a quotient of its fundamental group.•
247
DEFINITION 2.5 The positivity structure on the fundamental group of a subshift of finite type associated with a partially ordered group G0 consists of the set of all homomorphisms from the fundamental group into G0 such that the image of every closed walk starting at the base point and made up of a concatenation of directed walks and cancelling segments PP~1,P~1P on any of the iterated edge graphs is nonnegative.
Note that a walk of this kind in En(G) projects to such a walk in En~l(G). Conversely a walk of this kind in En~1{G) can be lifted to a walk of this kind up to homotopy in En(G). Such homomorphisms can be constructed by taking a fixed edge graph and ordering its fundamental group so that the directed closed walks define the subsemigroup of nonnegative elements. This is defined just to generalize the ordering for cohomology. PROPOSITION 2.6 If
Proof: A conjugacy is induced in either direction by a map of graphs which sends directed walks to directed walks. This gives isomorphisms preserving order structure for base points which correspond under the automorphism. The last statement follows from the fact that any two periodic points of sufficiently large and equal period in a mixing SFT can be conjugated by an automorphism 2 . •
When we abelianize this construction we get a concept of ordered homology for SFT which is dual to the ordered cohomology of 3 in a sense that we will describe below. We make our definitions in the setting of general zero-dimensional systems, for which there is a weaker duality result. Recall that the simplicial homology of a graph in dimension 1 is the kernel of a mapping from the free abelian group on its edges into the free abelian group on its vertices which sends each direct edge to the difference of its starting and terminal vertices. Its cohomology is the cokernel of the dual map.
248
Let (X, s) be a zero-dimensional dynamical system. For each partition P of X into clopen sets we define a graph GP as follows. The vertices are elements of P, and we have a directed edge from the set C to the set D if s(C) f l D ^ I , If Q is a partition into clopen sets that refines P there is a natural graph homomorphism from GQ to Gp, with the map on vertices given by inclusion. DEFINITION 2.7 The homology of a zero-dimensional dynamical system (X, s) is the inverse limit of the first homology groups of graphs Gp defined above.
The relation of ordered cohomology of subshifts to graphs is described in 3
The homology of a subshift defined by an inverse limit of graphs (as described in [3]) will be the inverse limit of the homology of the defining graphs. The corresponding direct limit of cohomology of graphs is the same as the cohomology group in the ordered cohomology of the system. For subshifts of finite type, this follows by the theory in 3 . For more general systems we define a mapping by sending the characteristic function of an edge CD in GP to the characteristic function of the clopen set s{C)C\D. We check easily that this is consistent under refinement of partitions. Note that C, D need not be disjoint. The coboundary of the characteristic function of vertex C will be, up to identification, s(C) — C and the group these generate passes over to the group of coboundaries in the usual definition of cohomology of dynamical systems. The homology group of X is a topological group when given the topology of an inverse limit of discrete groups. We define the positive elements to be the subsemigroup generated by elements that project to a sum of (directed) graph cycles in each of the graphs Gp. In the SFT case these can be identified with formal sums of periodic points. It is not true that every homology class is a difference of positive elements, but it is a limit of such differences for subshifts of finite type, and in that case, this partially ordered homology contains just as much information as the ordered cohomology. Corresponding to the unit in unital ordered cohomology is not a unit, but a sort of trace sending each homology class to its total length. (By the length of a chain £ " ^ j in a graph G we mean J2 mi. This is well defined on homology classes, and is preserved by the graph homomorphisms in the inverse limit system associated with (X, s). For this reason we refer to tracially ordered homology (terminology suggested by M. Boyle).
249 For an SFT with graph G it is suffient to take the inverse limit over the sequence of higher edge graphs En(G). To get a sequence of graphs that corresponds to a universally refinig sequence of partitions, we may take the sequence E2n+1(G) with the graph homomorphisms that take a vertex p - f t ] • • • i 3Cn\ *0 \p^— n + 1 j • • • j ^ n —lj •
DEFINITION 2.8 Take a partition P of a zero-dimensional dynamical system into clopen sets. The evaluation of a cocycle represented as a cochain c = 52 i n(i)xCi for clopen sets Cj corresponding to edges of some graph Gp, on a homology class z which projects to that graph as a sum of cycles m(j)zj is
i,j
e€z
We observe that the choice of P does not matter, and that any coboundary evaluates to zero on a homology cycle, which will be a sum of graph cycles. Each graph cycle corresponds to a sum ]TV Cj such that s{Ci) fl Cj + i ^ 0. A coboundary which is defined at this partition is of the form f — sf and /(s(Cj)) = f(Ci+i), hence the evaluation is zero. The homology (cohomology) of a dynamical system T will be denoted GT (GT). For a subshift of finite type, the evaluation of the characteristic function of a clopen set on a graph cycle will be the number of periodic points of that cycle which lie in the clopen set. This follows from the defining formula. As in Lemma 3.2 of 3 , in computing the homology of graphs as groups we use undirected graphs; the directedness does play a role in determining the directed edge graphs and hence the inverse limits. THEOREM 2.9 The homology group of a general subshift is isomorphic to the group of homomorphisms from its cohomology group to Z with the topology of pointwise convergence. Now suppose the system is an SFT. Its cohomology is isomorphic to the group of continuous homomorphisms from homology to Z. For an irreducible SFT, the positive cocycles are those which have positive evaluations on all periodic orbits. Finite sums of periodic orbits of total period n conversely are those homology cycles which have nonnegative evaluation on all positive cocycles and have length n. The unit in ordered cohomology is determined as the element sending every periodic point to its length or trace, and the trace corresponds to the evaluation of a homology class on the unit.
250
Proof: Evaluation gives a mapping from the homology group of the shift into the group Gi = Hom(G T , Z) of homomorphisms from ordered cohomology to Z. It is continuous on homology, since on a given cocycle the evaluation depends only on some projection 7rQ of the inverse limit to a member Xa of the inverse system. Conversely if the evaluations of classes zn on each cocycle converge (pointwise limit) then the homology classes are equal projected to the homology of each finite graph in the inverse limit. Or, a system of open neighborhoods of the identity in homology is given by all maps of Hom(G T , Z) which project to zero to some graph for the system. These are exactly the maps which are zero on a basis of cocycles for the dual graph, which is an open set in the topology of pointwise convergence. This shows that evaluation is an open mapping as well as a continuous mapping. Suppose that a homology class z is nonzero. Then for some projection 7ra, Ka{z) is nonzero and by the duality between homology and cohomology of simplicial complexes, (c,n a (z)) ^ 0 for some cocycle c for the corresponding graph Xa. This proves the mapping dj is one-to-one. Let h be any homomorphism from cohomology to Z. Then h defines a mapping from the geometric cohomology of each Xa to Z hence a homology class of Xa by duality for finite simplicial complexes, which gives rise to h in turn. These homology classes agree under the mappings between different projections Xa, hence give a class in the inverse limit system. This proves the first statement. For irreducible subshifts of finite type, as indicated above, the mapping from an edge graph to the original graph induces an epimorphism on homology which is a finitely generated free abelian group. This is what makes possible stronger conclusions in this case. It means that projections from the inverse limit are also epimorphic, since the homology of each edge graph is the direct sum of a copy of the homology of the graph, and of terms projecting to zero in its homology. As above, by evaluation any cohomology class gives a continuous mapping from homology to Z. Suppose that a cohomology class c evaluates to zero on each homology class in the inverse limit. Choose a finite graph G where c is defined. Then homology of G comes from the inverse limit, therefore c is zero on each homology class of G and is therefore zero. Let h be any continuous homomorphism from homology to Z. To be continuous from the inverse limit into a discrete group, it must arise from a homomorphism on some finite projection which is the homology of some edge graph. By duality for that edge graph, it is given by a cohomology class. (We use Theorem 3.1, Lemma 3.2 of 3 to go between directed and undirected cycles). This proves the second statement. In 9 and in 3 , Theorem 3.1(2), it is shown that a cocycle is positive if and
251
only if it has positive evaluation on every (directed) cycle (periodic point) in homology, which is the third statement. Also the evaluation of the unit on any cycle in homology is its length (that is, the total signed length of all closed walks in the graph which occur in the homology cycle). Let c be a homology class which has total length L and has nonnegative evaluation on each positive cohomology class, and therefore on every positive cochain. That means if we project to any finite graph of the system, and evaluate on characteristic functions of clopen sets representing vertices, that it assigns a nonnegative number to each vertex, whose sum is L. It must contain some undirected cycle in the graph, otherwise its support would be a disjoint union of trees which would prevent it from being a homology cycle. But if we take a n L + 1 fold or more iterated edge graph, each undirected minimal cycle of length less than L is in fact a directed cycle: each vertex determines the entire cycle which is periodic, and reversed edges can only backtrack along this cycle. The evaluation of c will be greater than or equal to this cycle, we can subtract, and have a homology cycle of smaller length still with nonnegative evaluation. This process must cease and express the projected homology cycle as a sum of directed cycles in the graph, of total length L. When the degree of the edge graph exceeds L, all cycles of length less than L become disjoint. We cannot have nontrivial relations among such cycles, for instance if we take a spanning tree to get a basis for the homology of a graph, they will contain separate edges representing linearly independent basis elements. Therefore the cycles of length less than L in a given positive homology class are unique and in all higher edge graphs we have a fixed set of cycles of fixed lengths, each of which maps to the previous set. They represent unique periodic points of the subshifts, which give the homology class. The last statement follows from the above statements and definition of the trace.
• 3
R E P R E S E N T A T I O N OF CLOPEN SETS
In this section we will restrict our attention to irreducible subshifts of finite type. Since cohomology classes with nonnegative evaluations on cycles are in effect nonnegative formal sums of clopen sets (i.e. sums of their characteristic functions), it is natural to ask whether a nonnegative cocycle is represented by a single clopen set if its evaluation on each periodic orbit is at most the period of that orbit. By a well-known theorem of R. Williams n , every conjugacy of finite type
252
subshifts can be produced by a sequence of state splittings and amalgamations of graphs. Conversely, these induce conjugacies. A state splitting corresponds to splitting a row or column of the adjacency matrix into a sum, then duplicating the corresponding column or row. Amalgamation is the inverse operation. We refer the reader to 7 for details. PROPOSITION 3.1 In the ordered cohomology of an irreducible SFT, a nonnegative cocycle is cohomologous to a cochain which is the characteristic function of a clopen set if and only if its value on each orbit of period n is at most n.
Proof: We will represent cochains as labels or weights on graphs. Any partition into clopen sets is refined by the set of vertices in some edge graph, and we might represent the cochain / by weighting each vertex j with the value f(j). However in order to consider also coboundaries, it is more convenient to consider the alternative labelling in which edges represent clopen sets and receive the weights. An edge of a graph is the same as a vertex in its edge graph, so this is equivalent to the other labelling, if we are allowed to pass to edge graphs. We obtain a conjugate subshift if we split the inputs or outputs to any vertex. The image of a cochain is the cochain where all the split edges receive the same weight as the original. In addition to this splitting, we can modify a cocycle by a coboundary of a cochain represented as a function d on the vertices. The effect of such a coboundary, for each vertex v is to add d(v) to all outgoing edges and subtract d(v) from all incoming edges. We will start with a nonnegative cochain, or weighting of edges. Our goal is to use state splittings and coboundaries to replace it with a weighting by zeros and ones, since this corresponds to the characteristic function of a clopen set. It will be enough to be able, without increasing the maximum weight or introducing negative labels, to reduce by one the number of edges having the maximum weight when the maximum weight is at least two. For then we eventually eliminate all edges of maximum weight, and so reduce this maximum by 1. By the weight of a path in the graph we mean the sum of the weights of the edges. Since the average weight of any closed walk is less than or equal to 1, there must exist some edge having the maximum weight, say m, such that any path going outward from it must hit an edge of weight zero before hitting another edge of weight m (including a return to the original). For if not then we can keep on taking paths from weight m edges to weight m edges with no
253
weight zero edges until we obtain a closed walk with average weight greater t h a n 1, hence average weight greater t h a n 1 on some cycle. Choose such an edge e and consider the set of all p a t h s t h a t begin with e and end with the first occurrence of an edge of weight 0. This will involve a finite number of edges, first paths, and then cycles having positive weight on all vertices, which pass through one of t h e vertices of some previously obtained walk prior to the zero edge. By assumption these walks have no other weight m or weight 0 edges. At each vertex prior to the termini of the zero edges, they include all edges going out from any vertex, because every walk outward can be continued to some weight m edge and hence will hit a 0 edge first. At all the internal vertices of this set, we split off all the weight 0 inputs. As we do so, we will not increase the set of edges in the given walks, since the new vertices have only zero inputs and so cannot occur in full cycles or in the p a t h s . We may do this in any order, and when we have finished, all the internal edges of the p a t h have inputs at least 1. This also does not increase the number of edges labelled m. Now take a coboundary which has value 1 at all these internal vertices and therefore subtracts 1 from their inputs and adds it t o their o u t p u t s . T h i s must reduce the weight of the initial weight m edge. All other inputs have weight at least 1 so no negative weights are produced. All the o u t p u t s from this set of vertices which are not inputs to the same set of vertices (in particular not on full cycles) are into zero edges so become 1 and all o u t p u t s which are also inputs, have weights unchanged. So no new edges acquire weights of m. •
E X A M P L E 3.2 Given a cohomology class / represented by the characteristic function of a clopen set C and another positive cohomology class g < f t h a t also has average weight less t h a n 1 on every cycle, one might hope t o represent g by the characteristic function of a clopen set D C C. However, this is not always possible. Replacing / by 1 — / , we see this is equivalent t o the problem of representing g, f by disjoint clopen sets provided they are nonnegative and their total weight on any closed walk is at most its length and the clopen set for / is given. Consider the S F T given by the matrix
(Hi) and let / be the cohomology class of the characteristic function of t h e union C of clopen sets corresponding to the edges [12], [22], [31], [32] and [33],
254
while g corresponds to the edge [12]. It can be verified that both cocycles are positive, and have together average weight at most 1 on each cycle since whenever 12 occurs in a closed walk, 23 must also occur. If we pass to the edge graph E2n+1(G) as described in the paragraph preceding Definition 2.8, each edge corresponds to a path of length 2n +1 in G and receives the weight of the middle edge of that path. Suppose we could represent g by a clopen set D corresponding to a union of edges in E2n+1(G) that is disjoint from C. The edge [22... 233 . . . 3] must be in D since it is part of the closed walk 22 . . . 233 . . . 312 with total weight in / + g equal to its length and every other edge belongs to C. On the other hand it cannot be in D because it is part of a closed walk 322 . . . 233 . . . 3 with weight 0 in g. So it is impossible to represent g by a clopen set. In this case we can rearrange the cocycle / however to get disjoint clopen sets.
We suspect that this failure is generic rather than special. However it may be possible to characterize it within the ordered cohomology semigroup. EXAMPLE 3.3 In this example we employ a shorthand that describes a cocycle c that is the characteristic function of a union of edges in the graph G by labeling these edges with c. The weight of a path is the product of the labels on its edges. The matrix
\d1+d2
1 J
describes a weighted graph, with sums denoting parallel edges. The cohomology classes of the cocycles c, d\, d2 satisfy 0 < [c] < [di] + [da]. We claim it is impossible to find cocycles ci, c2 such that 0 < [CJ] < [d$] and Cj + c2 is cohomologous to c. If it were possible, we could realize these classes with edge labelings on some En(G). Consider the four possible walks in En(G) corresponding to the vertex sequence l n 2 n l in G. These must have weights di, cdi, d2 and cd2. However, these walks are the concatenations of paths Pi, P2 from the vertex [l n ] to [2™] with paths Q±, Q2 going back. There is no way to label these paths with c\, c2, d\ and d2 to obtain these weights and satisfy [CJ] < [di]. This implies that we cannot represent di by disjoint clopen sets and c by a subclopen set of their union.
PROPOSITION 3.4. In order for k nonnegative cohomology classes dj,
255
0 < di < 1, i = 1 . . . n to be represented by disjoint clopen sets, we must have that their total evaluation on a length m orbit is at most m and that for any nonnegative cohomology class z less than 1 we can write z = z 0 + . . . + z„ with 0 < Zi < di for i > 0 and 0 < ZQ < 1 — d\ — ... — dn.
Proof: Represent the di by disjoint sets of vertices and z by some clopen set, or set of vertices in some edge graph. Let Zj be represented by the intersection of the clopen set for z and that for di; z0 by the rest of z. • .
We would like to raise the question, are these conditions also sufficient? DEFINITION 3.5 The clopen set semigroup of an SFT is the abelian semigroup whose generators are clopen sets and whose defining relations are that a disjoint union of two clopen sets is equal to their sum, and any clopen set is equivalent to its image under the shift.
PROPOSITION 3.6 Two elements are equal in the clopen set semigroup if and only if the clopen sets can be decomposed into finite numbers of disjoint clopen subsets and shifted until the formal sums are identical. The nonnegative cohomology semigroup of an irreducible SFT is isomorphic to the maximal quotient of the clopen set semigroup which is a cancellation semigroup.
Proof: The relation in the first statement is preserved by sums and is therefore a semigroup congruence, and the semigroup so defined satisfies the defining relations of the clopen set semigroup. Therefore there is a homomorphism to it from the clopen set semigroup. But all its relations hold in the clopen set semigroup, so this is an isomorphism. We may identify the semigroup of formal sums of clopen sets modulo the relation that a disjoint union of two clopen sets is equal to their sum with the semigroup of nonnegative cochains via the identification of a clopen set with its characteristic function. This identification induces a homomorphism from the clopen set semigroup onto the nonnegative cohomology semigroup since images of relators of the second kind are cohomologous to zero. If two nonnegative cochains / , g are cohomologous then for some difference hi — h2 of nonnegative cochains, we have f(x) — g(x) = hi(x) —h,2(x) — hi(sx) + /i 2 (sx), so that f(x) + h2{x) + hi(sx) = g(x) + hi(x) + h2(sx), a relation which must
256
hold in any quotient semigroup which is cancellation. •
In what follows we use the notation *a0 ... an* for the cylinder set defined by xQ = a 0 l . . . ,xn = a„. EXAMPLE 3.7 On the full two-shift, the cocycles given by the clopen sets *12* and *21* are equal, being cohomologous by the coboundary of *2*. However, they are not equal in the clopen set semigroup. The point ... 111222... lies in *12*, hence if we could shift pieces of *12* to make *21* some translate of . . . 111222 . . . would lie in *21*. But that is false. 4
ORBIT CLOSURE EQUIVALENCE
Recall that two dynamical systems are said to be orbit equivalent if there is a homeomorphism between them that sends orbits to orbits, and flow equivalent if they are sections of a common flow. M. Boyle and D. Handelman 3 have shown that for irreducible subshifts of finite type, orbit equivalence implies flow equivalence, and flow equivalence corresponds exactly to isomorphism of ordered cohomology groups. The relation between orbit eqivalence and isomorphism of unital ordered cohomology is unknown. In this section we discuss the connection between cohomology and a weaker form of orbit equivalence. DEFINITION 4.1. Two dynamical systems are orbit closure equivalent if there is a homeomorphism between them sending all closures of orbits to closures of orbits and vice versa. They are finite orbit equivalent if there is a homeomorphism between them sending all finite orbits to finite orbits.
THEOREM 4.2 The following are equivalent for irreducible subshifts of finite type: finite orbit equivalence, orbit closure equivalence, and isomorphism of ordered cochain groups preserving the set of coboundaries (hence also the quotient map to ordered cohomology) and the unit.
Proof: Suppose we have an isomorphism h on ordered cochain groups preserving coboundaries and units. Then h induces a bijection on the collections of clopen sets since the characteristic functions of clopen sets are precisely the nonnegative cochains c such that 1 — c is also nonnegative. This bijec-
257
tion respects the relations of inclusion and disjointness. The entire subshift is topologically the inverse limit of all finite partitions into clopen sets, therefore h induces a homeomorphism on topological spaces, which in turn induces h. By duality (Theorem 2.9) h also induces an isomorphism on oredered tracial homology, and hence a bijection of the sets of orbits of each length. Moreover we can identify which clopen sets contain a given orbit by the evaluation ont the corresponding cycle, and their intersection gives the corresponding finite orbit. Therefore we have a finite orbit equivalence. Conversely, any homeomorphism between spaces induces a bijection of the collections of clopen sets respecting inclusion and disjoint union, and hence gives an isomorphism of ordered cochain groups. A finite orbit equivalence gives in addition a correspondence on finite orbits which preserves the count of points of each orbit in each clopen set. Therefore it preserves the evaluation of any cochain on any cycle. In particular it preserves the set of coboundaries, since a cochain is a coboundary if and only if it evaluates to zero on each cycle. The unit is the characteristic function of the entire space, so it is also preserved. Orbit closure equivalence implies finite orbit equivalence, since the finite orbits are a subset of all orbit closures which the homeomorphism must preserve. Conversely suppose f : SA —> SB is a finite orbit equivalence. For any sequence {On} of finite orbits in SA, f must take the set of limit points of {On} to the set of limit points of the sequence {f(On)}. (By the limit points of {On} we mean all limits of sequences {zn} with zn € On.) This gives a family of closed invariant subsets, and it will be enough to show that it includes all orbit closures. For the orbit closure of a point x is the minimal element of the family that contains x, and hence / must take it to the orbit closure of f(x). Let w be a, point of SA whose orbit closure is not the entire SFT. We want to construct a sequence of periodic points that include longer and longer blocks w[—n, n] but do not to converge when we shift them relative to each other in general ways. By Lemma 2.2 of 5 we can find two blocks (markers) R, S of equal length such that RS and SR occur in SA, and such that no proper initial segment of either block is identical to a terminal segment of either block. In the construction we may take R and S to have a common initial and final vertex v; passing to a higher block presentation if necessary, we can choose v to represent a block that does not occur in in, so R and S are not in the orbit closure of w. We choose bounded length transitional segments Tn, Un from v to w[—n, n] and from w[—n,n} to v respectively. We may make R, S arbitrarily long without changing v, so we can assume that they are more than twice as long as the transitional segments.
258
Now we take as periodic orbits On, those generated by Tnw[-n,
n]UnRRSRRS
...RR
Tnw[-n,
n]UnSSRSSR
...SS.
for n odd and n even respectively. Then w is a limit of the sequence {On}. However, consider any convergent sequence {zn}, zn £ On. The sequences RRSRR will never occur in the zn for n even and the sequences SSRSS will never occur in the zn for n odd. Thus any sequence of central blocks on which the zn eventually agree can extend only a bounded distance beyond the w[—n, n] blocks. If we omit this bounded region we will get the same limit, which is thus in the orbit closure of w. •
EXAMPLE 4.3 We find a finite orbit equivalence / 0 from a full shift S5 to itself which is not an orbit equivalence, using methods similar to M. Boyle's 1 construction of orbit equivalences which are not conjugacies up to time reversals. This will be an orbit equivalence on the complement of a subshift M which is almost minimal; M has a maximum proper closed invariant subset Mo which is fixed by the automorphism. We define M by recursively generating a set Bn of marker blocks containing blocks generated at previous stages; M consists of the biinfinite sequences all of whose blocks are subblocks of members of UnBn. A class of mappings is defined in S5 — M by taking the largest marker block b around the O-coordinate of a given sequence z and mapping z to a shift of itself depending on the location of the O-coordinate within b. We do this so that maps are consistent on neighborhoods consisting of the next lower sized blocks, which will yield existence of a continuous extension of the map from S5 — M to M. All these maps are involutions. There will be a generator g of M such that some shifts of g are fixed but the map can be chosen in uncountably many ways on other shifts of g. At most countably many can lie in the original orbit, so that some of these are not orbit equivalences. The action takes place on symbols 1,2,3 and 4,5 are used to form markers. Bn will denote the nth set of blocks. B\ = {1,2,3}. Let Nk = \Bk\Inductively, we define Bn+i to be the set of all blocks of the form 43"M2...62JV„53"
where the blocks 6, are in Bn and each element of Bn occurs exactly twice in the concatenation. The length and number of these blocks are given recursively by Ln+1 = 2NnLn + 2(3n),Nn+1 = (2Nn)\/2N». It follows by induction
259
that overlaps among these blocks can only arise from the occurrence of elements of J5„_i as subblocks of the elements of Bn. M will be the subshift consisting of all points x £ S 5 such that every block occurring in a; is a subblock of a block in some Bn. For a given k, any sufficiently long block of any Bn that contains a 1, 2 or 3 will contain a block of Bk, and hence contains all blocks of Bk-i- Therefore any element of M having a 1, 2, 3 has dense orbit in M. The maximal proper closed subshift of M is the orbit closure of the point . . . 555444 .... This will be fixed under /o. Consider an element z of the 5-shift which is not in M. Then for some maximal k, z has a block 6 of Bk including the zero coordinate, or else there is no block of any B). which occurs as a block in z containing the zero coordinate. If there is no such block or if the zero coordinate is 4,5 then set / 0 (z) = z. If the zero coordinate is in such a block b and is 1, 2 or 3, then set jo(z) = CT-'(Z) for a j which is specified below. This means that outside M, each element goes to a shift of itself, so orbits and their closures are fixed. Moreover the map is defined in terms of finite blocks around the zero coordinate, so it will be continuous in the complement of M, where these blocks must converge as we take limits. Our map will not, however, be shift-commuting. We will require the continuity property that if the zero coordinate of z is in a block 6 n _i of Bn-\ contained in a block bn of Bn, then the shift defined for bn at each point will send the zero coordinate to one of the two copies of 6 n _i in bn. The location of the zero coordinate within that block is chosen to be consistent with the shift chosen for 6 n _i. That is, (C) in some B n _i block around the zero coordinate, the image will be same as if we considered only the largest B n _i block around the zero coordinate in the domain. The map / 0 will be chosen to be an involution; for each pair of equal n — 1 blocks we have two choices, either to switch them or not. This is consistent with condition (C). Being a continuous involution, /o is a homeomorphism on S$ — M. Consider a sequence zn in Ss — M converging to a point of M. We define the images of zn by looking at the blocks of Bn around the zero coordinate, and mapping them as in the paragraph above. The image blocks are contained in one another as n increases, and give a total image sequence according to condition (C), which converges. The dependence of blocks of Bn in the image only on blocks of Bn in the domain means that such limits give a unique extension of /o to M, which will be continuous. Take any generator g in M and consider its shifts. If the zero coordinate is 1, 2, 3 choosing to reverse or not to reverse pairs of blocks one of which is the block containing the zero coordinate gives an uncountable number of choices f° r fo{g)', but shifts of g with a 4, 5 at the zero-coordinate are fixed and lie
260
in the same orbit. Only countably many choices of / 0 can map g to its orbit and be orbit equivalences. But all finite orbits are contained in S5 — M or M0 and are preserved.
5 ACTION OF AUTOMORPHISMS O N ORDERED COHOMOLOGY Let SA be a mixing subshift of finite type and consider the product Sn x SA, where Sn is the full n-shift. Any automorphism a of Sn may be extended to Sn x SA by letting it act trivially on the second component. Hence a induces an automorphism of the unital ordered cohomology of Sn x SA- The cohomology group may be regarded as a module over the integral group ring of aut(5 n ), the automorphism group of Sn. We show in this section that for n > 3 the unital ordered cohomology of Sn x SA together with this module structure determines the conjugacy class of SA- We will make the assumption that n > 3 for the rest of this section, since it is already needed in the first lemma. Let / be an isomorphism from the unital ordered cohomology of Sn X SA to that of Sn x SB , where the second factors are mixing SFT's, and assume that / commutes with the action of aut(5 n ). By duality we get a corresponding isomorphism of ordered tracial homology, hence a bijection of orbits of each period. These induced maps will also be denoted / . LEMMA 5.1 If / takes the orbit of (a,x) £ Sn x SA to the orbit of (b, y) G Sn x SB, then a and b lie in the same orbit of Sn. Proof: By Theorem 1 of 2 or Theorem 4.12 of 8 , for a full shift any permutation of a finite set of orbits that preserves period can be realized by an automorphism of the shift. The periods of a, b are divisors of the period p of (a, x). Let a be an automorphism of S„ which permutes orbits of periods up to and including p in a way that fixes only the orbit of a. (Since n > 3 there are at least 3 orbits of every period.) Then the orbit of (a, x) is fixed by a, so the orbit of (6, y) is also fixed by a since / commutes with the action of aut(5ji). Hence b must lie in the orbit of a. •
Now we examine the ways in which / can be refined from a map on periodic orbits to a map on periodic points. Let G SA be periodic points of periods pi, P2 respectively and suppose (x,a) has pe-
261
riod p. We will denote the shift actions on Sn, SA and Sn x SA by <Ti, a2 and a respectively, and the orbit of any point z (under the appropriate shift) by O(z). By the preceding lemma, / must take the points of the set 0(a,x) = {(a,x), (a\a, a2x),..., (a^a,a2x)} to the points of 0(a,y) = {(a,y), (a1a,a2y),..., (<xla,a%y)} for some y € SB- We claim that y must also have period p2. For under the action of o\ on Sn x SA, 0(a,x) sweeps out all of 0(a) x 0(x), a set of cardinality P\p2, and / must take this set to O(a) x 0(y). LEMMA 5.2 Given positive integers pi, p2 with p2 dividing p\, there is a unique way to define / on the set of points (a, x) £ Sn x SA with a of period pi and x of period p2 so that / agrees with the given map on homology, commutes with a and <Ti, and has the form f(a,x) = (a, /i(x)).
Proof: For each such (a, x) we see from the above discussion that since p = Pi there will be a unique y of the same period as x with (a, y) in the image of 0(a, x) under / . We define f(a,x) to be this point (a,y). This choice clearly makes / cr-commuting on each orbit. We claim that y depends only on x and not on a. As a consequence, / must commute with the action Of CTj.
Suppose f(a,x) = (a,y) and f{b,x) = (b,z). We can choose a £ aut(5 n ) with a(a) = b. Since / commutes with the action of a on orbits, (6, z) must be in the orbit of (6, y). But there can only be one element in this orbit with first coordinate b, so y = z. •
We denote this point mapping also by / , keeping in mind that it is only defined on orbits with p\ = p. Since it commutes with a and <7i, it also commutes with a2We wish now to derive from / a bijection between clopen sets of SA and SB • We will do this by fixing a clopen set C\ in Sn and obtaining a bijection of sets of the form C± x C2 in Sn x SA and Sn x SB • LEMMA 5.3 Let C be a clopen set of the form C1 x C 2 . Let (a,x) be a periodic point of period p in Sn x SA- Then the evaluation of C on the orbit of (a^a,x) is Y?j=o "lC"7: a)n2{o2x), where n\ and n2 are the characteristic functions of C\ and C2 respectively.
262
Proof: We are counting the number of j such that (a:>+ka, a^x) e C\ x C 2 . D
Denote the operation Y%Zo f(aria)9(.°r2x) as / * g(a, x). For every Cartesian product O(a) x O(x) of orbits in the two subshifts, we have a p-tuple of evaluations n\ * nz^a^a, x), k = 0 , . . . ,p — 1. We will consider the particular case where n\ is the characteristic function of C\ = *1* and in particular evaluations whenever a is a0 = (100... 0)°°, where in both cases the 1 is in the zero coordinate. In the next proof we will use the fact that these evaluations completely determine ri2, that is ri2(x) = ni * 122(0,0, x). We will work with Cartesian products of orbits in the two factors as our basic units, and often restrict to the case where pair of periods (pi,£>2) satisfies p%bi• LEMMA 5.4 A cocycle c which is representable as the characteristic function of a product of clopen sets C\ x C2 has these properties: ( l ) 0 < c < l ; (2) the evaluation of c on any orbit where the first factor does not contain the symbol 1 is zero; (3) in each pair of periods where P2I.P1 f° r some function / , the evaluation of c has the form n\ * f(a*a,x). Conversely any cocycle with these properties is representable by a product of clopen sets C\ x C2, and the clopen set C2 is unique. Moreover, the map / : Sn x SA —> Sn x SB preserves these properties, and hence induces a map on clopen sets C2 in the two subshifts SA, SB which preserves evaluations on periodic points mapped according to Lemma 5.2.
Proof: Property (1) by Prop.3.1 characterizes existence of some clopen set representing c. The clopen set C\ — * 1 * does not contain O00, which implies (2). Property (3) follows by Proposition 5.3 and the following discussion. Conversely suppose c has these properties. Represent it as the characteristic function of a clopen set C and break that down as a disjoint union of cylinder sets, which by their nature are Cartesian products over the two factors, C — UjCij x C2j- We want to form a candidate for C 2 , which will be the union of certain shifts of some sets C ^ . We will choose this C 2 in such a way that the evaluations involving ao are correct, when P2&1 and the period of a0 is a length great enough to define the cylinder sets. By (1) it is impossible for any C y to be a cylinder set of the form *00... 0*. We discard all j for which the block defining C y has at least two ones or a symbol greater than 1, since ao cannot lie in such sets. For the remaining sets, we shift the pair Cij,C2j until this 1 entry is in the place where ao is 1; a shift on clopen sets does not
263
affect its cohomology class. So if we take this union of shifted C2j as C2, then Ci x C2 has the correct evaluations on products of the form O(a0) x 0(x), that is, the same as c does. But now both it and c have the same evaluations on all orbits of this pair of periods, because both satisfy formula (3). Note that our construction of C2 has not depended on the choice of periods. Uniqueness follows from the fact that we have determined what periodic points C2 contains in some sequence of periods which tends to infinity. These periodic points are dense in C2, so C2 is determined as their closure. The fact that / preserves (1) is just because it is a cohomology isomorphism; it follows from Lemma 5.1 that it preserves (2). Property (3) depends only on the cohomology evaluations together with the Cartesian product structure of periodic points in a given pair of periods, and it is preserved by Lemma 5.2. •
Hence / induces a bijection of clopen sets in SA and SB as follows: given C2 C SA we take the image under / of the cohomology class of the characteristic function C\ x C2, and map C2 to the unique clopen set C2 C SB such that Cj x C'2 realizes this class. THEOREM 5.5 The unital ordered cohomology of a product of a full nshift, n > 3, and an irreducible subshift SA of finite type, as a module over the automorphism group of the full shift, determines the conjugacy class of SA-
Proof: Let / be an isomorphism from the unital ordered cohomology of Sn x SA to the unital ordered cohomology of Sn x SB consistent with the action of the automorphisms of Sn. By Lemmas 5.4 and 5.3 this defines an isomorphism between the families of clopen sets which is consistent with the shift, with inclusion and evaluation on finite orbits. Then we can determine from this when each intersection of shifts of these clopen sets is nonempty. But the subshifts themselves are the inverse limits of the system of clopen sets, so that we have identified the subshifts topologically in a way consistent with the action of the shift. • 6
CONCLUSION
Corresponding to cohomology of subshifts of finite type are also a fundamental group and homology. The strongest duality properties of homology to
264
cohomology are specific to finite type subshifts. The question of representing ordered sets of cocycles by ordered sets of clopen sets is significant for the question of whether isomorphism of ordered cohomology yields orbit closure equivalences. We can characterize representation of a single cocycle by a clopen set, and it may be that Proposition 3.4 can characterize of inclusion of a pair of clopen sets. The cochain group C0, the abelian group of all functions / : SA —> Z, is acted on by all automorphisms of SA, in particular the shift acting as an automorphism. This action of the shift makes C0 into a Z-module by n*c\ = <JnCi, c\ £ C0. By the homological algebra of Z modules, its cohomology as a Z module is the same as the cohomology in the sense of this paper (derived from more topological ideas in terms of graphs). That is, it is the cokernel of the operator a* — 1. Let Cp be the cochain group of Sn x SA • It is acted on not only by the shift of each factor but also by any subgroup G\ of the group of automorphisms of Sn (such as one-sided automorphisms). Then the cohomology in the sense of homological algebra H*(Gi, Cp) can be considered as a kind of higher-dimensional cohomology of the subshift. (This method of working with some cohomology theory on products, or smash products, is often used to study generalized cohomology theories in algebraic topology). The positivity structure of the fundamental group and the clopen set semigroup also give conjugacy invariants of subshifts of finite type. References 1. M. Boyle, Topological orbit equivalence and factor maps in symbolic dynamics, Ph.D. Thesis, University of Washington (1983). 2. M. Boyle, Nasu's simple automorphisms, in J.C.Alexander, ed., Dynamical Systems, Proceedings U. Maryland 1986-87, Springer Lecture Notes 1342, Springer, Berlin, 1988. 3. M. Boyle and D. Handelman, Orbit equivalence, flow equivalence, and ordered cohomology, Israel J. of Math. 95 (1996), 169-210. 4. M. Boyle and W. Krieger, Periodic points and automorphisms of the shift, Trans. A.M.S. 32 (1987), 125-149. 5. M Boyle, D. Lind, and D. Rudolph, The automorphism group of a shift of finite type, Trans. A.M.S. 3061988,71-114. 6. M. Hall, The Theory of Groups, Macmillan, New York, 1959. 7. D. Lind and B. Marcus, An Introduction to Symbolic Dynamics, Cambridge University Press, 1995; 8. M. Nasu, Topological conjugacy for sofic systems and extensions of automorphisms of finite subsystems of topological Markov shifts, in J.C.
265
Alexander, ed. Dynamical Systems, Proceedings U. Maryland 1986-87, Springer Lecture Notes 1342, Springer, Berlin, 1988. 9. Y. T. Poon, A K-theoretic invariant for dynamical systems, Trans. A.M.S. 311 (1989), 515-533. 10. E. H. Spanier, Algebraic Topology, New York, McGraw-Hill, 1964. 11. R. F. Williams, Classification of subshifts of finite type, Annals of Math. 98 (1973), 120-153; erratum, Annals of Math. 99 (1974), 380-381.
SIMPLE M A X I M U M LIKELIHOOD M E T H O D S FOR T H E OPTICAL M A P P I N G P R O B L E M VLADO DANCiK Millennium
Pharmaceuticals, E-mail:
Cambridge, Massachusetts, [email protected]
U.S.A.
M I C H A E L S. W A T E R M A N Department
of Mathematics, University of Southern Los Angeles, U.S.A. E-mail: [email protected]
California
Recently a new method for obtaining restriction maps was developed by David Schwartz and colleagues. Using this method restriction maps are created from fluorescent images of individual molecules obtained using a microscope. For every individual observed molecule, image processing methods are used t o generate a list of the approximate locations of the sites where the molecule is cut by the restriction enzyme. Our task is to find the location of all restriction sites given t h e observed cutting sites. This is also complicated by the fact t h a t an orientation of the molecules can be unknown, i.e. for a cut-site x we do not know whether x or 1 — x corresponds to a restriction site in a unit length molecule. First we consider the case that the orientation of all molecules and the number c of restriction sites are known. We suppose that for each restriction site location yj the corresponding measured cut-sites follow the normal distribution with the density function g(x;0j,<7j) for some
1
Introduction
There is a group of enzymes known as restriction endonucleases (or restriction enzymes) that are able to cleave (cut) DNA molecules. The restriction sites - the positions where DNA molecule are cleaved - is usually specified by a 266
267
short sequence of nucleotides. For given restriction enzymes, a DNA molecule exhibits a typical pattern of restriction sites called restriction map. Restriction maps are frequently used in molecular biology from genetic engineering to genome mapping. The standard way for constructing maps is by sizing the restriction fragments using gel electrophoresis, optical mapping is a new single-molecule approach to constructing restriction maps developed by D. Schwartz at the W.M. Keck Laboratory for Biomolecular Imaging, Department of Chemistry, New York University 2 ' 15,10 ' 13 . It has already been used in constructing restriction maps for medium-sized molecules effectively and has a potential for highly effective automated creation of restriction maps of entire genomes. For example recently the method produced the restriction map of the Plasmodium faciparum genome 8 . Here is an overview of the optical mapping approach. Fluorescently stained DNA molecules are elongated and attached to a surface so that biochemical activity is preserved. This can be achieved in a couple of ways, the most recent technique uses the fluid flows within drying droplets. The molecules are then exposed to a restriction enzyme and after digestion microscope images of cleaved molecules are taken. Restriction sites appear as gaps in the image of a molecule and fragment lengths can be computed based on fluorescent intensity of the fragments. In the idealized experiment we would expect restriction maps of individual molecules to be almost identical, however due to various experimental imprecision there are errors in the detection of restriction sites. False negative errors, when molecules are not cleaved at all restriction sites, are mostly due to the the fact that restriction enzymes cannot cleave the DNA molecule at the places where molecule is attached to the surface. Some false negative errors can be eliminated by increasing the number of scanned images. It is more difficult to eliminate the false positive errors - when there is a cleavage detected not at the restriction site. It is suspected that false positive errors are mostly due to imperfection of machine vision, namely 1) misidentification of spurious data, 2) identification of multiple molecules as one, 3) identification of partial molecules as complete, 4) errors in the size estimation, 5) missing fragments. Random breakage of large DNA molecules is quite common. Given the restriction maps of the individual molecules, the major computational challenge is to derive consensus locations of restriction sites. Another issue involved with the current system is that it may not produce the exact orientation information on individual molecules, i.e. the real ordering of the sites may be the reverse of what we observe. The orientation problem can be relaxed by attaching a marker to one end of the DNA molecule. This can make it easier to find a multiple alignment of restriction maps, but it still
268
remains a challenging algorithmic and statistical problem. A model similar to our has been presented in 2 and that implementation is used in the Schwartz's laboratories. Our aim has been to explore certain simplifications of that model in hopes of having faster algorithms that remain reliable. For example, we only include false cuts but not "bad" molecules as does 1. Also we employ certain heuristics. A different method combining discrete and continuos approaches can be found in 7 . Various discrete methods were also proposed 6 ' 12 . 2
Known Orientation
Let 0 = 0\,...,6k, Qi € (0,1) be the restriction sites of the unit length DNA molecule. We assume that the number k of restriction sites is known. We have got images of M different copies of the DNA molecule, for i-th copy of the molecule we have observed m* positions where the molecule is cleaved. We will call these positions cut sites and denote them Xi = {xi,i, • • • , X i m i } , x i,j G (0,1) for z = 1 , . . . , M and j = l , . . . , m j . With each Xij we can associate an unobservable zero-one indicator variable Zijti, where value of Ziji is one or zero depending on whether cut site Xij comes as observation of the restriction site 0j or does not. The knowledge of Zijj would allow us to estimate 9\ with M mi 2-y Z J Zi,3,lXi,3 § _ <=1J = 1 ' M mt
(1)
E E *ij,l t=l.;= l
We will simplify our statistical model by considering each cut site to be an independent observation. This is true for cut sites from different molecules and we believe that dependences among cut-sites within a molecule are weak enough to justify our simplification. More complex models have been studied 1 and it seems that our simplification does lead to comparable results. — Xi,..., xn — 3?i,ii • • • i XM,m.M ^ e the collection of all cut sites and let zn be the corresponding unobservable variables. Let YJ = {XJ : z^j = 1} be the collection of cut sites that arise from a restriction site 0;. We assume that each cut site from Yi is distributed according to a normal distribution with mean 6i and some variance of. Unfortunately we also observe "false cut sites" lo = {xi : ziyi ~ 0,1 < I < k}. We can extend the definition of zi%i for / = 0, we put Zito = 1 when Xj € Y0. We assume that cut sites from lo are distributed according to a uniform distribution on interval (0,1). Therefore
269
cut sites from X are distributed according to a mixture of uniform U(0,1) and k normal N(9i,cr\),... ,N(9k,cr2z) distributions. The probability density function for this mixture is k
f(x;p0,...
,pk, 0 i , . . . , 6k,a2,...,
a2.) = p 0 + ^2pig(x;
Oi,erf),
i=i 2
9 ). where g(x; 9, a2) = , 2 exp —( ^* -2g is the normal probability density func2 tion. We will make one more simplifying assumption, we will consider only the case when mixing proportions of the normals are the same and variances are the same too. So we have pi = • • • = p& = (1 — Po)/^, o"i, • • •, a\ — °"2 a n d the probability density function simplifies to k
1-
/(x;p, 0lt..., 0k,a2) = p + -~-J29^'
6l a
'^ •
i=i
Given data X = x\,..., xn, the best estimate of the positions of restriction sites is 6 = &i,... ,9k that maximizes the likelihood function n
L(X;p,9,a2)
l[f(xi;p,9,a2).
= i=i
This is the same as maximizing the log-likelihood function n
l(X;P:9,a2)
=logL(X;P,9,a2)
= Y/\ogf(xi;p,e,a2).
(2)
i=i
We use the EM (expectation-maximization) algorithm to find the maximum likelihood estimate (MLE). The EM algorithm is an iterative algorithm, in each iteration we compute a new estimate of parameters based on the estimate of parameters from the previous iteration (the question of starting values will be discussed later). It can been shown that iterative estimates of parameters obtained by the EM algorithm converge to a (local) maximum of the likelihood function 4 ' n . Every iteration of the EM algorithm consists of an E-step and an M-step. In the E-step we compute the estimate of unobservable data z^i from the values of parameters p, 9X,..., 0(, a2 using the following expressions. z-
=
P
f(xi,p,9,a2) 1—p qixi\9i,a2) yK Zi; = — . ' '— k f(xi,p,0,a2)
I
270
Note that while the indicator variables z^j can have only zero-one values, the estimate ziti is the conditional probability that observation xt belong to the Z-th component and can have any value from [0,1]. In the M-step the new estimates of the parameters are computed from Ziti. The estimate for 0; is similar to (1), we have n n _ 1=1
E hj,l The estimate for p is
1 "
p = - 52 ^ n ^—' and for a2 we have
i=l
a2
A;
n
E E ki l=li=l 2
Equations for p, 9, a can be justified in the sense that if p, 6, a2 are convergence points such that p,d,a-2 = p,0,a2, then dl/d£ = 0, for f = p, Q\,..., 6k, a2 and the convergence point is a local maximum (we can use augmentation technique to avoid being stuck in a stationary point that is not a maximum). 3
Unknown Orientation
In the case of unknown orientation we can introduce a new set of unobserved (rather that unobservable) variables fiy i = 1 , . . . ,M. We set fi — 1 when orientation of Xi corresponds to the orientation of Y and /$ = — 1 when orientation of X^1 = {1 — XiiTni, • • •, 1 — a;^ } corresponds to orientation of Y. There are 2 M possible choices for orientation of molecules, however we can incorporate orientation variables into the likelihood model in such a way that the orientation question for each molecule can be decided independently. Let l(X{*; p, 0, a2) be the contribution of molecule Xi to the log-likelihood
271
function,
{
mi
J2 log f(xitj;p,9,a2)
%}
if/i = l,
Zlogf(l-Xij;p,9,a2) if/< = - 1 . For given f = / i , . . . , / M the log-likelihood function (2) has form M
i(x,f;P,e,*2) = YJi(x{i>v,e,a2) and the MLE in this setting is the set parameters p, 6, a2 that maximizes l(X;p,6,cr2)
= ma,x{l(X,f;p,d,a2)
: f £ {-1,1} M }
M
= ^max{/(Xi;p,^a2),Z(Xr1;p,0,a2)}. i=l
We will extend E-step of EM-algorithm to estimate the orientation of molecules. Given p,6,o-2 we set fi — 1 if l(Xi;p,9,a2) > ^ . X ^ i p , 0,
Initial Values of Parameters and Independent Flipping
The major drawback of the EM-algorithms is the dependence of the outcome on the initial values of parameters. This is the consequence of the multimodality of the likelihood function. We are searching for the global maximum, but the EM-algorithm is only able to find local maxima. A straitforward but time consuming approach is described in 1 , to generate many starting points and to use the maximizing procedure on the most promising starting points. We describe a heuristic approach, which allows us to find an orientation of molecules without the knowledge of the estimates p,9,cr2 and even without the knowledge of the number of restriction sites k. Our heuristic is based on the
272
"voting (majority)" principle - the decision whether two molecules should have the same orientation or do not is based on how these two molecules compare to the all remaining molecules. First, every two molecules Xi, Xj are assigned orientation score osij from interval [0,1] expressing whether the molecules are likely to have the same orientation. Values of osij close to one mean that molecules Xi, Xj are likely to have the same orientation while values colse to zero indicate that Xi and Xj are likely to have opposite orientation. To get the orientation score we investigate how well Xi aligns with Xj and XJ1. The problem of aligning restriction maps is discussed in 5 . Here We use very simple scoring scheme. For two aligned cut sites Xi € Xi and Xj £ Xj the score is 1 — \xt — Xj\/w (w is a fixed parameter, only cut sites within distance w are considered aligned). The score of the alignment is the sum of scores of aligned pairs. The alignment score as{Xi,Xj) is the score of the highest scoring alignment. We define orientation score by oSl j = os(Xi,Xj) J
= — "
as X X
[ " i) as{Xi,Xj)+as{Xi,Xj1)'
where we set os^j = 1/2 when as(Xi, Xj) + as(Xi,XJ1) = 0. The orientation score osij can be seen as an estimate of l ( / j = fj) (for Boolean expression E the indicator value 1(E) is 1 when E is true and 0 otherwise). Our aim is to find the orientation f of molecules such that |1(/, = fj) - os»j| is minimal 0 . Consider two molecules Xi and Xj, the cut sites of Xi can correspond to the different restriction sites then do the cut sites of Xj thus making score osij small even if / ; = fj (or having osij large when / ; ^ fj). We can avoid this by looking at two corresponding rows of orientation scores os(i) = ositi and os(j) = OSJJ, 1 < I < M. If Xi and Xj have the same orientation, we should see some agreement between rows os(i) and os(j), and on the contrary, if Xi and Xj have the different orientation, we should see some disagreement between rows os(i) and os(j). In general, we consider two values from (0,1) to agree when they are either both larger than 0.5 or both smaller that 0.5. The new estimate obij thus is 1 ij = — ] T l((osi,j > 0.5 A osjti > 0.5) V (ositl < 0.5 A osjti < 0.5)) . M We can iterate this process and get more and more accurate estimates. We continue till convergence to a (zero-one) matrix F is achieved. Vector "This problem can be shown to be NP-hard by reduction from the EBFC problem ( 3 ' 1 2 )
273
/ such that F = {l(/i = fj)} specifies the orientation of molecules. The outcome of this heuristic is dependent on parameter w, when w is selected very small then most orientation scores are 1/2, the resulting matrix consists of all l's. We will not consider such values of parameter w. For some data sets, especially when resulting restriction map is quite symmetric, we do not get 0-1 matrix as the output of the heuristic algorithm. Than we have more then one answer to the orientation problem. Also in such case we can try to find w that yields 0-1 matrix. We have observed, that the most appropriate parameter is the smallest w that yields to 0-1 matrix. Even if the orientation of molecules is known (specified), the EM-algorithm remains sensitive to the initial values of the parameters, however to a much less extent. Again we can generate many sets of starting values and continue from the most promising points. However, having specified the orientation and knowing the number of restriction sites we can use the following simple heuristic. We order the observed cut sites and divide them according to their order into k bins, each containing n/k cut-sites. If we assume that the digestion rate for every restriction site is the same, we can assume that the cut sites in the l-th bin are mostly cut sites corresponding to the l-th restriction site. Therefore a good starting value for 0i might be the average value of cut sites in the Z-th bin. And the initial value for a2 would be the average variance. There are two factors that make the described heuristic for starting point imprecise. We do not have an appropriate starting value for p and the false cut sites artificially increase the variance. Therefore we include one extra bin (say bin 0), in which we will capture cut sites that appear to be false cuts. To specify the potential false cut sites we use the near neighbor technique 14 . For each point Xi we determine the distance dm(xi) between Xi and its m-th nearest neighbor, i.e. dm(xi) is such that |{i 6 J : \xt — x\ < dm(xi)}\ < m and \{x £ X : \xi — x\ < dm(xi)}\ > m. Naturally the cut sites in the dense populated areas have small m-th nearest neighbor distance and vice versa. Therefore we can expect cut sites with large dm(xi) to be false cuts. Given threshold t we put all cut sites x^ with dm(xj) > t into bin 0, order the remaining cut sites and distribute them into bins 1 , . . . ,k. The initial value for p then is the size of bin 0 divided by n, the initial value for 0/ is the average value of cut sites in the l-th bin and the initial value for a1 is the average variance.
274
data
heuristics
MLE
Figure 1. A DNA, Ava I Enzyme.
5
Experimental Results
We have implemented the EM algorithm for maximum likelihood estimate and the heuristics for orientations and starting point. The algorithms are accessible through the University of Southern California Computational Biology server "http://www-hto.usc.edu/software/mrma". The performance of the algorithms is shown in Fig. 1 - 3 . Data we used were provided by D. Schwartz, Laboratory for Biomolecular Imaging, Department of Chemistry, NYU. The first columns show data as we obtained it. The second columns show the outcome of the orientation heuristic. The third columns show the outcome of the EM-algorithm. The vertical bars are actual real restriction sites obtained from the sequence of the DNA. The density function corresponding to the maximum likelihood estimate of the parameters is also shown. 6
Conclusion
We have described a simple maximum likelihood approach for solving the multiple restriction map alignment problem from optical mapping. There
275
.' * .'
• ••: «• •• . :. : ' • • > - - .
\ - \
.•
•••
;.
-
•
.
•
•
' .
...
. • • . . ' v - / . v : - •:
.• .
tv
' • '
<
• lm"
• ••. V. '..' ' .^ * •.'•'••.'' A* .; V\ ' >•;. *,'-. • } *. >• •• } N; < •>.-. \ ,/* , * .•
x
• ••. ••• Y s . : • •; ' •
}.•••;•:?.;•
: . , '
• • ; . • • ' •
*• •••
•
•
• • .'. n . ? : *. ••*' ..'•• "*..'• •*.*• ;/• v
• .
• •• 1*
'. . t .
. v .•'• .'•' -'v !"•*.
data
s* V
>
V* V
heuristics
MLE
Figure 2. A DNA, EcoR I Enzyme.
are several shortcomings to our model. As we already mentioned we do not model "bad molecules" which for example arise from fragments missing from the end of a molecule. Our reasoning is that such a rescaled molecule would have approximately random cuts that would easily be assigned to the uniform component. Also we do not model the cut—no cut stochastic aspect of the cutting, nor do we make sure the linear order of the cuts is preserved in the model. For this last reason it is likely that the method is not unbiased, but we believe the bias is quite negligible. Our results support that assertion. A major drawback of the maximum likelihood methods is the dependence of the outcome on the starting point, caused by the multimodality of the likelihood surface. To overcome this obstacle we have designed heuristic algorithms to find plausible orientations of the molecules and to suggest appropriate initial values for the parameters. Unfortunately, these techniques are not able to specify the number of restriction sites and more sophisticated approaches for this problem can be used 9 .
276
II
• •' V
* *.' % * '•:( >••'
•V
> V
.
: » •
.
.!•••
$ "
•
;
'
*
•
>
% #
; • '
•
:
•
:
, • « . Y't /• •• « .'•».*•.•*•• > t ' - ' • •••• •
.
•••:-•'•?•.
data
i
\
' • i& •I*'
• : £. • • - ; • « . • • • •;••*! . • •• • .!•*
••
•• •• • -,.y f, • .
'
?; •••. i
4"? • # . < # '.«•: V
v
'
heuristics
*'•
•
••.
.
#
.-'•••• v .
• ••£. t
'.••:?••;•••'
v>,
-..y
.
*•.'•
!; J
••
¥A
'•" H; • • •
••
&*'.
•
'l»
. £• •
;.
•» ,V• , •i* •-• *1• • * '•• • ?.
J "
.
'A, •* . .. • A - .i •
• • > • • < • . » • '
'
•••.i^-V .
•
: # • > • •
# • • . & .
iv.
'
• . '
•. '".$ .-•.• •vH-'. v.
•
' :•
.
1 '•;'.i
• $ • • > • * •••
•ST-••.»'!
II 1
^ \ - -
'
•
...
•
:• • f
'
••
'k i'""' .:'.:• -V
IT fl 1 MLE
Figure 3. A DNA, Sea I Enzyme.
Acknowledgments The authors wish to thank David Schwartz for the motivation and for providing us with data. We also thank Bud Mishra, Larry Goldstein, Sridhar Hannenhalli, George Komatsoulis and Jae K. Lee for helpful discussions. This work is supported by NIH grant GM36230. References 1. T. Ananthraman, B. Mishra B, D. Schwartz. Genomics via optical mapping II: Ordered restriction maps, Journal of Computational Biology, 4, 91-118, 1997. 2. W. Cai, H. Aburatani, D. Housman, Y. Wang, D. C. Schwartz. Ordered restriction endonuclease maps of yeast artificial chromosomes created by optical mapping on surfaces, Proc. Nat. Acad. Sci., 92, 5164-5168, 1995. 3. V. Dancik, S. Hannenhalli, S. Muthukrishnan. Hardness of flip-cut problems from optical mapping, Journal of Computational Biology, 4, 119125, 1997. 4. B. S. Everitt and D. J. Hand. Finite Mixture Distributions. Chapman
277
and Hall, London 1981. 5. X. Huang, M. S. Waterman. Dynamic programming algorithms for restriction map comparison. CABIOS 8, 511-520, 1992. 6. R. M. Karp and R. Shamir. Algorithms for optical mapping. In Proceedings of the 2nd Annual International Conference on Computational Molecular Biology (RECOMB'98), 117-124, 1998. 7. R. Karp, I. Pe'er, and R. Shamir. Algorithms for optical mapping of DNA combining discrete and continuous methods. In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology (ISMB'99), 159-168, 1999. 8. Z. Lai, J. Jing, C. Aston, V. Clarke, J. Apodaca, E. Dimalanta, D. Carucci, M. Gardner, B. Mishra, T. Anantharaman, S. Paxia, S. Hoffman, J. Ventner, E. Huff, D. Schwartz. A shotgun optical map of the entire Plasmodium falciparum genome. Nature Genetics, 23, 309-313, 1999 9. J. K. Lee, V. Dancik, and Michael S. Waterman. Estimation for restriction sites observed by optical mapping using Markov Chain Monte Carlo. Journal of Computation Biology, 5, 505-515, 1998. 10. X. Meng, K. Benson, K. Chada, E. Huff, D. C. Schwartz. Optical mapping of lambda bacteriophage clones using restriction endonucleases. Nature Genetics, 9, 432-438, April 1995. 11. G. J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, New York 1997. 12. S. Muthukrishnan and L. Parida. On constructing physical maps by optical mapping: A simple, highly effective, combinatorial approach. Proc. of the First ACM Conference on Computational Molecular Biology (RECOMB), 209-219, Santa Fe, January 1997. 13. D. C. Schwartz, X. Li, L. I. Hernandez, S. P.Ramnarain, E. J. Huff, Y. K. Wang. Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science 262, 110-114, 1993. 14. B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London 1986. 15. Y. K. Wang, E. J. Huff, D. C. Schwartz. Optical mapping of site-directed cleavages on single DNA molecules by the RecA-assisted restriction endonuclease technique, Proc. Nat. Acad. Sci., 92, 165-169, January 1995.