|
Cobig, Coproduct, and Comma (410 lines): msg#00015science.types
Date: Sun, 19 Mar 89 23:45:31 PST Cobig, Coproduct, and Comma Vaughan Pratt 3/19/89 It is eminently reasonable, whether you are a computer or a human, to schedule tasks on the basis of size. Be strict with small (= "cobig") jobs, that is, get them out of the way as soon as they come up. But don't get bogged down in big jobs, just nibble away at them nonstrictly starting with the bits you need soonest. To make this strategy effective you need some way to gauge size. Gavin Wraith's intriguing criterion is that given by big:cobig::product:coproduct. As evidence for this he exhibits recursive definitions of types such that the definitions based on product yield types having infinite values while those based on sum are those with finite values. Although strictness may be linked to whether the definitions are recursive, Wraith's idea would appear not to be. For define a stream of t's to be a function f:w->t (writing w for omega = {0,1,2,...}), and define a list of n t's as a function f:N(n)->t (N(n) = {m|m<n} = n). Then Stream(t) is a product, indeed a power, namely t^w, while List(t) is a sum, namely of t^n over all finite n (all n in w), t^* if you will, or 1/(1-t) (did you see "Rational Persuasion"?). In effect we unwound the recursion and found that the finitary product and sum operations encountered in the recursive definitions expand, after some reassociating, to infinitary product and sum respectively. So the cobig-coproduct connection survived the unwinding. So maybe there is indeed something to this idea. Looking around at a few more types quickly puts the kabosh on this. Consider the type length-17-list-of-t, t^17, a term in the sum t^* = 1 + t + t^2 + ... + t^17 + .... The values of t^17 are finite, but it is naturally defined as a product. So here we have a product type with finite values. Now let's consider the other direction: can sum types have infinite values? Sure. Let's define a brook to be a stream with a limit. This is of type t^{w+1}, where w+1 denotes the ordinal {0,1,2,...,w}. Define a river to be a stream with an optional limit. So river = stream + brook, a sum type whose values are all infinite. So I'm not sold on this cobig idea one little bit. Maybe it might somehow work better if you put the recursion back, but then I'd be inclined to judge the idea as lacking generality. But there's more than one way to skin a category. Let me imitate the vacuum-cleaner creature in Yellow Submarine and vacuum up not just cobig but coproduct, which I'm not sold on either. Were I of the rational persuasion I might argue that t^* is the "quotient" type 1/(1-t), or some such name. This has certain algebraic and arithmetic advantages, and does seem to get rid of the sum nicely, but I haven't seen enough of its categorial advantages yet to be cured of my irrationality. Instead I want to make the case for comma categories over coproducts as a more satisfactory way to organize your favorite morphism collection. I like comma categories because: 1. They exist. 2. They have nice morphisms of their own. 3. They act like homsets. 4. They need not be rational, context-free, or even r.e. In more detail: 1. Many interesting categories don't have coproducts, whereas the comma construction is always defined. Any time you have two functors F and G with a common codomain you have their associated comma category. 2. Coproducts in Set have no morphisms at all, being just sets. The coproduct C+D in Cat has morphisms, but only those of C and D separately. Comma categories however not only have morphisms but quite useful ones. 3. Comma categories can be usefully viewed as variable homsets, retaining the useful formal properties of a constant homset but not the useless ones (such as a homset being just a set). 4. Rationality or regularity is asking too much. Even r.e. is pretty restrictive. Why build computational limits into your structures when you don't have to? If you aren't familiar with comma categories then I recommend getting a sense of the concept from the following examples before spending too long poring over a formal definition. For now just visualize a comma category as a variable homset, in the sense that given functors F:A->C, G:B->C, the comma category F=>G is the homset F(a)=>G(b) (i.e. hom(F(a),G(b))) as a and b vary over A and B respectively. A comma category is like a vector field: just as the earth's magnetic field is a vector varying over space, so is a comma category a homset varying over the "plane" A x B. In our first few examples B will be a point so the variation will be just along the "line" A. Formally a comma category is most slickly described as a lax pullback. I've attempted an understandable account of this 2-category concept in an appendix below. I'd appreciate pointers to other accounts. The various notations for comma include (F,G) (Lawvere), (F downarrow G) (Mac Lane), and the slice category F/c (Freyd?) in the case when G is the constant c:1->C picking out object c of C. I apologize for perpetrating yet another notation, excuse deferred to an appendix. To express List(t), Stream(t) etc. as a comma category F=>G we need some F's and G's. I'll use the following functors. w:1->Set (picking out the set w = omega = {0,1,2,...}). t:1->Set (picking out the set t). N:w->Set (the Von Neumann inclusion N(n) = {m|m<n} = n), N full. (N full implies that w as a category has as morphisms all functions between its elements as sets. Thus 3=>4 (hom(3,4)) has the 64 = 4^3 functions from {0,1,2} to {0,1,2,3}.) Then we have Stream(t) = w=>t List(t) = N=>t Since w and t as functors are constants, w=>t is the constant (ordinary) homset consisting of all functions from the set w of natural numbers to the set t of possible list entries. N:w->Set on the other hand should be thought of as a set-valued variable ranging over the finite initial segments of w, making N=>t a variable homset. For any given n in w, N(n)=>t is the homset of all functions from N(n) to t, namely all t-lists of length n. Varying n over all natural numbers we get all finite lists of t's, the intended meaning for List(t). In this view of things, the main difference between Stream(t) and List(t) is that the domain of the former is the constant w while that of the latter is the variable N ranging over w. That is, streams have a constant domain, lists have a variable domain, depending on their length. (Indeed we may define len(l) = dom(l) for lists l.) So Wraith's product-sum distinction has simplified to the more elementary constant-variable distinction. As an ordinary homset, Stream(t) is a discrete category, one with no morphisms other than the identity at each stream. List(t) on the other hand has lots of nice morphisms between its lists. For example, with t={0,1}, there are two morphisms from the list 01 to 010, distinguished by which 0 in 010 the 0 in 01 is carried to: 0 1 0 1 | | \/ | | /\ 0 1 0 0 1 0 Exercises. (1) There is only one morphism from 010 to 01, but four morphisms from 010 to itself, two of which are automorphisms. (2) For lists x,y give a representation of the homset x=>y in List(t) as a sum indexed by elements of t. Stream(t) as a discrete category seems deprived: one can think of lots of morphisms that it should have. They can be put in by replacing w:1->Set by W:{w}->Set where {w} is the full subcategory of Set with single object w = {0,1,2,...}, with W the associated inclusion of {w} into Set, and taking Stream(t) to be W=>t instead of w=>t. Now what about Seq(t) in this scheme of things, sequences that might or might not go on for ever? Naively Seq(t) is the sum of List(t) and Stream(t). But this has all the aforementioned drawbacks of sum, including not getting enough morphisms. Instead we may define Seq(t) = N'=>t where w+1 = {0,1,2,...,w} and N':w+1->Set is the inclusion into Set that extends N:w->Set by taking N'(w) = {0,1,2,...} = w. (So W:{w}->Set is just the restriction of N' to the full subcategory {w} of w+1.) This has all morphisms between lists and streams, e.g. the prefix morphisms making a given list a prefix of a stream extending that list, as well as all the morphisms of W=>t from streams to streams. So far we've been holding t constant, varying only N and N'. What about varying t, say to give us the type List = (At)List(t) (A = for-all)? We can do this by replacing t:1->Set by Set:Set->Set (an abbreviation for the identity 1_{Set}:Set->Set). This yields the definitions Stream = w=>Set or W=>Set for more morphisms List = N=>Set Seq = N'=>Set A typical list of N=>Set is then a triple (n,f,t) where n is a natural number, t is a set, and f:N(n)->t determines the list. Thus every list comes equipped with an "alphabet" t and hence is a list of t's. However it is not the case that N=>Set is just the sum (in Cat) of the N=>t's for t ranging over all sets. Even for a fixed t we can now have more than just the identity function on t. And we can have morphisms between lists with different t's. In particular two lists are isomorphic just when they have the same multiset of elements. This last fact, and our earlier remark that 010 had two automorphisms in List({0,1}), suggest that we have the wrong morphisms. In particular if we expect our intuition about data types to be reflected in the category structure of that type then we seem to have defined Multiset(t) rather than List(t). Let us equip w and its members with their standard order and take the domains of N and N' to have for their respective morphisms just monotone maps, this being a necessary and sufficient condition for N and N' to factor through Pos via the forgetful functor V:Pos->Set. (Pos is the category of partially ordered sets and their monotone maps.) The resulting category structure is the one we more naturally associate with lists as opposed to multisets. Exercises (assuming monotone maps now). (3) In List(t) and Seq(t), all isomorphisms are identities. (4) Isomorphism in List and Seq however means the weaker notion of having the same pattern, e.g. 0012220 and 2201112 are isomorphic. Note that order remains important here; for lists of any given type t, List differs from List(t) mainly in that its isomorphisms don't take individual letters seriously. But now why are we limiting ourselves to just the domains w and w+1? Why not define the notion of list/stream/whatever for all of Pos? Our ultimate type of this general flavor, subsuming all the above, is V=>Set, for V the above-mentioned forgetful functor V:Pos->Set. V=>Set is the category of what I have been calling pomsets, a term that in the last couple of years seems to have displaced Grabowski's term "partial word" for this concept. That is, Pom = V=>Set A pomset amounts to a labeled partial order, namely a triple (p,h,t) where p is a poset, t is a set thought of as the alphabet of the pomset, and h:V(p)->t labels elements of p with letters of the alphabet. A morphism of pomsets carrying (p,h,t) into (p',h',t') consists of a monotone function f:p->p' and an "alphabet translation" g:t->t', with the evident square commuting. The identity morphisms are those with both f and g identities. I prefer pomsets to lists, streams, or seqs because the concept seems more fundamental to me. Unlike lists and streams it has no built-in rationality; in fact it makes no computational or constructive assumptions at all. Moreover the additional axioms needed to cut back to list or whatever tend to do more harm than good. For example Pom, unlike List or Seq, is small bicomplete (has all small limits and colimits). This makes Pom much more attractive than Seq as a basis for a programming language. By analogy with List vs. List(t) we may have Pom(t) = V=>t. Whereas every isomorphism in List(t) is an identity, an isomorphism in Pom(t) just expresses the idea that two labeled partial orders coincide in order and in labelling but not in underlying sets. I've been using "pomset" to refer to an isomorphism class in this sense, so strictly speaking pomsets are elements of a skeleton of Pom(t). However working with skeletons seems to be a bad idea in general, so even though I still regard pomsets as being defined only up to isomorphism (in Pom(t), not in Pom which allows isomorphisms with nonidentity alphabet translations), I take the category Pom(t) to be V=>t rather than a skeleton of it. Vaughan Pratt ========================================================== Appendix 1: Comma Categories as Lax Pullbacks A comma category is a lax pullback in CAT. In more detail: Let C^2 denote the functor category of arrows of C. Then the *comma category* F=>G determined by functors F:A->C, G:B->C is the subcategory D of A x C^2 x B whose objects (a,h,b) and morphisms (f,s,g) (morphisms of C^2 being commuting squares s) satisfy dom(h)=F(a), cod(h)=G(b) (1) dom(s)=F(f), cod(s)=G(g) (2) That is, (f,s,g) is h a F(a) ---> G(b) b | | | | f| F(f|) s G(g|) g| (3) | | | | v v h' v v a' F(a') ---> G(b') b' Such a subcategory of a product should be a pullback. We can make it a sort of pullback, namely the limit D of diagram (5) of II.6 of Mac Lane: Q D - - - - - - - - > B : . | : .H |G : . | | : --' 2 v P : C -----> C (4) : | cod : |dom v v A -------> C F Here the functors P,H,Q denote the restriction to D (= F=>G) of the respective projections of A x C^2 x B. Commutativity of this diagram expresses equations (1) and (2). Notice that the interesting "notch" in the lower right of the "pullback" depends only on C. There is a very nice 2-category packaging of this notch. Instead of requiring the diagram to commute "on the nose", that is, identity (equality) of the compositions along all paths from D to a given instance of C, we relax this to commutativity up to a natural transformation t to give the "lax pullback" Q D - - -> B : --, | P : t/| |G (5) v / v A -----> C F Here t:FP->GQ is just H:D->C^2 rotated 90 degrees but carrying the same data. That we can perform this rotation without loss of information follows from the familiar isomorphism 2 D 2xD Dx2 D 2 (C ) ~ C ~ C ~ (C ) (6) showing that the functors from D to C^2 are in correspondence with the natural transformations of functors from D to C. To see this correspondence at closer range, observe that H determines t via t(d) = H(d), while t determines H via H(d) = t(d), H(m:d->d') = the square t(d) FP(d) ---> GQ(d) | | FP(m|) GQ(m|) | | v t(d') v FP(d')---> GQ(d') Close inspection reveals that equations (1) and (2) have now become dom(t)=FP, cod(t)=GQ. Diagram (5) conveys these equations trivially and much more directly than diagram (4). Universality of this lax pullback is expressed as usual by its finality in a suitable category of such squares with the AFCGB part held constant, whose morphisms are cubes with only the two object faces lax, the other four faces commuting exactly, such that if the morphism from D'P't'Q' to DPtQ has J:D'->D as one edge then t' = tJ. ============================================= Appendix 2. Notation For Homsets and Comma Categories Lawvere introduced comma categories in his thesis, using the notation (F,G) to introduce and emphasize the variable-homset idea, without however calling them comma categories. Others picked up on the comma in the notation to suggest the term "comma category." For his book Mac Lane retained the name "comma category" but replaced the comma symbol by downarrow. (Downarrow categories?) In this message I've followed Lawvere's convention of using the same notation for homsets and comma categories. At the same time I've agreed with Mac Lane that the notation (a,b) is "already overworked," a complaint that applies equally to homsets and comma categories. C(a,b) is almost as bad, hom(a,b) is good (with hom_C(a,b) when necessary for disambiguation) but a bit klutzy. But the notation a=>b that I've been using here strikes me as very nice, capturing the idea of a homset as a set of arrows a->b yet looking fine either in ASCII or when typeset. There is precedent for reading a->b as hom(a,b), based on the usage f:a->b read as f:t for the case t the type a->b. If we're lucky however we may be able to tease apart the f:t usage into two cases. One is where -> is the internal hom, which I don't propose to interfere with. The other, external hom, takes f:a->b to be an easily typeset f synonym for a-->b, where the arrow denotes a morphism rather than a whole homset. Fortuitously this usage fits in with logic. In Boolean and Heyting algebra a->b is synonymous with b^a (the internal hom), while the external hom is written a <= b (less-or-equal) to distinguish it from internal. (The spurious relationship between <= and => is peculiar to ASCII and goes away when typeset; both however represent external hom.) Also, the Curry adjunction looks better (less permuting) as a => (b->c) ~ axb => c than as b a -> c ~ axb -> c. So to summarize, we have three places needing some sort of arrow: homsets, morphisms, and internal hom. We distinguish these by using f a=>b, a-->b, and a->b respectively, taking a <= b as a synonym for the first (when the category is a preorder), f:a->b as a synonym for the a second and b as a synonym for the third. Unless I've overlooked some point of view, this should all work out to everyone's satisfaction. |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re. Gavin Wraith's message on Types & Strictness: 00015, Rishiyur S. Nikhil |
|---|---|
| Next by Date: | Re: Cobig, Coproduct, and Comma: 00015, John Gray |
| Previous by Thread: | Re. Gavin Wraith's message on Types & Strictnessi: 00015, Rishiyur S. Nikhil |
| Next by Thread: | Re: Cobig, Coproduct, and Comma: 00015, John Gray |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |