This book is about finite-alphabet stationary processes, which are important in physics, engineering, and data compression. The book is designed for use in graduate courses, seminars or self study for students or faculty with some background in measure theory and probability theory. The focus is on the combinatorial properties of typical finite sample paths drawn from a stationary, ergodic process. A primary goal, only partially realized, is to develop a theory based directly on sample path arguments, with minimal appeals to the probability formalism. A secondary goal is to give a careful presentation of the many models for stationary finite-alphabet processes that have been developed in probability theory, ergodic theory, and information theory.
The two basic tools for a sample path theory are a packing lemma, which shows how ``almost'' packings of integer intervals can be extracted from coverings by overlapping subintervals, and a counting lemma, which bounds the number of n-sequences that can be partitioned into long blocks subject to the condition that most of them are drawn from collections of known size. These two simple ideas, introduced by Ornstein and Weiss in 1980, immediately yield the two fundamental theorems of ergodic theory, namely, the ergodic theorem of Birkhoff and the entropy theorem of Shannon, McMillan, and Breiman. The packing and counting ideas yield more than these two classical results, however, for in combination with the ergodic and entropy theorems and further simple combinatorial ideas they provide powerful tools for the study of sample paths. Much of Chapter I and all of Chapter II are devoted to the development of these ideas.
The classical process models are based on independence ideas and include the i.i.d. processes, Markov chains, instantaneous functions of Markov chains, and renewal and regenerative processes. An important and simple class of such models is the class of concatenated-block processes, that is, the processes obtained by independently concatenating fixed-length blocks according to some block distribution and randomizing the start. Related models are obtained by block coding and randomizing the start, or by stationary coding, an extension of the instantaneous function concept which allows the function to depend on both past and future. All these models and more are introduced in the first two sections of Chapter I. Further models, including the weak Bernoulli processes and the important class of stationary codings of i.i.d. processes, are discussed in Chapter III and Chapter IV.
Of particular note in the discussion of process models is how ergodic theorists think of a stationary process, namely, as a measure-preserving transformation on a probability space, together with a partition of the space. This point of view, introduced in Section I.2, leads directly to Kakutani's simple geometric representation of a process in terms of a recurrent event, a representation that not only simplifies the discussion of stationary renewal and regenerative processes but generalizes these concepts to the case where times between recurrences are not assumed to be independent, but only stationary. A further generalization, given in Section I.10, leads to a powerful method for constructing examples known as cutting and stacking.
The book has four chapters. The first chapter, which is half the book, is devoted to the basic tools, including the Kolmogorov and ergodic theory models for a process, the ergodic theorem and its connection with empirical distributions, the entropy theorem and its interpretations, a method for converting block codes to stationary codes, the weak topology and the even more important dbar-metric topology, and the cutting and stacking method.
Properties related to entropy which hold for every ergodic process are discussed in Chapter II. These include entropy as the almost-sure bound on per-symbol compression, Ziv's proof of asymptotic optimality of the Lempel-Ziv algorithm via his interesting concept of individual sequence entropy, the relation between entropy and partitions of sample paths into fixed-length blocks, or partitions into distinct blocks, or partitions into repeated blocks, and the connection between entropy and recurrence times and entropy and the growth of prefix trees.
Properties related to entropy which hold only for restricted classes of processes are discussed in Chapter III, including rates of convergence for frequencies and entropy, the estimation of joint distributions in both the variational metric and the dbar-metric, a connection between entropy and dbar-neighborhoods, and a connection between entropy and waiting times.
Several characterizations of the class of stationary codings of i.i.d.\ processes are given in Chapter IV, including the almost block-independence, finitely determined, very weak Bernoulli, and blowing-up characterizations. Some of these date back to the original work of Ornstein and others on the isomorphism problem for Bernoulli shifts, although the almost block-independence and blowing-up ideas are more recent.
This book is an outgrowth of the lectures I gave each fall in Budapest from 1989 through 1994, both as special lectures and seminars at the Mathematics Institute of the Hungarian Academy of Sciences and as courses given in the Probability Department of Eotvos Lorand University. In addition, lectures on parts of the penultimate draft of the book were presented at the Technical University of Delft in the fall of 1995. The audiences included ergodic theorists, information theorists, and probabilists, as well as combinatorialists and people from engineering and other mathematics disciplines, ranging from undergraduate and graduate students through post-docs and junior faculty to senior professors and researchers.
Many standard topics from ergodic theory are omitted or given only cursory treatment, in part because the book is already too long and in part because they are not close to the central focus of this book. These topics include topological dynamics, smooth dynamics, random fields, K-processes, combinatorial number theory, general ergodic theorems, and continuous time and/or space theory. Likewise little or nothing is said about such standard information theory topics as rate distortion theory, divergence-rate theory, channel theory, redundancy, algebraic coding, and multi-user theory.
Some specific stylistic guidelines were followed in writing this book. Proofs are sketched first, then given in complete detail. With a few exceptions, the sections in Chapters II, III, and IV are approximately independent of each other, conditioned on the material in Chapter I. Theorems and lemmas are given names that include some information about content, for example, the entropy theorem rather than the Shannon-McMillan-Breiman theorem. Likewise, suggestive names are used for concepts, such as building blocks (a name proposed by Zoli Gyorfi) and column structures (as opposed to gadgets.) Also numbered displays are often (informally) given names similar to those used for LaTeX labels. Only those references that seem directly related to the topics discussed are included. Exercises that extend the ideas are given at the end of most sections; these range in difficulty from quite easy to quite hard.
I am indebted to many people for assistance with this project. Imre Csiszar and Katalin Marton not only attended most of my lectures but critically read parts of the manuscript at all stages of its development and discussed many aspects of the book with me. Much of the material in Chapter III as well as the blowing-up ideas in Chapter IV are the result of joint work with Kati. It was Imre's suggestion that led me to include the discussions of renewal and regenerative processes, and his criticisms led to many revisions of the cutting and stacking discussion. In addition I had numerous helpful conversations with Benjy Weiss, Don Ornstein, Aaron Wyner, and Jacob Ziv. Others who contributed ideas and/or read parts of the manuscript include Gabor Tusn ady, Bob Burton, Jacek Serafin, Dave Neuhoff, Gyorgy Michaletzky, Gusztav Morvai, and Nancy Morrison. Last, but far from least, I am much indebted to my two Toledo graduate students, Shaogang Xu and Xuehong Li, who learned ergodic theory by carefully reading almost all of the manuscript at each stage of its development, in the process discovering numerous errors and poor ways of saying things.
No project such as this can be free from errors and incompleteness. A list of errata as well as a forum for discussion will be available on the Internet at the following web address.
http://www.math.utoledo.edu/~pshields/ergodic.html
I am grateful to the Mathematics Institute of the Hungarian Academy for providing me with many years of space in a comfortable and stimulating environment as well as to the Institute and to the Probability Department of Eotvos Lorand University for the many lecture opportunities. My initial lectures in 1989 were supported by a Fulbright lectureship. Much of the work for this project was supported by NSF grants DMS-8742630 and DMS-9024240 and by a joint NSF-Hungarian Academy grant MTA-NSF project 37.
This book is dedicated to my son, Jeffrey.
BACK to Home Page.