Sheldon Ross - A first course in probability theory. Loads of good examples. Text a bit dry and plodding. His Introduction to Probability Models (now in its 9th edition) has some of the same pros and cons for Markov chains, Poisson process, and a lot more. The number of exercises is breathtaking.

David Williams - Weighing the odds. A recent text by an inspiring author, with an integrated treatment of probability and statistics. Fast and exciting.

Kai Lai Chung - Elementary probability theory. Slower exposition than the above. Good on counting.

William Feller - An introduction to probability theory and its applications. The classic book, in 2 volumes. Volume 1 is just discrete distributions (including Markov chains), volume 2 is general probability, including convergence of distributions. Generations of probabilists have been inspired by this book, and it still can’t be beat for its treatment of renewal processes, densities, and random walks. Deep exercises involving many interesting applications. Good treatment of convergence in distribution.

Jim Pitman - Probability. Covers the more elementary parts of the course. Lots of entertaining examples worked through in the text, and exercises with solutions. Encyclopedic array of exercises. Terrific text. Particularly good on conditional expectations and applying the normal approximation. Also great section on the bivariate normal distribution. Also a very intuitive treatment of the Poisson process

Henk Tijms – Understanding probability. Full of anecdotal details – the first half is designed to be read as fun and to act as motivation for the mathematical details in the second half.

Grinstead and Snell: Introduction to Probability. This is available in its entirety on-line. Long and thorough, but gentle.

P. Billingsley – Convergence of Probability Measures. Above the level of this course, but this is the text to go to for those who are interested in really understanding how convergence of probability distributions really works.

Kemeny and Snell – Finite Markov Chains. Another classic (1960), though the only thing that’s outdated is the typeface. (Well, the applications are meager.)

James Norris – Markov Chains. Officially recommended text of the course. Not bad. The section on discrete Markov chains is not all that long, and perhaps less accessible than it could be.