## Information Theory, Inference and Learning AlgorithmsInformation theory and inference, often taught separately, are here united in one entertaining textbook. These topics lie at the heart of many exciting areas of contemporary science and engineering - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics, and cryptography. This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction. A toolbox of inference techniques, including message-passing algorithms, Monte Carlo methods, and variational approximations, are developed alongside applications of these tools to clustering, convolutional codes, independent component analysis, and neural networks. The final part of the book describes the state of the art in error-correcting codes, including low-density parity-check codes, turbo codes, and digital fountain codes -- the twenty-first century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, David MacKay's groundbreaking book is ideal for self-learning and for undergraduate or graduate courses. Interludes on crosswords, evolution, and sex provide entertainment along the way. In sum, this is a textbook on information, communication, and coding for a new generation of students, and an unparalleled entry point into these subjects for professionals in areas as diverse as computational biology, financial engineering, and machine learning. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Introduction to Information Theory | 3 |

Data Compression | 16 |

Probability Entropy and Inference | 22 |

ful theoretical ideas of Shannon but also practical solutions to communica | 34 |

More about Inference | 48 |

Data Compression | 65 |

The Source Coding Theorem | 67 |

4 The Source Coding Theorem | 81 |

Model Comparison and Occams Razor | 343 |

Stream Codes 26 I Exact Margmalization in Graphs | 346 |

Monte Carlo Methods | 357 |

Efficient Monte Carlo Methods | 387 |

Ising Models | 400 |

Exact Monte Carlo Sampling | 413 |

Variational Methods | 422 |

Independent Component Analysis and Latent Variable Mod | 440 |

Symbol Codes | 91 |

Stream Codes | 110 |

Codes for Integers | 132 |

NoisyChannel Coding | 137 |

Dependent Random Variables | 138 |

9 Communication over a Noisy Channel | 144 |

Communication over a Noisy Channel | 146 |

MOj The NoisyChannel Coding Theorem | 151 |

The NoisyChannel Coding Theorem | 162 |

ErrorCorrecting Codes and Real Channels | 177 |

MlErrorCorrecting Codes and Real Channels | 178 |

no The NoisyChannel Coding Theorem | 184 |

Further Topics in Information Theory | 191 |

Codes for Efficient Information Retrieval | 193 |

Binary Codes | 203 |

Binary Codes | 206 |

Binarv Codes | 208 |

TE | 216 |

Very Good Linear Codes Exist | 229 |

Further Exercises on Information Theory | 233 |

Message Passing | 241 |

MO Message Passing | 236 |

Communication over Constrained Noiseless Channels | 248 |

Crosswords and Codebreaking | 260 |

Why have Sex? Information Acquisition and Evolution | 269 |

Probabilities and Inference | 281 |

22 | 282 |

Clustering | 284 |

Exact Inference by Complete Enumeration | 293 |

2 Exact Inference by Complete Enumeration | 296 |

Maximum Likelihood and Clustering | 300 |

Useful Probability Distributions | 311 |

Exact Marginalization | 319 |

Exact Marginalization in Trellises | 324 |

Exact Marginalization in Graphs | 334 |

Laplaces Method | 340 |

elling | 443 |

Random Inference Topics | 445 |

Decision Theory | 451 |

Bayesian Inference and Sampling Theory | 457 |

V | 466 |

Neural networks | 467 |

Introduction to Neural Networks | 468 |

The Single Neuron as a Classifier | 471 |

Capacity of a Single Neuron | 483 |

Learning as Inference | 492 |

Hopfield Networks | 505 |

42 Hopfield Networks | 518 |

Boltzmami Machines | 522 |

43 Boltzmann Machine | 526 |

Supervised Learning in Multilayer Networks | 527 |

43 Bolt maim Machines | 534 |

Gaussian Processes | 535 |

Deconvolution | 549 |

Sparse Graph Codes | 555 |

LowDensity ParityCheck Codes | 557 |

I | 558 |

47 LowDensity ParityCheck Codes | 573 |

Convolutional Codes and Turbo Codes | 574 |

C | 575 |

48 Convolutional Codes and Turbo Codes | 581 |

Repeat Accumulate Codes | 582 |

49 Repeat Accumulate Codes | 583 |

50 Digital Fountain Codes | 588 |

Digital Fountain Codes | 589 |

Appendices | 597 |

A Notation | 598 |

B Some Physics | 601 |

Some Mathematics | 605 |

613 | |

620 | |

### Other editions - View all

### Common terms and phrases

approximation arithmetic coding assume average Bayesian binary symmetric channel blocklength capacity Chapter cluster codeword coding theorem compression compute corresponding data set decoding problem defined denote distance encoding ensemble entropy equal equation error probability estimate evaluate example expected length factor flipped Gaussian channel Gaussian distribution Gibbs sampling given graph Hamming code hash function Huffman code hypothesis inference integer Ising model iterations K-means algorithm linear code log2 low-density parity-check codes marginal maximal mean Monte Carlo methods mutual information neural network neuron node noise level noisy channel normalizing constant obtain outcome output parameters parity-check matrix possible posterior distribution posterior probability predictions prefix code prior probability density probability distribution probability of error random variable ratio sequence shown in figure shows simulation Solution to exercise source bits space standard deviation string symbol code theory trajectory transmitted trellis typical set uniquely decodeable variance vector weight zero