Philosopher Peter Ludlow comments.
My former colleagues at another university in Middle East have also been moved to online teaching indefinitely, with the students…
News and views about philosophy, the academic profession, academic freedom, intellectual culture, and other topics. The world’s most popular philosophy blog, since 2003.
My former colleagues at another university in Middle East have also been moved to online teaching indefinitely, with the students…
If much of the interest of high-quality papers lies between the lines—in the metaphorical fire that a paper lights in…
I would also recommend that potential grad students make inquiries into how far the compensation package actually goes towards cost…
It’s a mix. I’m still in the UAE with my family, and we feel safe. But some students and faculty…
In the above comment, Michel wrote: “As an aside, every once in a while I check out how the chatbots…
I could imagine LLMs having saved me a *ton* of time in graduate school–e.g., by having supplied reasonable answers to…
The McMaster Department of Philosophy has now put together the following notice commemorating Barry: Barry Allen: A Philosophical Life Barry…
Some clarifications regarding Professor Ludlow’s claims that Hinton’s Boltzmann machines lack both originality and relevance to contemporary AI.
First, a brief review of the early developments of neural networks following the work of Donald Hebb. Hebb translated William James’ insights regarding brain cell behavior into the emerging neuroscience that had developed in the early 20th century linking learning theory with modern neurophysiology. The initial research in artificial neural networks performed by Frank Rosenblatt, Marvin Minsky, John Hopfield and others were all influenced by Hebb’s work. John Hopfield’s work implemented a Hebbian network well before Geoffrey Hinton. However, John Hopfield’s true originality lies with applying statistical mechanics to neural networks by treating them as a collection of interacting physical components. He drew an analogy between neurons and the magnetic spins of atoms, allowing him to introduce an energy function whose minimization resulted in associative memory.
While Hinton’s Boltzmann machine is based directly on Hopfield’s work, the Boltzmann machine contains three innovations that fundamentally change its character. The first is contrastive Hebbian learning that is described in some detail in Professor Ludlow’s note. This is significant because Hopfield networks do not learn in the standard machine learning sense: weights are not adjusted through a data-driven training process but set by hand based on the mathematical formalism of the design. Contrastive Hebbian learning is driven by the difference between the positive and negative phases. This delta provides the gradient used to adjust the weights and minimize an energy-based function similar to the one defined in the Hopfield network.
The second innovation uses this energy function to define a probability distribution that captures the statistical relationships present in the training data. This provides functionality that is fundamentally distinct from the associative memory implemented by a Hopfield network. When you provide an arbitrary input pattern to a Hopfield network it will deterministically return the stored pattern that is the most similar. A Boltzmann machine will stochastically derive a new pattern that reflects the original input and the overall patterns of the training data.
Finally, the Boltzmann machine divides its network into observable and hidden neurons. The observable neurons serve as the interface for the network: during training the observable neurons are set to the training data (known as “clamping”), and the output of a Boltzmann machine is the values of the observable neurons. Hidden neurons are allowed to vary freely during the training process; they are additional variables that allow the probability distribution to capture higher-order statistical patterns and structure.
The originality of Hinton’s work does not lie with any one of these ideas but with their combination: the Boltzmann machine is the first learnable, generative neural network model. Yes, contrastive Hebbian learning was grossly inefficient, but it was a provably viable mechanism and it showed the way. The following year David Rumelhart would work with Hinton and Ronald Williams to rediscover back propagation which quickly became the core learning mechanism for all deep learning. Hidden variables and the related concept of latent space are fundamental to deep learning. Most of all the Boltzmann machine is the forerunner of generative AI such as image models like Midjourney and DALL-E, video models like Sora and Veo and large language models like Gemini, Claude and GPT.
Leave a Reply to A B Carter Cancel reply