Preface
The past 2 decades have witnessed an unprecedented advance in the use of artificial computerised devices to solve real world problems that are of enormous practical use. For example, small portable devices can recognise your voice commands to carry out simple instructions, e.g., to turn on a radio station, show you how to navigate to a distant location, or even more sophisticated tasks such as translating an utterance from one language to another or recognise medical conditions which an expert might have difficulty detecting. Speech recognition and image recognition systems can now perform tasks that we would have thought unimaginable even just 20 years ago. The capabilities of such devices seem set to transform society in a revolutionary fashion akin to that of the industrial revolution of the 19th century.
Yet, the incredible changes we have observed in these capabilities are not based on any recent major theoretical advances. Indeed, the theoretical foundations for these developments were already laid almost 50 years ago, by physicists, computer scientists and cognitive psychologists, attempting to solve some of the basic problems of complexity, computation and human thinking and development. For example, the Nobel Prize in Physics in 2024 was awarded to Hopfield and Hinton for their foundational work on the topic of artificial neural networks carried out in the 1980s (Hopfield 1982; Ackley, Hinton, and Sejnowski 1985; Rumelhart, Hinton, and Williams 1986). And a ground breaking series of articles published in a two volume set (Rumelhart et al. 1986; McClelland et al. 1987) made the new advances accessible to a wide audience.
Hopfield, John J. 1982. “Neural Networks and Physical Systems with Emergent Collective Computational Abilities.” Proceedings of the National Academy of Sciences 79 (8): 2554–58.
Ackley, David H, Geoffrey E Hinton, and Terrence J Sejnowski. 1985. “A Learning Algorithm for Boltzmann Machines.” Cognitive Science 9 (1): 147–69.
Rumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986. “Learning Representations by Back-Propagating Errors.” Nature 323 (6088): 533–36.
Rumelhart, David E, James L McClelland, PDP Research Group, et al. 1986. Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations. The MIT press.
McClelland, James L, David E Rumelhart, PDP Research Group, et al. 1987. Parallel Distributed Processing, Volume 2: Explorations in the Microstructure of Cognition: Psychological and Biological Models. Vol. 2. MIT press.
The main reasons for the sudden surge in the capabilities of these artificial devices are twofold: the emergence of very powerful computing devices and the availablity of huge databases of digital information. The combination of these 2 developments, enabled the previous theoretical advances to realise their potential.
The goal of this very brief introduction is to elucidate that nature of these earlier theoretical advances and show how they lead to the extraordinary developments in science and technology that we are witnessing today.
For those who have some background in the R programming language, I also include some snippets of code that demonstrate artificial neural networks at work. These are hidden in collapsable callout notes indicated by a icon.
These R snippets do not offer a comprehensive introduction to the R programming language. Readers are encouraged to consult excellent introductions to R programming and its use in data science, statistical modelling and neural network modelling. Some examples include:
- Wickham, Grolemund, et al. (2017)
- Grolemund (2014)
- Ciaburro and Venkateswaran (2017)
The code used in this book was written with R version 4.5.0 (2025-04-11) (R Core Team 2025) and uses only elements taken from base R, unless exposition is facilitated by the use of third party packages. Third party packages need to be installed and loaded for successful execution by the reader.
R Core Team. 2025.
R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.
Ciaburro, Giuseppe, and Balaji Venkateswaran. 2017. Neural Networks with r: Smart Models Using CNN, RNN, Deep Learning, and Artificial Intelligence Principles. Packt Publishing Ltd.
Grolemund, Garrett. 2014. Hands-on Programming with r: Write Your Own Functions and Simulations. " O’Reilly Media, Inc.".
Wickham, Hadley, Garrett Grolemund, et al. 2017. R for Data Science. Vol. 2. O’Reilly Sebastopol, CA.
Ackley, David H, Geoffrey E Hinton, and Terrence J Sejnowski. 1985.
“A Learning Algorithm for Boltzmann Machines.”
Cognitive Science 9 (1): 147–69.
Ciaburro, Giuseppe, and Balaji Venkateswaran. 2017. Neural Networks
with r: Smart Models Using CNN, RNN, Deep Learning, and Artificial
Intelligence Principles. Packt Publishing Ltd.
Collins, Allan M, and M Ross Quillian. 1969. “Retrieval Time from
Semantic Memory.” Journal of Verbal Learning and Verbal
Behavior 8 (2): 240–47.
Elman, J. L., E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, D. Parisi,
and K.Plunkett. 1996. Rethinking Innateness: A Connectionist
Perspective on Development. Cambridge, MA: MIT Press.
Grolemund, Garrett. 2014. Hands-on Programming with r: Write Your
Own Functions and Simulations. " O’Reilly Media, Inc.".
Hebb, Donald O. 1949. Organization of Behavior: A Neuropsychological
Theory. New York: Wiley.
Hopfield, John J. 1982. “Neural Networks and Physical Systems with
Emergent Collective Computational Abilities.” Proceedings of
the National Academy of Sciences 79 (8): 2554–58.
McClelland, James L, David E Rumelhart, PDP Research Group, et al. 1987.
Parallel Distributed Processing, Volume 2: Explorations in the
Microstructure of Cognition: Psychological and Biological Models.
Vol. 2. MIT press.
Minsky, M., and S. A. Papert. 1969. “Perceptrons.”
Cambridge, MA: MIT Press 6 (318-362): 7.
R Core Team. 2025.
R: A Language and Environment for Statistical
Computing. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.
Rosenblatt, Frank. 1958. “The Perceptron: A Probabilistic Model
for Information Storage and Organization in the Brain.”
Psychological Review 65 (6): 386.
Rumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986.
“Learning Representations by Back-Propagating Errors.”
Nature 323 (6088): 533–36.
Rumelhart, David E, James L McClelland, PDP Research Group, et al. 1986.
Parallel Distributed Processing, Volume 1: Explorations in the
Microstructure of Cognition: Foundations. The MIT press.
Treves, Alessandro, and Edmund T Rolls. 1994. “Computational
Analysis of the Role of the Hippocampus in Memory.”
Hippocampus 4 (3): 374–91.
Von der Malsburg, Chr. 1973. “Self-Organization of Orientation
Sensitive Cells in the Striate Cortex.” Kybernetik 14
(2): 85–100.
Waddington, Conrad Hall. 1957. The Strategy of the Genes, a
Discussion of Some Aspects of Theoretical Biology. Allen; Unwin.
Wickham, Hadley, Garrett Grolemund, et al. 2017. R for Data
Science. Vol. 2. O’Reilly Sebastopol, CA.
Widrow, Bernard, and Marcian E Hoff. 1988. “Adaptive Switching
Circuits.” In Neurocomputing: Foundations of Research,
123–34.