AI’s history is longer and more uneven than is often assumed.
The field traces its intellectual roots to Alan Turing’s 1950 essay “Can Machines Think?”. In 1951, researchers ran the first working AI program on the Ferranti Mark 1 at the University of Manchester. In 1956, a team at Carnegie Mellon University in Pittsburgh produced the “Logical Theorist”, the first program built to perform automated reasoning.
Attention then shifted toward expert systems. Development began in 1964 and culminated in 1969 with DENDRAL, which assisted organic chemists in designing complex syntheses. Its performance depended on rules extracted from human laboratory experience. In 1971, MYCIN extended this approach to medical diagnosis and treatment using the LISP language. It worked best in the hands of skilled doctors who could resolve uncertainty by ordering further tests.
These successes prompted widespread adoption of expert systems throughout the 1970s and early 1980s. Most followed a common architecture based on a knowledge base and an inference engine. By the mid-1980s, this approach had reached its limits. Capturing expert knowledge in rule form proved difficult, and systems became increasingly complex. The interaction of forward chaining and backward chaining made them powerful but fragile, contributing to the decline of the first expert system era.
Expert systems were ultimately seen as slow and expensive to develop. Even so, they remain valuable in settings where transparency and review are required, such as credit scoring and job application screening, and in situations where fast execution is essential, including self-driving vehicle systems.
In parallel, two other experimental approaches were pursued, often by researchers working outside engineering. In 1957, Frank Rosenblatt, a psychologist, introduced the “Perceptron”, an early attempt to build a system capable of learning by trial and error. This connectionist model represented the first neural network. It showed limited success in image recognition, including distinguishing cats from dogs, but failed to secure continued funding. Another early initiative focused on machine translation, yet poor performance led to its termination in 1966.
These disappointments, combined with poorly structured data, hardware constraints, and limited processing power, led to a loss of confidence and funding. This period of decline, from 1974 to 1980, became known as the First AI Winter.
A cautious revival followed with explanation-based learning. This approach depended on a human expert who explained how a specific case could support a general rule. It enabled faster systems, especially when data was sparse or unreliable, but remained constrained by the quality of human input. Its strengths were focus, efficiency, and traceability, which made it suitable for use in legal reasoning systems.
In 1983, further progress was made with the development of an effective recurrent neural network. Repeated activation between connected units increased sensitivity, allowing patterns of activity to stabilise in a manner comparable to learning.
This improvement to the expert system concept resulted in earlier systems becoming obsolete and 1987 saw the final collapse of the earlier LISP-based expert systems and the start of the “Second AI Winter” (1987 – 2000).
Development did of course move along during this so-called “winter”. Of note were Support Vector Machines designed to improve data classification and regression analysis, of which the dog/cat sorting is a trivial example. In 1995 these were enhanced by the introduction of “Random Forests” – ensembles of decision trees based on random subsets of the presented data from which a corresponding number of predictions were made, which were then averaged or a majority vote taken on the classification to get the final result. Shortly after that, in 1996, the oddly named Long Short Term Memory (LSTM) units were incorporated into recurrent neural networks which, if unmodified, had a tendency to lose data that might be helpful later on in the learning process. The LSTMs took decisions on retaining or forgetting earlier assessment of data and if retained, doing so over hundreds of time intervals. So some data was indeed discarded but other was retained over a longer timescale that was represented in conventional “short term memory”. The practical effect was greater sensitivity to processing results which the LSTMs judged likely to be useful later on, at the expense of other results judged expendable. However this was expressed in terms of improved speed and accuracy.
During the next 13 years development concentrated on market testing robotic products like Roomba, a domestic robot vacuum cleaner and robot grass cutters while image processing was improved to sort and standardize machine readable images from large databases and for use in self-driving vehicles.
2009 saw the first LSTM recurrent neural network with pattern recognition software and this enabled cursive hand writing to be read and Google built an “autonomous car”. At the same time the ImageNet visual dataset containing 14 million hand annotated images was produced using a team of 49,000 individuals from 167 countries working from a base of 162 million candidate images!
The decade that followed brought fast-moving innovation. In 2013, Google significantly improved natural language processing, helping to establish the chatbot as a practical tool. Generative AI then expanded machine output beyond prediction to prompt-led content creation, producing fluent and often convincing text. Meanwhile, the now familiar AI-generated images from OpenAI’s DALL-E 3 became part of everyday digital culture.
In 2017, OpenAI released influential research on generative models. At the same time, Google DeepMind in London refined AlphaGo into AlphaGo Zero, which learned to play Go entirely through self-play. The same approach was applied to chess, with similarly strong results against both machines and human opponents.
In 2018, Google Duplex showed that an AI assistant could make telephone bookings in real-world settings. The following year saw the launch of OpenAI’s GPT-4, which was widely praised despite continued concerns about fabricated responses inherited from earlier models.
ChatGPT was released in November 2022 and quickly became a public reference point for AI. Its ongoing hallucination problems sparked political debate. In parallel, a wave of legal challenges emerged against newer AI companies, often centred on copyright infringement and the unauthorised use of private or personal data in training sets.
Late in 2024, President Joe Biden issued an executive order defining eight goals for ethical AI development in the United States. These included protecting national interests, respecting copyright, safeguarding personal data, and ensuring AI systems were accurate and non-discriminatory.
This order was rescinded by Donald Trump on his first day in office, removing regulatory obligations for US AI firms. This is likely to deepen legal conflict between major companies while discouraging public challenge.
On 10 and 11 February 2025, France hosted the Artificial Intelligence Action Summit. Sixty-one countries signed a declaration supporting inclusive and sustainable AI. The UK and the US declined. Anglo-Saxon exceptionalism?