SYN@PTICS®   Lab: twenty-fifth birthday of the website (Neural Networks / Cognitive Systems / Lifelong Deep Learning Systems "on Chip")

SYN@PTICS® Lab: twenty-fifth birthday of the website (Neural Networks / Cognitive Systems / Lifelong Deep Learning Systems "on Chip")


My research, software development, and teaching activities in the field of Neural Networks, Adaptive and Cognitive Systems began in 1984 but this year the Syn @ ptics website which describes part of the history of these activities is "only" 25 years old (1993 ).

This article does not want to be a self-commemoration or even an advertisement but just a thanks to all those who believed in my research work in the field of neural networks and, above all, in the applicability of neuronal chips even in the years when the Artificial Intelligence has seen a long winter, remaining relegated to the academic world and very poorly considered in industry. During that winter, many of us, although moved by passion for research in the field of Neural Networks, Fuzzy Logic and Soft-Computing in general, have also had to devote themselves to projects and developments not particularly related to the way of Artificial Intelligence.

In 1993 I have developed "NeurFuzz" (Freeware Distributed OS:DOS), an Error Back Propagation MLP Neural Network trainer and C source code generator. The program included: “Elliptic” Fuzzy Interfaces based on Fuzzy Logic (Lofti Zadeh, Bart Kosko), Genetic Algorithm Learning,  Learning with Simulated Annealing & Thermoshok (local minima overcoming), Multiple (up to 100) Hidden Layers (very long training time with many hidden layers on Intel 386/486). The limited RAM imposed, with many layers, the use of mass storage of synapses and fast access through hash algorithms.

Soon I realized that the Multilayer Perceptron with Error Back Propagation was a neural network suitable for being programmed through data, but not suitable for learning new data in real time. The need to cycle on all the data already learned through learning "epochs" was a not negligible limit:

My real interest was aimed at neural networks able to learn new data without having to cycle on previously learned data. The problem of plasticity vs stability has been solved by Stephen Grossberg and Gail Carpenter with Adaptive Resonance Theory (ART). From this unsupervised learning model, supervised models have been derived, among which I want to remember the simple and intuitive SFAM (Simplified Fuzzy ARTmap).


Other Neural Network models based on prototype neurons like the RBF (Radial Basis Function) with supervised RCE (Restricted Coulomb Energy) learning algorithm were able to learn in an adaptive and non-competitive way by adding neural resources based on "novelty detection" in the input data. The RCE learning algorithm was invented by the Nobel Prize Leon Cooper. At that time I realized that SFAM and RBF / RCE were classifiers suited to my main objective: to build machines able to learn in a continuous way while maintaining the stability of the acquired knowledge.

At the same time I learned that the RBF / RCE model had been implemented in some chips with on-chip learning capability. I have, therefore, chosen to work on these chips because I thought they would be the future in the applications of Artificial Intelligence with the need for continuous learning in embedded contexts. The learning algorithm is executed in few steps and new data are learned without cycling on previously learned data avoiding catastrophic forgetting.


Despite the difficulties of those years, I believed in these technologies and I presented projects in the 90's that described SERVERS (today called CLOUD) based on neuronal chips, meeting the distrust of those who saw the neural networks closed in some University laboratories. These projects proposed pattern recognition "on the cloud" and "Context based Internet Search Engines", not based on GPU (Graphic Processing Units) but on neural chips such as ZISC (Zero Instruction Set Computer) which implemented a RBF (Radial Basis Function) neural network with RCE learning algorithm (Restricted Coulomb Energy) and had on-chip learning capabilities. Today this chip still exists and has 1000 neurons (instead of 36) and continues to be the only neural chip on the market with on-chip learning capabilities ... it has changed its name to NeuroMem®.


The early version of the RBF/RCE chip ZISC036 (Zero Instruction Set Computer). The name was derived from the evolution of CISC processors (Complex Instruction Set Computer) to RISC (Restricted Instruction Set Computer): ZISC was obviously the third generation and the concept, even today, represents the future of "neural devices", since the current GPUs (very useful for algorithmic research and cloud services) still process instructions flows in high energy consumption SIMD / MIMD architectures.

The new NeuroMem® chip by General-Vision implementing a RBF neural network with on-chip RCE (Restricted Coulomb Energy) learning algorithm and 1000 prototypes.

In the 80s and 90s powerful GPUs (SIMD + MIMD) were not available but there were accelerators boards (ISA/PCI) based on SIMD (Single Instruction Multiple Data) or MIMD (Multiple Instruction Multiple Data) processors with limited capacity of parallelism. Also the ISA / PCI cards based on DSP (Digital Signal Processor) were widely used to accelerate software implementations of MLP (Multilayer Perceptron) thanks to the ability to carry out many multiplication-accumulate operations (MAC) used in Fast Fourier Transform (FFT) but very useful to multiply input data with weights (synaptic values) and accumulate in the neuron.

.

In the 90's there were some neuromorphic/accelerator chips/boards that could be divided into the following classes:

ANALOG: 1) Intel ETANN, 2) Synaptics Silicon Retina, 3) Mesa Research Neuralclassifier, 4) Ricoh RN-200

HYBRID ANALOG-DIGITAL: 1) AT&T ANNA, 2)Bellcore CLNN-32

SLICES ARCHITECTURES: 1) Micro Devices MD-1220, 2) NeuraLogix NLX-420, 3) Philips Lneuro-1, 4) Philips Lneuro-2

SIMD: 1) Inova N64000, 2) Hecht-Nielson HNC 100-NAP, 3) Hitachi WSI, 4) Neuricam Nc3003 Totem, 5) RC Module NM6403

SYSTOLIC ARRAY: 1) Siemens MA-16

RADIAL BASIS FUNCTION: 1) Nestor/Intel NI1000, 2) IBM ZISC36, 3) Silicon Recognition ZISC78

The only ones with built-in learning were: Bellcore CLNN-32 (Boltzmann), Ricoh RN-200(Backprop), Hitachi WSI(2 versions: Hopfield/Backprop), Nestor/Intel NI1000(RCE), IBM ZISC36(RCE), Silicon Recognition ZISC78(RCE).

I have always thought that a useful neuronal hardware device should have built-in learning capability. In the opposite case it is a device that can be programmed from the data in "off-line" mode and cannot update itself and learn new data in the field.

I have believed in the silicon implementations of neural networks such as RBF (Radial Basis Function) with RCE (Restricted Coulomb Energy) learning algorithm with the ability to learn online through an adaptive and non-competitive mechanism, which guarantees plasticity and stability. A RBF NN with RCE can be considered a Lifelong Learning Neural Network, specially when implemented on a scalable architecture with on-chip learning and prototype neurons introspection capability. Being a classifier, the RBF / RCE network can be used as a "brick" for complex deep-learning projects.

Therefore, I also use this article to thank General Vision, which allowed me to work on adaptive machine learning solutions with continuous learning, scalable architecture and low energy consumption for embedded applications.

Lifelong Learning Machines L2M is the new DARPA program which, once again, has understood that current Machine Learning (ML) techniques are unable to adapt to new situations: they are in fact "programmed by data" solutions, which can not learn in the field. A special thanks to DARPA which, as always, is a step ahead of technological fashions, trying to overcome the limits that others often do not see. The speed with which DARPA does research, innovation and development depends a lot on the agility of the methodology, very result oriented and little oriented towards documentation. Also the continuous turnover of Program Managers always guarantees innovation and new ideas. The negative side effect is that, often, results already obtained are forgotten, when instead they could be "bricks" or "starting points" for new and more ambitious programs (RBF with RCE neural network on silicon with on-chip learning capability is an example of "brick" that could be used in the L2M program).


I would like to thank all those who have shown interest in the Probabilistic Adaptive Learning Mapper (PALM) and have asked interesting questions about statistical consistency issues, STM (Short Term Memory), LTM (Long Term Memory) and controlled forgetting through statistical synapses. For all those interested in the subject, I am working on the implementation of PALM on the NeuroMem®chip, which thanks to the presence of multiple working contexts and the possibility of introspection of the prototype neurons makes the realization simple and without loss of performance. A scientific paper on the implementation will be presented on Researchgate . The improvements made by PALM compared to the basic RBF / RCE network are:     

1) Continuous unsupervised learning (without feedbacks) that inherits from supervised learning

2) Continuous adaptivity

3) Statistical consistency

4) STM (Short Term Memory)

5) LTM (Long Term Memory)

6) Smart statistical management of forgetting

7) Smart statistical management of limited resources

8) Statistically controlled balance of plasticity versus stability


Although my interest has focused mainly on the RBF / RCE neural chips and their possible applications, years of research oriented to the development of efficient (one step learning / execution) algorithms on Von Neumann machines have led to the realization of the SHARP algorithm (Systolic Hebb Agnostic Resonance Perceptron). I thank all those on Researchgate who have downloaded and tested the source code I have provided, verifying the possibility of mapping "If - Then" rules on a neural network that could be implemented in pulsed hardware with synapses based on delays. The readability of neural networks (which are typically black boxes) is also an important topic for DARPA, which in 2016 launched the XAI (Explainable Artificial Intelligence) program.


To conclude, I want to thank those over one hundred Italian students who have chosen for twenty years my open access book on Neural Networks and Fuzzy Logic (in Italian language) to prepare their theses.

One last consideration: I would like a new winter for artificial intelligence not to return! But for this to happen, new algorithms and new technologies are not enough. Even in the 80s there were fields of important applicability and sufficient technology for certain tasks. It is always the excess of expectations that creates inevitable negative feedback effects: it is a problem that seems to be ignored also by those who would have the "cultural authority" to express an opinion on the issue.

Luca Marchese


To view or add a comment, sign in

More articles by Luca Marchese

Insights from the community

Others also viewed

Explore topics