Friday, January 12, 2007
Wednesday, January 10, 2007
Tuesday, January 9, 2007
Neural network
A neural network is a computing paradigm that is loosely modeled after cortical structures of the brain. It consists of interconnected processing elements called neurons that work together to produce an output function. The output of a neural network relies on the cooperation of the individual neurons within the network to operate. Processing of information by neural networks is often done in parallel rather than in series (or sequentially). Since it relies on its member neurons collectively to perform its function, a unique property of a neural network is that it can still perform its overall function even if some of the neurons are not functioning. That is, they are very robust to error or failure (i.e., fault tolerant).
Neural network is sometimes used to refer to a branch of computational science that uses neural networks as models to either simulate or analyze complex phenomena and/or study the principles of operation of neural networks analytically. It addresses problems similar to artificial intelligence (AI) except that AI uses traditional computational algorithms to solve problems whereas neural networks use 'networks of agents' (software or hardware entities linked together) as the computational architecture to solve problems. Well-designed neural networks are trainable systems that can often "learn" to solve complex problems from a set of exemplars and generalize the "acquired knowledge" to solve unforeseen problems, i.e., they are self-adaptive systems.
Traditionally, a neural network is used to refer to a network of biological neurons. In modern usage, the term is often used to refer to artificial neural networks, which are composed of artificial neurons. Thus the term 'Neural Network' has two distinct connotations:
Biological neural networks are made up of real biological neurons that are connected or functionally-related in the peripheral nervous system or the central nervous system. In the field of neuroscience, they are often identified as groups of neurons that perform a specific physiological function in laboratory analysis.
Artificial neural networks are made up of interconnecting artificial neurons (usually simplified neurons) designed to model (or mimic) some properties of biological neural networks. Artificial neural networks can be used to model the modes of operation of biological neural networks, whereas cognitive models are theoretical models that mimic cognitive brain functions without necessarily using neural networks while artificial intelligence are well-crafted algorithms that solve specific intelligent problems (such as chess playing, pattern recognition, etc.) without using neural network as the computational architecture.
Please see the corresponding articles for details on artificial neural networks or biological neural networks. This article focuses on the relationship between the two concepts.
Neural network is sometimes used to refer to a branch of computational science that uses neural networks as models to either simulate or analyze complex phenomena and/or study the principles of operation of neural networks analytically. It addresses problems similar to artificial intelligence (AI) except that AI uses traditional computational algorithms to solve problems whereas neural networks use 'networks of agents' (software or hardware entities linked together) as the computational architecture to solve problems. Well-designed neural networks are trainable systems that can often "learn" to solve complex problems from a set of exemplars and generalize the "acquired knowledge" to solve unforeseen problems, i.e., they are self-adaptive systems.
Traditionally, a neural network is used to refer to a network of biological neurons. In modern usage, the term is often used to refer to artificial neural networks, which are composed of artificial neurons. Thus the term 'Neural Network' has two distinct connotations:
Biological neural networks are made up of real biological neurons that are connected or functionally-related in the peripheral nervous system or the central nervous system. In the field of neuroscience, they are often identified as groups of neurons that perform a specific physiological function in laboratory analysis.
Artificial neural networks are made up of interconnecting artificial neurons (usually simplified neurons) designed to model (or mimic) some properties of biological neural networks. Artificial neural networks can be used to model the modes of operation of biological neural networks, whereas cognitive models are theoretical models that mimic cognitive brain functions without necessarily using neural networks while artificial intelligence are well-crafted algorithms that solve specific intelligent problems (such as chess playing, pattern recognition, etc.) without using neural network as the computational architecture.
Please see the corresponding articles for details on artificial neural networks or biological neural networks. This article focuses on the relationship between the two concepts.
Characterization
In general, a biological neural network is composed of a group or groups of physically connected or functionally associated neurons. A single neuron can be connected to many other neurons and the total number of neurons and connections in a network can be extremely large. Connections, called synapses, are usually formed from axons to dendrites, though dendrodentritic microcircuits [Arbib, p.666] and other connections are possible. Apart from the electrical signalling, there are other forms of signaling that arise from neurotransmitter diffusion, which have an effect on electrical signaling. As such, neural networks are extremely complex. While a detailed description of neural systems seems currently unattainable, progress is made towards a better understanding of basic mechanisms.
Artificial intelligence and cognitive modeling try to simulate some properties of neural networks. While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.
In the artificial intelligence field, artificial neural networks have been applied successfully to speech recognition, image analysis and adaptive control, in order to construct software agents (in computer and video games) or autonomous robots. Most of the currently employed artificial neural networks for artificial intelligence are based on statistical estimation, optimisation and control theory.
The cognitive modelling field is the physical or mathematical modelling of the behaviour of neural systems; ranging from the individual neural level (e.g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e.g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e.g. behavioural modelling of the organism's response to stimuli).
Artificial intelligence and cognitive modeling try to simulate some properties of neural networks. While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.
In the artificial intelligence field, artificial neural networks have been applied successfully to speech recognition, image analysis and adaptive control, in order to construct software agents (in computer and video games) or autonomous robots. Most of the currently employed artificial neural networks for artificial intelligence are based on statistical estimation, optimisation and control theory.
The cognitive modelling field is the physical or mathematical modelling of the behaviour of neural systems; ranging from the individual neural level (e.g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e.g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e.g. behavioural modelling of the organism's response to stimuli).
The brain, neural networks and computers
While historically the brain has been viewed as a type of computer, and vice-versa, this is true only in the loosest sense. Computers do not provide us with accurate hardware for describing the brain (even though it is possible to describe a logical process as a computer program, or to simulate a brain using a computer) as they do not possess the parallel processing architectures that have been described in the brain. Even when speaking of multiprocessor computers, the functions are not nearly as distributed as in the brain.
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is very much debated. To answer this question, Marr has proposed various levels of analysis which provide us with a plausible answer for the role of neural networks in the understanding of human cognitive functioning.
The question of what is the degree of complexity and the properties that individual neural elements should have in order to reproduce something resembling animal intelligence is a subject of current research in theoretical neuroscience.
Historically computers evolved from Von Neumann architecture, based on sequential processing and execution of explicit instructions. On the other hand origins of neural networks are based on efforts to model information processing in biological systems, which are primarily based on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, rather than sequential processing and execution, at their very heart, neural networks are complex statistic processors.
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is very much debated. To answer this question, Marr has proposed various levels of analysis which provide us with a plausible answer for the role of neural networks in the understanding of human cognitive functioning.
The question of what is the degree of complexity and the properties that individual neural elements should have in order to reproduce something resembling animal intelligence is a subject of current research in theoretical neuroscience.
Historically computers evolved from Von Neumann architecture, based on sequential processing and execution of explicit instructions. On the other hand origins of neural networks are based on efforts to model information processing in biological systems, which are primarily based on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, rather than sequential processing and execution, at their very heart, neural networks are complex statistic processors.
Neural networks and Artificial intelligence
An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network.
In more practical terms neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.
Background
An artificial neural network involves a network of simple processing elements (neurons) which can exhibit complex global behaviour, determined by the connections between the processing elements and element parameters.
In a neural network model, simple nodes (called variously "neurons", "neurodes", "PEs" ("processing elements") or "units") are connected together to form a network of nodes — hence the term "neural network". While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.
In modern software implementations of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing. In some of these systems neural networks, or parts of neural networks (such as artificial neurons) are used as components in larger systems that combine both adaptive and non-adaptive elements.
Applications
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.
Real life applications
The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
Function approximation, or regression analysis, including time series prediction and modelling.
Classification, including pattern and sequence recognition, novelty detection and sequential decision making.
Data processing, including filtering, clustering, blind signal separation and compression.
Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications, data mining (or knowledge discovery in databases, "KDD"), visualisation and e-mail spam filtering.
In more practical terms neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.
Background
An artificial neural network involves a network of simple processing elements (neurons) which can exhibit complex global behaviour, determined by the connections between the processing elements and element parameters.
In a neural network model, simple nodes (called variously "neurons", "neurodes", "PEs" ("processing elements") or "units") are connected together to form a network of nodes — hence the term "neural network". While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.
In modern software implementations of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing. In some of these systems neural networks, or parts of neural networks (such as artificial neurons) are used as components in larger systems that combine both adaptive and non-adaptive elements.
Applications
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.
Real life applications
The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
Function approximation, or regression analysis, including time series prediction and modelling.
Classification, including pattern and sequence recognition, novelty detection and sequential decision making.
Data processing, including filtering, clustering, blind signal separation and compression.
Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications, data mining (or knowledge discovery in databases, "KDD"), visualisation and e-mail spam filtering.
History of the neural network analogy
The concept of neural networks started in the late-1800s as an effort to describe how the human mind performed. These ideas started being applied to computational models with the Perceptron.
In early 1950s Friedrich Hayek was one of the first to posit the idea of spontaneous order in the brain arising out of decentralized networks of simple units (neurons). In the late 1940s, Donald Hebb made one of the first hypotheses for a mechanism of neural plasticity (i.e. learning), Hebbian learning. Hebbian learning is considered be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for long term potentiation.
The Perceptron is essentially a linear classifier for classifying data specified by parameters and an output function f = w'x + b. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the inner product is linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are linearly separable in the input space, while it often fails completely for non-separable data. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.
The Cognitron (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the Hopfield's network (1982), and specialization of these node layers for specific purposes was introduced through the first hybrid network.
The parallel distributed processing of the mid-1980s became popular under the name connectionism.
The backpropagation network was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986. The original network utilised multiple layers of weight-sum units of the type f = g(w'x + b), where g was a sigmoid function or logistic function such as used in logistic regression. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the backpropagation network are referred to as Multi-Layer Perceptrons. This name does not impose any limitations on the type of algorithm used for learning.
The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.
In more recent times, neuroscientists have successfully made some associations between reinforcement learning and the dopamine system of reward. However, the role of this and other neuromodulators is still under active investigation.
In early 1950s Friedrich Hayek was one of the first to posit the idea of spontaneous order in the brain arising out of decentralized networks of simple units (neurons). In the late 1940s, Donald Hebb made one of the first hypotheses for a mechanism of neural plasticity (i.e. learning), Hebbian learning. Hebbian learning is considered be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for long term potentiation.
The Perceptron is essentially a linear classifier for classifying data specified by parameters and an output function f = w'x + b. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the inner product is linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are linearly separable in the input space, while it often fails completely for non-separable data. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.
The Cognitron (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the Hopfield's network (1982), and specialization of these node layers for specific purposes was introduced through the first hybrid network.
The parallel distributed processing of the mid-1980s became popular under the name connectionism.
The backpropagation network was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986. The original network utilised multiple layers of weight-sum units of the type f = g(w'x + b), where g was a sigmoid function or logistic function such as used in logistic regression. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the backpropagation network are referred to as Multi-Layer Perceptrons. This name does not impose any limitations on the type of algorithm used for learning.
The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.
In more recent times, neuroscientists have successfully made some associations between reinforcement learning and the dopamine system of reward. However, the role of this and other neuromodulators is still under active investigation.
Subscribe to:
Posts (Atom)