The author investigates the use of
evolutionary computation and artificial neural networks for pattern
recognition of microarray data. The combination of these methods holds
great promise for automated feature selection and data analysis.
| Oct
1, 2005 |
| By:
Gary
B. Fogel |
| Pharmaceutical
Discovery |
|

Gary B. Fogel
|
The ever-increasing application of microarrays for gene expression
analysis has led to similar expansion in the number of algorithms and
methods used for data interpretation. Microarray techniques make it
possible to monitor the expression of thousands of genes simultaneously
from a given cell type. Differential expression between these cell types
(e.g., benign and cancerous) can lead to the development of markers useful
for classification, where each experiment monitors thousands of potential
markers that differ over time. A central problem in the microarray
community is the reduction of these potential markers into reduced sets
that retain high predictive accuracy for cell-type classification.
Accurate diagnosis of cancer type may be essential because the treatments,
responses, and prognoses vary significantly depending on the diagnosis. In
addition, methods that work well for cell-type classification should also
work well to predict prognosis and diagnosis if given the appropriate
input data.
The problem of cancer classification has been divided into two related
but separate challenges: class prediction and class discovery (1). Class
prediction refers the assignment of samples to one of several
previously defined classes. Class discovery refers to defining a
previously unrecognized tumor subtype(s) in expression data. Both of these
tasks are challenging and require computational assistance. Class
prediction via cluster analysis is typically used to infer the function of
novel genes by grouping them with genes of well-known functionality in
gene expression profiling. Genes that show similar activity patterns (said
to be coexpressed) are often related functionally and are
controlled by the same mechanisms of regulation (said to be coregulated).
A major obstacle to the eventual utility of microarrays is the lack of
efficient methods for cataloging the data into coexpressed groups. This
article highlights recent advancements in the coupling of evolutionary
computation and neural networks for microarray class prediction and
discovery.
Neural Networks
Computational intelligence is commonly defined as the accumulation of
artificial neural networks, evolutionary computation, and fuzzy logic (2)
and their combination. Neural networks have been noted elsewhere in the
literature as particularly useful for microarray data clustering and
classification. For instance, Khan et al. (3) demonstrated the
robustness of this technique, providing evidence to suggest that neural
networks could accurately classify data on the expression profiles of 96
transcripts characterizing four distinct subtypes of small round blue cell
tumors. Misclassification of cell type was minimized to zero when a set of
96 highest-ranked genes was used. Of the 96 genes, 41 were not identified
previously as having any association with these 4 cancer types.
While it is clear that neural-network methods are well suited to
microarray analysis, their proper training and optimization is a
prerequisite for superior performance. A standard approach to neural
network training is the use of backpropagation to optimize the weight
assignments for a fixed neural network topology. This approach generally
forces the user to choose the appropriate number of features to use and a
fixed neural network topology. Backpropagation itself can also lead to
suboptimal weight assignment if there are many local optima in the search
space. Optimizing neural networks with stochastic optimization methods
such as evolutionary computation, however, can outperform these classic
methods by avoiding local optima and simultaneously identifying the most
appropriate features to use for prediction.
Evolutionary Computation
Natural evolution is a population-based optimization process. The three
main avenues of research in simulated evolution (evolutionary programming
(EP), evolution strategies (ES), and genetic algorithms (GA)), are broadly
similar in that each maintains a population of trial solutions, imposes
random changes on those solutions, and incorporates selection to determine
which solutions to maintain into future generations and which to remove
from the pool of trials (4).

Figure 1. Evolution of neural
networks for microarray analysis. Statistical feature assessment
helps reduce the feature space to the set of features most
relevant to the predictive outcome. Microarray data is then
divided into training, testing, and validation sets for the
purpose of evolving neural networks. Evolution of neural networks
is performed over the training examples, with assay on the testing
examples at repeated intervals during the evolution so as to
monitor possible overtraining. The best evolved neural network
from the iterated loop of selection and variation is used for
processing on validation set data as an example of a real-world
setting.
|
In the context of microarray analysis,
the approach generally proceeds as follows (Figure 1). Standard
statistical analysis methods are used to prune features from the
microarray data that have least correlation to the desired class
prediction. The typical user wishes to identify a model that has highest
predictive accuracy in class prediction but also uses a small subset of
the available features (gene expressions). A training set of examples is
used where the true cell type is known. Neural networks are then evolved
using this training data to minimize the mean squared error (MSE) relative
to known examples relating input patterns of gene expression to an output
decision of class prediction (or other decision such as patient diagnosis,
small molecule toxicology, etc.). A random choice of model parameters over
a feasible range is used to generate an initial population of neural
network solutions. Each of these solutions can have a different topology,
weight assignment for the connections within the neural network, features
used as input, etc (Figure 2). Each neural network is then scored with
respect to MSE (or other suitable scoring function). Neural networks of
sufficient worth in the population are retained as "parents" and
those of insufficient worth are culled from the population.
"Offspring" neural networks are then derived from the parent
neural networks using variation (mutation or recombination),
reconstituting the entire population of neural networks. All solutions in
this generation are scored once again with respect to the fitness function
and the process is iterated until either a sufficient number of
generations of evolution have been completed or until overtraining is
observed on a held-out set of testing examples. Similar approaches have
been applied to problems in breast cancer detection (both from histologic
and radiographic data), pharmaceutical design, structure prediction,
quantitative structure-activity analysis, gene recognition, and a variety
of industrial settings. Fogel and Corne (5) provide additional information
for the application of these algorithms to problems in bioinformatics and
biochemistry.
Evolved Neural Networks for Class
Prediction

Figure 2. Evolution of neural
networks can simultaneously adjust weights on the connections,
number and type of connections, number and type of nodes,
processing elements, etc. to produce a superior neural network
topology. Optimized feature sets can be generated with a reduction
of the genes used as input to the network.
|
Optimization of neural networks with
simulated evolution also allows simultaneous optimization of the neural
network topology. This significantly increases the versatility of the
approach, especially when the researcher would like to reduce the feature
space to only the most important features for accurate prediction. After
sufficient computation, the best-evolved neural network can be examined to
determine which features were used for classification.
Keedwell and Narayanan (6) used
evolutionary algorithms to identify gene expression combinations useful
for classification and evaluated their predictive accuracy through an
artificial neural network. Others have used a combination of evolutionary
algorithms and neural networks simultaneously to optimize the structure
and weights of neural networks for classification based on gene
expression. Ooi and Tan (7) developed a similar hybridized approach for
this problem, using an evolutionary algorithm to determine the optimal
number and set of gene expressions that maximized classification using a
maximum likelihood method. To develop such a classifier, they separated
the data into training and testing sets and used a discriminant function
(in this case maximum likelihood) to assign samples to the class with the
highest conditional probability. The evolutionary algorithm then was used
to search for subsets of genes that satisfied this discriminant function
in the training set. The worth of the overall approach was examined by
prediction on a test set corresponding to nine tumor types. The approach
generated higher classification accuracies than other published predictive
methods on the same multi-class test dataset.
Automated Class Discovery
Hierarchical clustering is the most
commonly used approach to cluster microarray data in the hopes of
identifying previously known or unknown classes of similar gene
expression. Hierarchical clustering utilizes a binary tree, with similar
expression patterns clustered in nested subsets. Several authors have
noted, however, that the method of hierarchical clustering can depend on
the order of the input data: some clusters appear to be the result of
local rather than global clustering decisions.
It is desirable to have an algorithm
that can determine the correct number of classes in the data on its own
without a priori knowledge. One way to do this is through a
self-organizing tree algorithm that can produce a binary tree output
independent on the data order. K-means clustering is a completely
unstructured approach that proceeds in an entirely local fashion and
produces an unorganized collection of clusters but may not be conducive to
easy user interpretation. Bayesian clustering is a highly structured
approach that is appropriate in where a strong prior distribution of the
data is available, but this is uncommon in microarray data, especially
where class discovery is concerned.
Evolutionary computation (without
neural networks) can be used to group similar gene expression patterns
together in an attempt to determine the correct number of classes in the
data. For this purpose, microarray data can be represented as a fully
connected weighted graph with a similarity measure. The vertices of the
graph correspond to the objects to be clustered (gene expression profiles)
and the edges of the graph correspond to similarity between these objects.
A suitable fitness function that maximizes inner-cluster similarity and
simultaneously maximizes intra-cluster difference can be used to evolve
and optimize the placement of incisional hyperplanes that separate the
objects in the weighted graph. The resulting output represents a best
clustering of the data. When tested relative to standard methods such as
self-organizing maps for this purpose, this approach is quite competitive.
Conclusion
As the number of microarray
experiments continues to increase drastically and as these techniques are
becoming more and more a part of personalized healthcare, computational
methods to support this expansion must also occur. Two central problems in
gene expression analysis are class prediction (where the number of classes
is already known a priori) and class discovery (where the number of
classes is not known a priori). Given that the researcher likely has
hundreds or thousands of genes that can be related with statistical
confidence in a linear fashion to class prediction, a central problem is
automatic reduction of that feature space while simultaneously maximizing
predictive accuracy. The resulting model may include both linear and
nonlinear interactions of gene expression with time for successful
prediction. Neural networks are favored in this regard because of their
ability to model both linear and nonlinear interactions and for their
proven ability for pattern recognition in a variety of contexts. Proper
optimization of the neural network can be a difficult task, however. And
in the case of microarray analysis, proper optimization includes not only
weight assignment, but also feature selection and reduction. Evolutionary
algorithms have recently been used to optimize neural networks where the
features are reduced simultaneously with weight assignment optimization on
training examples. The best-evolved neural networks can then be assayed
relative to held-out examples in testing or validation sets, and then, if
sufficient, put to market with kits associated with class prediction. In
addition, in cases where the precise number of classes may not be known,
evolutionary computation can be used to provide an approximation. The best
resulting clusters from this approach can be used as the output classes to
be predicted using a neural network approach.
It is clear that the healthcare
industry requires methods to rapidly transition microarray data into
practical use. A key roadblock remains the discovery of reduced feature
sets that still retain high accuracy in classification. Combinations of
computational intelligence approaches as indicated above hold promise for
rapid, automated, feature selection and pattern recognition for a wide
assortment of data. Combinations of evolutionary computation, neural
networks, and fuzzy logic will play an increasingly significant role in
the areas of gene expression analysis and gene network reconstruction in
the near future.
Gary B. Fogel is Vice
President of Natural Selection, Inc. and can be reached at 3333 N. Torrey
Pines Ct., Suite 200, La Jolla, CA 92037 USA, gfogel@natural-selection.com
References
1. T.R. Golub et al. Science 286
, 531-537 (1999).
2. See the IEEE Computational
Intelligence Society website at www.ieee.org
3. J. Khan et al. Nature Med. 7,
673-679 (2001).
4. D.B. Fogel (2000), Evolutionary
Computation: Toward a New Philosophy of Machine Intelligence, IEEE
Press, Piscataway, NJ.
5. G.B. Fogel and D.W. Corne (eds.),
(2003) Evolutionary Computation in Bioinformatics. Morgan Kauffman,
San Francisco.
6. E. Keedwell and A. Narayanan, Applications
of Evolutionary Computing: EvoWorkshops 2003, pp. 76-86 (2003).
7. C.H. Ooi and P. Tan, Bioinformatics
19 , 37-44 (2003).
|