Parallel Evolutionary Algorithms for Feature Selection in High Dimensional Datasets

Parallel Evolutionary Algorithms for Feature
Selection in High Dimensional Datasets
Safa Adi #1
, Mohammed Aldasht ∗2
#
Computer and IT Department, Palestine Polytechnic University
Palestine
1
safa_adi@ppu.edu
∗
Computer Engineering Department, Palestine Polytechnic University
Palestine
2
mohammed@ppu.edu
Abstract—Feature selection in high-dimensional datasets is
considered to be a complex and time-consuming problem. To
enhance the accuracy of classification and reduce the execution
time, Parallel Evolutionary Algorithms (PEAs) can be used. In
this paper, we make a review for the most recent works which
handle the use of PEAs for feature selection in large datasets.
We have classified the algorithms in these papers into four main
classes (Genetic Algorithms (GA), Particle Swarm Optimization
(PSO), Scattered Search (SS), and Ant Colony Optimization
(ACO)). The accuracy is adopted as a measure to compare the
efficiency of these PEAs. It is noticeable that the Parallel Genetic
Algorithms (PGAs) are the most suitable algorithms for feature
selection in large datasets; since they achieve the highest accuracy.
On the other hand, we found that the Parallel ACO is time-
consuming and less accurate comparing with other PEA.
Index Terms: Evolutionary algorithms, parallel com-
puting, classification, feature selection, high dimensional
dataset.
I. INTRODUCTION
Nowadays many disciplines have to deal with high
dimensional datasets which involve a huge number of
features. So we need data preprocessing methods and data
reduction models in order to simplify input data.
There are two main types of data reduction models [1]. The
first is: instance selection and instance generation processes
are focused on the instance level. (i.e. select a representative
portion of data that can fulfill a data mining task as if the
whole data is used) [14]. The second is: feature selection
and feature extraction models which work at the level of
characteristics. These models attempt to reduce a dataset by
removing noisy, irrelevant, or redundant features. Feature
selection is a necessary preprocessing step in analyzing big
datasets. It often leads to smaller data that will make the
classifier training better and faster [3].
Feature selection is a problem with big datasets. In order to
make classification faster and more accurate, we need to select
the subset of features that are discriminative. Evolutionary
algorithms like Genetic algorithms, Swarm intelligence
optimization, Ant colony optimization, etc. These methods
can be effective for this problem, but they require a huge
amount of computation (long execution time), also memory
consumption. In order to overcome these weaknesses, parallel
computing can be used.
In this survey, we will review a set of papers about parallel
evolutionary algorithms that used for feature selection in large
datasets. Furthermore, we will compare the performance of
different algorithms and environment.
The rest of the paper is organized as follow: Section2
is Background about feature selection approaches and
parallel architecture in general. Section3 talk about parallel
evolutionary algorithms. Section 4 will discuss and review
many papers which talk about the feature selection problem
by using parallel computing. Section5 contains the summary
of the survey, the last section is the conclusion and future
work.
II. BACKGROUND
In general, there are three classes of feature selection:
filter-based, wrapper, and embedded. The filter approach
analyzes the features statistically and ignores the classifier
[18]. Most of filter-based methods perform two operations,
ranking and subset selection. In some cases, these two
operations are performed sequentially, first the ranking, then
the selection, in other cases only the selection is carried
out. These methods are effective in terms of execution time.
However, filter methods sometimes select redundant variables;
since they don’t consider the relationships between variables.
Therefore, they are mainly used as a pre-processing method.
In the wrapper model [15], the process of feature selection is
depending on the performance of a specific classifier. But its
disadvantages are time-consuming and over fitting. The last
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 3, March 2018
181 https://sites.google.com/site/ijcsis/
ISSN 1947-5500

method for feature selection is the embedded. In this method,
the feature selection process and the learning algorithm
(tuning the parameters) are combined to each other[6, 15].
The selection of optimal feature subset is an optimization
problem that proved to be NP-hard, complex, and time-
consuming problem [13]. Two major approaches are
traditionally used to tackle NP-hard problems, as seen in
Figure1: exact methods and metaheuristics. Exact methods
allow exact solution to be found, but this approach is
impractical since it is extremely time consuming for real
world problems. On the other hand, metaheuristics are used
for solving complex and real world problems. Because
metaheuristics provide suboptimal (sometimes optimal)
solution in reasonable time [2, 11, 13].
As seen in Figure1, Metaheuristics are divided into two
categories [13]:
• Trajectory-based (exploitation-oriented methods): the
well-known metaheuristics families based on the manip-
ulation of a single solution. Include Simulated Annealing
(SA), Tabu Search (TS), Iterated Local Search (ILS),
Variable Local Search (VNS), and Greedy Randomized
Adaptive Search Procedures (GRASP).
• Population-based (exploration-oriented methods): the
well-known metaheuristics families based on the
manipulation of a population of solutions. Include PSO,
ACO, SS, Evolutionary Algorithms (EAs), Differential
Evolution (DE), Evolutionary Strategies (ES), and
Estimation Distribution Algorithms (EDA).
Fig. 1. Approaches for handling NP-hard problems
Metaheuristics algorithms have proved to be suitable tools
for solving the feature selection accurately and efficiently
for large dimensions in big datasets [2]. The main problems
when dealing with big datasets are: The first is execution
time because the complexity of the metaheuristics methods
for feature selection is at least O(n2
D), where n is the
number of instances and D is the number of features. The
second is memory consumption since most methods for
feature selection need to store the whole dataset in memory.
Therefore, the researchers try to parallelize the sequential
metaheuristics to improve their efficiency for feature selection
on large datasets. There are many programming models
and paradigms, such as MapReduce (Hadoop, spark), MPI,
OpenMP, CUDA [1, 6, 13]. Parallel computing can be process
interaction (shared memory, message passing) or problem
decomposition (task or data parallelization) [6].
Parallel computing is a good solution for these problems
since many calculations are carried out simultaneously in
the task and/or data [6]. Population-based metaheuristics are
naturally prone to parallelize since most of their variation
operators can be easily undertaken in parallel [2, 13].
Parallel implementations of metaheuristics are an effective
alternative to speed up sequential metaheuristics; by reducing
the search time for solutions of optimization problems.
Furthermore, they lead to the more precise random algorithm
and improve the quality of solutions [11]. As seen in Figure2,
the implementation of parallel metaheuristics is divided into
two categories [13].
Fig. 2. Parallel implementation of metaheuristics
Parallel evolutionary algorithms are used in many works
rather than feature selection, such as inferring phylogenies,
traffic prediction. In [9] Santander et al., used MPI/OpenMP
with a hybrid multiobjective evolutionary algorithm (fast non-
dominated sorting genetic algorithms and firefly algorithm);
for phylogenetic reconstruction (Inferring evolutionary trees).
In [10] Jiri at al., used parallel multiobjective GA with
OpenMP. In order to make traffic prediction more accurate.
Master-Slave scheme of GA was implemented on multi-core
parallel architecture. They reduced the computational time,
but it was successful for short-term traffic prediction.
III. OVERVIEW OF PARALLEL EVOLUTIONARY
ALGORITHMS FOR FEATURE SELECTION
Feature selection algorithms are used to find an optimal
subset of relevant features in the data. In this section we will
talk about parallel evolutionary algorithms that are used for
feature selection problem in large datasets. We will illustrate
the steps of six algorithms (PGA, PCHC, PPSO, PGPSO,
PSS, and PACO).
Vol. 16, No. 3, March 2018
ISSN 1947-5500

A. Parallel Genetic algorithm (PGA)
In order to increase the efficiency and reduce the execution
time of the genetic algorithm (GA); the researchers used par-
allel GA. Algorithm 1 presents the parallel GA methodology,
with the master-slave model of parallel GA.
Algorithm 1 Parallel genetic algorithm [10]
Create initial population
Evaluate initial population
Create slaves
while not done do
Start slave
Wait for slave to finish
Run mutation operator
end while
for i=1 to slave iterations do
Select individuals
Run crossover operator
Evaluate offsprings
if solution found then
set done=True
end if
end for
B. Parallel CHC algorithm (PCHC)
A CHC is a non-traditional GA, which combines a con-
servative selection strategy (that always preserves the best
individuals found so far), that produces offsprings that are at
the maximum hamming distance from their parent. The main
processes of CHC algorithm are [1]:
• Half-Uniform Crossover (HUX): This will produce two
offsprings, which are maximally different from their two
parents.
• Elitist selection: this will keep the best solutions in each
generation.
• Incest prevention: this step prevents two individuals to
mate if the similarity between them greater than a thresh-
old.
• The Restarting process: if the specified population
stagnated, then this step generated a new population by
choosing the best individuals.
C. Particle Swarm Optimization (PSO)
This subsection handles the geometric particle swarm
optimization (GPSO) and shows the algorithm that used to
parallelize PSO or GPSO.
1) Geometric Particle Swarm Optimization (GPSO):
GPSO is a recent version of PSO. The key issue in GPSO
is the using a multi-parental recombination of solutions
(particles). In the first phase, a random initialization of
particles created. Then the algorithm evaluates these particles
to update the historical and social positions. Finally, the
three parents (3PMBCX) move the particles, as shown in
Algorithm 2:
Algorithm 2 GPSO algorithm [2]
S:SwarmInitialization()
while not stop condition do do
for each particle i of the swarm S do do
evaluate(solution(xi))
update(velocity equation (hi))
update(global best solution (gi))
end for
for each particle i of the swarm S do do
xi:3PMBCX ((xi, wa), (gi, wb), (hi, wc))
mutate(xi)
end for
end while
Output: best solution found
2) Parallel Multi Swarm Optimization (PMSO): Parallel
multi swarm optimization presented in [2], it was defined in
analogy with parallel GA as a pair of (S, M), where S is a
collection swarm, and M is a migration policy. Algorithm 3
depicts the parallel PSO methodology.
Algorithm 3 Multi swarm optimization [2]
DO IN PARALLEL for each i 1,...,m
initialize(Si)
while not stop condition do do
iterate Si for n steps /* PSO evolution */
for each Sj (Si) do do
send particles in s(Si) to Sj
end for
for each Sj such that Si (Sj ) do do
receive particles from Sj
replace particles in Si according to r
end for
end while
Output: best solution ever found in the multi-swarm
D. Parallel Scatter Search (PSS)
Scatter search is an evolutionary method that was success-
fully applied to hard optimization problems. It uses strategies
for search diversification and intensification that have proved
effective in a variety of optimization problems, see Algorithm
4.
E. Parallel Ant Colony Optimization (PACO)
When dealing with huge search space, parallel computing
techniques usually applied to improve the efficiency. Parallel
ACO algorithms can achieve high-quality solutions in
reasonable execution times comparing with sequential ACO
[18]. In Algorithm 5, the methodology of PACO is presented.
Vol. 16, No. 3, March 2018
ISSN 1947-5500

Algorithm 4 Parallel scatter search methodology [11]
Create Population (Pop, PopSize)
Generate ReferenceSet (RefSet, RefSetSize)
while Stopping Criterion1 do
while Stopping Criterion2 do
Select Subset (Subset, SubsetSize)
for each processor r=1 to n do in parallel do
Combine Solutions (SubSet, CurSol)
Improve Solution (CurSol, ImpSol)
end for
end while
Update ReferenceSet (RefSet)
end while
Algorithm 5 Parallel ant colony optimization methodology
[18]
Generate Ants
Initialize N processors
Multicast to all slaves processors N and the task ids of all
slaves
for each slave do do
Send a number between 0 and N that identifies the task
inside the program
end for
while not all slaves have sent back solution do
Wait for solution
if a slave returns a solution that is better than any solution
received then
Multicast this solution to all slaves
end if
end while
Return the best solution
IV. PARALLEL EVOLUTIONARY ALGORITHMS FOR
FEATURE SELECTION
We reviewed a set of research papers, which were dealing
with feature selection problem for high dimensional datasets
in a parallel environment and using parallel evolutionary
algorithms. Let us discuss these studies in the following
subsections.
A. Parallel GA
Liu et al., [5] used parallel GA for selecting informative
genes (features) in tissue classification, using wrapper
approach. The main purpose was to find the subset of features
with fewer elements and higher accuracy. The parallelization
of GA performed by dividing the population into sub-
populations, and then the GA run on each sub-population.
Therefore, the searching for the optimal subset of genes can
be on several CPUs/computers at the same time.
For evaluation, the Golub classifier was used. This classifier
introduced by the authors and it depend on the sign of the
results for classification; if the sign is positive the sample x
belongs to class 1, else if it negative the sample x belongs
to class 2. This classifier used only if the datasets have two
classes. The accuracy of the classifier tested by using the
LOOCV (leave one out cross validation) method. The results
showed that using the parallel GA increased the accuracy,
and reduced the number of genes that used for classification.
In [8] Zheng et al., analyzed the execution speed and
solution quality of many parallel GA schemes theoretically.
Furthermore, they pointed to the best scheme of parallel GA
that used on multi-core architecture. This paper considered
the relationship between speed and parallel architecture along
with solution quality.
They analyzed (Master-Slave, Synchronous Island,
Asynchronous Island, Cellular, and hybrid scheme of Master-
Slave and Island) schemes of parallel GA, with Pthread
library on multi-core parallel architecture.
To validate their theoretical analyzing an experiments
performed. The hybrid scheme of (Master-Slave and
Asynchronous Island) was the best scheme in performance
using multi-core architecture. The Island scheme has the best
execution time, but the worst solution quality. To improve
the solution quality when using Island model it is better to
decrease the number of islands. The Asynchronous Island is
faster than the Synchronous. The Master-Slave scheme has
the best solution quality and the worst execution time.
Soufan at el., [15] developed a web-based tool called
DWFS, which used for feature selection for different
problems. This tool followed a hybrid approach of wrapper
and filter. First, the filter used as preprocessing and select
the top ranked features based on tunable and a predefined
threshold. In the next step, parallel GA based on wrapper
approach applied to the selected features to search for subset
features that increase the classifier accuracy. The scheme of
parallel GA was Master-Slave; the master node used to create
initial population and GA steps. While the slave (worker)
nodes used for fitness evaluation of each chromosome, this
implementation is performed on 64 core.
For evaluation, they used three different classifiers
(Bayesian classifier, K-nearest neighbor, and a combination
of them). The results of the experiments show that DWFS
tool provided many options to enhance the feature selection
problem in different biological and biomedical problems.
In [7] Pinho et al., presented a framework called ParJEColi
(java-based library) for a parallel evolutionary algorithm in
bioinformatics applications. The aim of this platform was
to make the parallel environment (multi-core, cluster, and
grid) easy and transparent to the users. This library adapted
itself to the problem and the target parallel architecture. The
user can easily configure the parallel model and the target
architecture; since, ParJEColi encapsulated the parallelization
Vol. 16, No. 3, March 2018
ISSN 1947-5500

concerns as features. The explicit steps implemented by a
simple GUI.
The experiments for validation this framework were done
on 2 biological dataset and many bioinformatics scenarios.
The results indicate that the proposed framework improves
the computational performance (decreases execution time)
also the solution quality.
B. Parallel CHC
In [1] Peralta et al., presented a parallel evolutionary
algorithm called CHC algorithm by using the MapReduce
paradigm for selecting features in high dimensional datasets
to improve the classification. The parallelization of CHC
algorithm is done by using MapReduce procedure (Hadoop
implementation).
A cluster of computers of 20 computing nodes were used.
Each dataset split into 512-map task. For evaluating their
work, three classifiers where used SVM (support vector
machine), logistic regression, and Bayesian classifier.
The results showed that the run time for classification
increased as the number of features decreased, except for
Bayesian classifier. They explained this result as follow: if
the number of blocks less than the number of computing
machines; this leads to have some machines remain idle. In
addition, if the number of blocks greater than the number of
computing machines, the blocks maybe will not distributed
in efficient way.
They compared parallel CHC with the serial version, and
they concluded that the accuracy of classification increased by
using parallel CHC. Furthermore, the parallel version of CHC
reduced the run time when the datasets is high dimensional.
C. Parallel PSO
PSO is an efficient optimization technique, it used to
solve the problem of feature selection in high dimensional
datasets. In [4] Chen et al., used the parallel PSO algorithm
for solving two problems at the same time. By creating an
objective function that takes into account three variables
at the same time (the selected features, the number of
support vectors, and average accuracy of SVM). In order
to maximize the capability of SVM classifier in generalization.
The proposed method called PTVPSO-SVM (parallel time
variant particle swarm optimization support vector machine),
it had two phase: 1) the parameter settings of SVM and
feature selection work together. 2) the accuracy of SVM
evaluated using the set of features and the optimal parameters
from the first phase.
They used parallel virtual machine (PVM) with 8 machines;
and 10-fold cross validation. The results showed that they
could achieve the following aims: increasing the accuracy
classification, reducing the execution time comparing
with sequential PSO, producing an appropriate model of
parameters, and selecting the most discriminative subset of
features.
Feature selection can be carried out based on rough set
theory with searching algorithm as in [3, 6]. In [6] Qian
et al., proposed three parallel attribute reduction (feature
selection) algorithms based on MapReduce on Hadoop. The
first algorithm was built by constructing the proper (key,
value) by rough set theory and implementing MapReduce
functions. The second algorithms were done by realizing
the parallel computation of equivalence classes and attribute
significances. The last parallel algorithm was designed to
acquire the core attributes and a reduce in both data and
parallel task.
The experiments are performed on a cluster of computers
(17 computing node). They considered the performance
of the parallel algorithms, but they did not focus on the
classification accuracy; since the sequential and parallel
algorithms gave the same results. The results showed that the
proposed parallel attribute reduction algorithms could deal
with high dimensional datasets in an efficient way and better
than the sequential algorithms.
In [3] Adamczyk, use rough set theory for attribute
reduction, to increase the efficiency he implemented parallel
Asynchronous PSO for this problem. The parallelization was
done by assigning the complex function computations in
slave cores and the main core make the updating particle and
checking the convergence of the algorithm.
From their experiments it was noticeable that the efficiency
and speedup of parallel PSO algorithm were raising as the
size of dataset increased. The achievable accuracy was not
astonishing, but it was better than the classical algorithms.
D. Parallel GPSO
In [2] Garcia-Nieto et al., parallelized a version of PSO
called GPSO which is suitable for feature selection problem
in high dimensional datasets. The proposed method was called
PMOS (Parallel multi-swarm optimizer). Which was done by
running a set of parallel sub PSOs algorithms, which forming
an island model. Migration operation exchanged solutions
between islands based on a certain frequency The aim of
the fitness function increasing the classification accuracy and
reduce the number of selected genes (features).
They used the SVM classifier (Support Vector Machine)
to prove the accuracy of the selected subset of features. In
their experiments, they used a cluster of computers as a
Vol. 16, No. 3, March 2018
ISSN 1947-5500

parallel architecture. They found that 8-swarm PMSO was
the best choice for parallelization. The results pointed out
that this algorithm was better than the sequential version and
other methods in term of performance and accuracy while it
selected few genes for each subset.
E. Parallel SS
In [11] Lopez et al., present a parallel SS metaheuristics
for solving feature selection problem in classification. They
proposed two methods for combining solutions in SS.
The first method is called GC (greedy combination): in this
strategy, the common features of the combined solutions are
added, then at each iteration one of the remaining features is
added to any new solution.
The second strategy is called RGC (reduced greedy
combination), it has the same start as GC, but in the next
step, it considers only the features that appear in solutions
with good quality. Then the parallelization of SS is obtained
by running these two methods (GC, RGC) at the same time
on two processors. Using different combination methods and
parameters settings at each processor.
They compared the proposed parallel SS with sequential
SS and GA. The results show that the quality of solution in
parallel SS is better than solutions which was obtained from
the sequential SS and GA. Also, the parallel SS use a smaller
set of features for classification. The run time is the same for
parallel and sequential SS.
F. Parallel ACO
This subsection shows how the parallel ACO is used to
solve feature selection problem for classification in high
dimensional datasets.
In [17] Meena et al., implemented a parallel ACO to
solve the feature selection problem for long documents. The
parallelization was done using MapReduce programming
model (Hadoop) that automatically parallelize the code and
data then run them on a cluster of computing nodes. The
wrapper approachis used as evaluation criteria that used
Bayesian classifier. Furthermore, the accuracy of the classifier
was based on these metrics: precision, recall, accuracy and
F-measure.
The enhanced algorithm (parallel ACO) was compared with
ACO, enhanced ACO, and two feature selection methods,
CHI (Statistical technique) and IG (Information Gain). They
used Bayesian classifier in evaluation process. The results
showed that for a given fixed quality of the solutions the
proposed algorithm could reduce the execution time but
without considered the solution quality. On the other hand,
the accuracy of the classifier was increased using parallel
TABLE I
SUMMARY OF ALGORITHMS AND PROGRAMMING MODELS
Paper
Used
evolutionary
algorithm
Parallel
Programming
model
Peralta et al. [1]
CHC
(Type of GA)
MapReduce
Garcia-Nieto et al. [2] GPSO MALLBA
Adamczyk [3] PSO Unknown
Chen et al. [4] PSO PVM
Liu et al. [5] GA Unknown
Lopez et al. [11] SS Unknown
Soufan et al. [15] GA MPI
Meena et al. [17] ACO MapReduce
ACO comparing with sequential ACO and feature selection
methods.
In [12] Cano et al., parallelized an existing multi-objective
ant programming model that used as the classifier. This
algorithm was used for rule mining in high dimensional
datasets. The parallelization was done on data and each ant
encoded a rule. This was achieved by let each processor
perform the same task on a different subset of the data at the
same time. In the implementation, they used GPUs, which
are multi-core and parallel processor units architecture. This
parallel model Followed CUDA method.
For evaluation they used these metrics: true positive,
false positive, true negative, false negative, sensitivity, and
specificity. The results indicate that the efficiency of this
model was increased as the size of datasets increased.
V. SUMMARY AND DISCUSSION
The summary of the papers that implemented the parallel
EA for solving the classification problem in high dimensional
datasets is reported in Table 1 and Table 2.
Many research papers [2, 3, 7, 8, 9, 10, 12], stated that
we can reduce the execution time and achieve acceptable
speed ups, when applying parallel evolutionary algorithms
on multiple processors. We noticed that they achieved a
reasonable speed up in many cases.
In the next table (Table 2), when comparing the accuracy
of parallel EA it is important to notice how many classifiers
were used to measure the accuracy. Furthermore, we should
consider the metrics that were used to evaluate the classifier.
For example, the parallel PSO and its variants have the
higher accuracy; but they used only one metric which is the
success rate. This means that the parallel PSO is not the most
accurate parallel EA based on Table 2.
On the other hand, the parallel GA and its variant has the
least accuracy, but they used from two to five metrics for
evaluation purpose. Based on these metrics, we can say that
Vol. 16, No. 3, March 2018
ISSN 1947-5500

TABLE II
SUMMARY OF DATASETS, CLASSIFIERS, AND ACCURACY RESULTS
Paper dataset Classifiers
Metrics for
classification
Accuracy
Peralta
et al. [1]
Epsilon Bayesian
AUC=
(TPR+TNR)/2
0.71
SVM 0.68
Logistic
Regression
0.70
ECBDL
14-ROS
Bayesian 0.67
SVM 0.63
Logistic
Regression
0.63
Garcia-
Nieto
et al. [2]
Colon SVM
Success
Rate
0.85
Lymp 0.97
Leuk 0.98
Lung 0.97
Adamczyk
[3]
15 Data
Set
—
Success
rate
0.70
(Avg)
Chen
et al. [4]
30 Data
Set
SVM
Success
rate
0.87
(Avg)
Liu
et al. [5]
Leukemia Golub
Success
rate
0.88
Colon N/A
Lopez
et al. [11]
12 Data
Set
Nearest
Neighbor
Success
rate
0.86
(Avg)
Bayesian
0.87
(Avg)
Decision
Tree
0.86
(Avg)
Soufan
et al. [15]
9 Data
Set
K- Nearest
Neighbor
F1, PPV,
GMean,...
0.81(Avg)
(GMean)
Bayesian
0.79(Avg)
(GMean)
Meena
et al. [17]
2 Data
Sets
Bayesian
F-measure,
recall,....
0.64
(Avg)
the parallel GA is the best parallel EA for feature selection
in high dimensional datasets
VI. CONCLUSION
After the review of different parallel EA that are used
to solve the feature selection problem in high dimensional
datasets. We adopted the accuracy as a measure to compare
the algorithms performance.
The following points show our conclusion about the perfor-
mance of the mentioned algorithms in this chapter for feature
selection:
• GA and its variants: based on the papers we reviewed,
the parallel GA has the higher accuracy.
• PSO and its variants: the parallel PSO has the same
accuracy as sequential PSO.
• SS: parallel SS gives better results in case of accuracy
than GA and sequential SS.
• ACO: parallel ACO has the less accurate results than the
other parallel EA.
It is noticeable that PGAs are the most suitable algorithms
for feature selection in large datasets; since they achieved
the highest accuracy. On the other hand, the PACO is
time-consuming and less accurate comparing with other PEA.
References
[1] D. Peralta, S. RA.O, S. Rama.rez-Gallego, I.Triguero,
J. Benitez, F. Herrera. ”Evolutionary Feature Selection
for Big Data Classification: A MapReduce Approach”.
Hindawi Publishing Corporation, Mathematical Problems in
Engineering, Volume 11, Article ID 246139, (2015).
[2] J. Garca-Nieto, E. Alba. ”Parallel multi-swarm optimizer
for gene selection in DNA microarrays”. DOI 10.1007/s10489-
011-0325-9, Appl Intell (2012) 37:255266.
[3] M. Adamczyk. ”Parallel Feature Selection Algorithm
based on Rough Sets and Particle Swarm Optimization”.
DOI:.10.15439/2014F389, ACSIS, VOL.2. IEEE, (2014).
[4] H. Ling Chen, B. Yang, S. Wang, G. Wang, D. Liu,
H. Zhong Li, W. Liu. ”Towards an optimal support vector
machine classifier using a parallel particle swarm optimization
startegy”. Applied Mathematics and Computation 239, 180-
197. Elsevier, (2014).
[5] J. Liu, H. Iba. ”Selecting Informative Genes with Parallel
Genetic Algorithms in Tissue Classification”. Genome
Informatics 12: 14-23, (2001).
[6] J. Qian, D. Miao, Z. Zhang, X. Yue. ”Parallel attribute
reduction algorithms using MapReduce”. Elsevier, Information
Sciences (2014).
[7] J. Pinho, L. Sobral, M. Rocha. ”Parallel evolutionary
computation in bioinformatics applications”. Elsevier.
Computer methods and programs in biomedicine 110,
183191, (2013).
[8] L. Zheng, Y. Lu, M. Ding, Y.Shen, M. Guo, S. Guo.
”Architecture-based performance evaluation of genetic
algorithms on multi/many core systems”. The 14th IEEE
International Conference on Computational Science and
Engineering. 978-0-7695-4477-9/11, (2011) IEEE DOI
10.1109/CSE.2011.65
[9] S. Santander-Jimenez, M. Vega-Rodrguez. ”Parallel
Multiobjective Metaheuristics for Inferring Phylogenies on
Multicore Clusters”. DOI 10.1109/TPDS.2014.2325828, IEEE
(2014).
[10] J. Petrlik, L. Sekanina. ”Towards Robust and Accurate
Traffic Prediction Using Parallel Multiobjective Genetic
Algorithms and Support Vector Regression”. (2015) IEEE
18th International Conference on Intelligent Transportation
Systems.
Vol. 16, No. 3, March 2018
ISSN 1947-5500

[11] F. Lopez, M. Torres, B. Batista, J. Perez, J. Vega.
”Solving feature selection problem by a parallel Scatter
Search”. Elsevier, European Journal of Operational Research
169, 477-489, (2006).
[12] A. Cano, J. Olmo, S. Ventura. ”Parallel Multi-Objective
Ant Programming for Classiﬁcation Using GPUs”. Journal of
Parallel and Distributed Computing, November (2012).
[13] E. Albaa, G. Luquea, S. Nesmachnowb. ”Parallel
metaheuristics: recent advances and new trends” Intl.
Trans. in Op. Res. 20 (2013) 148, DOI: 10.1111/j.1475-
3995.2012.00862.x
[14] H. Liu, M. Hiroshi. ”Instance Selection and Construction
for Data mining”. Springer ,Eds (2001).
[15] O. Soufan, D. Kleftogiannis, P. Kalnis, V. Bajic. ”DWFS:
A wrapper Feature Selection Tool Based on a Parallel Genetic
Algorithm”. PLOS ONE, DOI: 10.1371/journal.pone.0117988,
February (2015).
[16] A. Gottlieb, G.S. Almasi. ”Highly Parallel Computing”.
Benjamin-Cummings Publishing Co., Inc. Red word city, CA,
USA (1989).
[17] M. Meena, K.R. Chandran, A. Kathik, A. Samuel. ”A
parallel ACO algorithm to select terms to categorise longer
documents”. Int. J. Computational Science and Engineering,
Vol 6 , No.4, (2001).
[18] A. Sameh, A. Ayman, N. Hasan. ”Parallel Ant Colony
Optimization”. International Journal of research and reviews
in Computer Science (IJRRCS), Vol.1, No. 2, June (2010).
Vol. 16, No. 3, March 2018
ISSN 1947-5500

Parallel Evolutionary Algorithms for Feature Selection in High Dimensional Datasets

Recommended

More Related Content

What's hot (16)

Similar to Parallel Evolutionary Algorithms for Feature Selection in High Dimensional Datasets (20)

Recently uploaded (20)

Parallel Evolutionary Algorithms for Feature Selection in High Dimensional Datasets