SlideShare a Scribd company logo
Dhinaharan Nagamalai et al. (Eds) : ACITY, WiMoN, CSIA, AIAA, DPPR, NECO, InWeS - 2014
pp. 301–306, 2014. © CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4530
OPTIMAL RULE SET GENERATION USING
PSO ALGORITHM
Shampa sengupta1
, Asit Kumar Das2
1
Department of Information Technology, MCKV Institute of Engineering,
Liluah,
Howrah – 711 204, West Bengal, India
shampa2512@yahoo.co.in
2
Department of Computer Science and Technology, Indian Institute of
Engineering, Science and Technology, Shibpur, Howrah – 711 103, West
Bengal, India
akdas@cs.becs.ac.in
ABSTRACT
Classification and Prediction is an important research area of data mining. Construction of
classifier model for any decision system is an important job for many data mining applications.
The objective of developing such a classifier is to classify unlabeled dataset into classes. Here
we have applied a discrete Particle Swarm Optimization (PSO) algorithm for selecting optimal
classification rule sets from huge number of rules possibly exist in a dataset. In the proposed
DPSO algorithm, decision matrix approach was used for generation of initial possible
classification rules from a dataset. Then the proposed algorithm discovers important or
significant rules from all possible classification rules without sacrificing predictive accuracy.
The proposed algorithm deals with discrete valued data, and its initial population of candidate
solutions contains particles of different sizes. The experiment has been done on the task of
optimal rule selection in the data sets collected from UCI repository. Experimental results show
that the proposed algorithm can automatically evolve on average the small number of
conditions per rule and a few rules per rule set, and achieved better classification performance
of predictive accuracy for few classes.
KEYWORDS
Particle swarm optimization, Data Mining, Classifiers.
1. INTRODUCTION
Many particle swarm algorithms have been developed that deals only with continuous variables
[1, 5, 10]. This is a significant limitation because many optimization problems are there which
featuring discrete variables in the problem domain. Typical problems are there in the space which
deals with the ordering, grouping or arranging of discrete variables such as scheduling or routing
problems. Therefore, the developing of particle swarm algorithms that deals with discrete
variables is important for such kind of problem. We propose a variant of Particle Swarm
Optimization (PSO) algorithm applied to dicerete data for rule selection in Data Mining. We will
refer to this algorithm as the discrete Particle Swarm Optimization (DPSO) algorithm. The DPSO
deals with discrete valued data, and its population of candidate solutions contains particles of
different sizes. Although the algorithm has been specifically designed for optimized rule selection
task, it is by no means limited to this kind of application. The DPSO algorithm may be applied to
302 Computer Science & Information Technology (CS & IT)
other optimization problems with little modification. Construction of classifier model for any
decision system is an important job for many data mining applications. The objective of
developing such a classifier is to classify unlabeled dataset (object belongs to test set) into
classes. Here we propose a discrete Particle Swarm Optimization (PSO) algorithm for selection of
optimal or near optimal classification rules from huge number of rules possibly exists in a dataset.
Primarily, all exhaustive rules are generated from the dataset using decision matrix approach [11,
12] which is based on rough set theory. Then the DPSO method identifies important or significant
rules from all possible classification rules without sacrificing predictive accuracy. For the larger
dataset, if all decision rules are considered for data analysis then time complexity will be very
high. For this reason a minimum set of rules are generated.
The paper is organized as follows. Section 2 briefly introduces PSO algorithm. Section 3
introduces the DPSO algorithm proposed in this paper for the task of optimal rule selection.
Section 4 and 5 reports experimental methodology, experiments and results respectively and
finally Section 6 presents conclusions of the work.
2. A BRIEF INTRODUCTION TO PARTICLE SWARM OPTIMIZATION
PSO [5, 6, 7] is an evolutionary optimization algorithm proposed by Kennedy and Eberhart in
1995. In PSO, a population, called a swarm, of candidate solutions are encoded as particles in the
search space. PSO starts with the random initialization of a population of particles. The whole
swarm move in the search space to search for the best solution by updating the position of each
particle based on the experience of its own and its neighbouring particles. In PSO a potential
solution to a problem is represented by a particle X(i) = (x(i,1), x(i,2), …,x(i,n)) in an n-dimensional
search space. The coordinates x(i,d) of these particles have a rate of change(velocity) v(i,d) , d = 1,
2, ..., n. Every particle keeps a record of the best position that it has ever visited. Such a record is
called the particle's previous best position and denoted by Bi. The global best position attained by
any particle so far is also recorded and stored in a particle denoted by G. Iteration comprises
evaluation of each particle with the adjustment of v (i, d) in the direction of particle X(i)'s previous
best position and the previous best position of any particle in the neighbourhood.
Generally speaking, the set of rules that govern PSO are: evaluate, compare and evolve. The
evaluation phase measures how well each particle (candidate solution) solves the problem. The
comparison phase identifies the best particles. The evolve phase produces new particles based on
some of the best particles previously found. These three phases are repeated until a given
stopping criterion is matched. The objective is to find the best particle which solves the target
problem. Important concepts in PSO are velocity and neighbourhood topology. Each particle,
X(i), is associated with a velocity vector. This velocity vector is updated at every generation. The
updated velocity vector is then used to generate a new particle X(i). The neighbourhood
topology defines how other particles in the swarm, such as B (i) and G, interact with X (i) to
modify its respective velocity vector and, consequently, its position as well.
3. THE PROPOSED DISCRETE PSO ALGORITHM
The algorithm presented here is based on DPSO algorithm [6]. The proposed algorithm deals
with generation of optimized classification rules from discrete valued dataset, which is typically
a rule mining problem. Primarily, all exhaustive rules are generated from the dataset using
decision matrix approach [11, 12] which is based on rough set theory. Every rule has two parts,
conditional part and decision part, conditional part comprises of some conditional attributes with
their values and decision part has decision attribute with the corresponding decision value or
class. In a rule, each conditional attribute with the corresponding value is termed as rule
component. So a rule is formed by some rule components. In PSO, population of candidate
solutions contains particles of different sizes. Here population of candidate solutions is generated
Computer Science & Information Technology (CS & IT) 303
from initial rule set. Each particle represents the antecedent part of a rule by considering the
conclusion part is fixed at the beginning of the execution and represents a target attribute value.
In this case to discover the optimal rule set, predicting different target/decision values, the
algorithm has to run several times, one for each decision value. There are N (No of initial rules)
particles in a swarm. The length of each particle may vary from 1 to n, where n is the number of
unique rule component present in the initial rule set, which is made by considering only the
antecedent part of each rule. Each particle Ri keeps a record of the best position it has ever
attained. This information is kept in a distinguished particle labeled as Bi. The swarm also keeps
the information of the global best position ever attained by any particle in the swarm. This
information is also kept in a separated particle labeled G. G is equal to the best Bi present in the
swarm.
3.1 ENCODING OF THE PARTICLES FOR THE PROPOSED DPSO ALGORITHM
Each rule is formed by some rule components and each rule component is identified by a unique
positive integer number or index. These indices, vary from 1 to n, where n is the number of
unique rule components present in the initial rule set. Each particle is subsets of indices without
repetition. For example, corresponding to the rules R1, R2 and R3 given below, the rule
components are (A=1), (A=2), (B=3), (B=4) and (C=5) which are indexed as 1, 2, 3, 4 and 5
respectively.
R1= {A=1, B=3, C=5}
R2= {A=2, B=4}
R3= {A=2, C=5}
Where, A, B, C are the conditional attributes, N (Number of initial rules) = 3. Here initial swarm
representing candidate solution could looks as follows:
R1= {1, 3, 5}
R2= {2, 4}
R3= {2, 5}.
3.2 THE INITIAL POPULATION FOR THE PROPOSED DPSO ALGORITHM
The initial population of particles is generated as follows.
Population of candidate solutions is generated from initial rule set. Each particle represents the
antecedent part of a rule by considering the conclusion part as fixed at the beginning of the
execution and represents a target attribute value. Rule encoding process is described in section
3.1. By considering the rule encoding process the particles are formed for each rule generated
using decision matrix based classification method.
3.3 VELOCITIES
The DPSO algorithm does not use a vector of velocities. It works with proportional likelihoods.
Basically, the idea of proportional likelihood used in the DPSO algorithm is almost similar with
the idea of velocity used in the standard PSO algorithm. Each particle is associated with a 2 × n
array of proportional likelihoods, where 2 and n represents number of rows and number of
columns respectively. In this standard proportional likelihood array, each element in the first row
of Vi represents the proportional likelihood based on which a rule component be selected. The
second row of Vi has the indices of the rule components which is associated with the respective
proportional likelihoods of the first row of the vector Vi. There is a one-to-one correspondence
between the columns of this array. At the beginning, all elements in the first row of Vi are set to
1, e.g., Vi = {{1, 1, 1, 1, 1 }, {1,2,3,4,5}}.
304 Computer Science & Information Technology (CS & IT)
After the initial population of particles is generated, this array is always updated before a new
configuration for the particle associated to it is made. The updating process is based on Ri, Bi
(particle's previous best position) and G (global best position) and works as follows.
In addition to Ri, Bi and G, three constant updating factors, namely, a, b and c are used to update
the proportional likelihoods v(i,d). These factors determine the strength of the contribution of Ri, Bi
and G to the adjustment of every coordinate v(i,d) € Vi. Parameter values of a, b and c is chosen
experimentally.
3.4 GENERATING NEW PARTICLES FOR THE PROPOSED DPSO ALGORITHM
The proportional likelihood array Vi is then used to sample a new configuration of particle Ri that
is, the particle associated to it. First, for a particle with Vi, all indices present in Ri have their
corresponding proportional likelihood increased by ‘a’. Similarly, all indices present in Bi and G
have their corresponding proportional likelihood increased by ‘b’ and ‘c’ respectively. Now each
element of the first row of the array Vi is then multiplied by a uniform random number between 0
and 1. A new random number is generated for every single multiplication performed. The new
particle is then defined by ranking the columns in Vi by the values in its first row. That is, the
elements in the first row of the array are ranked in a decreasing order of value and the indices of
the rule components (in the second row of Vi) follow their respective proportional likelihoods.
Thus for example, after all the steps if particle i has length 3, and the particle associated to the
array
V i =
0.74 0.57 0.50 0.420.20
5 4 3 1 2
Then first 3 indices from the second row of Vi would be selected to compose the new particle.
That is, Ri = {*, *, *} the indices (rule components) 5, 4 and 3 would be selected to compose the
new particle, i.e., Ri = {5, 4, 3}. Note that indices that have a higher proportional likelihood are,
on average, more likely to be selected. If indices are the rule components with same attribute are
selected, then higher confidence rule component is selected. In that case new particle size is also
changes in the generation. The updating of Ri, Bi and G is identical to what is described earlier. In
this way new particles are formed generation by generation.
4. EXPERIMENTAL METHODOLOGY
The purpose of this experiment was to evaluate the classification accuracy and comprehensibility
by the number of rules in the data set, and the average number of rule conditions per rule. The
fitness function f (Ri) of any particle i is computed as follows. Optimized rule selection process
can be performed by DPSO as a multi objective problem to maximize the confidence (association
rule mining concept) of a rule to achieve higher classification accuracy as well as minimizing the
length of a rule. The goal is to see whether DPSO can select optimized set of rules to achieve a
higher classification accuracy rate from initial set of rules.
In this regard following fitness function is considered.
Fi=α*rulei_confidence+ (1-α)*1/ (rulei_length)
rulei _confidence: A rule like, Ri → ( = a) ( = b) → (D =di) has the
condition part Ci where conditional attributes are associated with value, and the decision part D
has the decision values di. So here the rule maps Ci→ di.
Computer Science & Information Technology (CS & IT) 305
Then the confidence of the rule is conf (Ri) = number of rows in dataset that match Ci and have
class label di / number of rows in dataset that match only Ci.
Here relative importance of the rule confidence and the length of the rule are considered. Rule
confidence is set larger than length of a rule because the classification performance is assumed
more important than the number of conditional attributes present in a rule i.e. rule length. The
objective is to find the fittest rules with which it is possible to classify the data set as belonging
to one of the classes with an acceptable accuracy. Here α=0.8 is considered.
5. EXPERIMENTS AND RESULT
The computational experiments involved a 10-fold cross-validation method [13].To illustrates the
method, wine dataset from UCI repository [14] of Machine Learning databases [10] is considered.
First, the 178 records with 3 distinct classes (0, 1, 2) in the wine data set were divided into 10
almost equally sized folds. The folds were randomly generated but under the following
regulation. The proportion of different classes in every single fold was similar to the one found in
the original data set containing all the 178 records. Each of the 10 folds is used once as test set
and the remaining of the data set is used as training set. The purpose of the experiment was to
evaluate the generalization ability, measured as the classification accuracy on test set. Each
training set was used for generation of initial classification rules. For each initial rule set, DPSO
algorithm generates optimized set of classification rules according to their fitness which is used to
classify the test set (each 10 test folds) accordingly. DPSO selects only the best particle in each
run as the rule.
As the DPSO algorithm is stochastic algorithm, 20 independent runs for the algorithm were
performed for every single fold and for every decision classes. The average number of rules by
the rule selection algorithms has always been rounded to the nearest integer. The population size
(initial rule set) used for the algorithm is on average 300 and the search stops in one run after 100
iterations. Other choices of parameter values were a = 0.10, b = 0.12 and c = 0.14. The results of
experiment were as follows. For Wine on average 95 rules are selected as optimized rules from
300 rules. And using these 95 rules on test set 99.32% of average classification accuracy is
achieved.Using fewer rules, DPSO algorithm obtained on average, a better predictive accuracy
than the classification performed using all the initial classification rules for few classes. For the
initial rule set average accuracy values on test set for 3 classes (0, 1, 2)were 99.05, 99.48, 99.77,
while using optimized rule set corresponding accuracy values on test set were 98.38, 99.59, 100.
The results also indicate that not only the predictive accuracy is good, but also the number of rule
conditions or rule length is relatively short like 2- 4 rule components in a rule where the same was
3-5 in the initial rule set and there are a small number of rules in the rule set.
6. CONCLUSION
We proposed an efficient method for rule discovery using DPSO algorithm. The discovered rules
are with of high accuracy and comprehensibility. Using fewer rules, DPSO algorithm obtained on
average, a better predictive accuracy than the classification performed using all the initial
classification rules for few classes. The results on the Wine data set show that our approach has
good performance for rule discovery on discrete data. Few less coverage rules were also selected.
Rule Coverage is an important parameter for selection of good quality rule. In the future by
applying some rule pruning technique better quality rule set can be generated.
306 Computer Science & Information Technology (CS & IT)
REFERENCES
[1] T. Blackwell and J. Branke. Multi-swarm optimization in dynamic environments. In Lecture Notes in
Computer Science, volume 3005, pages 489{500. Springer-Verlag, 2004.
[2] E. S. Correa, M. T. Steiner, A. A. Freitas, and C. Carnieri. Using a genetic algorithm for solving a
capacity p-median problem. Numerical Algorithms, 35:373{388, 2004.
[3] D. Freedman, R. Pisani, and R. Purves. Statistics. W. Norton & Company, 3rd edition, September
1997.
[4] A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-
Verlag, October 2002.
[5] G. Kendall and Y. Su. A particle swarm optimisation approach in the construction of optimal risky
portfolios. In Proceedings of the 23rd IASTED International Multi-Conference on Applied
Informatics, pages 140{145, 2005. Artificial intelligence and applications
[6] J. Kennedy and R. C. Eberhart. A discrete binary version of the particle swarm algorithm. In
Proceedings of the 1997 Conference on Systems, Man, and Cybernetics, pages 4104{4109,
Piscataway, NJ, USA, 1997. IEEE.
[7] J. Kennedy and R. C. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers Inc., San
Francisco, CA, USA, 2001
[8] T. M. Mitchell. Machine Learning. McGraw-Hill, August 1997.
[9] R. Poli, C. D. Chio, and W. B. Langdon. Exploring extended particle swarms: a genetic programming
approach. In GECCO'05: Proceedings of the 2005 Conference on Genetic and Evolutionary
Computation, pages 169{176, New York, NY, USA, 2005. ACM Press.
[10] Y. Shi and R. C. Eberhart. Parameter selection in particle swarm optimization. In EP'98: Proceedings
of the 7th International Conference on Evolutionary Programming, pages 591{600, London, UK,
1998. Springer-Verlag
[11] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan
Kaufmann, 2nd edition, 2005
[12] N.Zhong and A. Skowron, “A Rough Set-Based Knowledge Discovery Process”, Int. Journal of
Applied Mathematics and Computer Science., 11(3), 603-619, 2001. BIME Journal, Volume (05),),
2005
[13] Ethem Alpaydin Introduction to Machine Learning.PHI, 2010.
[14] Murphy, P. and Aha, W.: UCI repository of machine learning databases (1996),
http://www.ics.uci.edu/mlearn/MLRepository.html
Ad

Recommended

A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
IJRES Journal
 
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
idescitation
 
Two-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential Evolution
Xin-She Yang
 
Applications and Analysis of Bio-Inspired Eagle Strategy for Engineering Opti...
Applications and Analysis of Bio-Inspired Eagle Strategy for Engineering Opti...
Xin-She Yang
 
An improved teaching learning
An improved teaching learning
csandit
 
Master's Thesis Presentation
Master's Thesis Presentation
●๋•máńíکhá Gőýálツ
 
An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controlle...
An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controlle...
Waqas Tariq
 
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
iosrjce
 
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
IJERA Editor
 
Medical diagnosis classification
Medical diagnosis classification
csandit
 
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Shubhashis Shil
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
IJAEMSJORNAL
 
Artificial Intelligence in Robot Path Planning
Artificial Intelligence in Robot Path Planning
iosrjce
 
Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...
Aboul Ella Hassanien
 
Anirban part1
Anirban part1
kamatchi priya
 
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
gerogepatton
 
Borgulya
Borgulya
Khadija Achkoun
 
A046010107
A046010107
IJERA Editor
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Ica 2013021816274759
Ica 2013021816274759
Valentino Selayan
 
A Framework for Self-Tuning Optimization Algorithm
A Framework for Self-Tuning Optimization Algorithm
Xin-She Yang
 
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
ijsc
 
Cluster analysis
Cluster analysis
Jewel Refran
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
inventy
 
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
IJMER
 
Artificial bee colony with fcm for data clustering
Artificial bee colony with fcm for data clustering
Alie Banyuripan
 
Selecting the best stochastic systems for large scale engineering problems
Selecting the best stochastic systems for large scale engineering problems
IJECEIAES
 
Bh4301327332
Bh4301327332
IJERA Editor
 
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
ijsrd.com
 
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
IJERA Editor
 

More Related Content

What's hot (19)

A Combined Approach for Feature Subset Selection and Size Reduction for High ...
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
IJERA Editor
 
Medical diagnosis classification
Medical diagnosis classification
csandit
 
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Shubhashis Shil
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
IJAEMSJORNAL
 
Artificial Intelligence in Robot Path Planning
Artificial Intelligence in Robot Path Planning
iosrjce
 
Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...
Aboul Ella Hassanien
 
Anirban part1
Anirban part1
kamatchi priya
 
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
gerogepatton
 
Borgulya
Borgulya
Khadija Achkoun
 
A046010107
A046010107
IJERA Editor
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
Ica 2013021816274759
Ica 2013021816274759
Valentino Selayan
 
A Framework for Self-Tuning Optimization Algorithm
A Framework for Self-Tuning Optimization Algorithm
Xin-She Yang
 
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
ijsc
 
Cluster analysis
Cluster analysis
Jewel Refran
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
inventy
 
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
IJMER
 
Artificial bee colony with fcm for data clustering
Artificial bee colony with fcm for data clustering
Alie Banyuripan
 
Selecting the best stochastic systems for large scale engineering problems
Selecting the best stochastic systems for large scale engineering problems
IJECEIAES
 
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
IJERA Editor
 
Medical diagnosis classification
Medical diagnosis classification
csandit
 
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Solving Multidimensional Multiple Choice Knapsack Problem By Genetic Algorith...
Shubhashis Shil
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
IJAEMSJORNAL
 
Artificial Intelligence in Robot Path Planning
Artificial Intelligence in Robot Path Planning
iosrjce
 
Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...
Aboul Ella Hassanien
 
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
AUTOMATIC TRANSFER RATE ADJUSTMENT FOR TRANSFER REINFORCEMENT LEARNING
gerogepatton
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
ijsrd.com
 
A Framework for Self-Tuning Optimization Algorithm
A Framework for Self-Tuning Optimization Algorithm
Xin-She Yang
 
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
MARKOV CHAIN AND ADAPTIVE PARAMETER SELECTION ON PARTICLE SWARM OPTIMIZER
ijsc
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
inventy
 
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
IJMER
 
Artificial bee colony with fcm for data clustering
Artificial bee colony with fcm for data clustering
Alie Banyuripan
 
Selecting the best stochastic systems for large scale engineering problems
Selecting the best stochastic systems for large scale engineering problems
IJECEIAES
 

Viewers also liked (17)

Bh4301327332
Bh4301327332
IJERA Editor
 
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
ijsrd.com
 
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
IJERA Editor
 
Renewable sources of energy.
Renewable sources of energy.
awsomejedi
 
Maximum power pont tracking
Maximum power pont tracking
Adítýã Pâñïgrähï
 
Concept of renewable energy-various forms of renewable energy
Concept of renewable energy-various forms of renewable energy
Shanid Pazhamkavil
 
simulation of maximum power point tracking for photovoltaic systems
simulation of maximum power point tracking for photovoltaic systems
ST. MARTIN'S ENGINEERING COLLEGE
 
Maximum power point tracking techniques a review
Maximum power point tracking techniques a review
graphic era university
 
Feature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 
Buck boost converter
Buck boost converter
Sathiya kumar
 
Maximum Power Point Tracking of PV Arrays using Different Techniques
Maximum Power Point Tracking of PV Arrays using Different Techniques
IJERA Editor
 
Particle Swarm Optimization: The Algorithm and Its Applications
Particle Swarm Optimization: The Algorithm and Its Applications
adil raja
 
a project report on MPPT algorithm for PV panel
a project report on MPPT algorithm for PV panel
gauravchitransh
 
Maximum power point tracking.......saq
Maximum power point tracking.......saq
Saquib Maqsood
 
Renewable And Non Renewable Sources Of Energy
Renewable And Non Renewable Sources Of Energy
oneworld.abhiraj
 
Renewable non-renewable-energy-resources
Renewable non-renewable-energy-resources
Rachit Walia
 
100% Renewable Energy by 2050: Fact or Fantasy
100% Renewable Energy by 2050: Fact or Fantasy
John Farrell
 
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
A NOVEL APPROACH TO OBTAIN MAXIMUM POWER OUTPUT FROM SOLAR PANEL USING PSO
ijsrd.com
 
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
A Positive Buck Boost Converter with Mode Select Circuit and Feed Forward Tec...
IJERA Editor
 
Renewable sources of energy.
Renewable sources of energy.
awsomejedi
 
Concept of renewable energy-various forms of renewable energy
Concept of renewable energy-various forms of renewable energy
Shanid Pazhamkavil
 
simulation of maximum power point tracking for photovoltaic systems
simulation of maximum power point tracking for photovoltaic systems
ST. MARTIN'S ENGINEERING COLLEGE
 
Maximum power point tracking techniques a review
Maximum power point tracking techniques a review
graphic era university
 
Feature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 
Buck boost converter
Buck boost converter
Sathiya kumar
 
Maximum Power Point Tracking of PV Arrays using Different Techniques
Maximum Power Point Tracking of PV Arrays using Different Techniques
IJERA Editor
 
Particle Swarm Optimization: The Algorithm and Its Applications
Particle Swarm Optimization: The Algorithm and Its Applications
adil raja
 
a project report on MPPT algorithm for PV panel
a project report on MPPT algorithm for PV panel
gauravchitransh
 
Maximum power point tracking.......saq
Maximum power point tracking.......saq
Saquib Maqsood
 
Renewable And Non Renewable Sources Of Energy
Renewable And Non Renewable Sources Of Energy
oneworld.abhiraj
 
Renewable non-renewable-energy-resources
Renewable non-renewable-energy-resources
Rachit Walia
 
100% Renewable Energy by 2050: Fact or Fantasy
100% Renewable Energy by 2050: Fact or Fantasy
John Farrell
 
Ad

Similar to Optimal rule set generation using pso algorithm (20)

Particle swarm optimization
Particle swarm optimization
Mahesh Tibrewal
 
Markov Chain and Adaptive Parameter Selection on Particle Swarm Optimizer
Markov Chain and Adaptive Parameter Selection on Particle Swarm Optimizer
ijsc
 
PSO
PSO
Pooya Sagharchiha
 
Pso notes
Pso notes
Darshan Sharma
 
A PARTICLE SWARM OPTIMIZATION ALGORITHM BASED ON UNIFORM DESIGN
A PARTICLE SWARM OPTIMIZATION ALGORITHM BASED ON UNIFORM DESIGN
IJDKP
 
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
IJRESJOURNAL
 
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
Zubin Bhuyan
 
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
Zubin Bhuyan
 
Particle Swarm Optimization
Particle Swarm Optimization
Stelios Petrakis
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problems
riyaniaes
 
Particle swarm optimization
Particle swarm optimization
anurag singh
 
Particle Swarm Optimization
Particle Swarm Optimization
QasimRehman
 
PSO.ppsx
PSO.ppsx
Arunkumar Tulasi
 
Particle Swarm Optimization Application In Power System
Particle Swarm Optimization Application In Power System
Ministry of New & Renewable Energy, Govt of India
 
A REVIEW OF PARTICLE SWARM OPTIMIZATION (PSO) ALGORITHM
A REVIEW OF PARTICLE SWARM OPTIMIZATION (PSO) ALGORITHM
IAEME Publication
 
A-Hybrid-Approach-Using-Particle-Swarm-Optimization-and-Simulated-Annealing-f...
A-Hybrid-Approach-Using-Particle-Swarm-Optimization-and-Simulated-Annealing-f...
Pourya Jafarzadeh
 
Comparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image Steganography
IJCSIS Research Publications
 
Particle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentation
LatestShorts
 
PSO__AndryPinto_InesDomingues_LuisRocha_HugoAlves_SusanaCruz.pptx
PSO__AndryPinto_InesDomingues_LuisRocha_HugoAlves_SusanaCruz.pptx
SubhamGupta106798
 
A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...
Pim Piepers
 
Particle swarm optimization
Particle swarm optimization
Mahesh Tibrewal
 
Markov Chain and Adaptive Parameter Selection on Particle Swarm Optimizer
Markov Chain and Adaptive Parameter Selection on Particle Swarm Optimizer
ijsc
 
A PARTICLE SWARM OPTIMIZATION ALGORITHM BASED ON UNIFORM DESIGN
A PARTICLE SWARM OPTIMIZATION ALGORITHM BASED ON UNIFORM DESIGN
IJDKP
 
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
An Improved Adaptive Multi-Objective Particle Swarm Optimization for Disassem...
IJRESJOURNAL
 
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
Zubin Bhuyan
 
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
Zubin Bhuyan
 
Particle Swarm Optimization
Particle Swarm Optimization
Stelios Petrakis
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problems
riyaniaes
 
Particle swarm optimization
Particle swarm optimization
anurag singh
 
Particle Swarm Optimization
Particle Swarm Optimization
QasimRehman
 
A REVIEW OF PARTICLE SWARM OPTIMIZATION (PSO) ALGORITHM
A REVIEW OF PARTICLE SWARM OPTIMIZATION (PSO) ALGORITHM
IAEME Publication
 
A-Hybrid-Approach-Using-Particle-Swarm-Optimization-and-Simulated-Annealing-f...
A-Hybrid-Approach-Using-Particle-Swarm-Optimization-and-Simulated-Annealing-f...
Pourya Jafarzadeh
 
Comparison Between PSO and HPSO In Image Steganography
Comparison Between PSO and HPSO In Image Steganography
IJCSIS Research Publications
 
Particle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentation
LatestShorts
 
PSO__AndryPinto_InesDomingues_LuisRocha_HugoAlves_SusanaCruz.pptx
PSO__AndryPinto_InesDomingues_LuisRocha_HugoAlves_SusanaCruz.pptx
SubhamGupta106798
 
A comparison of particle swarm optimization and the genetic algorithm by Rani...
A comparison of particle swarm optimization and the genetic algorithm by Rani...
Pim Piepers
 
Ad

Recently uploaded (20)

PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
Curietech AI in action - Accelerate MuleSoft development
Curietech AI in action - Accelerate MuleSoft development
shyamraj55
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Enhance GitHub Copilot using MCP - Enterprise version.pdf
Nilesh Gule
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 

Optimal rule set generation using pso algorithm

  • 1. Dhinaharan Nagamalai et al. (Eds) : ACITY, WiMoN, CSIA, AIAA, DPPR, NECO, InWeS - 2014 pp. 301–306, 2014. © CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4530 OPTIMAL RULE SET GENERATION USING PSO ALGORITHM Shampa sengupta1 , Asit Kumar Das2 1 Department of Information Technology, MCKV Institute of Engineering, Liluah, Howrah – 711 204, West Bengal, India [email protected] 2 Department of Computer Science and Technology, Indian Institute of Engineering, Science and Technology, Shibpur, Howrah – 711 103, West Bengal, India [email protected] ABSTRACT Classification and Prediction is an important research area of data mining. Construction of classifier model for any decision system is an important job for many data mining applications. The objective of developing such a classifier is to classify unlabeled dataset into classes. Here we have applied a discrete Particle Swarm Optimization (PSO) algorithm for selecting optimal classification rule sets from huge number of rules possibly exist in a dataset. In the proposed DPSO algorithm, decision matrix approach was used for generation of initial possible classification rules from a dataset. Then the proposed algorithm discovers important or significant rules from all possible classification rules without sacrificing predictive accuracy. The proposed algorithm deals with discrete valued data, and its initial population of candidate solutions contains particles of different sizes. The experiment has been done on the task of optimal rule selection in the data sets collected from UCI repository. Experimental results show that the proposed algorithm can automatically evolve on average the small number of conditions per rule and a few rules per rule set, and achieved better classification performance of predictive accuracy for few classes. KEYWORDS Particle swarm optimization, Data Mining, Classifiers. 1. INTRODUCTION Many particle swarm algorithms have been developed that deals only with continuous variables [1, 5, 10]. This is a significant limitation because many optimization problems are there which featuring discrete variables in the problem domain. Typical problems are there in the space which deals with the ordering, grouping or arranging of discrete variables such as scheduling or routing problems. Therefore, the developing of particle swarm algorithms that deals with discrete variables is important for such kind of problem. We propose a variant of Particle Swarm Optimization (PSO) algorithm applied to dicerete data for rule selection in Data Mining. We will refer to this algorithm as the discrete Particle Swarm Optimization (DPSO) algorithm. The DPSO deals with discrete valued data, and its population of candidate solutions contains particles of different sizes. Although the algorithm has been specifically designed for optimized rule selection task, it is by no means limited to this kind of application. The DPSO algorithm may be applied to
  • 2. 302 Computer Science & Information Technology (CS & IT) other optimization problems with little modification. Construction of classifier model for any decision system is an important job for many data mining applications. The objective of developing such a classifier is to classify unlabeled dataset (object belongs to test set) into classes. Here we propose a discrete Particle Swarm Optimization (PSO) algorithm for selection of optimal or near optimal classification rules from huge number of rules possibly exists in a dataset. Primarily, all exhaustive rules are generated from the dataset using decision matrix approach [11, 12] which is based on rough set theory. Then the DPSO method identifies important or significant rules from all possible classification rules without sacrificing predictive accuracy. For the larger dataset, if all decision rules are considered for data analysis then time complexity will be very high. For this reason a minimum set of rules are generated. The paper is organized as follows. Section 2 briefly introduces PSO algorithm. Section 3 introduces the DPSO algorithm proposed in this paper for the task of optimal rule selection. Section 4 and 5 reports experimental methodology, experiments and results respectively and finally Section 6 presents conclusions of the work. 2. A BRIEF INTRODUCTION TO PARTICLE SWARM OPTIMIZATION PSO [5, 6, 7] is an evolutionary optimization algorithm proposed by Kennedy and Eberhart in 1995. In PSO, a population, called a swarm, of candidate solutions are encoded as particles in the search space. PSO starts with the random initialization of a population of particles. The whole swarm move in the search space to search for the best solution by updating the position of each particle based on the experience of its own and its neighbouring particles. In PSO a potential solution to a problem is represented by a particle X(i) = (x(i,1), x(i,2), …,x(i,n)) in an n-dimensional search space. The coordinates x(i,d) of these particles have a rate of change(velocity) v(i,d) , d = 1, 2, ..., n. Every particle keeps a record of the best position that it has ever visited. Such a record is called the particle's previous best position and denoted by Bi. The global best position attained by any particle so far is also recorded and stored in a particle denoted by G. Iteration comprises evaluation of each particle with the adjustment of v (i, d) in the direction of particle X(i)'s previous best position and the previous best position of any particle in the neighbourhood. Generally speaking, the set of rules that govern PSO are: evaluate, compare and evolve. The evaluation phase measures how well each particle (candidate solution) solves the problem. The comparison phase identifies the best particles. The evolve phase produces new particles based on some of the best particles previously found. These three phases are repeated until a given stopping criterion is matched. The objective is to find the best particle which solves the target problem. Important concepts in PSO are velocity and neighbourhood topology. Each particle, X(i), is associated with a velocity vector. This velocity vector is updated at every generation. The updated velocity vector is then used to generate a new particle X(i). The neighbourhood topology defines how other particles in the swarm, such as B (i) and G, interact with X (i) to modify its respective velocity vector and, consequently, its position as well. 3. THE PROPOSED DISCRETE PSO ALGORITHM The algorithm presented here is based on DPSO algorithm [6]. The proposed algorithm deals with generation of optimized classification rules from discrete valued dataset, which is typically a rule mining problem. Primarily, all exhaustive rules are generated from the dataset using decision matrix approach [11, 12] which is based on rough set theory. Every rule has two parts, conditional part and decision part, conditional part comprises of some conditional attributes with their values and decision part has decision attribute with the corresponding decision value or class. In a rule, each conditional attribute with the corresponding value is termed as rule component. So a rule is formed by some rule components. In PSO, population of candidate solutions contains particles of different sizes. Here population of candidate solutions is generated
  • 3. Computer Science & Information Technology (CS & IT) 303 from initial rule set. Each particle represents the antecedent part of a rule by considering the conclusion part is fixed at the beginning of the execution and represents a target attribute value. In this case to discover the optimal rule set, predicting different target/decision values, the algorithm has to run several times, one for each decision value. There are N (No of initial rules) particles in a swarm. The length of each particle may vary from 1 to n, where n is the number of unique rule component present in the initial rule set, which is made by considering only the antecedent part of each rule. Each particle Ri keeps a record of the best position it has ever attained. This information is kept in a distinguished particle labeled as Bi. The swarm also keeps the information of the global best position ever attained by any particle in the swarm. This information is also kept in a separated particle labeled G. G is equal to the best Bi present in the swarm. 3.1 ENCODING OF THE PARTICLES FOR THE PROPOSED DPSO ALGORITHM Each rule is formed by some rule components and each rule component is identified by a unique positive integer number or index. These indices, vary from 1 to n, where n is the number of unique rule components present in the initial rule set. Each particle is subsets of indices without repetition. For example, corresponding to the rules R1, R2 and R3 given below, the rule components are (A=1), (A=2), (B=3), (B=4) and (C=5) which are indexed as 1, 2, 3, 4 and 5 respectively. R1= {A=1, B=3, C=5} R2= {A=2, B=4} R3= {A=2, C=5} Where, A, B, C are the conditional attributes, N (Number of initial rules) = 3. Here initial swarm representing candidate solution could looks as follows: R1= {1, 3, 5} R2= {2, 4} R3= {2, 5}. 3.2 THE INITIAL POPULATION FOR THE PROPOSED DPSO ALGORITHM The initial population of particles is generated as follows. Population of candidate solutions is generated from initial rule set. Each particle represents the antecedent part of a rule by considering the conclusion part as fixed at the beginning of the execution and represents a target attribute value. Rule encoding process is described in section 3.1. By considering the rule encoding process the particles are formed for each rule generated using decision matrix based classification method. 3.3 VELOCITIES The DPSO algorithm does not use a vector of velocities. It works with proportional likelihoods. Basically, the idea of proportional likelihood used in the DPSO algorithm is almost similar with the idea of velocity used in the standard PSO algorithm. Each particle is associated with a 2 × n array of proportional likelihoods, where 2 and n represents number of rows and number of columns respectively. In this standard proportional likelihood array, each element in the first row of Vi represents the proportional likelihood based on which a rule component be selected. The second row of Vi has the indices of the rule components which is associated with the respective proportional likelihoods of the first row of the vector Vi. There is a one-to-one correspondence between the columns of this array. At the beginning, all elements in the first row of Vi are set to 1, e.g., Vi = {{1, 1, 1, 1, 1 }, {1,2,3,4,5}}.
  • 4. 304 Computer Science & Information Technology (CS & IT) After the initial population of particles is generated, this array is always updated before a new configuration for the particle associated to it is made. The updating process is based on Ri, Bi (particle's previous best position) and G (global best position) and works as follows. In addition to Ri, Bi and G, three constant updating factors, namely, a, b and c are used to update the proportional likelihoods v(i,d). These factors determine the strength of the contribution of Ri, Bi and G to the adjustment of every coordinate v(i,d) € Vi. Parameter values of a, b and c is chosen experimentally. 3.4 GENERATING NEW PARTICLES FOR THE PROPOSED DPSO ALGORITHM The proportional likelihood array Vi is then used to sample a new configuration of particle Ri that is, the particle associated to it. First, for a particle with Vi, all indices present in Ri have their corresponding proportional likelihood increased by ‘a’. Similarly, all indices present in Bi and G have their corresponding proportional likelihood increased by ‘b’ and ‘c’ respectively. Now each element of the first row of the array Vi is then multiplied by a uniform random number between 0 and 1. A new random number is generated for every single multiplication performed. The new particle is then defined by ranking the columns in Vi by the values in its first row. That is, the elements in the first row of the array are ranked in a decreasing order of value and the indices of the rule components (in the second row of Vi) follow their respective proportional likelihoods. Thus for example, after all the steps if particle i has length 3, and the particle associated to the array V i = 0.74 0.57 0.50 0.420.20 5 4 3 1 2 Then first 3 indices from the second row of Vi would be selected to compose the new particle. That is, Ri = {*, *, *} the indices (rule components) 5, 4 and 3 would be selected to compose the new particle, i.e., Ri = {5, 4, 3}. Note that indices that have a higher proportional likelihood are, on average, more likely to be selected. If indices are the rule components with same attribute are selected, then higher confidence rule component is selected. In that case new particle size is also changes in the generation. The updating of Ri, Bi and G is identical to what is described earlier. In this way new particles are formed generation by generation. 4. EXPERIMENTAL METHODOLOGY The purpose of this experiment was to evaluate the classification accuracy and comprehensibility by the number of rules in the data set, and the average number of rule conditions per rule. The fitness function f (Ri) of any particle i is computed as follows. Optimized rule selection process can be performed by DPSO as a multi objective problem to maximize the confidence (association rule mining concept) of a rule to achieve higher classification accuracy as well as minimizing the length of a rule. The goal is to see whether DPSO can select optimized set of rules to achieve a higher classification accuracy rate from initial set of rules. In this regard following fitness function is considered. Fi=α*rulei_confidence+ (1-α)*1/ (rulei_length) rulei _confidence: A rule like, Ri → ( = a) ( = b) → (D =di) has the condition part Ci where conditional attributes are associated with value, and the decision part D has the decision values di. So here the rule maps Ci→ di.
  • 5. Computer Science & Information Technology (CS & IT) 305 Then the confidence of the rule is conf (Ri) = number of rows in dataset that match Ci and have class label di / number of rows in dataset that match only Ci. Here relative importance of the rule confidence and the length of the rule are considered. Rule confidence is set larger than length of a rule because the classification performance is assumed more important than the number of conditional attributes present in a rule i.e. rule length. The objective is to find the fittest rules with which it is possible to classify the data set as belonging to one of the classes with an acceptable accuracy. Here α=0.8 is considered. 5. EXPERIMENTS AND RESULT The computational experiments involved a 10-fold cross-validation method [13].To illustrates the method, wine dataset from UCI repository [14] of Machine Learning databases [10] is considered. First, the 178 records with 3 distinct classes (0, 1, 2) in the wine data set were divided into 10 almost equally sized folds. The folds were randomly generated but under the following regulation. The proportion of different classes in every single fold was similar to the one found in the original data set containing all the 178 records. Each of the 10 folds is used once as test set and the remaining of the data set is used as training set. The purpose of the experiment was to evaluate the generalization ability, measured as the classification accuracy on test set. Each training set was used for generation of initial classification rules. For each initial rule set, DPSO algorithm generates optimized set of classification rules according to their fitness which is used to classify the test set (each 10 test folds) accordingly. DPSO selects only the best particle in each run as the rule. As the DPSO algorithm is stochastic algorithm, 20 independent runs for the algorithm were performed for every single fold and for every decision classes. The average number of rules by the rule selection algorithms has always been rounded to the nearest integer. The population size (initial rule set) used for the algorithm is on average 300 and the search stops in one run after 100 iterations. Other choices of parameter values were a = 0.10, b = 0.12 and c = 0.14. The results of experiment were as follows. For Wine on average 95 rules are selected as optimized rules from 300 rules. And using these 95 rules on test set 99.32% of average classification accuracy is achieved.Using fewer rules, DPSO algorithm obtained on average, a better predictive accuracy than the classification performed using all the initial classification rules for few classes. For the initial rule set average accuracy values on test set for 3 classes (0, 1, 2)were 99.05, 99.48, 99.77, while using optimized rule set corresponding accuracy values on test set were 98.38, 99.59, 100. The results also indicate that not only the predictive accuracy is good, but also the number of rule conditions or rule length is relatively short like 2- 4 rule components in a rule where the same was 3-5 in the initial rule set and there are a small number of rules in the rule set. 6. CONCLUSION We proposed an efficient method for rule discovery using DPSO algorithm. The discovered rules are with of high accuracy and comprehensibility. Using fewer rules, DPSO algorithm obtained on average, a better predictive accuracy than the classification performed using all the initial classification rules for few classes. The results on the Wine data set show that our approach has good performance for rule discovery on discrete data. Few less coverage rules were also selected. Rule Coverage is an important parameter for selection of good quality rule. In the future by applying some rule pruning technique better quality rule set can be generated.
  • 6. 306 Computer Science & Information Technology (CS & IT) REFERENCES [1] T. Blackwell and J. Branke. Multi-swarm optimization in dynamic environments. In Lecture Notes in Computer Science, volume 3005, pages 489{500. Springer-Verlag, 2004. [2] E. S. Correa, M. T. Steiner, A. A. Freitas, and C. Carnieri. Using a genetic algorithm for solving a capacity p-median problem. Numerical Algorithms, 35:373{388, 2004. [3] D. Freedman, R. Pisani, and R. Purves. Statistics. W. Norton & Company, 3rd edition, September 1997. [4] A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer- Verlag, October 2002. [5] G. Kendall and Y. Su. A particle swarm optimisation approach in the construction of optimal risky portfolios. In Proceedings of the 23rd IASTED International Multi-Conference on Applied Informatics, pages 140{145, 2005. Artificial intelligence and applications [6] J. Kennedy and R. C. Eberhart. A discrete binary version of the particle swarm algorithm. In Proceedings of the 1997 Conference on Systems, Man, and Cybernetics, pages 4104{4109, Piscataway, NJ, USA, 1997. IEEE. [7] J. Kennedy and R. C. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001 [8] T. M. Mitchell. Machine Learning. McGraw-Hill, August 1997. [9] R. Poli, C. D. Chio, and W. B. Langdon. Exploring extended particle swarms: a genetic programming approach. In GECCO'05: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pages 169{176, New York, NY, USA, 2005. ACM Press. [10] Y. Shi and R. C. Eberhart. Parameter selection in particle swarm optimization. In EP'98: Proceedings of the 7th International Conference on Evolutionary Programming, pages 591{600, London, UK, 1998. Springer-Verlag [11] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition, 2005 [12] N.Zhong and A. Skowron, “A Rough Set-Based Knowledge Discovery Process”, Int. Journal of Applied Mathematics and Computer Science., 11(3), 603-619, 2001. BIME Journal, Volume (05),), 2005 [13] Ethem Alpaydin Introduction to Machine Learning.PHI, 2010. [14] Murphy, P. and Aha, W.: UCI repository of machine learning databases (1996), http://www.ics.uci.edu/mlearn/MLRepository.html