SlideShare a Scribd company logo
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
DOI : 10.5121/ijcseit.2012.2102 11
FUZZY WEIGHTED ASSOCIATIVE CLASSIFIER: A
PREDICTIVE TECHNIQUE FOR HEALTH CARE DATA
MINING
Sunita Soni 1
and O.P.Vyas 2
1
Associate Professor,Bhilai Institute of Technology, Durg-491 001, Chhattisgarh, India
Sunitasoni74@gmail.com
2
Professor, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
dropvyas@gmail.com
ABSTRACT
In this paper we extend the problem of classification using Fuzzy Association Rule Mining and propose the
concept of Fuzzy Weighted Associative Classifier (FWAC). Classification based on Association rules is
considered to be effective and advantageous in many cases. Associative classifiers are especially fit to
applications where the model may assist the domain experts in their decisions. Weighted Associative
Classifiers that takes advantage of weighted Association Rule Mining is already being proposed. However,
there is a so-called "sharp boundary" problem in association rules mining with quantitative attribute
domains. This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates
classification rules using Fuzzy Weighted Support and Confidence framework. The naïve approach can be
used to generating strong rules instead of weak irrelevant rules. where fuzzy logic is used in partitioning
the domains. The problem of Invalidation of Downward Closure property is solved and the concept of
Fuzzy Weighted Support and Fuzzy Weighted Confidence frame work for Boolean and quantitative item
with weighted setting is generalized. We propose a theoretical model to introduce new associative classifier
that takes advantage of Fuzzy Weighted Association rule mining.
Keywords
Associative Classifiers, Fuzzy Weighted Association Rule, FWAC, Fuzzy weighted support, Fuzzy weighted
Confidence.
1. INTRODUCTION
Associative Classification is an integrated framework of Association Rule Mining (ARM) and
Classification. A special subset of association rules whose right-hand-side is restricted to the
classification class attribute is used for classification.
The traditional ARM was designed considering that items have same importance and in the
database simply their presence or absence is mentioned. In several problem domains it does not
make sense to assign equal importance to all the items particularly in predictive modeling system
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
12
where attributes have different prediction capability. For example in medical domain predicting
the probability of heart disease, the attribute prior-stroke is having more impact than the attribute
BMI (Body Mass Index). The concept of weighted association rule mining is used to deal with
the case where items are assigned a weight to reflect their importance. To deal with the situation
the authors have proposed a new Weighted Associative Classifier (WAC) that generates
classification rules using weighted support and Confidence framework [1].
Another problem in the medical database as well as databases from other applications is that
most of the attributes are associated with quantitative domains such as BMI, Age, Blood-
Pressure, etc., which are very common in many real applications, association rule mining usually
needs to partition the domains in order to apply the Apriori-type method. Thus, a discovered rule
X →Y reflects association between interval values of data items. Examples of such rules are
{(Age,”>62”), (BMI,“45”), (Blood_pressure,“95-135”)} (Heart_Disease) and (Income[20,000-
30,000] Age[20-30]) and so on. Apart from domain discretization techniques for association
rule mining, fuzzy logic is considered as suitable solution to deal with the “sharp boundary”
problem. This gives rise to the notion of Fuzzy Association Rules (FAR). These rules are richer
and of certain natural language nature. For example, (old Age, high Obesity and high
Blood_Pressure) (chance of Heart Disease) and (medium, Income) (young Age) are fuzzy
association rules, where X’s and Y’s are fuzzy sets with linguistic terms (i.e., old, high, hyper
medium, and young). Building an associative classifier based upon fuzzy association rules
provides two advantages: one is the need to mine large datasets with quantitative domains; the
other is to generate classification rules with more general semantics and linguistic expressiveness
[3].
This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates
classification rules using Fuzzy Weighted Support and Confidence framework. The naïve
approach can be used to generate strong rules instead of weak irrelevant rules. We discussed the
importance of Fuzzy Weighted Association rule in classification problem.
In section 2, we have discussed the concept of association rule mining and fuzzy weighted
associative classifiers. In section 3 we described some new formulae and given new definitions
for the same. In section 4, downward closure properties for Fuzzy weighted version of association
rule mining is being discussed. Section 5 discuss some of the application area that can be
benefited with the proposed concept. In section 6, conclusion and future work of this paper is
given.
2. Related Work
2.1. Association Rule Mining
Let I = {i1, i2… in} be a set of n distinct literals called items. D is a set of variable length
transactions over I. Each transaction contains a set of items i1, i2… ik ∈ I. A transaction has an
associated unique identifier called TID. An association rule is an implication of the form A⇒ B
(or written as A → B), where A, B ⊆ I, and A ∩ B=∅. A is called the antecedent of the rule and B
is called the consequent of the rule. The rule X ⇒ Y has a support s in the transaction set D if s%
of the transactions in D contain X∪Y. In other words, the support of the rule is the probability that
X and Y hold together among all the possible presented cases. It is said that the rule X⇒ Y holds
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
13
in the transaction set D with confidence c if c% of transactions in D that contain X also contain Y.
In other words, the confidence of the rule is the conditional probability that the consequent Y is
true under the condition of the antecedent X. The problem of discovering all association rules
from a set of transactions D consists of generating the rules that have a support and confidence
greater than given thresholds. These rules are called strong rules, and the framework is known as
the support confidence framework for association rule mining.
2.1.1. Weighted Association Rule Mining
A weighted association rule (WAR) is an implication X→Y where X and Y are two weighted
items. A pair (ij, wj) is called a weighted item where ij∈I and wj∈W is the weight associated with
the item ij. A transaction is a set of weighted items where 0<wj<=1. Weight is used to show the
importance of the item. For example in the context of stock market prices, some stocks have
much higher value than others and might appreciate a lot more than other stocks in a comparable
time period. In the supermarket context, some items like jewellery, designer clothes, etc. are of
much higher value than trivia like bubblegum or candy. Rules involving jewellery may have less
support than those involving candy but are much more significant in terms of the revenue (and
consequently profit) earned by the store.
In weighted association rule mining problem each item is allowed to have a weight. The goal is to
steer the mining process to those significant relationships involving items with significant weights
rather than being flooded in the combinatorial explosion of insignificant relationships [8].
2.1.2. Fuzzy Association Rule Mining (FARM).
A data mining for Discovering Fuzzy Association Rules is proposed in [11]. The authors have
given the technique to find Fuzzy Association Rules without using the user supplied support
values which are often hard to determine. The other unique feature of the work is that the
conclusion of a fuzzy association rule can contain linguistic terms. The experimental result shows
that the algorithm is capable of discovering both positive and negative fuzzy association rules in
an effective manner from real life database.
In [5] the authors have proposed a model to find the fuzzy association rules in fuzzy transaction
database. The model is found to be useful technique to find the patterns in data in the presence of
imprecision, either because data are fuzzy in nature or because we must improve their semantics.
Authors have also discussed` some of the applications of the scheme, paying special attention to
the discovery of fuzzy association rules in relational database.
2.1.3. Fuzzy Weighted Association Rule Mining
Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework is
proposed in [2]. The authors have addressed the issue of invalidation of downward closure
property (DCP) in weighted association rule mining where each item is assigned a weight
according to their significance. Formulae for fuzzy weighted support and fuzzy weighted
confidence for Boolean and quantitative items with weighted settings is proposed. The
methodology follows an Apriori like approach and avoids the pre and post processing as opposed
to most weighted ARM algorithm, thus eliminating the extra steps during rules generation.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
14
A new algorithm which is applicable to Normalized and unnormalized case is proposed in [11] In
this paper the authors have introduced the problem of mining Weighted Quantitative Association
rules based on Fuzzy approach. Using the fuzzy set concept, the discovered rules are more
understandable to a human. Two different definition of weighted Support with and without
normalization is proposed.
2.1.4. Incorporating Weight in ARM
The concepts of assigning weight to the attribute have never been utilized in medical domain. In
Super marker context the concept of has been used to assigned more weight to the item that gives
more profit per unit sale. Other example in the context of stock market prices, some stocks have
much higher value than others and might appreciate a lot more than other stocks in a comparable
time period.
In traditional association rule mining (ARM) model only item’s presence or absence in the
transaction is mentioned and its significance is not considered at profit point of view. To deal
with the weighted setting environment, an algorithm called WARM (Weighted Association Rule
Mining) is proposed [8] in which the item’s weighted support is measured instead of calculating
only support. The weighted support is a measure of significance of an item at cost point of view.
The goal of using weighted support is to make use of the weight in the mining process and
prioritize the selection of target itemsets according to their significance in the dataset, rather than
their frequency alone. An itemset is denoted large if its support is above a predefined minimum
support threshold. In the WARM context, an itemset is said to be significant if its weighted
support is above a pre-defined minimum weighted support threshold. The threshold values
specified by the user are significance of item at cost point of view.
2.1.4. Utilizing Weight in Medical Domain
Motivation to utilized weight in medical domain is from supermarket scenario where a weight is
assigned to each of the items according to the profit it generates to the store, rather than simply
counting and calculating the percentage of transactions that contain items. In medical domain also
some of the symptoms have much impact to predict particular disease. For example in predicting
the probability of heart disease, the attribute prior-stroke is having more impact than the attribute
BMI (Body Mass Index).The experience of expert doctor can be utilized to assign weight to the
different symptoms in medical domain. This is also the way to utilize the experience of Domain
Expert in Prediction Model. In medical domain the algorithm WARM can be utilized to generate
weighted CAR rule. So in this paper we have used the WARM to incorporate attribute importance
instead of considering the weight separately in the algorithm.
2.1.5. Fuzziness of Quantitative Attribute
In ARM model when the data are quantitative such as income, age, price, etc., which are very
common in many real applications, association rule mining needs to descitization of domain to
convert it in to nominal domain. And ultimately apply the Apriori-type method. Thus, association
rule like X −−> Y reflects association between nominal values of data items. Examples of such
rules are “(Age, old), (BP, high) (Heart_Disease, yes) , “(Income, Low)−−>(Age, medium)”,
and so on. Such the mining results are affected by how the intervals are partitioned, particularly
for data values around interval boundaries. That is the so-called “sharp boundary” problem.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
15
Subsequently, the result of associative classification may also be affected in terms of accuracy
and understandability.
In medical domain there are number of quantitative attributes suffers from crisp boundary
problem. For example attributes for example if in a particular record the BMI (Body mass Index)
is 41 them according to following discritization rule
The patient is considered to be severely obese. This may not give a correct result because of sharp
boundary problem. Instead by applying fuzzy logic the patient is partially belonging to each fuzzy
set. Hence the patient membership value to the fuzzy set should be( µ(Obesity, ”mild”) = 0.1 , µ(Obesity,
”moderate”) = 0.3, µ(Obesity, ”sever”) = 0.6 ). To deal with crisp boundary problem of quantitative attribute
in ARM model the [2] proposed the Fuzzy WARM (FWARM) Algorithm and redefine the
weighted support and weighted confidence to adapt in Fuzzy environment. In Fuzzy Weighted
Association Rule Mining (FWARM) model the Fuzzy Weighted Support (FWS) and Fuzzy
Weighted Confidence framework is proposed to mine Fuzzy Weighted Association Rule. The
algorithm FWARM can be used to generate the CAR rules in Fuzzy Weighted environment.
3. PROBLEM DEFINITION
In this paper we have proposed two important modifications (weight of an attribute and
fuzzyfication of quantitative attributes) in Associative Classifier to improve Prediction Accuracy
in Medical Domain. Hence the problem definition consists of the terms and basic concepts to
define Fuzzy Attribute Weight (FAW), Fuzzy Attribute set Transaction Weight (FASTW), Fuzzy
Attribute Set Weight (FASW) Fuzzy Weighted Support (FWS) and Fuzzy Weighted Confidence
(FWC) for Fuzzy Weighted Associative Classifiers (FWAC). Technique for Fuzzy Weighted
Association Rule Mining is known as (FWARM).
3.1 Associative Classifiers.
Given a set of cases with class labels as a training set, classification is to build a model (called
classifier) to predict future data objects for which the class label is unknown.
Associative Classification is an integrated framework of Association Rule Mining (ARM) and
Classification. A special subset of association rules whose right-hand-side is restricted to the
classification class attribute is used for classification. This subset of rules is referred as the Class
Association Rules (CARs). Recent studies propose the extraction of a set of high quality
association rules from the training data set, which satisfy certain user-specified frequency and
confidence thresholds. Effective and efficient classifiers have been built by careful selection of
rules, e.g., CBA, CAEP and ADT. Such a method takes the most effective rule(s) from among all
the rules mined for classification. Since association rules explore highly confident associations
among multiple variables, it may overcome some constraints introduced by a decision-tree
induction method, which examines one variable at a time. Extensive performance studies [11]
show that association based classification may have better accuracy in general [9].
BMI[26-30] Obesity=”mild”
BMI[31-40] Obesity=”moderate”
BMI[40-*] Obesity=”sever”
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
16
3.2 Fuzzy Weighted Associative Classifiers.
A fuzzy dataset consists of fuzzy relational database D={ r1, r2, r3…. ri…rn} with a set of
attributes I=(I1, I2, ……Im}, each IK can be associated with a set of linguistic labels L={l1, l2,
……lL } for example L={young, Middle, Old}.Let each Ik is associated with fuzzy set Fk = {(Ik,l1),
(Ik,l2), (Ik,l3), ……(Ik,lL)}. So that a new Fuzzy Database D’’ is defined as {(I1, l1).… (I1, lL) …. (Ik,
l1),…(Ik, lL),…(Im, l1)…(Im , lL) } .Each attribute Ii in a given transaction tk is associated (to some
degree) with Several fuzzy sets. The degree of association is given by a membership degree in
the range [0..1]. tk[µ(Ii, lj)] will denote the degree of membership for Fuzzy Attribute Ii to fuzzy
set lj in transaction tk.
Table-1 Data Base with continuous domain
Table-2 Transformed Binary Database D’ from D
(D) R_ID Age Blood
Pressure
(BP)
BMI
(Obesity)
Heart_Disease(H_D)
1 42 90-130 40 Yes
2 62 80-120 28 No
3 55 82-122 40 Yes
4 62 92-135 50 Yes
5 45 95-135 30 No
(D’)
R_ID
Age Blood Pressure(BP) BMI(Obesity)
Heart
Disease
( H_D)
young
Middle
old
High
Low
Normal
Mild
Moderate
Severe
1 0 1 0 1 0 0 0 1 0 Y
2 0 0 1 0 0 1 0 1 0 N
3 0 1 0 1 0 0 0 1 0 Y
4 0 0 1 1 0 0 0 0 1 Y
5 0 1 0 1 0 0 1 0 0 N
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
17
Table-3 Database D’’ with Fuzzy Items.
Table 1 shows the Example database D with continuous Domain of quantitative attribute. In
Table 2 the transformed binary database (D’) is shown in which the quantitative attributes have
been partitioned by converting it into categorical attribute. Consider attribute Age in Table 1
again, three new attributes (e.g. ((Age, young), (Age, middle) and (Age, old) in place of Age may
be used to constitute a new database (D′′) with partial belongings of original attribute values to
each of the new attributes. Table 3 illustrates an example of the new database obtained from the
original database, given fuzzy sets {Young, Middle, Old} as characterized by membership
functions shown in Figure 1 for attribute age. Similarly the other quantitative attribute ie Blood
pressure and BMI (Obesity) are also partitioned and membership values are assigned by using
corresponding membership function.
Figure 1: Fuzzy Sets Y-Age, M-Age and O-Age.
Here Fuzzy logic is incorporated to split the domain of quantitative attribute into intervals, and to
define a set of meaningful linguistic labels represented by fuzzy sets and use them as a new
domain. In this case it is possible that one item may appear with different label of same attribute.
Hence the item sets are needs to be restricted to contain at most one item set per attribute because
D’’ Age BP BMI H_D
young
Middle
old
High
Low
Normal
Mild
Moderate
Severe
1 0.2 0.7 0.1 0.4 0 0.6 0.6 0.3 0.1 Y
2 0.0 0.3 0.7 0.1 0.1 0.8 0.8 0.1 0.1 N
3 0.1 0.3 0.6 0.2 0.0 0.8 0.6 0.3 0.1 Y
4 0.0 0.3 0.7 0.5 0.0 0.5 0.1 0.2 0.7 Y
5 0.1 0.8 0.1 0.6 0.0 0.4 0.7 0.2 0.1 N
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
18
otherwise the rules of the form {(Age, Middle),(Age, old)….. ⇒ class_label } have no meaning.
The triangle and trapezoidal are the two important membership functions that can be used to find
the degree of association for the different attribute.
Definition 1. Fuzzy Attribute Weight: We assign a weight W(Ii, lj) to each fuzzy Item I(Ii, lj)
where( 1≤ i ≤ n), (1≤ j ≤ L) and (0≤w≤1). Table 4 shows the random weight assigns to different
fuzzy attribute for heart disease
Table-4 Weight of symptoms for heart disease (attribute weight).
Definition 2. Fuzzy Attribute set Transaction Weight: Weight of attribute set X a particular
transaction tk is denoted by tk[FATW(X)] and is calculated as the product of membership degree
of attribute in given fuzzy set in the transaction tk and weight of fuzzy attribute; of all enclosing
Fuzzy attribute in the set. And is given by
Example 2: Consider the 2 attribute set (Age , old), (BP, high ) in transaction1
FASTW ((Age, old), (BP, high)) = (0.1×0.6)(0.4×0.7) = 0.34
Definition 3. Fuzzy Attribute Set Weight: Fuzzy Weight of attribute set X is calculated as
sum of FASTW all transaction and is denoted by FASW(X). And is given by
S. No. Symptoms Weights
1
2
3
4
5
6
7
8
9
(Age, young)
(Age, middle)
(Age, old)
(BP, Norma)l
(BP, Low)
(Bp, High)
(BMI, Mild)
(BMI, Moderate)
(BMI, severe)
0.1
0.2
0.6
0.3
0.2
0.7
0.3
0.5
0.7
FASW(X) =
|D’’|
∑ tk [FASTW(X)]
k=1
|X|
∏ (∀ (Ii, lj)ϵ X) [ tk[µ (Ii ,l j)] × W(Ii ,l j)]
i=1
tk[FASTW(X) ]=
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
19
Example 3: Consider the 2 attribute set (Age , old), (BP, high ) .
FASW ((Age, old), (BP, high)) = [ (0.1×0.6)(0.4×0.7) + (0.7×0.6)(0.1×0.7) + (0.6×0.6)(0.2×0.7)
+ (0.7×0.6)(0.5×0.7) + (0.1×0.6)(0.6×0.7) ] = 2.34
Definition 4. Fuzzy Weighted Support: In associative classification rule mining, the association
rules are not of the form X →Y rather they are subset of these rules where Y is the class label.
Fuzzy Weighted support FWS of rule X→Class_label, where X is set of non empty subsets of
fuzzy weighted attribute. Fuzzy Weighted Support FWS of a rule X→Class_label is calculated as
sum of weight of all transaction in which the given class label is true, divided by total number of
transaction, denoted by FWS(X→Class_label ). And is given by
where tk is all transaction for which the given class_label is true
Example 4: Consider the attribute set X= [(Age, old), (BP, high)] and a rule r =[(Age, old), (BP,
high) (Heart_disease= ”yes”) the Fuzzy Weighted Support of a rule is given by
FWS((Age, old),(BP, high) (Heart_disease=”yes”))
[(0.1×0.6)(0.4×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7) ]
5
FWS( r ) = 0.27 (27%)
Definition 5. Fuzzy Weighted Confidence: Fuzzy Weighted Confidence of a rule X→Y where
Y represents the Class label can be defined as the ratio of Fuzzy Weighted Support of (X∪Y)
and Fuzzy Weighted Support of (X). And is given by
|D’’| |X|
∑ ∏ (∀ (Ii, lj)ϵX) [ tk[µ (Ii ,l j)] × W(Ii ,l j)]
k=1 i=1
FSAW(X) =
Fuzzy Weighted
Confidence =
Fuzzy Weighted Support (X∪Y)
Fuzzy Weighted Support (X)
∑ ∀ tk having tk[FASTW(X)]
Given
class _label
Number of records in D’’
FWS(X→Class_label) =
|X|
∑ ∀ tk having ∏∀ (Ii, lj)ϵ X [ µ (Ii ,l j) × W(Ii ,l j)]
Given i=1
class _label
n
FWS(X→Class_label) =
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
20
Example 5: Consider the attribute set X= [(Age, old), (BP, high)] and a rule r =[(Age, old), (BP,
high) (Heart_disease= ”yes”) the Fuzzy Weighted Confidence of a rule is given by
FWC[(Age, old), (BP, high) Heart_disease=”yes”)] =
[(0.1×0.6)(0.4×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7) ]
[(0.1×0.6)(0.4×0.7)+(0.7×0.6)(0.1×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7)+(0.1×0.6)(0.6×0.
7)]
FWC(r) = 1.37/ 2.34
FWC(r) = 0.585(58%)
4. WEIGHTED DOWNWARD CLOSURE PROPERTY
In a classical Apriori algorithm it is assumed that if the itemset is large, then all its subsets should
also be large and is called Downward Closure Property (DCP). This helps algorithm to generate
large itemsets of increasing size by adding items to itemsets that are already large. In the
weighted ARM case where each item is assigned weight, the DCP does not hold. To solve the
problem of invalidation of DCP, the new framework, “Fuzzy weighted support framework” is
designed in [2]. The authors have proved that using Fuzzy weighted support the “weighted
downward closure property” retains. the authors have prove that if an itemset {AC} is not
significant then its superset say {ACE} is impossible to be significant hence no need to calculate
its Fuzzy weighted support. To generate the frequent item set in the proposed method, the Apriori
algorithm has been used and instead of using “support – large” framework the new framework of
“Fuzzy weighted support Framework” has been used.
5. APPLICATIONS
• Medical Application: In medical database most of the attributes are quantitative in
nature. Descritization of these attributes will suffer crisp boundary problem Hence fuzzy
environment can be used. Assigning weights to the symptoms to improves prediction
accuracy compare to the traditional Associative classifiers.
• Business Application: Fuzzy weighted environment is suitable for customer
classification based upon their purchasing habit. In fuzzy transactional Database [2] the
weighted concept can be used to assign External utility to the item in the supermarket.
• Web Mining: In web mining visitor page dwelling time can be used to assign weight.
|X|
∑ ∀ tk having ∏ ∀ (Ii, lj)ϵ X [ µ(Ii ,l j) × W(Ii ,l j)]
given i=1
class_label
|D’’| |X|
∑ ∏ (∀ (Ii, lj
)ϵ X) [ µ (Ii ,l j) × W(Ii ,l j)]
K=1 i=1
FWC(X) =
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
21
• Classification Problem: Utility of the Fuzzy weighted Associative Classifiers is not
limited to health care rather it can be applied in any domain to improve the prediction
accuracy.
6. CONCLUSION & FUTURE WORK
This work presents a new foundational approach to Fuzzy Weighted Associative Classifiers
where quantitative attributes are discritized to get transformed binary database. In such data base
each record fully belongs to only one fuzzy set. Such database will suffer the crisp boundary
problem. To deal with crisp boundary problem of quantitative attribute in ARM model the Fuzzy
WARM (FWARM) Algorithm has been proposed and redefine the weighted support and
weighted confidence to adapt in Fuzzy environment.
Each Fuzzy attribute is allowed to have weight depending upon their importance in predicting the
class labels. A Conceptual model has been presented that allows development of an efficient and
applicable algorithm in future that can capture real-life situations and can produce more accurate
classifiers such as in Medical data mining. It has already been proved that, by assigning weights
to Fuzzy items and using FWARM, the selection of significant item sets is steered to those item
sets containing or having relationships to high weight items. Hence using Fuzzy weighted
Association Rule as a Classification rule will improve the classification accuracy. In future work
the proposed concept needs to be implemented to find out how much accuracy is improved by
adapting the above concept. One of existing associative classifiers is to be chosen or new
algorithm needs to be developed that can be integrated with Fuzzy weighted association rule
miner.
REFERENCES
[1] S. soni, O.P. Vyas, J. pillai, Associative Classifier Using Weighted Association Rule, Symposium
2009 World Congress on Nature & Biologically Inspired Computing (NaBIC 2009) page(s):1492-
1496.
[2] M. Suleman Khan, Maybin Muyeba, M.Frans Coenen, Fuzzy weighted Association Rule Minging
with weighted Support and Confidence framework. 2009
[3] Zuoliang Chen, Guoqing Chen, BUILDING AN ASSOCIATIVE CLASSIFIER BASED ON FUZZY
ASSOCIATION RULE. International Journal of Computational Intelligence Systems, Vol.1, No. 3
(August, 2008), 262 – 2732008
[4] Fadi Thabtah, A review of associative classification mining, The Knowledge Engineering Review,
Volume 22, Issue 1 (March 2007), Pages 37-65, 2007.
[5] Miguel Delgado, Nicolas Marin, Daniel Sanchez, and maria-Amparo Vila, Fuzzy Association Rules:
General Models and Applications, IEEE TRANSACTION ON FUZZY SYSTEM ,VOL 11, NO.2,
April 2003.
[6] Khan, M.S. Muyeba, M. Coenen, F A Weighte Utility Framework for Mining Association Rules,
Symposium Computer Modeling and Simulation, 2008. EMS '08. Second UKSIM European ,
page(s): 87-92.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012
22
[7] H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rule-based classification
systems,” IEEE Trans. on Fuzzy Systems, vol. 13, no. 4, pp. 428-435, August 2005.
[8] Feng Tao, Fionn Murtagh and Mohsen Farid. Weighted Association Rule Mining using Weighted
Support and Significance Framework Proceedings of the ninth ACM SIGKDD international
conference on Knowledge discovery and data mining 2003, Pages:661-666 Year of Publication: 2003
[9] Lu, J-J.: Mining Boolean and General Fuzzy Weighted Association Rules in Databases, Systems
Engineering-Theory & Practice, 2, 28--32 (2002)
[10] W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-
association rules. In ICDM'01, pp. 369(376, San Jose, CA, Nov.2001).
[11] Gyenesei, A.: Mining Weighted Association Rules for Fuzzy Quantitative Items, Proceedings of
PKDD Conference pp. 416--423 (2000).
[12] Wai-Ho Au Keith C.C. Chan , FARM: A Data Mining System for Discovering Fuzzy Association
Rules Proc. of the 8th IEEE Int’l Conf. on Fuzzy Systems, Seoul, Korea 1999.
[13] B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. In KDD’98, New
York, NY, Aug.1998.
Authors
Mrs. Sunita Soni is a Sr. Associate Professor in Department of Computer
Applications at Bhilai Institute of Technology, Durg (C.G.) , India. She is a post-
graduate from Pt. Ravi Shankar Shukla University, India. She is a Life fellow
member of Indian Society for Technical Education. She has total teaching experience
of 12 years He has a total of 16 Research papers published in National / International
Journals / Conferences into her credit. Presently She is pursuing PhD from Pt. Ravi
Shankar Shukla University, Raipur under the guidance of Dr. O.P.Vyas, IIIT,
Allahabad.
Dr.O.P.Vyas is currently working as Professor and Incharge Officer (Doctoral
Research Section) in Indian Institute of Information Technology-Allahabad (Govt. of
India’s Center of Excellence in I.T.). Dr.Vyas has done M.Tech.(Computer Science)
from IIT Kharagpur and has done Ph.D. work in joint collaboration with Technical
University of Kaiserslautern (Germany) and I.I.T.Kharagpur. With more than 25 years
of academic experience Dr.Vyas has guided Four Scholars for the successful award of
Ph.D. degree and has more than 80 research publications with two books to his credit.
His current research interests are Linked Data Mining and Service Oriented
Architectures.

More Related Content

PDF
An apriori based algorithm to mine association rules with inter itemset distance
PDF
Distance based transformation for privacy preserving data mining using hybrid...
PDF
G0364347
PDF
COMPACT WEIGHTED CLASS ASSOCIATION RULE MINING USING INFORMATION GAIN
PDF
Re-Mining Item Associations: Methodology and a Case Study in Apparel Retailing
PDF
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
PPT
Classification: Basic Concepts and Decision Trees
PDF
COMPACT WEIGHTED CLASS ASSOCIATION RULE MINING USING INFORMATION GAIN
An apriori based algorithm to mine association rules with inter itemset distance
Distance based transformation for privacy preserving data mining using hybrid...
G0364347
COMPACT WEIGHTED CLASS ASSOCIATION RULE MINING USING INFORMATION GAIN
Re-Mining Item Associations: Methodology and a Case Study in Apparel Retailing
CONFIGURING ASSOCIATIONS TO INCREASE TRUST IN PRODUCT PURCHASE
Classification: Basic Concepts and Decision Trees
COMPACT WEIGHTED CLASS ASSOCIATION RULE MINING USING INFORMATION GAIN

Similar to FUZZY WEIGHTED ASSOCIATIVE CLASSIFIER: A PREDICTIVE TECHNIQUE FOR HEALTH CARE DATA MINING (20)

PDF
A Novel Quantity based Weighted Association Rule Mining
PDF
Gr2411971203
PDF
Classification based on Positive and Negative Association Rules
PPTX
PostMining of weighted assosiation rules using knowledge base
PDF
Ca25458463
PDF
Hu3414421448
PDF
Interestingness Measures In Rule Mining: A Valuation
PDF
SURVEY ON FREQUENT PATTERN MINING
PDF
Efficient Utility Based Infrequent Weighted Item-Set Mining
PDF
Research Inventy : International Journal of Engineering and Science
PDF
Research Inventy : International Journal of Engineering and Science
PDF
Introduction To Multilevel Association Rule And Its Methods
PDF
Paper id 212014126
PDF
Generation of Potential High Utility Itemsets from Transactional Databases
PDF
Hl2513421351
PDF
Hl2513421351
PDF
Ec3212561262
PDF
An Ontological Approach for Mining Association Rules from Transactional Dataset
PDF
Mining Negative Association Rules
A Novel Quantity based Weighted Association Rule Mining
Gr2411971203
Classification based on Positive and Negative Association Rules
PostMining of weighted assosiation rules using knowledge base
Ca25458463
Hu3414421448
Interestingness Measures In Rule Mining: A Valuation
SURVEY ON FREQUENT PATTERN MINING
Efficient Utility Based Infrequent Weighted Item-Set Mining
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
Introduction To Multilevel Association Rule And Its Methods
Paper id 212014126
Generation of Potential High Utility Itemsets from Transactional Databases
Hl2513421351
Hl2513421351
Ec3212561262
An Ontological Approach for Mining Association Rules from Transactional Dataset
Mining Negative Association Rules
Ad

More from IJCSEIT Journal (20)

PDF
ANALYSIS OF EXISTING TRAILERS’ CONTAINER LOCK SYSTEMS
PDF
A MODEL FOR REMOTE ACCESS AND PROTECTION OF SMARTPHONES USING SHORT MESSAGE S...
PDF
BIOMETRIC APPLICATION OF INTELLIGENT AGENTS IN FAKE DOCUMENT DETECTION OF JOB...
PDF
FACE RECOGNITION USING DIFFERENT LOCAL FEATURES WITH DIFFERENT DISTANCE TECHN...
PDF
BIOMETRICS AUTHENTICATION TECHNIQUE FOR INTRUSION DETECTION SYSTEMS USING FIN...
PDF
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
PDF
Effect of Interleaved FEC Code on Wavelet Based MC-CDMA System with Alamouti ...
PDF
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
PDF
DETECTION OF CONCEALED WEAPONS IN X-RAY IMAGES USING FUZZY K-NN
PDF
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
PDF
ERROR PERFORMANCE ANALYSIS USING COOPERATIVE CONTENTION-BASED ROUTING IN WIRE...
PDF
M-FISH KARYOTYPING - A NEW APPROACH BASED ON WATERSHED TRANSFORM
PDF
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGES
PDF
A NOVEL WINDOW FUNCTION YIELDING SUPPRESSED MAINLOBE WIDTH AND MINIMUM SIDELO...
PDF
CSHURI – Modified HURI algorithm for Customer Segmentation and Transaction Pr...
PDF
AN EFFICIENT IMPLEMENTATION OF TRACKING USING KALMAN FILTER FOR UNDERWATER RO...
PDF
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
PDF
FACTORS AFFECTING ACCEPTANCE OF WEB-BASED TRAINING SYSTEM: USING EXTENDED UTA...
PDF
PROBABILISTIC INTERPRETATION OF COMPLEX FUZZY SET
PDF
ALGORITHMIC AND ARCHITECTURAL OPTIMIZATION OF A 3D RECONSTRUCTION MEDICAL IMA...
ANALYSIS OF EXISTING TRAILERS’ CONTAINER LOCK SYSTEMS
A MODEL FOR REMOTE ACCESS AND PROTECTION OF SMARTPHONES USING SHORT MESSAGE S...
BIOMETRIC APPLICATION OF INTELLIGENT AGENTS IN FAKE DOCUMENT DETECTION OF JOB...
FACE RECOGNITION USING DIFFERENT LOCAL FEATURES WITH DIFFERENT DISTANCE TECHN...
BIOMETRICS AUTHENTICATION TECHNIQUE FOR INTRUSION DETECTION SYSTEMS USING FIN...
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...
Effect of Interleaved FEC Code on Wavelet Based MC-CDMA System with Alamouti ...
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
DETECTION OF CONCEALED WEAPONS IN X-RAY IMAGES USING FUZZY K-NN
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
ERROR PERFORMANCE ANALYSIS USING COOPERATIVE CONTENTION-BASED ROUTING IN WIRE...
M-FISH KARYOTYPING - A NEW APPROACH BASED ON WATERSHED TRANSFORM
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGES
A NOVEL WINDOW FUNCTION YIELDING SUPPRESSED MAINLOBE WIDTH AND MINIMUM SIDELO...
CSHURI – Modified HURI algorithm for Customer Segmentation and Transaction Pr...
AN EFFICIENT IMPLEMENTATION OF TRACKING USING KALMAN FILTER FOR UNDERWATER RO...
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
FACTORS AFFECTING ACCEPTANCE OF WEB-BASED TRAINING SYSTEM: USING EXTENDED UTA...
PROBABILISTIC INTERPRETATION OF COMPLEX FUZZY SET
ALGORITHMIC AND ARCHITECTURAL OPTIMIZATION OF A 3D RECONSTRUCTION MEDICAL IMA...
Ad

Recently uploaded (20)

PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
Microbial diseases, their pathogenesis and prophylaxis
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Cell Types and Its function , kingdom of life
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Updated Idioms and Phrasal Verbs in English subject
PDF
RMMM.pdf make it easy to upload and study
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
Chinmaya Tiranga quiz Grand Finale.pdf
Classroom Observation Tools for Teachers
Microbial diseases, their pathogenesis and prophylaxis
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Final Presentation General Medicine 03-08-2024.pptx
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Computing-Curriculum for Schools in Ghana
Cell Types and Its function , kingdom of life
Orientation - ARALprogram of Deped to the Parents.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Updated Idioms and Phrasal Verbs in English subject
RMMM.pdf make it easy to upload and study
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Paper A Mock Exam 9_ Attempt review.pdf.

FUZZY WEIGHTED ASSOCIATIVE CLASSIFIER: A PREDICTIVE TECHNIQUE FOR HEALTH CARE DATA MINING

  • 1. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 DOI : 10.5121/ijcseit.2012.2102 11 FUZZY WEIGHTED ASSOCIATIVE CLASSIFIER: A PREDICTIVE TECHNIQUE FOR HEALTH CARE DATA MINING Sunita Soni 1 and O.P.Vyas 2 1 Associate Professor,Bhilai Institute of Technology, Durg-491 001, Chhattisgarh, India [email protected] 2 Professor, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India [email protected] ABSTRACT In this paper we extend the problem of classification using Fuzzy Association Rule Mining and propose the concept of Fuzzy Weighted Associative Classifier (FWAC). Classification based on Association rules is considered to be effective and advantageous in many cases. Associative classifiers are especially fit to applications where the model may assist the domain experts in their decisions. Weighted Associative Classifiers that takes advantage of weighted Association Rule Mining is already being proposed. However, there is a so-called "sharp boundary" problem in association rules mining with quantitative attribute domains. This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates classification rules using Fuzzy Weighted Support and Confidence framework. The naïve approach can be used to generating strong rules instead of weak irrelevant rules. where fuzzy logic is used in partitioning the domains. The problem of Invalidation of Downward Closure property is solved and the concept of Fuzzy Weighted Support and Fuzzy Weighted Confidence frame work for Boolean and quantitative item with weighted setting is generalized. We propose a theoretical model to introduce new associative classifier that takes advantage of Fuzzy Weighted Association rule mining. Keywords Associative Classifiers, Fuzzy Weighted Association Rule, FWAC, Fuzzy weighted support, Fuzzy weighted Confidence. 1. INTRODUCTION Associative Classification is an integrated framework of Association Rule Mining (ARM) and Classification. A special subset of association rules whose right-hand-side is restricted to the classification class attribute is used for classification. The traditional ARM was designed considering that items have same importance and in the database simply their presence or absence is mentioned. In several problem domains it does not make sense to assign equal importance to all the items particularly in predictive modeling system
  • 2. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 12 where attributes have different prediction capability. For example in medical domain predicting the probability of heart disease, the attribute prior-stroke is having more impact than the attribute BMI (Body Mass Index). The concept of weighted association rule mining is used to deal with the case where items are assigned a weight to reflect their importance. To deal with the situation the authors have proposed a new Weighted Associative Classifier (WAC) that generates classification rules using weighted support and Confidence framework [1]. Another problem in the medical database as well as databases from other applications is that most of the attributes are associated with quantitative domains such as BMI, Age, Blood- Pressure, etc., which are very common in many real applications, association rule mining usually needs to partition the domains in order to apply the Apriori-type method. Thus, a discovered rule X →Y reflects association between interval values of data items. Examples of such rules are {(Age,”>62”), (BMI,“45”), (Blood_pressure,“95-135”)} (Heart_Disease) and (Income[20,000- 30,000] Age[20-30]) and so on. Apart from domain discretization techniques for association rule mining, fuzzy logic is considered as suitable solution to deal with the “sharp boundary” problem. This gives rise to the notion of Fuzzy Association Rules (FAR). These rules are richer and of certain natural language nature. For example, (old Age, high Obesity and high Blood_Pressure) (chance of Heart Disease) and (medium, Income) (young Age) are fuzzy association rules, where X’s and Y’s are fuzzy sets with linguistic terms (i.e., old, high, hyper medium, and young). Building an associative classifier based upon fuzzy association rules provides two advantages: one is the need to mine large datasets with quantitative domains; the other is to generate classification rules with more general semantics and linguistic expressiveness [3]. This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates classification rules using Fuzzy Weighted Support and Confidence framework. The naïve approach can be used to generate strong rules instead of weak irrelevant rules. We discussed the importance of Fuzzy Weighted Association rule in classification problem. In section 2, we have discussed the concept of association rule mining and fuzzy weighted associative classifiers. In section 3 we described some new formulae and given new definitions for the same. In section 4, downward closure properties for Fuzzy weighted version of association rule mining is being discussed. Section 5 discuss some of the application area that can be benefited with the proposed concept. In section 6, conclusion and future work of this paper is given. 2. Related Work 2.1. Association Rule Mining Let I = {i1, i2… in} be a set of n distinct literals called items. D is a set of variable length transactions over I. Each transaction contains a set of items i1, i2… ik ∈ I. A transaction has an associated unique identifier called TID. An association rule is an implication of the form A⇒ B (or written as A → B), where A, B ⊆ I, and A ∩ B=∅. A is called the antecedent of the rule and B is called the consequent of the rule. The rule X ⇒ Y has a support s in the transaction set D if s% of the transactions in D contain X∪Y. In other words, the support of the rule is the probability that X and Y hold together among all the possible presented cases. It is said that the rule X⇒ Y holds
  • 3. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 13 in the transaction set D with confidence c if c% of transactions in D that contain X also contain Y. In other words, the confidence of the rule is the conditional probability that the consequent Y is true under the condition of the antecedent X. The problem of discovering all association rules from a set of transactions D consists of generating the rules that have a support and confidence greater than given thresholds. These rules are called strong rules, and the framework is known as the support confidence framework for association rule mining. 2.1.1. Weighted Association Rule Mining A weighted association rule (WAR) is an implication X→Y where X and Y are two weighted items. A pair (ij, wj) is called a weighted item where ij∈I and wj∈W is the weight associated with the item ij. A transaction is a set of weighted items where 0<wj<=1. Weight is used to show the importance of the item. For example in the context of stock market prices, some stocks have much higher value than others and might appreciate a lot more than other stocks in a comparable time period. In the supermarket context, some items like jewellery, designer clothes, etc. are of much higher value than trivia like bubblegum or candy. Rules involving jewellery may have less support than those involving candy but are much more significant in terms of the revenue (and consequently profit) earned by the store. In weighted association rule mining problem each item is allowed to have a weight. The goal is to steer the mining process to those significant relationships involving items with significant weights rather than being flooded in the combinatorial explosion of insignificant relationships [8]. 2.1.2. Fuzzy Association Rule Mining (FARM). A data mining for Discovering Fuzzy Association Rules is proposed in [11]. The authors have given the technique to find Fuzzy Association Rules without using the user supplied support values which are often hard to determine. The other unique feature of the work is that the conclusion of a fuzzy association rule can contain linguistic terms. The experimental result shows that the algorithm is capable of discovering both positive and negative fuzzy association rules in an effective manner from real life database. In [5] the authors have proposed a model to find the fuzzy association rules in fuzzy transaction database. The model is found to be useful technique to find the patterns in data in the presence of imprecision, either because data are fuzzy in nature or because we must improve their semantics. Authors have also discussed` some of the applications of the scheme, paying special attention to the discovery of fuzzy association rules in relational database. 2.1.3. Fuzzy Weighted Association Rule Mining Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework is proposed in [2]. The authors have addressed the issue of invalidation of downward closure property (DCP) in weighted association rule mining where each item is assigned a weight according to their significance. Formulae for fuzzy weighted support and fuzzy weighted confidence for Boolean and quantitative items with weighted settings is proposed. The methodology follows an Apriori like approach and avoids the pre and post processing as opposed to most weighted ARM algorithm, thus eliminating the extra steps during rules generation.
  • 4. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 14 A new algorithm which is applicable to Normalized and unnormalized case is proposed in [11] In this paper the authors have introduced the problem of mining Weighted Quantitative Association rules based on Fuzzy approach. Using the fuzzy set concept, the discovered rules are more understandable to a human. Two different definition of weighted Support with and without normalization is proposed. 2.1.4. Incorporating Weight in ARM The concepts of assigning weight to the attribute have never been utilized in medical domain. In Super marker context the concept of has been used to assigned more weight to the item that gives more profit per unit sale. Other example in the context of stock market prices, some stocks have much higher value than others and might appreciate a lot more than other stocks in a comparable time period. In traditional association rule mining (ARM) model only item’s presence or absence in the transaction is mentioned and its significance is not considered at profit point of view. To deal with the weighted setting environment, an algorithm called WARM (Weighted Association Rule Mining) is proposed [8] in which the item’s weighted support is measured instead of calculating only support. The weighted support is a measure of significance of an item at cost point of view. The goal of using weighted support is to make use of the weight in the mining process and prioritize the selection of target itemsets according to their significance in the dataset, rather than their frequency alone. An itemset is denoted large if its support is above a predefined minimum support threshold. In the WARM context, an itemset is said to be significant if its weighted support is above a pre-defined minimum weighted support threshold. The threshold values specified by the user are significance of item at cost point of view. 2.1.4. Utilizing Weight in Medical Domain Motivation to utilized weight in medical domain is from supermarket scenario where a weight is assigned to each of the items according to the profit it generates to the store, rather than simply counting and calculating the percentage of transactions that contain items. In medical domain also some of the symptoms have much impact to predict particular disease. For example in predicting the probability of heart disease, the attribute prior-stroke is having more impact than the attribute BMI (Body Mass Index).The experience of expert doctor can be utilized to assign weight to the different symptoms in medical domain. This is also the way to utilize the experience of Domain Expert in Prediction Model. In medical domain the algorithm WARM can be utilized to generate weighted CAR rule. So in this paper we have used the WARM to incorporate attribute importance instead of considering the weight separately in the algorithm. 2.1.5. Fuzziness of Quantitative Attribute In ARM model when the data are quantitative such as income, age, price, etc., which are very common in many real applications, association rule mining needs to descitization of domain to convert it in to nominal domain. And ultimately apply the Apriori-type method. Thus, association rule like X −−> Y reflects association between nominal values of data items. Examples of such rules are “(Age, old), (BP, high) (Heart_Disease, yes) , “(Income, Low)−−>(Age, medium)”, and so on. Such the mining results are affected by how the intervals are partitioned, particularly for data values around interval boundaries. That is the so-called “sharp boundary” problem.
  • 5. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 15 Subsequently, the result of associative classification may also be affected in terms of accuracy and understandability. In medical domain there are number of quantitative attributes suffers from crisp boundary problem. For example attributes for example if in a particular record the BMI (Body mass Index) is 41 them according to following discritization rule The patient is considered to be severely obese. This may not give a correct result because of sharp boundary problem. Instead by applying fuzzy logic the patient is partially belonging to each fuzzy set. Hence the patient membership value to the fuzzy set should be( µ(Obesity, ”mild”) = 0.1 , µ(Obesity, ”moderate”) = 0.3, µ(Obesity, ”sever”) = 0.6 ). To deal with crisp boundary problem of quantitative attribute in ARM model the [2] proposed the Fuzzy WARM (FWARM) Algorithm and redefine the weighted support and weighted confidence to adapt in Fuzzy environment. In Fuzzy Weighted Association Rule Mining (FWARM) model the Fuzzy Weighted Support (FWS) and Fuzzy Weighted Confidence framework is proposed to mine Fuzzy Weighted Association Rule. The algorithm FWARM can be used to generate the CAR rules in Fuzzy Weighted environment. 3. PROBLEM DEFINITION In this paper we have proposed two important modifications (weight of an attribute and fuzzyfication of quantitative attributes) in Associative Classifier to improve Prediction Accuracy in Medical Domain. Hence the problem definition consists of the terms and basic concepts to define Fuzzy Attribute Weight (FAW), Fuzzy Attribute set Transaction Weight (FASTW), Fuzzy Attribute Set Weight (FASW) Fuzzy Weighted Support (FWS) and Fuzzy Weighted Confidence (FWC) for Fuzzy Weighted Associative Classifiers (FWAC). Technique for Fuzzy Weighted Association Rule Mining is known as (FWARM). 3.1 Associative Classifiers. Given a set of cases with class labels as a training set, classification is to build a model (called classifier) to predict future data objects for which the class label is unknown. Associative Classification is an integrated framework of Association Rule Mining (ARM) and Classification. A special subset of association rules whose right-hand-side is restricted to the classification class attribute is used for classification. This subset of rules is referred as the Class Association Rules (CARs). Recent studies propose the extraction of a set of high quality association rules from the training data set, which satisfy certain user-specified frequency and confidence thresholds. Effective and efficient classifiers have been built by careful selection of rules, e.g., CBA, CAEP and ADT. Such a method takes the most effective rule(s) from among all the rules mined for classification. Since association rules explore highly confident associations among multiple variables, it may overcome some constraints introduced by a decision-tree induction method, which examines one variable at a time. Extensive performance studies [11] show that association based classification may have better accuracy in general [9]. BMI[26-30] Obesity=”mild” BMI[31-40] Obesity=”moderate” BMI[40-*] Obesity=”sever”
  • 6. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 16 3.2 Fuzzy Weighted Associative Classifiers. A fuzzy dataset consists of fuzzy relational database D={ r1, r2, r3…. ri…rn} with a set of attributes I=(I1, I2, ……Im}, each IK can be associated with a set of linguistic labels L={l1, l2, ……lL } for example L={young, Middle, Old}.Let each Ik is associated with fuzzy set Fk = {(Ik,l1), (Ik,l2), (Ik,l3), ……(Ik,lL)}. So that a new Fuzzy Database D’’ is defined as {(I1, l1).… (I1, lL) …. (Ik, l1),…(Ik, lL),…(Im, l1)…(Im , lL) } .Each attribute Ii in a given transaction tk is associated (to some degree) with Several fuzzy sets. The degree of association is given by a membership degree in the range [0..1]. tk[µ(Ii, lj)] will denote the degree of membership for Fuzzy Attribute Ii to fuzzy set lj in transaction tk. Table-1 Data Base with continuous domain Table-2 Transformed Binary Database D’ from D (D) R_ID Age Blood Pressure (BP) BMI (Obesity) Heart_Disease(H_D) 1 42 90-130 40 Yes 2 62 80-120 28 No 3 55 82-122 40 Yes 4 62 92-135 50 Yes 5 45 95-135 30 No (D’) R_ID Age Blood Pressure(BP) BMI(Obesity) Heart Disease ( H_D) young Middle old High Low Normal Mild Moderate Severe 1 0 1 0 1 0 0 0 1 0 Y 2 0 0 1 0 0 1 0 1 0 N 3 0 1 0 1 0 0 0 1 0 Y 4 0 0 1 1 0 0 0 0 1 Y 5 0 1 0 1 0 0 1 0 0 N
  • 7. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 17 Table-3 Database D’’ with Fuzzy Items. Table 1 shows the Example database D with continuous Domain of quantitative attribute. In Table 2 the transformed binary database (D’) is shown in which the quantitative attributes have been partitioned by converting it into categorical attribute. Consider attribute Age in Table 1 again, three new attributes (e.g. ((Age, young), (Age, middle) and (Age, old) in place of Age may be used to constitute a new database (D′′) with partial belongings of original attribute values to each of the new attributes. Table 3 illustrates an example of the new database obtained from the original database, given fuzzy sets {Young, Middle, Old} as characterized by membership functions shown in Figure 1 for attribute age. Similarly the other quantitative attribute ie Blood pressure and BMI (Obesity) are also partitioned and membership values are assigned by using corresponding membership function. Figure 1: Fuzzy Sets Y-Age, M-Age and O-Age. Here Fuzzy logic is incorporated to split the domain of quantitative attribute into intervals, and to define a set of meaningful linguistic labels represented by fuzzy sets and use them as a new domain. In this case it is possible that one item may appear with different label of same attribute. Hence the item sets are needs to be restricted to contain at most one item set per attribute because D’’ Age BP BMI H_D young Middle old High Low Normal Mild Moderate Severe 1 0.2 0.7 0.1 0.4 0 0.6 0.6 0.3 0.1 Y 2 0.0 0.3 0.7 0.1 0.1 0.8 0.8 0.1 0.1 N 3 0.1 0.3 0.6 0.2 0.0 0.8 0.6 0.3 0.1 Y 4 0.0 0.3 0.7 0.5 0.0 0.5 0.1 0.2 0.7 Y 5 0.1 0.8 0.1 0.6 0.0 0.4 0.7 0.2 0.1 N
  • 8. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 18 otherwise the rules of the form {(Age, Middle),(Age, old)….. ⇒ class_label } have no meaning. The triangle and trapezoidal are the two important membership functions that can be used to find the degree of association for the different attribute. Definition 1. Fuzzy Attribute Weight: We assign a weight W(Ii, lj) to each fuzzy Item I(Ii, lj) where( 1≤ i ≤ n), (1≤ j ≤ L) and (0≤w≤1). Table 4 shows the random weight assigns to different fuzzy attribute for heart disease Table-4 Weight of symptoms for heart disease (attribute weight). Definition 2. Fuzzy Attribute set Transaction Weight: Weight of attribute set X a particular transaction tk is denoted by tk[FATW(X)] and is calculated as the product of membership degree of attribute in given fuzzy set in the transaction tk and weight of fuzzy attribute; of all enclosing Fuzzy attribute in the set. And is given by Example 2: Consider the 2 attribute set (Age , old), (BP, high ) in transaction1 FASTW ((Age, old), (BP, high)) = (0.1×0.6)(0.4×0.7) = 0.34 Definition 3. Fuzzy Attribute Set Weight: Fuzzy Weight of attribute set X is calculated as sum of FASTW all transaction and is denoted by FASW(X). And is given by S. No. Symptoms Weights 1 2 3 4 5 6 7 8 9 (Age, young) (Age, middle) (Age, old) (BP, Norma)l (BP, Low) (Bp, High) (BMI, Mild) (BMI, Moderate) (BMI, severe) 0.1 0.2 0.6 0.3 0.2 0.7 0.3 0.5 0.7 FASW(X) = |D’’| ∑ tk [FASTW(X)] k=1 |X| ∏ (∀ (Ii, lj)ϵ X) [ tk[µ (Ii ,l j)] × W(Ii ,l j)] i=1 tk[FASTW(X) ]=
  • 9. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 19 Example 3: Consider the 2 attribute set (Age , old), (BP, high ) . FASW ((Age, old), (BP, high)) = [ (0.1×0.6)(0.4×0.7) + (0.7×0.6)(0.1×0.7) + (0.6×0.6)(0.2×0.7) + (0.7×0.6)(0.5×0.7) + (0.1×0.6)(0.6×0.7) ] = 2.34 Definition 4. Fuzzy Weighted Support: In associative classification rule mining, the association rules are not of the form X →Y rather they are subset of these rules where Y is the class label. Fuzzy Weighted support FWS of rule X→Class_label, where X is set of non empty subsets of fuzzy weighted attribute. Fuzzy Weighted Support FWS of a rule X→Class_label is calculated as sum of weight of all transaction in which the given class label is true, divided by total number of transaction, denoted by FWS(X→Class_label ). And is given by where tk is all transaction for which the given class_label is true Example 4: Consider the attribute set X= [(Age, old), (BP, high)] and a rule r =[(Age, old), (BP, high) (Heart_disease= ”yes”) the Fuzzy Weighted Support of a rule is given by FWS((Age, old),(BP, high) (Heart_disease=”yes”)) [(0.1×0.6)(0.4×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7) ] 5 FWS( r ) = 0.27 (27%) Definition 5. Fuzzy Weighted Confidence: Fuzzy Weighted Confidence of a rule X→Y where Y represents the Class label can be defined as the ratio of Fuzzy Weighted Support of (X∪Y) and Fuzzy Weighted Support of (X). And is given by |D’’| |X| ∑ ∏ (∀ (Ii, lj)ϵX) [ tk[µ (Ii ,l j)] × W(Ii ,l j)] k=1 i=1 FSAW(X) = Fuzzy Weighted Confidence = Fuzzy Weighted Support (X∪Y) Fuzzy Weighted Support (X) ∑ ∀ tk having tk[FASTW(X)] Given class _label Number of records in D’’ FWS(X→Class_label) = |X| ∑ ∀ tk having ∏∀ (Ii, lj)ϵ X [ µ (Ii ,l j) × W(Ii ,l j)] Given i=1 class _label n FWS(X→Class_label) =
  • 10. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 20 Example 5: Consider the attribute set X= [(Age, old), (BP, high)] and a rule r =[(Age, old), (BP, high) (Heart_disease= ”yes”) the Fuzzy Weighted Confidence of a rule is given by FWC[(Age, old), (BP, high) Heart_disease=”yes”)] = [(0.1×0.6)(0.4×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7) ] [(0.1×0.6)(0.4×0.7)+(0.7×0.6)(0.1×0.7)+(0.6×0.6)(0.2×0.7)+(0.7×0.6)(0.5×0.7)+(0.1×0.6)(0.6×0. 7)] FWC(r) = 1.37/ 2.34 FWC(r) = 0.585(58%) 4. WEIGHTED DOWNWARD CLOSURE PROPERTY In a classical Apriori algorithm it is assumed that if the itemset is large, then all its subsets should also be large and is called Downward Closure Property (DCP). This helps algorithm to generate large itemsets of increasing size by adding items to itemsets that are already large. In the weighted ARM case where each item is assigned weight, the DCP does not hold. To solve the problem of invalidation of DCP, the new framework, “Fuzzy weighted support framework” is designed in [2]. The authors have proved that using Fuzzy weighted support the “weighted downward closure property” retains. the authors have prove that if an itemset {AC} is not significant then its superset say {ACE} is impossible to be significant hence no need to calculate its Fuzzy weighted support. To generate the frequent item set in the proposed method, the Apriori algorithm has been used and instead of using “support – large” framework the new framework of “Fuzzy weighted support Framework” has been used. 5. APPLICATIONS • Medical Application: In medical database most of the attributes are quantitative in nature. Descritization of these attributes will suffer crisp boundary problem Hence fuzzy environment can be used. Assigning weights to the symptoms to improves prediction accuracy compare to the traditional Associative classifiers. • Business Application: Fuzzy weighted environment is suitable for customer classification based upon their purchasing habit. In fuzzy transactional Database [2] the weighted concept can be used to assign External utility to the item in the supermarket. • Web Mining: In web mining visitor page dwelling time can be used to assign weight. |X| ∑ ∀ tk having ∏ ∀ (Ii, lj)ϵ X [ µ(Ii ,l j) × W(Ii ,l j)] given i=1 class_label |D’’| |X| ∑ ∏ (∀ (Ii, lj )ϵ X) [ µ (Ii ,l j) × W(Ii ,l j)] K=1 i=1 FWC(X) =
  • 11. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 21 • Classification Problem: Utility of the Fuzzy weighted Associative Classifiers is not limited to health care rather it can be applied in any domain to improve the prediction accuracy. 6. CONCLUSION & FUTURE WORK This work presents a new foundational approach to Fuzzy Weighted Associative Classifiers where quantitative attributes are discritized to get transformed binary database. In such data base each record fully belongs to only one fuzzy set. Such database will suffer the crisp boundary problem. To deal with crisp boundary problem of quantitative attribute in ARM model the Fuzzy WARM (FWARM) Algorithm has been proposed and redefine the weighted support and weighted confidence to adapt in Fuzzy environment. Each Fuzzy attribute is allowed to have weight depending upon their importance in predicting the class labels. A Conceptual model has been presented that allows development of an efficient and applicable algorithm in future that can capture real-life situations and can produce more accurate classifiers such as in Medical data mining. It has already been proved that, by assigning weights to Fuzzy items and using FWARM, the selection of significant item sets is steered to those item sets containing or having relationships to high weight items. Hence using Fuzzy weighted Association Rule as a Classification rule will improve the classification accuracy. In future work the proposed concept needs to be implemented to find out how much accuracy is improved by adapting the above concept. One of existing associative classifiers is to be chosen or new algorithm needs to be developed that can be integrated with Fuzzy weighted association rule miner. REFERENCES [1] S. soni, O.P. Vyas, J. pillai, Associative Classifier Using Weighted Association Rule, Symposium 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC 2009) page(s):1492- 1496. [2] M. Suleman Khan, Maybin Muyeba, M.Frans Coenen, Fuzzy weighted Association Rule Minging with weighted Support and Confidence framework. 2009 [3] Zuoliang Chen, Guoqing Chen, BUILDING AN ASSOCIATIVE CLASSIFIER BASED ON FUZZY ASSOCIATION RULE. International Journal of Computational Intelligence Systems, Vol.1, No. 3 (August, 2008), 262 – 2732008 [4] Fadi Thabtah, A review of associative classification mining, The Knowledge Engineering Review, Volume 22, Issue 1 (March 2007), Pages 37-65, 2007. [5] Miguel Delgado, Nicolas Marin, Daniel Sanchez, and maria-Amparo Vila, Fuzzy Association Rules: General Models and Applications, IEEE TRANSACTION ON FUZZY SYSTEM ,VOL 11, NO.2, April 2003. [6] Khan, M.S. Muyeba, M. Coenen, F A Weighte Utility Framework for Mining Association Rules, Symposium Computer Modeling and Simulation, 2008. EMS '08. Second UKSIM European , page(s): 87-92.
  • 12. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012 22 [7] H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rule-based classification systems,” IEEE Trans. on Fuzzy Systems, vol. 13, no. 4, pp. 428-435, August 2005. [8] Feng Tao, Fionn Murtagh and Mohsen Farid. Weighted Association Rule Mining using Weighted Support and Significance Framework Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining 2003, Pages:661-666 Year of Publication: 2003 [9] Lu, J-J.: Mining Boolean and General Fuzzy Weighted Association Rules in Databases, Systems Engineering-Theory & Practice, 2, 28--32 (2002) [10] W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class- association rules. In ICDM'01, pp. 369(376, San Jose, CA, Nov.2001). [11] Gyenesei, A.: Mining Weighted Association Rules for Fuzzy Quantitative Items, Proceedings of PKDD Conference pp. 416--423 (2000). [12] Wai-Ho Au Keith C.C. Chan , FARM: A Data Mining System for Discovering Fuzzy Association Rules Proc. of the 8th IEEE Int’l Conf. on Fuzzy Systems, Seoul, Korea 1999. [13] B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. In KDD’98, New York, NY, Aug.1998. Authors Mrs. Sunita Soni is a Sr. Associate Professor in Department of Computer Applications at Bhilai Institute of Technology, Durg (C.G.) , India. She is a post- graduate from Pt. Ravi Shankar Shukla University, India. She is a Life fellow member of Indian Society for Technical Education. She has total teaching experience of 12 years He has a total of 16 Research papers published in National / International Journals / Conferences into her credit. Presently She is pursuing PhD from Pt. Ravi Shankar Shukla University, Raipur under the guidance of Dr. O.P.Vyas, IIIT, Allahabad. Dr.O.P.Vyas is currently working as Professor and Incharge Officer (Doctoral Research Section) in Indian Institute of Information Technology-Allahabad (Govt. of India’s Center of Excellence in I.T.). Dr.Vyas has done M.Tech.(Computer Science) from IIT Kharagpur and has done Ph.D. work in joint collaboration with Technical University of Kaiserslautern (Germany) and I.I.T.Kharagpur. With more than 25 years of academic experience Dr.Vyas has guided Four Scholars for the successful award of Ph.D. degree and has more than 80 research publications with two books to his credit. His current research interests are Linked Data Mining and Service Oriented Architectures.