SlideShare a Scribd company logo
A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FOR
HIGH-DIMENSIONAL DATA
ABSTRACT:
Feature selection involves identifying a subset of the most useful features that produces
compatible results as the original entire set of features. A feature selection algorithm may be
evaluated from both the efficiency and effectiveness points of view. While the efficiency
concerns the time required to find a subset of features, the effectiveness is related to the quality
of the subset of features. Based on these criteria, a fast clustering-based feature selection
algorithm (FAST) is proposed and experimentally evaluated in this paper.
The FAST algorithm works in two steps.
In the first step, features are divided into clusters by using graph-theoretic clustering methods.
In the second step, the most representative feature that is strongly related to target classes is
selected from each cluster to form a subset of features.
Features in different clusters are relatively independent; the clustering-based strategy of FAST
has a high probability of producing a subset of useful and independent features. To ensure the
efficiency of FAST, we adopt the efficient minimum-spanning tree (MST) clustering method.
The efficiency and effectiveness of the FAST algorithm are evaluated through an empirical
study. Extensive experiments are carried out to compare FAST and several representative feature
selection algorithms results, on 35 publicly available real-world high-dimensional image,
microarray, and text data, demonstrate that the FAST not only produces smaller subsets of
features but also improves the performances of the four types of classifiers.
ECWAY TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE
CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111
VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com
Ad

Recommended

IEEE 2014 DOTNET DATA MINING PROJECTS Similarity preserving snippet based vis...
IEEE 2014 DOTNET DATA MINING PROJECTS Similarity preserving snippet based vis...
IEEEMEMTECHSTUDENTPROJECTS
 
Yangetal Efficient Letkf
Yangetal Efficient Letkf
ShuChih.Yang
 
Economic dispatch using fuzzy logic
Economic dispatch using fuzzy logic
Senthil Kumar
 
Understanding Map Integration Using GIS Software Poster_ff
Understanding Map Integration Using GIS Software Poster_ff
Michelle Pasco
 
Object Tracking By Online Discriminative Feature Selection Algorithm
Object Tracking By Online Discriminative Feature Selection Algorithm
IRJET Journal
 
IMPL Data Analysis
IMPL Data Analysis
Alkis Vazacopoulos
 
Learning from data for wind–wave forecasting
Learning from data for wind–wave forecasting
Jonathan D'Cruz
 
Freenome's Biological Machine Learning Platform
Freenome's Biological Machine Learning Platform
Brandon White
 
Neural Network Presentation
Neural Network Presentation
Omoye
 
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Mahmoud Alfarra
 
Collaborative Filtering Survey
Collaborative Filtering Survey
mobilizer1000
 
Ppt manqing
Ppt manqing
Xiang Zhang
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
Masud Rahman
 
D0931621
D0931621
IOSR Journals
 
A value added predictive defect type distribution model
A value added predictive defect type distribution model
UmeshchandraYadav5
 
Poster: ICPR 2008
Poster: ICPR 2008
Mahfuzul Haque
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEEBEBTECHSTUDENTPROJECTS
 
Cloud migration research a systematic review
Cloud migration research a systematic review
Nexgen Technology
 
Matlab reversible watermarking based on invariant image classification and d...
Matlab reversible watermarking based on invariant image classification and d...
Ecway Technologies
 
Different approaches for controlling Boolean networks
Different approaches for controlling Boolean networks
CeliaBianeFourati
 
One–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networks
Jonathan D'Cruz
 
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
Aboul Ella Hassanien
 
Integrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
Java region-based foldings in process discovery
Java region-based foldings in process discovery
Ecway Technologies
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
JPINFOTECH JAYAPRAKASH
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
IEEEGLOBALSOFTTECHNOLOGIES
 
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Editor IJCATR
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
IEEEGLOBALSOFTTECHNOLOGIES
 
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd Iaetsd
 

More Related Content

What's hot (16)

Neural Network Presentation
Neural Network Presentation
Omoye
 
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Mahmoud Alfarra
 
Collaborative Filtering Survey
Collaborative Filtering Survey
mobilizer1000
 
Ppt manqing
Ppt manqing
Xiang Zhang
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
Masud Rahman
 
D0931621
D0931621
IOSR Journals
 
A value added predictive defect type distribution model
A value added predictive defect type distribution model
UmeshchandraYadav5
 
Poster: ICPR 2008
Poster: ICPR 2008
Mahfuzul Haque
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEEBEBTECHSTUDENTPROJECTS
 
Cloud migration research a systematic review
Cloud migration research a systematic review
Nexgen Technology
 
Matlab reversible watermarking based on invariant image classification and d...
Matlab reversible watermarking based on invariant image classification and d...
Ecway Technologies
 
Different approaches for controlling Boolean networks
Different approaches for controlling Boolean networks
CeliaBianeFourati
 
One–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networks
Jonathan D'Cruz
 
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
Aboul Ella Hassanien
 
Integrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
Java region-based foldings in process discovery
Java region-based foldings in process discovery
Ecway Technologies
 
Neural Network Presentation
Neural Network Presentation
Omoye
 
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Mahmoud Alfarra
 
Collaborative Filtering Survey
Collaborative Filtering Survey
mobilizer1000
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
Masud Rahman
 
A value added predictive defect type distribution model
A value added predictive defect type distribution model
UmeshchandraYadav5
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Tension in active shapes
IEEEBEBTECHSTUDENTPROJECTS
 
Cloud migration research a systematic review
Cloud migration research a systematic review
Nexgen Technology
 
Matlab reversible watermarking based on invariant image classification and d...
Matlab reversible watermarking based on invariant image classification and d...
Ecway Technologies
 
Different approaches for controlling Boolean networks
Different approaches for controlling Boolean networks
CeliaBianeFourati
 
One–day wave forecasts based on artificial neural networks
One–day wave forecasts based on artificial neural networks
Jonathan D'Cruz
 
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
New Rough Set Attribute Reduction Algorithm based on Grey Wolf Optimization
Aboul Ella Hassanien
 
Integrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
Java region-based foldings in process discovery
Java region-based foldings in process discovery
Ecway Technologies
 

Similar to A fast clustering based feature subset selection algorithm for high-dimensional data (20)

A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
JPINFOTECH JAYAPRAKASH
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
IEEEGLOBALSOFTTECHNOLOGIES
 
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Editor IJCATR
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
IEEEGLOBALSOFTTECHNOLOGIES
 
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd Iaetsd
 
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
IEEEGLOBALSOFTTECHNOLOGIES
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
IEEEGLOBALSOFTTECHNOLOGIES
 
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
IEEEMEMTECHSTUDENTSPROJECTS
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes
 
High dimesional data (FAST clustering ALG) PPT
High dimesional data (FAST clustering ALG) PPT
deepan v
 
M43016571
M43016571
IJERA Editor
 
SEO PROCESS
SEO PROCESS
Mohan Balakrishna
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
IRJET Journal
 
Iaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection for
Iaetsd Iaetsd
 
A Survey on Constellation Based Attribute Selection Method for High Dimension...
A Survey on Constellation Based Attribute Selection Method for High Dimension...
IJERA Editor
 
Automatic Feature Subset Selection using Genetic Algorithm for Clustering
Automatic Feature Subset Selection using Genetic Algorithm for Clustering
idescitation
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
JPINFOTECH JAYAPRAKASH
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
JAVA 2013 IEEE DATAMINING PROJECT A fast clustering based feature subset sele...
IEEEGLOBALSOFTTECHNOLOGIES
 
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Feature Selection Algorithm for Supervised and Semisupervised Clustering
Editor IJCATR
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...
IEEEGLOBALSOFTTECHNOLOGIES
 
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd an efficient and large data base using subset selection algorithm
Iaetsd Iaetsd
 
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
JAVA 2013 IEEE PROJECT A fast clustering based feature subset selection algor...
IEEEGLOBALSOFTTECHNOLOGIES
 
A fast clustering based feature subset selection algorithm for high-dimension...
A fast clustering based feature subset selection algorithm for high-dimension...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subset ...
IEEEGLOBALSOFTTECHNOLOGIES
 
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
2014 IEEE JAVA DATA MINING PROJECT A fast clustering based feature subset sel...
IEEEMEMTECHSTUDENTSPROJECTS
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes
 
High dimesional data (FAST clustering ALG) PPT
High dimesional data (FAST clustering ALG) PPT
deepan v
 
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
IRJET Journal
 
Iaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection for
Iaetsd Iaetsd
 
A Survey on Constellation Based Attribute Selection Method for High Dimension...
A Survey on Constellation Based Attribute Selection Method for High Dimension...
IJERA Editor
 
Automatic Feature Subset Selection using Genetic Algorithm for Clustering
Automatic Feature Subset Selection using Genetic Algorithm for Clustering
idescitation
 
Ad

Recently uploaded (20)

How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18
Celine George
 
The Man In The Back – Exceptional Delaware.pdf
The Man In The Back – Exceptional Delaware.pdf
dennisongomezk
 
Capitol Doctoral Presentation -June 2025.pptx
Capitol Doctoral Presentation -June 2025.pptx
CapitolTechU
 
How to Configure Vendor Management in Lunch App of Odoo 18
How to Configure Vendor Management in Lunch App of Odoo 18
Celine George
 
BINARY files CSV files JSON files with example.pptx
BINARY files CSV files JSON files with example.pptx
Ramakrishna Reddy Bijjam
 
What are the benefits that dance brings?
What are the benefits that dance brings?
memi27
 
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
penafloridaarlyn
 
How to Implement Least Package Removal Strategy in Odoo 18 Inventory
How to Implement Least Package Removal Strategy in Odoo 18 Inventory
Celine George
 
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
GlysdiEelesor1
 
Nice Dream.pdf /
Nice Dream.pdf /
ErinUsher3
 
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Restu Bias Primandhika
 
Wax Moon, Richmond, VA. Terrence McPherson
Wax Moon, Richmond, VA. Terrence McPherson
TerrenceMcPherson1
 
SPENT QUIZ NQL JR FEST 5.0 BY SOURAV.pptx
SPENT QUIZ NQL JR FEST 5.0 BY SOURAV.pptx
Sourav Kr Podder
 
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
Quiz Club of PSG College of Arts & Science
 
“THE BEST CLASS IN SCHOOL”. _
“THE BEST CLASS IN SCHOOL”. _
Colégio Santa Teresinha
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
How to Manage & Create a New Department in Odoo 18 Employee
How to Manage & Create a New Department in Odoo 18 Employee
Celine George
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18
Celine George
 
The Man In The Back – Exceptional Delaware.pdf
The Man In The Back – Exceptional Delaware.pdf
dennisongomezk
 
Capitol Doctoral Presentation -June 2025.pptx
Capitol Doctoral Presentation -June 2025.pptx
CapitolTechU
 
How to Configure Vendor Management in Lunch App of Odoo 18
How to Configure Vendor Management in Lunch App of Odoo 18
Celine George
 
BINARY files CSV files JSON files with example.pptx
BINARY files CSV files JSON files with example.pptx
Ramakrishna Reddy Bijjam
 
What are the benefits that dance brings?
What are the benefits that dance brings?
memi27
 
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
penafloridaarlyn
 
How to Implement Least Package Removal Strategy in Odoo 18 Inventory
How to Implement Least Package Removal Strategy in Odoo 18 Inventory
Celine George
 
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
GlysdiEelesor1
 
Nice Dream.pdf /
Nice Dream.pdf /
ErinUsher3
 
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Restu Bias Primandhika
 
Wax Moon, Richmond, VA. Terrence McPherson
Wax Moon, Richmond, VA. Terrence McPherson
TerrenceMcPherson1
 
SPENT QUIZ NQL JR FEST 5.0 BY SOURAV.pptx
SPENT QUIZ NQL JR FEST 5.0 BY SOURAV.pptx
Sourav Kr Podder
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
How to Manage & Create a New Department in Odoo 18 Employee
How to Manage & Create a New Department in Odoo 18 Employee
Celine George
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
Ad

A fast clustering based feature subset selection algorithm for high-dimensional data

  • 1. A FAST CLUSTERING-BASED FEATURE SUBSET SELECTION ALGORITHM FOR HIGH-DIMENSIONAL DATA ABSTRACT: Feature selection involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. A feature selection algorithm may be evaluated from both the efficiency and effectiveness points of view. While the efficiency concerns the time required to find a subset of features, the effectiveness is related to the quality of the subset of features. Based on these criteria, a fast clustering-based feature selection algorithm (FAST) is proposed and experimentally evaluated in this paper. The FAST algorithm works in two steps. In the first step, features are divided into clusters by using graph-theoretic clustering methods. In the second step, the most representative feature that is strongly related to target classes is selected from each cluster to form a subset of features. Features in different clusters are relatively independent; the clustering-based strategy of FAST has a high probability of producing a subset of useful and independent features. To ensure the efficiency of FAST, we adopt the efficient minimum-spanning tree (MST) clustering method. The efficiency and effectiveness of the FAST algorithm are evaluated through an empirical study. Extensive experiments are carried out to compare FAST and several representative feature selection algorithms results, on 35 publicly available real-world high-dimensional image, microarray, and text data, demonstrate that the FAST not only produces smaller subsets of features but also improves the performances of the four types of classifiers. ECWAY TECHNOLOGIES IEEE PROJECTS & SOFTWARE DEVELOPMENTS OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111 VISIT: www.ecwayprojects.com MAIL TO: [email protected]