Outlier Detection Method on UCI Repository Dataset by Entropy Based Rough K-means

  • Ashok P. Research scholar, Bharathiar University, Coimbatore
  • G.M Kadhar Nawaz Research scholar, Bharathiar University, Coimbatore
Keywords: Clustering process, entropy, rough set, outlier detection, validity index, data mining

Abstract

Rough set theory is used to handle uncertainty and incomplete information by applying two sets, lower and upper approximation. In this paper, the clustering process is improved by adapting the preliminary centroid selection method on rough K-means (RKM) algorithm. The entropy based rough K-means (ERKM) method is developed by adapting entropy based preliminary centroids selection on RKM and executed and also validated by cluster validity indexes. An example shows that the ERKM performs effectively by selection of entropy based preliminary centroid. In addition, Outlier detection is an important task in data mining and very much different from the rest of the objects in the cluster. Entropy based rough outlier factor (EROF) method is used to detect outlier effectively for yeast dataset. An example shows that EROF detects outlier effectively on protein localisation sites and ERKM clustering algorithm performed effectively. Further, experimental readings show that the ERKM and EROF method outperformed the other methods.

 

Author Biographies

Ashok P., Research scholar, Bharathiar University, Coimbatore
Mr P. Ashok received his MSc (Computer Science) and MPhil (Computer Science) from Periyar University, Salem, India in 2008 and 2009, respctively. Currently pursuing his PhD (Computer Science) from Bharathiar University. His research area of interests includes : Data mining, rough set, fuzzy logic and bioinformatics.
G.M Kadhar Nawaz, Research scholar, Bharathiar University, Coimbatore
Dr G.M. Kadhar Nawaz, received his PhD (Computer Science) from Periyar University, Salem and MCA from Madras University, Chennai. Currently working as a Director, Department of Computer Applications, Bharathiar University. He presented and published over 30 papers in National, International conferences and Journals. His area of interest is N/W security, stegnography and cryptography. His research area of interests includes : Data mining, networking fuzzy logic and image processing.
Published
2016-03-23
How to Cite
P., A., & Nawaz, G. (2016). Outlier Detection Method on UCI Repository Dataset by Entropy Based Rough K-means. Defence Science Journal, 66(2), 113-121. https://doi.org/10.14429/dsj.66.9463
Section
Computers & Systems Studies