UWA Logo
  Faculty Home | School Home | Internal Page | Awesome Animations   
           
Home
About the School
Contact and People
Future Undergraduate Students
Prospective Postgraduates
Current Students
Current Postgraduates
Research
IT News
Awards
Industry Links and Prizes
School and IT Information
Other
Internal Information

Research Seminar - September 07, 2003

Clustering in Very Large Data Sets
with an Application to Satellite Image Processing

Professor Jim Bezdek
Department of Computer Science
University of West Florida
11am Tuesday 7th October, 2003
Computer Science & Software Engineering
Seminar Room 1.24

Abstract:

Recent increases in data acquisition speeds, higher resolutions and greater storage capabilities have created many very large data sets. One of the most computationally intensive operations that is performed on such data is cluster analysis. This talk is about research that facilitates clustering in very large data sets (more than one terabyte for storage). We discuss a general method to deal with very large data sets that cannot be loaded into a single memory. This method selects subsets of the very large data set using progressive sampling. Sampling terminates and the sample is clustered when the selected sample passes a statistical goodness of fit test. We then introduce the idea of efficiently extensible image processing algorithms. We show that clusters in the sample can be extended to the rest of the image non-iteratively using this idea for some, but not all, clustering algorithms. We illustrate our method with very large satellite image data. Our computational examples suggest that we can achieve about the same accuracy as that which can be obtained using all of the data when (roughly) 30 percent of the image is used, when segmenting images with the fuzzy c-means clustering algorithms.

Top of Page