Home
About the School
Contact and People
Future Undergraduate Students
Prospective Postgraduates
Current Students
Current Postgraduates
Research
IT News
Awards
Industry Links and Prizes
School and IT Information
Other
Internal Information
|
Research Seminar - September 07, 2003
Clustering in Very Large Data Sets with an Application to
Satellite Image Processing
Professor Jim Bezdek
Department of Computer Science
University of West Florida
11am Tuesday 7th October, 2003
Computer Science & Software Engineering
Seminar Room 1.24
Abstract:
Recent increases in data acquisition speeds,
higher resolutions and greater storage capabilities have created many
very large data sets. One of the most computationally intensive
operations that is performed on such data is cluster analysis. This
talk is about research that facilitates clustering in very large data
sets (more than one terabyte for storage). We discuss a general method
to deal with very large data sets that cannot be loaded into a single
memory. This method selects subsets of the very large data set using
progressive sampling. Sampling terminates and the sample is clustered
when the selected sample passes a statistical goodness of fit test. We
then introduce the idea of efficiently extensible image processing
algorithms. We show that clusters in the sample can be extended to the
rest of the image non-iteratively using this idea for some, but not
all, clustering algorithms. We illustrate our method with very large
satellite image data. Our computational examples suggest that we can
achieve about the same accuracy as that which can be obtained using
all of the data when (roughly) 30 percent of the image is used, when
segmenting images with the fuzzy c-means clustering algorithms.
|
|