Data Mining

Data mining is the process of analyzing large collections of data and searching for underlining patterns that my not be immediately noticeable. Most often the collection of data is in the form of a relational database and the pattern is a correlation between unique elements of the database.


N-Body Cluster Formation Simulations

N-body simulations are used when studying the interaction between the element of a star cluster over time. These simulations map the interaction between every element of the field every other element over time. Such simulations are extremely CPU intensive. The simulation of a model with 100,000 elements would require roughly 10 billion billion simple arithmetic calculations. For the worlds fastest computer at a petaflop this calculation would take over 115 days of processing time.


Proposal

Data Mining on relational databases is a field of growing importance and scope, however much of the field still centers around highly mathematical and analytical analysis. My proposal is for the use of star cluster formation computer model techniques to be for the creation of visual data mining tools.

Similar element identifiers could be modeled as relative vector forces. As can be seen in the image above the vector forces from all the other elements on a single element could be summed to a single force vector which would determine the movement of that vector. Uniform distribution of elements would transform over time to sets of clusters. These clusters would hint at underline connections between the elements of the data set.


Below is a video of a model of partial cluster evolving from an initial spherical format.



Data Mining - Wikipedia

P. Berkhin, Survey of Clustering Data Mining Techniques, Accrue Software, 2002.

# ^ Heggie, D. C.; M. Giersz, R. Spurzem, K. Takahashi (1998). Johannes Andersen "Dynamical Simulations: Methods and Comparisons". Highlights of Astronomy Vol. 11A, as presented at the Joint Discussion 14 of the XXIIIrd General Assembly of the IAU, 1997, 591+, Kluwer Academic Publishers. Retrieved on 2006-05-28.