What on my minds: Data Mining

Monday, April 20, 2009

Data Mining

K-mean clustering, (popular method of clustering)
each cluster get a center, we assign K amount of cluster
Find the cluster with the mimimizing the sum of squared distance from the center point.
repeat till center settle down.

1. Select K points as the initial centroids
2. repeat
3. form K cluster by assigning all points closest to centroid
4. compute the centroid of each cluster
5. untill the centroid dont change.

ex. 49
fit<-kmean (x,2) //take the data and put in 2 cluster
fit
clustering mean = 2 points
calculating the distances.

knnnfit<-knn(fit$centers,x,as.factor(c(-1,1))) */ basically gettin 2 diff color.

ex.50
just like last example. but using sonar data

ex.51 asking 2 diff thing metal and wood object, does the cluster we drew corresponding with the objects? now color them what the actuals points are. on 61th column
if we based on clustering, it have not much to do with the type of object that they are.
need to pick better attributes.

ex.52
use the cluster to classify the data, how accuracy would that be?
1-sum(knnfit==y)/lenghth(y) */ ==y means compare with the actual data|| 1-sum = Misclas er

ex.53
repeat with all 60 columns.
u can draw a cluster, it make no contribute to classify what type of materials it is.

ex.54
x<-c(1,2,3,5,6,7,8)

assign center, then assign closet points to center,
center 1 & 2

Iteration Centers
0 1&2
1 1&5*1/6
2 2&6.5
3 same, stop on repeat, make sure it the same.

ex.55
write Kmean algorithm in R

What on my minds

Monday, April 20, 2009

Data Mining

No comments:

Contributors

Blog Archive