R Code for DDclust and DDclass:
Algorithms described in
"Clustering and Classification based on the L1 data depth"
General information. I have posted this code, feel free to use it. However, please do not redistribute the code without referring to this page, and the source paper.
About the code:
- R code - not optimized for speed
- Required libraries: MASS, class, cluster
- Assist functions
- Source in the functions in the files below: Basicfcns.q, Basicfcns2.q
- Main programs
- The main programs are DDclass.q for classification and DDclust.q for clustering.
- Validation and Visualization
- ReDplot.q, DDclust.q
-
- Input for DDclass
- Required:(nc,data.resptr,data.mattr,data.matte)
- nc - the number of classs
- data.resptr - the labels of the training set (label range 0 to C-1, where C is the number of classes)
- data.mattr - the training data matrix (rows=samples, columns=genes)
- data.matte - the test data matrix
- nc - number of classes
- Output from DDclass
- ReDtrain, ReDtest - the relative data depths
- Ntest, NtestCV - the predictions
- Ntestm, Ntestc (median and centroid prototyp), Ntestk (kNN), Ntests (SILclass), NtestsCV (SILclassCV)
- IVT - training observations removed by DDclass CV. IV - removed by SILclass CV.
-
- Input for DDclust
- Required:(X,K,lambda,Th,A=20,T0=0,alpha=.9,lplot=0)
- K - the number of clusters
- X - the data matrix (rows are clustered)
- Th - threshold to identify objects that can be relocated, usually set to 0
- Optional: A, T0, alpha, lplot
- A - number of iterations, default 20
- T0 (1/beta) - default 0
- alpha - decay rate, default .9
- lplot - tracking convergence, default 0
- Output from DDclust
- NN - cluster assignment, NN[1,] is the final allocation
- Y - the multivariate median cluster representatives
- Cost - final value of partition
- Basicfcns.qSome basic functions, cross validation, data depths etc
- Basicfcns2.qSome basic functions for DDclust
- DDplot.qData depth plot
- ReDplot.q/Relaive data depth plot
- DDclass.qClassification algorithm
- DDclust.qClustering algorithm
- Calling the functions:
- example:
- Dout<-DDclass(3,lander.train,lander.traindat,lander.testdat)
- testerror: length(Dout$Ntest[Dout$Ntest!=lander.test])
- Dout<-DDclust(lander.dat,3,.5,0)
- ReD<-ReDplot(t(Dout$NN),lander.dat,3)
Back to : Rebecka
05/03