Principal component analysis (PCA) is a linear transformation of the data, such that the first components points to the direction of the most variation among samples. PCA enables us to project samples into two-dimensional space. We also treated the PCA loadings onto each of the genes as expression data to run pathway analysis with the PGSEA package. For each pathway, this runs the PAGE algorithm which performs one-sample t-test on each gene set in the biological processes branch of Gene Ontology (GO). The adjusted P-values are used to rank the pathways for each of the first 5 principal components. The pathways are labeled with FDR first, followed by the principal components (PC1, PC2 and so on). Only 5 pathways for each principal components are shown, but duplicated ones are skipped.
Multidimensional scaling (MDS) is a non-linear method that projects data points in high-dimensional spaces into 2D surfaces while preserving the distances as much as possible.
The following R code performs PCA, PGSEA, and MDS analyses:
pca.object <- prcomp(t(x))
pg = PGSEA (pca,cl=GeneSets(),range=c(15,2000),p.value=TRUE, weighted=FALSE)
fit = cmdscale( dist2(t(x) ), eig=T, k=2)