Biclustering

For large gene expression datasets, biclustering is a method to detect groups of genes that are correlated among a subset of samples. This is different from traditional clustering methods where correlation is defined across all samples.

The biclust R package ( Kaiser & Leisch 2008) provides a wrapper for many different algorithms of biclustering. We used the default settings for the biclust package. The following information is compiled from the help file of biclust:

  • BCCC: Performs CC Biclustering based on the framework by Cheng and Church (2000).
  • BCXmotifs: Performs XMotifs Biclustering based on the framework by Murali and Kasif (2003). Searches for a submatrix where each row as a similar motif through all columns. The algorithm needs a discrete matrix to perform.
  • BCPlaid: Performs Plaid Model Biclustering as described in Turner et al., 2003. This is an improvement of original ‘Plaid Models for Gene Expression Data’ (Lazzeroni and Owen, 2002). This algorithm models data matrices to a sum of layers, the model is fitted to data through minimization of error.
  • BCSpectral: Performs Spectral Biclustering as described in Kluger et al., 2003. Spectral biclustering supposes that normalized microarray data matrices have a checkerboard structure that can be discovered by the use of svd decomposition in eigenvectors, applied to genes (rows) and conditions (columns).
  • BCBimax: Performs Bimax Biclustering based on the framework by Prelic et. al.(2006). It searches for submatrices of ones in a logical matrix. Uses the original C code of the authors.
  • BCQuest: Performs Questmotif Biclustering a Bicluster algorithm for questionnaires based on the framework by Murali and Kasif (2003). Searches subgroups of questionnaires with the same or similar answer to some questions.

In addition, iDEP also includes the QUBIC package by Zhang et al (2017) and runibic package by Orzechowski et al (2017).

References:

  1. Cheng, Y. & Church, G.M. Biclustering of Expression Data Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, 2000, 1, 93-103
  2. Kluger et al., “Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions”, Genome Research, 2003, vol. 13, pages 703-716
  3. Kaiser, Sebastian and Leisch, Friedrich (16. April 2008): A Toolbox for Bicluster Analysis in R. Department of Statistics: Technical Reports, No.28 https://epub.ub.uni-muenchen.de/3293/
  4. Lazzeroni and Owen, “Plaid Models for Gene Expression Data”, Standford University, 2002.
  5. Murali, T. & Kasif, S. Extracting Conserved Gene Expression Motifs from Gene Expression Data Pacific Symposium on Biocomputing, sullivan.bu.edu, 2003, 8, 77-88
  6. Prelic, A.; Bleuler, S.; Zimmermann, P.; Wil, A.; Buhlmann, P.; Gruissem, W.; Hennig, L.; Thiele, L. & Zitzler, E. A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data Bioinformatics, Oxford Univ Press, 2006, 22, 1122-1129
  7. Turner et al, “Improved biclustering of microarray data demonstrated through systematic performance tests”,Computational Statistics and Data Analysis, 2003, vol. 48, pages 235-254.
  8. Zhang Y et al. “QUBIC: a bioconductor package for qualitative biclustering analysis of gene co-expression data”, Bioinformatics 33:450-452, 2017
  9. Orzechowski P et al. runibic: a Bioconductor package for parallel row-based biclustering of gene expression data, BioRxiv 2017

 

%d bloggers like this:
search previous next tag category expand menu location phone mail time cart zoom edit close