Data Sources


ScerTF Data Sources

We collected results from eleven different computational and experimental studies which report binding specificities of transcription factors in Saccharomyces cerevisiae. These studies rely on different methods to infer DNA binding specificities, including phylogenetic footprinting(1,2), molecular modeling(3), gene expression analysis(4), in vitro binding assays(5), Chromatin Immunoprecipitation (ChIP) (6), DNA immunoprecipitation with microarray detection (DIP-ChIP) (7), and Protein Binding Microarrays (PBM) (7-9). In addition, we incorporated the SCPD database (10) into our own database to evaluate the performance of its matrices and to assimilate these matrices into our alignment strategy. Matrices from the commercially available TRANSFAC database were also evaluated using the same metrics, but are not made freely available in this database. However, in all cases, the TRANSFAC PWMs were outperformed by matrices in at least one of the other datasets.

To evaluate the performance of each matrix, we obtained ChIP-chip binding data from a compendium of experiments performed by Harbison et al (6). For each ChIP experiment, the matrices annotated to that transcription factor in the compiled database were used to predict which probes should be bound in the ChIP-chip data. These predictions were compared with the observed occupancy in the ChIP experiment to evaluate each matrix using a Fisher’s exact test (11). In the event that ChIP data was unavailable for a particular transcription factor, matrices were evaluated using data obtained from an analysis of transcription factor gene deletion mutants (12). In this dataset, genes are annotated as either significantly up- or down-regulated in response to deletion of a particular transcription factor. Each matrix annotated to a particular TF was used to predict which genes should be up/down-regulated in a deletion mutant strain for that transcription factor. Predictions were compared with observed data using a Fisher’s exact test. A few TFs lacked both immunoprecipitation data and gene deletion data; for these cases we used matrix information content to help decide among candidate matrices.

Main Experimental Data Sources
Data Source
Harbison et. al. Harbison et al Supplemental Data
Reimand et. al. Reimand et al Supplemental Data

For each transcription factor, we also provide a brief annotation of the protein structure and function of that gene. These descriptions were obtained from the Saccharomyces Genome Database(13).

Individual datasets assimilated in this collection are available for download from their respective sources. Links to the literature and the datasets are provided below:

ReferenceData Source
Badis et. al. Hughes Lab Data
Foat et. al. Transfactome
Harbison et. al. Harbison et al. Supplemental Data
Macisaac et. al Macisaac et. al. Supplemental Data
Morozov et. al. Author Website
Pachkov et al SwissRegulon
TRANSFAC http://www.gene-regulation.com
Zhao Y and GD Stormo BEEML Homepage
Zhu J and MQ Zhang SCPD
Zhu et. al.Uniprobe


1. MacIsaac, K.D., Wang, T., Gordon, D.B., Gifford, D.K., Stormo, G.D. and Fraenkel, E. (2006) An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics, 7, 113.
2. Pachkov, M., Erb, I., Molina, N. and van Nimwegen, E. (2007) SwissRegulon: a database of genome-wide annotations of regulatory sites. Nucleic Acids Res, 35, D127-131.
3. Morozov, A.V. and Siggia, E.D. (2007) Connecting protein structure with predictions of regulatory sites. Proc Natl Acad Sci U S A, 104, 7068-7073.
4. Foat, B.C., Tepper, R.G. and Bussemaker, H.J. (2008) TransfactomeDB: a resource for exploring the nucleotide sequence specificity and condition-specific regulatory activity of trans-acting factors. Nucleic Acids Res, 36, D125-131.
5. Fordyce, P.M., Gerber, D., Tran, D., Zheng, J., Li, H., DeRisi, J.L. and Quake, S.R. De novo identification and biophysical characterization of transcription-factor binding sites with microfluidic affinity analysis. Nat Biotechnol, 28, 970-975.
6. Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J. et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature, 431, 99-104.
7. Badis, G., Chan, E.T., van Bakel, H., Pena-Castillo, L., Tillo, D., Tsui, K., Carlson, C.D., Gossett, A.J., Hasinoff, M.J., Warren, C.L. et al. (2008) A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters. Mol Cell, 32, 878-887.
8. Zhu, C., Byers, K.J., McCord, R.P., Shi, Z., Berger, M.F., Newburger, D.E., Saulrieta, K., Smith, Z., Shah, M.V., Radhakrishnan, M. et al. (2009) High-resolution DNA-binding specificity analysis of yeast transcription factors. Genome Res, 19, 556-566.
9. Zhao, Y. and Stormo, G.D. Quantitative analysis demonstrates most transcription factors require only simple models of specificity. Nat Biotechnol, 29, 480-483.
10. Zhu, J. and Zhang, M.Q. (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics, 15, 607-611.
11. Marstrand, T.T., Frellsen, J., Moltke, I., Thiim, M., Valen, E., Retelska, D. and Krogh, A. (2008) Asap: a framework for over-representation statistics for transcription factor binding sites. PLoS One, 3, e1623.
12. Reimand, J., Vaquerizas, J.M., Todd, A.E., Vilo, J. and Luscombe, N.M. (2010) Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets. Nucleic Acids Res, 38, 4768-4777.
13. Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M. et al. (1998) SGD: Saccharomyces Genome Database. Nucleic Acids Res, 26, 73-79.