Modeling Dependencies in Protein-DNA Binding Sites:
Analysis of Yeast Clusters
Another rich collection of gene clusters, based upon functional annotations were collected by Hughes et al, 2000. In the analysis of this data, we've considered only the experiments were at least 50 target genes had a significant p-value of 0.01. For each of those 43 clusters, we've performed a 5-fold cross validation test, learning a model on 80% of the sites, and predicting if there will be a binding within the other 20%. We've repaeted this procedure for all our various models (PSSM, Tree, Mixture of PSSMs & Mixture of Trees), and calculated the Sensitivity and Specificity measures, as well as the hypergeometric p-value of this partition. In the following table we present these results, as well as additional raw information.

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

1

aminoacid_biosynthesis
(ROC)

114 regulated

16%, 31%
1.3e-16
(pic) (net)

16%, 30%
1.9e-16
(pic) (net)

17%, 35%
1.3e-18

(pic) (net)
Clust=( 0.29 0.71 )

5%, 10%
3.6e-03
(pic) (net)
Clust=( 0.34 0.66 )

2

aminoacid_metabolism
(ROC)

173 regulated

13%, 34%
6.0e-17

(pic) (net)

13%, 34%
6.0e-17

(pic) (net)

10%, 30%
3.4e-12
(pic) (net)
Clust=( 0.59 0.41 )

4%, 12%
4.3e-03
(pic) (net)
Clust=( 0.54 0.46 )

3

assembly_of_protein_complexes
(ROC)

85 regulated

3%, 5%
8.5e-02

(pic) (net)

3%, 5%
9.3e-02
(pic) (net)

2%, 2%
4.5e-01
(pic) (net)
Clust=( 0.34 0.66 )

1%, 1%
6.5e-01
(pic) (net)
Clust=( 0.54 0.46 )

4

biogenesis_of_cell_wall
(ROC)

94 regulated

3%, 5%
9.9e-02

(pic) (net)

3%, 5%
9.9e-02

(pic) (net)

2%, 3%
3.8e-01
(pic) (net)
Clust=( 0.49 0.51 )

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.38 0.62 )

5

budding_cell_polarity_and_filament_formation
(ROC)

159 regulated

0%, 1%
9.5e-01
(pic) (net)

0%, 1%
9.5e-01
(pic) (net)

3%, 3%
6.3e-01

(pic) (net)
Clust=( 0.37 0.63 )

1%, 2%
8.6e-01
(pic) (net)
Clust=( 0.66 0.34 )

6

carbohydrate_utilization
(ROC)

249 regulated

4%, 13%
7.2e-03
(pic) (net)

4%, 13%
6.5e-03

(pic) (net)

3%, 10%
4.8e-02
(pic) (net)
Clust=( 0.40 0.60 )

2%, 7%
2.1e-01
(pic) (net)
Clust=( 0.50 0.50 )

7

cell_cycle_control_and_mitosis
(ROC)

289 regulated

2%, 11%
1.0e-01

(pic) (net)

2%, 10%
1.7e-01
(pic) (net)

4%, 8%
1.5e-01
(pic) (net)
Clust=( 0.63 0.37 )

3%, 8%
2.5e-01
(pic) (net)
Clust=( 0.76 0.24 )

8

cell_growth
(ROC)

68 regulated

1%, 2%
4.7e-01
(pic) (net)

1%, 2%
4.5e-01

(pic) (net)

1%, 1%
6.3e-01
(pic) (net)
Clust=( 0.55 0.45 )

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.75 0.25 )

9

cellular_import
(ROC)

95 regulated

3%, 5%
1.3e-01
(pic) (net)

3%, 5%
9.7e-02
(pic) (net)

4%, 4%
1.0e-01
(pic) (net)
Clust=( 0.34 0.66 )

7%, 10%
3.9e-04

(pic) (net)
Clust=( 0.43 0.57 )

10

cytoplasmic_degradation
(ROC)

93 regulated

33%, 28%
1.0e-28

(pic) (net)

31%, 28%
1.5e-26
(pic) (net)

20%, 20%
1.6e-14
(pic) (net)
Clust=( 0.60 0.40 )

13%, 21%
1.6e-10
(pic) (net)
Clust=( 0.68 0.32 )

11

detoxificaton
(ROC)

85 regulated

5%, 7%
7.4e-03
(pic) (net)

7%, 9%
1.4e-03

(pic) (net)

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.48 0.52 )

1%, 1%
6.4e-01
(pic) (net)
Clust=( 0.36 0.64 )

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

12

dna_synthesis_and_replication
(ROC)

80 regulated

12%, 17%
4.0e-08
(pic) (net)

12%, 17%
3.4e-08

(pic) (net)

10%, 11%
3.1e-05
(pic) (net)
Clust=( 0.61 0.39 )

5%, 6%
2.9e-02
(pic) (net)
Clust=( 0.50 0.50 )

13

glucose_metabolism
(ROC)

191 regulated

3%, 11%
1.3e-02

(pic) (net)

3%, 11%
1.7e-02
(pic) (net)

4%, 9%
2.1e-02
(pic) (net)
Clust=( 0.37 0.63 )

1%, 3%
7.1e-01
(pic) (net)
Clust=( 0.50 0.50 )

14

homeostasis_of_other_ions
(ROC)

58 regulated

3%, 3%
1.6e-01
(pic) (net)

3%, 3%
1.4e-01

(pic) (net)

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.55 0.45 )

3%, 3%
1.4e-01

(pic) (net)
Clust=( 0.48 0.52 )

15

lipid_fattyacid_and_sterol_biosynthesis
(ROC)

91 regulated

5%, 7%
1.0e-02

(pic) (net)

4%, 6%
4.0e-02
(pic) (net)

2%, 3%
3.4e-01
(pic) (net)
Clust=( 0.37 0.63 )

1%, 2%
5.9e-01
(pic) (net)
Clust=( 0.46 0.54 )

16

meiosis
(ROC)

91 regulated

18%, 13%
1.9e-10
(pic) (net)

18%, 14%
9.9e-11

(pic) (net)

10%, 9%
3.0e-05
(pic) (net)
Clust=( 0.40 0.60 )

3%, 3%
2.0e-01
(pic) (net)
Clust=( 0.56 0.44 )

17

metabolism_of_vitamins_cofactors_and_prosthetic_groups
(ROC)

59 regulated

3%, 3%
1.8e-01

(pic) (net)

3%, 3%
1.8e-01

(pic) (net)

3%, 2%
2.4e-01
(pic) (net)
Clust=( 0.38 0.62 )

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.51 0.49 )

18

mitochondrial_organization
(ROC)

320 regulated

6%, 36%
8.9e-11

(pic) (net)

6%, 35%
1.9e-10
(pic) (net)

5%, 27%
2.0e-06
(pic) (net)
Clust=( 0.45 0.55 )

1%, 8%
3.7e-01
(pic) (net)
Clust=( 0.35 0.65 )

19

mitochondrial_transport
(ROC)

72 regulated

4%, 6%
4.1e-02
(pic) (net)

5%, 8%
6.5e-03

(pic) (net)

4%, 4%
9.8e-02
(pic) (net)
Clust=( 0.40 0.60 )

1%, 1%
6.1e-01
(pic) (net)
Clust=( 0.56 0.44 )

20

nuclear_organization
(ROC)

670 regulated

3%, 38%
5.0e-06
(pic) (net)

3%, 39%
2.6e-06

(pic) (net)

3%, 18%
1.5e-01
(pic) (net)
Clust=( 0.75 0.25 )

3%, 15%
4.6e-01
(pic) (net)
Clust=( 0.17 0.83 )

21

organization_of_cytoplasm
(ROC)

531 regulated

11%, 51%
1.3e-26
(pic) (net)

11%, 52%
2.8e-27

(pic) (net)

17%, 28%
4.8e-18
(pic) (net)
Clust=( 0.37 0.63 )

15%, 21%
5.3e-09
(pic) (net)
Clust=( 0.62 0.38 )

22

organization_of_cytoskeleton
(ROC)

95 regulated

4%, 9%
1.3e-02

(pic) (net)

3%, 5%
9.3e-02
(pic) (net)

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.41 0.59 )

2%, 2%
4.4e-01
(pic) (net)
Clust=( 0.38 0.62 )

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

23

organization_of_endoplasmatic_reticulum
(ROC)

108 regulated

1%, 4%
3.0e-01
(pic) (net)

1%, 4%
2.7e-01
(pic) (net)

2%, 4%
2.3e-01

(pic) (net)
Clust=( 0.39 0.61 )

0%, 1%
7.7e-01
(pic) (net)
Clust=( 0.32 0.68 )

24

organization_of_golgi
(ROC)

56 regulated

3%, 4%
1.2e-01
(pic) (net)

3%, 4%
1.2e-01
(pic) (net)

10%, 7%
3.5e-04

(pic) (net)
Clust=( 0.49 0.51 )

3%, 3%
1.9e-01
(pic) (net)
Clust=( 0.31 0.69 )

25

organization_of_plasma_membrane
(ROC)

128 regulated

0%, 2%
7.7e-01
(pic) (net)

0%, 2%
7.6e-01
(pic) (net)

1%, 1%
9.6e-01
(pic) (net)
Clust=( 0.79 0.21 )

3%, 4%
2.6e-01

(pic) (net)
Clust=( 0.60 0.40 )

26

other_transcription_activities
(ROC)

57 regulated

7%, 4%
2.0e-02
(pic) (net)

5%, 3%
7.6e-02
(pic) (net)

7%, 6%
8.0e-03

(pic) (net)
Clust=( 0.50 0.50 )

3%, 2%
2.2e-01
(pic) (net)
Clust=( 0.49 0.51 )

27

other_transport_facilitators
(ROC)

54 regulated

11%, 4%
4.2e-03

(pic) (net)

9%, 4%
1.5e-02
(pic) (net)

3%, 2%
2.1e-01
(pic) (net)
Clust=( 0.32 0.68 )

3%, 3%
1.5e-01
(pic) (net)
Clust=( 0.58 0.42 )

28

pheromone_response_matingtype_determination_sexspecific_proteins
(ROC)

150 regulated

2%, 9%
5.4e-02

(pic) (net)

2%, 9%
5.4e-02

(pic) (net)

4%, 5%
1.3e-01
(pic) (net)
Clust=( 0.25 0.75 )

1%, 3%
6.0e-01
(pic) (net)
Clust=( 0.36 0.64 )

29

proteases
(ROC)

67 regulated

2%, 4%
1.5e-01
(pic) (net)

2%, 4%
1.4e-01

(pic) (net)

1%, 1%
7.1e-01
(pic) (net)
Clust=( 0.25 0.75 )

1%, 1%
6.7e-01
(pic) (net)
Clust=( 0.69 0.31 )

30

protein_kinase
(ROC)

128 regulated

1%, 2%
5.8e-01
(pic) (net)

1%, 3%
5.7e-01
(pic) (net)

7%, 3%
3.8e-01

(pic) (net)
Clust=( 0.52 0.48 )

0%, 1%
8.9e-01
(pic) (net)
Clust=( 0.72 0.28 )

31

protein_targeting_sorting_and_translocation
(ROC)

110 regulated

2%, 7%
8.3e-02

(pic) (net)

2%, 6%
9.2e-02
(pic) (net)

1%, 3%
4.6e-01
(pic) (net)
Clust=( 0.40 0.60 )

1%, 3%
3.7e-01
(pic) (net)
Clust=( 0.32 0.68 )

32

recombination_and_dna_repair
(ROC)

83 regulated

3%, 5%
9.9e-02

(pic) (net)

3%, 4%
1.0e-01
(pic) (net)

1%, 1%
7.1e-01
(pic) (net)
Clust=( 0.69 0.31 )

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.44 0.56 )

33

regulation_of_carbohydrate_utilization
(ROC)

116 regulated

3%, 6%
6.5e-02

(pic) (net)

3%, 6%
7.5e-02
(pic) (net)

4%, 4%
1.7e-01
(pic) (net)
Clust=( 0.65 0.35 )

1%, 3%
4.2e-01
(pic) (net)
Clust=( 0.47 0.53 )

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

34

respiration
(ROC)

70 regulated

15%, 18%
1.1e-09
(pic) (net)

15%, 18%
7.6e-10

(pic) (net)

8%, 9%
3.6e-04
(pic) (net)
Clust=( 0.44 0.56 )

2%, 4%
1.7e-01
(pic) (net)
Clust=( 0.69 0.31 )

35

ribosomal_proteins
(ROC)

198 regulated

39%, 49%
4.5e-69
(pic) (net)

39%, 49%
4.5e-69
(pic) (net)

43%, 48%
2.2e-75

(pic) (net)
Clust=( 0.45 0.55 )

34%, 44%
3.9e-56
(pic) (net)
Clust=( 0.64 0.36 )

36

rpl
(ROC)

76 regulated

67%, 28%
8.6e-56
(pic) (net)

68%, 29%
1.2e-57

(pic) (net)

67%, 29%
1.6e-56
(pic) (net)
Clust=( 0.22 0.78 )

55%, 28%
2.4e-44
(pic) (net)
Clust=( 0.82 0.18 )

37

rps
(ROC)

56 regulated

57%, 20%
1.8e-33
(pic) (net)

57%, 20%
1.4e-33

(pic) (net)

50%, 17%
1.1e-26
(pic) (net)
Clust=( 0.34 0.66 )

41%, 18%
1.8e-22
(pic) (net)
Clust=( 0.79 0.21 )

38

sporulation_and_germination
(ROC)

92 regulated

0%, 0%
1.0e+00
(pic) (net)

0%, 0%
1.0e+00
(pic) (net)

2%, 2%
6.1e-01

(pic) (net)
Clust=( 0.28 0.72 )

1%, 1%
8.0e-01
(pic) (net)
Clust=( 0.38 0.62 )

39

stress_response
(ROC)

138 regulated

4%, 10%
7.0e-03
(pic) (net)

4%, 11%
5.9e-03

(pic) (net)

2%, 5%
2.6e-01
(pic) (net)
Clust=( 0.39 0.61 )

2%, 4%
2.9e-01
(pic) (net)
Clust=( 0.51 0.49 )

40

transcriptional_control
(ROC)

295 regulated

3%, 18%
4.7e-03

(pic) (net)

3%, 17%
5.3e-03
(pic) (net)

3%, 9%
1.8e-01
(pic) (net)
Clust=( 0.21 0.79 )

1%, 8%
3.9e-01
(pic) (net)
Clust=( 0.48 0.52 )

41

transcription_factors
(ROC)

226 regulated

3%, 12%
2.1e-02
(pic) (net)

3%, 12%
2.3e-02
(pic) (net)

3%, 7%
1.5e-01
(pic) (net)
Clust=( 0.22 0.78 )

3%, 14%
3.1e-03

(pic) (net)
Clust=( 0.57 0.43 )

42

translation_initiation_elongation_and_termination
(ROC)

62 regulated

4%, 3%
8.7e-02
(pic) (net)

4%, 4%
7.4e-02
(pic) (net)

3%, 1%
4.4e-01
(pic) (net)
Clust=( 0.36 0.64 )

8%, 6%
4.7e-03

(pic) (net)
Clust=( 0.63 0.37 )

43

vesicular_transport_golgi_network_etc
(ROC)

93 regulated

2%, 4%
2.3e-01

(pic) (net)

1%, 2%
6.0e-01
(pic) (net)

1%, 1%
7.9e-01
(pic) (net)
Clust=( 0.64 0.36 )

2%, 3%
3.2e-01
(pic) (net)
Clust=( 0.31 0.69 )