Modeling Dependencies in Protein-DNA Binding Sites:
Analysis of Yeast Clusters
Another rich collection of gene clusters based upon the gene expression assays were collected by Tavazoie et al, 2000. In the analysis of this data, we've considered only the experiments were at least 50 target genes had a significant p-value of 0.01. For each of those 43 clusters, we've performed a 5-fold cross validation test, learning a model on 80% of the sites, and predicting if there will be a binding within the other 20%. We've repaeted this procedure for all our various models (PSSM, Tree, Mixture of PSSMs & Mixture of Trees), and calculated the Sensitivity and Specificity measures, as well as the hypergeometric p-value of this partition. In the following table we present these results, as well as additional raw information.

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

1

cl1
(ROC)

136 regulated

30%, 38%
5.6e-28

(pic) (net)

30%, 37%
1.3e-27
(pic) (net)

32%, 33%
2.0e-26
(pic) (net)
Clust=( 0.53 0.47 )

19%, 30%
7.7e-15
(pic) (net)
Clust=( 0.59 0.41 )

2

cl12
(ROC)

70 regulated

2%, 5%
3.0e-01

(pic) (net)

2%, 4%
3.3e-01
(pic) (net)

1%, 1%
7.7e-01
(pic) (net)
Clust=( 0.34 0.66 )

1%, 3%
5.5e-01
(pic) (net)
Clust=( 0.60 0.40 )

3

cl13
(ROC)

89 regulated

12%, 31%
9.3e-09

(pic) (net)

12%, 28%
2.5e-08
(pic) (net)

6%, 15%
2.1e-03
(pic) (net)
Clust=( 0.49 0.51 )

6%, 13%
3.9e-03
(pic) (net)
Clust=( 0.65 0.35 )

4

cl14
(ROC)

71 regulated

25%, 12%
1.7e-08

(pic) (net)

25%, 12%
1.9e-08
(pic) (net)

19%, 10%
1.4e-05
(pic) (net)
Clust=( 0.17 0.83 )

15%, 10%
1.3e-04
(pic) (net)
Clust=( 0.46 0.54 )

5

cl15
(ROC)

109 regulated

4%, 16%
7.5e-03
(pic) (net)

4%, 17%
6.5e-03

(pic) (net)

4%, 8%
1.1e-01
(pic) (net)
Clust=( 0.36 0.64 )

1%, 3%
7.0e-01
(pic) (net)
Clust=( 0.43 0.57 )

6

cl16
(ROC)

90 regulated

22%, 13%
4.8e-08

(pic) (net)

22%, 13%
5.4e-08
(pic) (net)

18%, 12%
3.6e-06
(pic) (net)
Clust=( 0.47 0.53 )

11%, 9%
3.0e-03
(pic) (net)
Clust=( 0.35 0.65 )

7

cl17
(ROC)

79 regulated

3%, 15%
1.9e-02

(pic) (net)

3%, 12%
3.9e-02
(pic) (net)

2%, 4%
3.9e-01
(pic) (net)
Clust=( 0.34 0.66 )

5%, 7%
7.7e-02
(pic) (net)
Clust=( 0.42 0.58 )

8

cl18
(ROC)

80 regulated

2%, 6%
2.7e-01
(pic) (net)

3%, 8%
9.9e-02
(pic) (net)

5%, 7%
9.8e-02
(pic) (net)
Clust=( 0.35 0.65 )

8%, 14%
5.6e-04

(pic) (net)
Clust=( 0.35 0.65 )

9

cl19
(ROC)

64 regulated

7%, 12%
2.4e-03
(pic) (net)

9%, 15%
2.7e-04

(pic) (net)

4%, 6%
1.0e-01
(pic) (net)
Clust=( 0.53 0.47 )

9%, 12%
8.9e-04
(pic) (net)
Clust=( 0.49 0.51 )

10

cl2
(ROC)

171 regulated

19%, 60%
2.3e-26
(pic) (net)

19%, 60%
2.3e-26
(pic) (net)

27%, 50%
4.7e-33

(pic) (net)
Clust=( 0.49 0.51 )

27%, 44%
5.6e-30
(pic) (net)
Clust=( 0.48 0.52 )

11

cl20
(ROC)

71 regulated

8%, 16%
3.6e-04
(pic) (net)

8%, 18%
2.2e-04

(pic) (net)

5%, 7%
5.3e-02
(pic) (net)
Clust=( 0.50 0.50 )

5%, 8%
4.1e-02
(pic) (net)
Clust=( 0.46 0.54 )

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

12

cl21
(ROC)

56 regulated

3%, 5%
2.1e-01
(pic) (net)

3%, 5%
2.1e-01
(pic) (net)

5%, 5%
1.4e-01

(pic) (net)
Clust=( 0.39 0.61 )

5%, 4%
1.5e-01
(pic) (net)
Clust=( 0.42 0.58 )

13

cl22
(ROC)

79 regulated

3%, 9%
6.7e-02
(pic) (net)

3%, 9%
6.7e-02
(pic) (net)

7%, 15%
1.1e-03

(pic) (net)
Clust=( 0.47 0.53 )

5%, 10%
3.2e-02
(pic) (net)
Clust=( 0.27 0.73 )

14

cl23
(ROC)

55 regulated

3%, 4%
2.6e-01
(pic) (net)

3%, 4%
2.4e-01

(pic) (net)

1%, 2%
5.8e-01
(pic) (net)
Clust=( 0.55 0.45 )

1%, 1%
6.9e-01
(pic) (net)
Clust=( 0.57 0.43 )

15

cl24
(ROC)

82 regulated

6%, 13%
5.0e-03

(pic) (net)

6%, 13%
5.6e-03
(pic) (net)

3%, 6%
2.1e-01
(pic) (net)
Clust=( 0.29 0.71 )

3%, 5%
2.4e-01
(pic) (net)
Clust=( 0.43 0.57 )

16

cl25
(ROC)

64 regulated

3%, 3%
3.7e-01

(pic) (net)

3%, 3%
3.7e-01

(pic) (net)

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.53 0.47 )

0%, 0%
1.0e+00
(pic) (net)
Clust=( 0.60 0.40 )

17

cl27
(ROC)

56 regulated

17%, 7%
5.9e-04
(pic) (net)

17%, 7%
5.6e-04

(pic) (net)

5%, 2%
4.5e-01
(pic) (net)
Clust=( 0.51 0.49 )

7%, 4%
1.4e-01
(pic) (net)
Clust=( 0.31 0.69 )

18

cl28
(ROC)

52 regulated

7%, 16%
1.2e-03

(pic) (net)

5%, 13%
9.1e-03
(pic) (net)

3%, 5%
1.8e-01
(pic) (net)
Clust=( 0.41 0.59 )

9%, 12%
1.2e-03
(pic) (net)
Clust=( 0.57 0.43 )

19

cl3
(ROC)

100 regulated

6%, 17%
1.9e-03

(pic) (net)

5%, 14%
1.0e-02
(pic) (net)

6%, 11%
1.5e-02
(pic) (net)
Clust=( 0.46 0.54 )

6%, 9%
2.9e-02
(pic) (net)
Clust=( 0.33 0.67 )

20

cl30
(ROC)

57 regulated

14%, 23%
4.0e-07
(pic) (net)

14%, 24%
3.1e-07

(pic) (net)

5%, 8%
3.8e-02
(pic) (net)
Clust=( 0.44 0.56 )

7%, 11%
7.5e-03
(pic) (net)
Clust=( 0.41 0.59 )

21

cl4
(ROC)

150 regulated

4%, 15%
2.3e-02
(pic) (net)

4%, 15%
2.1e-02
(pic) (net)

6%, 16%
2.3e-03

(pic) (net)
Clust=( 0.54 0.46 )

4%, 12%
4.8e-02
(pic) (net)
Clust=( 0.50 0.50 )

22

cl5
(ROC)

141 regulated

1%, 3%
7.8e-01
(pic) (net)

1%, 3%
7.9e-01
(pic) (net)

2%, 7%
4.1e-01
(pic) (net)
Clust=( 0.30 0.70 )

4%, 10%
1.1e-01

(pic) (net)
Clust=( 0.42 0.58 )

#

NAME

PSSM
(sens,spec,pval)

Tree
(sens,spec,pval)

Mixture of PSSMs
(sens,spec,pval)

Mixture of Trees
(sens,spec,pval)

23

cl6
(ROC)

98 regulated

4%, 10%
5.4e-02

(pic) (net)

4%, 9%
7.8e-02
(pic) (net)

1%, 2%
7.3e-01
(pic) (net)
Clust=( 0.55 0.45 )

4%, 8%
1.1e-01
(pic) (net)
Clust=( 0.59 0.41 )

24

cl7
(ROC)

95 regulated

17%, 25%
9.7e-11
(pic) (net)

18%, 25%
1.5e-11
(pic) (net)

30%, 35%
5.5e-23

(pic) (net)
Clust=( 0.21 0.79 )

11%, 18%
7.3e-06
(pic) (net)
Clust=( 0.46 0.54 )

25

cl8
(ROC)

138 regulated

10%, 24%
1.1e-06

(pic) (net)

9%, 22%
8.1e-06
(pic) (net)

6%, 20%
3.3e-04
(pic) (net)
Clust=( 0.58 0.42 )

3%, 11%
8.3e-02
(pic) (net)
Clust=( 0.55 0.45 )

26

cl9
(ROC)

131 regulated

5%, 13%
1.3e-02
(pic) (net)

4%, 11%
4.6e-02
(pic) (net)

3%, 10%
1.1e-01
(pic) (net)
Clust=( 0.38 0.62 )

8%, 17%
2.1e-04
(pic) (net)
Clust=( 0.38 0.62 )