Modeling Dependencies in Protein-DNA Binding Sites:
Analysis of Yeast Clusters
Another rich collection of datasets of genes were collected by Hughes et al, 2000. These clusters of genes are based on functional annotations, and were originally analyzed using AlignACE. This analysis included multiple runs of AlignACE, followed by filtering based on the quality of the motifs found. The best PSSMs were reported for each cluster.

To gauge the quality of our baseline method, we compared the PSSMs learned by our procedures to the ones learned and reported by Hughes etal. For this task we used the whole training data (as done by AlignACE), and examined the two learned motifs for each group by comparing their sensitivity, specificity and their hypergeometric p-value. In addition, we present the runs of AlignACE, using the default parameters. All results, including raw information, are shown in the following table.

#

NAME

PSSM
(sens,spec,pval)

Huges et. al
(sens,spec,pval)

Default AlignACE
(sens,spec,pval)

1

aminoacid_biosynthesis
(ROC)

24%, 38%
2.1e-27

(pic) (net)

8%, 4%
7.6e-02
(pic) (net)

7%, 3%
2.9e-01
(pic) (net)

2

aminoacid_metabolism
(ROC)

14%, 39%
8.9e-20

(pic) (net)

9%, 9%
8.1e-04
(pic) (net)

20%, 5%
1.5e-02
(pic) (net)

3

assembly_of_protein_complexes
(ROC)

10%, 13%
3.6e-06
(pic) (net)

21%, 8%
2.5e-08

(pic) (net)

2%, 2%
5.6e-01
(pic) (net)

4

biogenesis_of_cell_wall
(ROC)

10%, 14%
1.4e-06

(pic) (net)

8%, 3%
1.5e-01
(pic) (net)

4%, 3%
1.7e-01
(pic) (net)

5

budding_cell_polarity_and_filament_formation
(ROC)

4%, 7%
5.1e-02
(pic) (net)

8%, 8%
4.3e-03

(pic) (net)

6%, 5%
7.3e-02
(pic) (net)

6

carbohydrate_utilization
(ROC)

6%, 19%
3.9e-06

(pic) (net)

9%, 8%
4.3e-02
(pic) (net)

11%, 5%
4.3e-01
(pic) (net)

7

cell_cycle_control_and_mitosis
(ROC)

4%, 20%
9.6e-05

(pic) (net)

6%, 9%
5.3e-02
(pic) (net)

9%, 9%
2.4e-02
(pic) (net)

8

cell_growth
(ROC)

13%, 20%
1.2e-08

(pic) (net)

11%, 4%
8.7e-03
(pic) (net)

8%, 3%
3.6e-02
(pic) (net)

9

cellular_import
(ROC)

9%, 14%
5.4e-06

(pic) (net)

9%, 6%
1.6e-03
(pic) (net)

5%, 5%
3.5e-02
(pic) (net)

10

cytoplasmic_degradation
(ROC)

38%, 30%
1.5e-34

(pic) (net)

35%, 32%
8.7e-33
(pic) (net)

37%, 31%
2.1e-34
(pic) (net)

11

detoxificaton
(ROC)

10%, 12%
5.9e-06

(pic) (net)

10%, 5%
4.2e-03
(pic) (net)

12%, 2%
2.5e-01
(pic) (net)

#

NAME

PSSM
(sens,spec,pval)

Huges et. al
(sens,spec,pval)

Default AlignACE
(sens,spec,pval)

12

dna_synthesis_and_replication
(ROC)

17%, 23%
7.2e-13

(pic) (net)

21%, 8%
3.8e-08
(pic) (net)

11%, 5%
2.4e-03
(pic) (net)

13

glucose_metabolism
(ROC)

6%, 16%
5.4e-05

(pic) (net)

9%, 9%
6.8e-04
(pic) (net)

6%, 6%
1.1e-01
(pic) (net)

14

homeostasis_of_other_ions
(ROC)

13%, 12%
1.1e-06

(pic) (net)

20%, 5%
9.3e-06
(pic) (net)

22%, 2%
6.8e-03
(pic) (net)

15

lipid_fattyacid_and_sterol_biosynthesis
(ROC)

13%, 16%
2.2e-08

(pic) (net)

14%, 6%
2.0e-04
(pic) (net)

14%, 5%
2.0e-03
(pic) (net)

16

meiosis
(ROC)

20%, 16%
7.2e-13

(pic) (net)

12%, 4%
2.2e-02
(pic) (net)

5%, 4%
9.6e-02
(pic) (net)

17

metabolism_of_vitamins_cofactors_and_prosthetic_groups
(ROC)

16%, 18%
1.3e-09

(pic) (net)

10%, 3%
1.7e-02
(pic) (net)

15%, 3%
5.4e-03
(pic) (net)

18

mitochondrial_organization
(ROC)

8%, 41%
1.2e-14

(pic) (net)

9%, 9%
1.0e-01
(pic) (net)

5%, 6%
6.7e-01
(pic) (net)

19

mitochondrial_transport
(ROC)

18%, 18%
2.9e-11

(pic) (net)

11%, 4%
1.4e-02
(pic) (net)

2%, 1%
5.3e-01
(pic) (net)

20

nuclear_organization
(ROC)

5%, 46%
9.0e-11

(pic) (net)

5%, 13%
7.9e-01
(pic) (net)

16%, 18%
8.7e-03
(pic) (net)

21

organization_of_cytoplasm
(ROC)

13%, 51%
4.6e-31

(pic) (net)

10%, 15%
2.7e-02
(pic) (net)

6%, 13%
2.5e-01
(pic) (net)

22

organization_of_cytoskeleton
(ROC)

12%, 25%
1.2e-10

(pic) (net)

6%, 3%
2.3e-01
(pic) (net)

5%, 5%
4.6e-02
(pic) (net)

#

NAME

PSSM
(sens,spec,pval)

Huges et. al
(sens,spec,pval)

Default AlignACE
(sens,spec,pval)

23

organization_of_endoplasmatic_reticulum
(ROC)

4%, 9%
7.2e-03
(pic) (net)

12%, 5%
2.0e-03

(pic) (net)

8%, 3%
9.7e-02
(pic) (net)

24

organization_of_golgi
(ROC)

17%, 17%
9.2e-10

(pic) (net)

28%, 4%
2.8e-06
(pic) (net)

7%, 4%
1.8e-02
(pic) (net)

25

organization_of_plasma_membrane
(ROC)

2%, 5%
2.1e-01
(pic) (net)

13%, 8%
3.3e-05

(pic) (net)

3%, 3%
4.4e-01
(pic) (net)

26

other_transcription_activities
(ROC)

14%, 10%
3.6e-06
(pic) (net)

22%, 6%
1.3e-06

(pic) (net)

10%, 4%
7.4e-03
(pic) (net)

27

other_transport_facilitators
(ROC)

22%, 8%
1.2e-07

(pic) (net)

24%, 5%
1.2e-06
(pic) (net)

1%, 1%
6.4e-01
(pic) (net)

28

pheromone_response_matingtype_determination_sexspecific_proteins
(ROC)

8%, 20%
1.4e-07

(pic) (net)

6%, 4%
2.6e-01
(pic) (net)

4%, 4%
2.9e-01
(pic) (net)

29

proteases
(ROC)

13%, 15%
1.1e-07

(pic) (net)

14%, 3%
1.6e-02
(pic) (net)

2%, 1%
7.2e-01
(pic) (net)

30

protein_kinase
(ROC)

9%, 17%
3.9e-07

(pic) (net)

10%, 6%
5.1e-03
(pic) (net)

7%, 4%
8.4e-02
(pic) (net)

31

protein_targeting_sorting_and_translocation
(ROC)

14%, 25%
8.0e-13

(pic) (net)

10%, 3%
1.1e-01
(pic) (net)

5%, 3%
3.1e-01
(pic) (net)

32

recombination_and_dna_repair
(ROC)

7%, 9%
9.8e-04

(pic) (net)

9%, 4%
2.3e-02
(pic) (net)

7%, 4%
4.0e-02
(pic) (net)

33

regulation_of_carbohydrate_utilization
(ROC)

12%, 20%
1.0e-09

(pic) (net)

11%, 5%
1.2e-02
(pic) (net)

2%, 3%
4.4e-01
(pic) (net)

#

NAME

PSSM
(sens,spec,pval)

Huges et. al
(sens,spec,pval)

Default AlignACE
(sens,spec,pval)

34

respiration
(ROC)

22%, 26%
1.8e-16

(pic) (net)

5%, 2%
2.2e-01
(pic) (net)

7%, 5%
1.7e-02
(pic) (net)

35

ribosomal_proteins
(ROC)

41%, 48%
3.8e-72
(pic) (net)

42%, 46%
2.9e-72

(pic) (net)

40%, 45%
7.5e-67
(pic) (net)

36

sporulation_and_germination
(ROC)

10%, 18%
9.2e-08

(pic) (net)

17%, 5%
6.2e-04
(pic) (net)

7%, 3%
8.4e-02
(pic) (net)

37

stress_response
(ROC)

5%, 12%
1.6e-03
(pic) (net)

10%, 8%
4.2e-04

(pic) (net)

6%, 4%
2.3e-01
(pic) (net)

38

transcriptional_control
(ROC)

7%, 35%
5.8e-11

(pic) (net)

5%, 6%
5.2e-01
(pic) (net)

5%, 8%
2.1e-01
(pic) (net)

39

transcription_factors
(ROC)

7%, 25%
1.7e-08

(pic) (net)

13%, 8%
1.5e-03
(pic) (net)

11%, 6%
1.6e-01
(pic) (net)

40

translation_initiation_elongation_and_termination
(ROC)

8%, 6%
4.9e-03
(pic) (net)

14%, 5%
4.1e-04

(pic) (net)

9%, 2%
9.6e-02
(pic) (net)

41

vesicular_transport_golgi_network_etc
(ROC)

4%, 8%
1.6e-02
(pic) (net)

26%, 9%
8.5e-11

(pic) (net)

5%, 3%
1.5e-01
(pic) (net)