Panel 2


Characterization of The GGCCCATTA Motif

The GGCCCATTA motif was next examined to see its position distribution upstream of the start of the gene. The figure in the lower left of this panel shows the distribution of GGCCCATTA, as well as TAATGGGCC (its inverse sequence) with up to one mismatch. As this graph demonstrates, this sequence is preferentially found about 100 to 200 bases upstream of the gene start.

click on above figure to see a larger version

The orientation of this element did not seem to make a difference in its distribution. We then examined the set of genes for which this motif is found. We only considered exact matches for this search, the results of which are shown in this table.

Number Category
331 total number of loci with GGCCCATTA motif
178 number of genes with some identified function
31 translation and protein delivery
18 transcription factors and other DNA binding proteins
9 RNA splicing related
9 DNA replication related

Of the 331 genes identified, only 178 had identified functions. These genes seemed to contain many examples of products involved in the flow of genetic information. The list below shows those genes which have been identified in the table above.

Identifiable Genes With GGCCCATTA Motif
Translation and Protein Delivery Related (31)
At2g19750 40S ribosomal protein S30
AT3g49910 60S RIBOSOMAL PROTEIN - like 60S RIBOSOMAL PROTEIN L26
At2g18020 60S ribosomal protein L2
AT5g27770 60S ribosomal protein L22 - like ribosomal protein L22 (cytosolic
At2g39460 60S ribosomal protein L23A identical to GB:AF034694
AT5g56710 60S ribosomal protein L31
AT3g28500 acidic ribosomal protein P2b (rpp2b
At1g14980 chaperonin CPN10 identical to SP:P34893 from [Arabidopsis thaliana
AT5g02490 hsc70.1 - like dnaK-type molecular chaperone hsc70.1
AT5g44320 eukaryotic translation initiation factor 3 subunit 7
AT5g06000 eukaryotic translation initiation factor 3 subunit-like protein
AT4g10320 isoleucine-tRNA ligase - like protein isoleucine--tRNA ligase
AT4g39460 mitochondrial carrier - like protein AgPET8
AT5g53180 polypyrimidine tract-binding RNA transport protein-like
At2g18110 putative elongation factor beta-1 /51228.m00082#T27K22.2
AT3g09500 putative 60S ribosomal protein L35 similar to 60S ribosomal protein L35 GB:AAC27830
At2g40660 putative methionyl-tRNA synthetase
At2g18710 putative preprotein translocase SECY protein Identical to GB:U37247; targeted to the thylakoid membrane; the protein has a chloroplast targeting signal
AT3g63190 putative protein chloroplast ribosome recycling factor protein - Spinacia oleracea
AT4g26310 putative protein elongation factor P (efp) RP238
At2g17980 putative SEC1 family transport protein similar to SLY1 proteins and vesicle transport proteins
AT3g06540 Rab escort protein
At1g07070 ribosomal protein
AT3g25520 ribosomal protein
At1g32990 ribosomal protein L11
AT3g54210 ribosomal protein L17 -like protein ribosomal protein L17
AT5g02610 ribosomal protein L35 - like ribosomal protein L35- cytosolic
AT5g20160 ribosomal protein L7Ae-like NHP2/RS6 FAMILY PROTEIN YEL026W HOMOLOG - Homo sapiens
AT4g29390 RIBOSOMAL PROTEIN S30 homolog
At1g48900 signal recognition particle 54 kDa protein 2 (SRP54
AT5g49500 SRP54 (signal recognition particle 54 KDa) protein
  
Transcription Factors and other DNA binding proteins (18)
At1g72050 C2H2-type zinc finger protein
AT3g53650 histone H2B - like protein histone H2B-2
AT3g46030 histone H2B -like protein histone H2B1
AT5g59970 histone H4 - like protein histone H4
At1g54230 hypothetical protein contains similarity to histone H1 GI:11558847 from [Triticum aestivum
At1g20280 hypothetical protein contains similarity to homeobox-leucine zipper proteins
At2g36740 hypothetical protein predicted by genscan; similar to DNA-binding protein YL-1
At1g78760 hypothetical protein similar to heat shock transcription factor like protein GI:7268102 from [Arabidopsis thaliana
AT3g09480 putative histone H2B similar to histone H2B-3 GB:CAA12231 from [Lycopersicon esculentum
At1g10230 putative kinetochore protein similar to kinetochore (SKP1p)-like protein (gi|3548811); similar to EST gb|T42880
AT5g51910 putative protein contains similarity to DNA binding protein PCF1
AT5g09450 putative protein DNA-binding protein - Triticum aestivum
AT5g02570 putative protein histone H2B-2
AT3g05060 putative SAR DNA-binding protein-1 similar to GB:AAC16330 from [Pisum sativum
AT3g06010 putative transcriptional regulator similar to homeotic gene regulator (brahma protein); contains Pfam profile PF00176 SNF2 and others N-terminal domain
AT5g23090 TATA-binding protein-associated phosphoprotein Dr1 protein homolog (sp|P49592
At1g75510 transcription initiation factor
At1g61570 unknown protein similar to small zinc finger-like protein GI:5107149 from [Oryza sativa
  
Splicing Related (9)
AT3g61860 ARGININE/SERINE-RICH SPLICING FACTOR RSP31
AT3g11964 hypothetical protein similar to putative pre-rRNA processing protein GB:AAF23213
At1g09140 putative SF2/ASF splicing modulator
At2g18510 putative spliceosome associated protein
At2g14550 putative spliceosome associated protein
AT4g30220 snRNP Sm protein F - like Sm protein F
At1g23860 splicing factor
At1g14650 splicing factor
At1g60170 unknown protein contains similarity to splicing factor required for vegetative and meiotic growth GI:2959374 from [Schizosaccharomyces pombe]
  
DNA Replication Related (9)
AT5g11200 DEAD BOX RNA helicase RH15
AT5g11170 DEAD BOX RNA helicase RH15 - like protein DEAD BOX RNA helicase RH15
AT5g67100 DNA polymerase alpha 1
AT5g41880 DNA polymerase alpha subunit IV (primase)-like protein
At1g34380 DNA polymerase type I
At1g03750 hypothetical protein similar to DNA Helicases and DNA repair proteins
At2g02090 putative helicase
At1g79890 putative helicase similar to helicase GB:AAB06962 [Homo sapiens
At1g33680 single-strand nucleic acid-binding protein

Abstract
Panel 1 Panel 3 Panel 5
Panel 2 Panel 4 Panel 6