Term Project

You have been given an unknown genomic sequence to analyze during this term. Your job will be to tell me everything that you can about this piece of DNA - what genes does it contain? what is the intron/exon organization? can you say anything about a promoter? If you have more than one gene, are they related? etc. You should start out by trying the various "gene finding" programs (see links on our home page). You should also explore the web for any kind of analysis you might find useful. Try some of the analyses in Gene Inspector and on the web. You probably want to start by running GeneMachine and examining the results in Sequin. During the term you should stop in regularly to talk with me about your progress and to discuss what to do next -- this should be a collaborative effort between you and me! Since these sequences have never been characterized before, they might contain anywhere from 1 to 10 genes... we will not know until you have done the analyses. You might discover something entirely new that might make an interesting publication.

At the end of the term, you will hand in a report detailing what you have learned about your sequence. It would be very advisable to write down notes as you do the research. It is very easy to forget what you have done if you look back several weeks. The report should include specifics on the methods/algorithms you used, figures and tables detailing what the results of each analysis were and what conclusions you were able to draw about your sequence from each analysis. Since we are dealing with uncharacterized sequences, you may not be able to draw any conclusions about function -- but that is how research works. It's OK not to be able to define a function for your DNA/gene(s) as long as you document the analyses you have done and why you conclude that it is not possible to define a function at this time.

What to Hand In

You are to hand in a report that details your unknown DNA and how you went about doing the analysis. As you work with your sequence you will essentially be solving a mystery. Of course, each of you will be solving a different mystery. The questions below are meant as a guide to how you should be thinking. There will be many different twists and turns in each project, so please keep in touch with me as you uncover new information about your DNA. The report should identify (graphically) the organization of the genes in your DNA and should number them. This numbering scheme should be used when describing the products those genes.

The report should include a description of the steps you take along the way and the reasoning behind the steps. You might start out describing the "gene finding" that you did. You should use a Microsoft Word document called UnknownXX_termProject to contain the bulk of your analysis results - both the text and the graphics. If you had conflicts in predictions from various programs, what did you do to resolve those conflicts? State the number of genes you identified and then go on to analyze each of the genes you found. For each gene, show the intron-exon structure (using a graphic generated in Gene Inspector, Gene Construction Kit, GenScan or any other source -- and then pasted into the word document) and the protein sequence that is encoded. Hand in the protein sequences as a single Gene Inspector file named termProject proteins. Then, for each gene, describe the analyses you did on the protein and the DNA. What were the results of database searches and the various protein analyses you did? Can you predict a likely 3-D structure for the proteins or segments of the proteins? Can you state what the gene (or gene product) does? If not, does it resemble any other known genes? Are adjacent genes in any way similar? Is there any kind of correlation between the repeat structures or base distributions and your gene structure? Look at the regions on your DNA between genes... is there any sequence that is recognizable by database searches? Can you identify a cluster of genes similar to yours in another organism? If so, are there any conserved DNA sequences that are non-coding? Could they be control regions? Antisense RNAs? tRNAs?

The term project is due as defined on the syllabus page. Place all the documents you generate into a single folder called lastname_firstname.tp, stuff/zip that folder and hand it in using BLACKBOARD as for all previous homework. These 50KB sequences were obtained from the Honey Bee Genome Project. Student names are as they were received from the Registrar.

 

Unknowns for Use as Term Project Sequences
Student Upstream Sequence Unknown Sequence Downstream Sequence

Ahmad N. Abou Tayoun

upstream01 unknown01 downstream01

Drupad Sil

upstream02 unknown02 downstream02

Daniel J. Goduti

upstream03 unknown03 downstream03

Casey S. Greene

upstream04 unknown04 downstream04

Xiao Hu

upstream05 unknown05 downstream05

Chuan Liang

upstream06 unknown06 downstream06

Huan Liu

upstream07 unknown07 downstream07

Courtney A. Lyman

upstream08 unknown08 downstream08

Adel A. Malek

upstream09 unknown09 downstream09

Viktor Martyanov

upstream10 unknown10 downstream10

Radhika Mathur

upstream11 unknown11 downstream11

Sean C. Murray

upstream12 unknown12 downstream12

Parul R. Sharma

upstream13 unknown13 downstream13

Sonia V. Simmons

upstream14 unknown14 downstream14

Sara E. Thiebaud

upstream15 unknown15 downstream15
Michael Chen
upstream16 unknown16 downstream16
Emma Lubin
upstream17 unknown17 downstream17
Justin Crocker
upstream18 unknown18 downstream18
Braden Lang
upstream19 unknown19 downstream19
??
upstream20 unknown20 downstream20

Bio 68 Home Page

This page was last modified on Tue, Jan 10, 2006, 1:12:30 PM