Homework #1 (80 points)

Assigned: 1/15/08 ---> Due: 1/22/08

There are three parts to this homework. You will be generating one document for each part of the homework. The main objective of these tasks is to make sure that you know how to use the tools that will come in handy for your term project or for other analyses you might end up doing in this course as well as outside of this course. Tutorials are available on line for Gene Inspector and Gene Construction Kit, and extensive help is available at the NCBI web site for other analysis tools.

For some of the results you will need to capture part of the screen and paste it into a Word document. On the Mac, use command-shift-4 to select an area of the screen to be saved as a file; on Windows, use Alt-PrintScreen. You can then "place" the file image into your Word document.

Once you have generated the three documents, you will need to create a folder containing those three documents and hand that folder in electronically (Instructions).

1. Gene Inspector 1.6 - All of the analyses in this section should be handed in as a single GI notebook - name the notebook GI_HW1.nbk. (40 points)

  1. Determine the dinucleotide base composition for your unknown sequence and display the results as a graph in a Gene Inspector notebook. If you notice anything interesting in the results, add some comments to the notebook describing what you observe and perhaps even some thoughts on what it means. This is best done by creating a text box using the "T" tool in the notebook.
  2. Find all repeated sequences in the DNA that are 15 nucleotides long and have no more than one mismatch. Show the results graphically in the same notebook. If you observe something interesting in the initial analysis, try to follow up on it. Try longer or shorter length repeats with more or fewer mismatches allowed to try and fine tune your initial observations. View the results as a table in addition to as a graphic (double-click on the analysis object in the notebook and then look at the choices in the Object menu that appears). Anything interesting? Comment on your observations in the notebook either in a separate text box or as part of the background notebook text.
  3. Plot the base distribution of G+C content of your unknown sequence using the default parameters for the analysis. Stretch the plot to be the width of the notebook page to better see the results.
  4. Show at least one additional base distribution analysis of your unknown DNA that produces results you find interesting. Try using large windows of 50-500 and look at purine, pyrimidine, or single nucleotide distributions (not just the default G+C content). Describe why the results you show are interesting (what have you observed?). Feel free to speculate on the meaning of your observations.

2. Gene Construction Kit 2.5 (20 points)

  1. Using your unknown sequence create a sequence display that includes the following restriction enzyme sites marked - EcoRI, HindIII, HinfI, and BamHI. View this as the actual DNA sequence with restriction sites annotated above the sequence. Capture the screen showing part of this result (see top of this page). Place this screen capture into a Word document; name it GCK_HW1.doc.
  2. Use your web browser to find the entry Y10129 in the corenucleotide database at NCBI. In the browser, show this item as TEXT (click on the "Send to" popup menu and select "text") and then save the file to your hard disk (File > Save As in your web browser). Now import the sequence file you just saved into GCK (Deluxe Import > Open sequences file) using the default import conversions. Capture the screen and paste it into the HW1-GCK.doc document.

3. NCBI (20 points)

  1. On the NCBI web site, look for the nucleotide sequence AL050348. Generate a graphic report without the sequence display (click on the "Reports" link next to the sequence name). Do a screen capture and paste the image of the graphic display into a new Word document called NCBI_HW1.doc.
  2. Find the "Map Viewer" page for this sequence (click on the "Links" link at the right end of the line containing the "Reports" link), capture a screen image into the NCBI_HW1.doc document.
  3. Look at the "GEO Profiles" page (Search Geo Profiles) for this gene. This section shows the level of expression of this gene under different conditions. To do effective searches it is best to display ALL the GEO profiles on a single page.
    (i) Based on the data presented on this page, how is the level of expression in human male skeletal muscle impacted by age in male humans? You will have to combine more than one search criterion to do this efficiently.
    (ii) Is this same trend observed in human females?
    (iii) Looking at the GDS287 +
    AL050348 record, choose to examine "Profile Neighbors" using the link at the right end of the line. Does this influence the conclusions you drew in part (i)? Explain.

What to hand in:

Instructions for submitting homework.

Bio 39/139 Home Page

This page was last modified on Wed, Jan 30, 2008, 9:59:21 PM