Students will explore the history of a specific food crop by looking at how features appear in the evolutionary tree of a group of species, at least one of which is a food crop. This activity is essentially a practice so they can repeat the steps with their independent short-term research question.
Estimated Time of Activity
Students will be able do the following as a result of the activity:
- Interpret phylogenetic trees
- Use wikipedia wisely to investigate plants
- Download Genbank sequences in the correct format, create an alignment, and create a phylogenetic tree.
- Show students a demo of past student research on melons. The notes in the powerpoint include instructions.
- Discuss how many times texture, color, and taste evolved in melon. Go back to wikipedia, and look at where each sweet melon was domesticated. Is there a pattern, or is it all over the place? How many genera have sweet melon?
- Let's find out how she made the tree. Go to Genbank. Type in Citrullus lanatus rbcL. RbcL is a portion of a gene and we can compare that same gene among different species and that will reflect the evolutionary relationship. You don't have to use this gene, you can use other DNA regions too, like trnL, or atpB, or matK, or ITS. Just make sure once you pick one you stick with it and don't look at other sequences. Read the names carefully. Click on the sequence that comes up then click on the small link called "fasta", or click the "fasta" button directly under the sequence name. You should get a page like this. and copy that sequence and paste it into a text or word file. rbcL is a part of a gene in the chloroplast. Differences in its sequence will give a representation of relationships when we compare those differences across species. Do this for around 10 species of melon and squashes so you have a nice group to compare. Paste the list of sequences into this website (make sure to click DNA instead of protein!): www.ebi.ac.uk/Tools/msa/clustalw2/. Search through the results to get the alignment and also to get the guide tree that shows the genetic relationships based on your gene sequences (you need to have Java installed on the computers to see it). That is a simplified version of the tree. There are various complex ways of doing analyses on this nucleotide data. We are just skimming the surface.
- Go back to Wikipedia or Google images and analyze the tree results with the images of the appearance of the fruits of each species. What can you say about the flavor, colors, and textures and how they vary across the evolutionary tree? If you see a bunch of closely related species have the same trait, it most likely was a trait shared by their common ancestor. However, if you see a trait that some species have in common that are not closely related species, then your hypothesis can be that the trait evolved multiple times. Take some time to analyze your data and then discuss your results with the class.
Complete the activity and submit the tree with the characters of texture, color, and taste, mapped.
-Draw a copy of the resulting guide tree for your melon sequences. Complete mapping characters onto the evolutionary tree for melon.
-Go through all the steps of downloading the Genbank sequences, put them in ClustalW, and look at the distance matrix (cluster plot) it generated.
Laptops with internet and Java, printouts of the last slide of the powerpoint (of the melon evolutionary tree)
The FASTA sequence site on Genbank provides a numeric name for the sequence in addition to the species and other important information. However, the students can simplify this information so they get labels they understand for the tree. Here are some tips to show students the easy way to handle the FASTA data:
FASTA data is read by programs by starting with a carrot, ">", and then a word that is unique for that sequence. After the first space, the information won't be retained. After a hard return (pressing enter or return), that is when your sequence should start. The next sequence will be recognized by the carrot all over again. Students can change the FASTA label of the sequence to read C_lanatus instead of what it was before for Citrullus lanatus, which would have started with numbers. That way, the students can easily read the unique identifier for the sequence.
The most common error students make is that they forget to copy the > and sequence identifier, or have redundant names, so the Clustal W program can't read their data.
For example data you can use, here is an alignment, done correctly, but with unmodified names. Here is a correct one with modified names. Both will work in ClustalW, but the tree branch names will be different.