G. Recap: Workflows Make Things Easier
...continued from: Compare Samples To Each Other
Set up and execute an efficient QIIME workflow, from the start
Okay, now that you've seen all the steps to a 16S rRNA gene amplicon pipeline, lets put it all together into a workflow. Specifically, we're going to incorporate the pick_otus_through_otu_table.py workflow to do most of those tutorial steps all in one go.
Here is a PDF flowchart of a typical analysis pipeline for 454 pyrosequencing of 16S rRNA genes. There are lots of alternative ways to go and additional analyses you could do, however, depending on your questions!
Let's start again, this time using workflow scripts to help things go faster. We still have to split libraries and denoise -- those steps are not part of the workflow script. So, remember -- you've already done these three initial steps:
# Split libraries: http://qiime.org/scripts/split_libraries.html
split_libraries.py -m Fasting_Map.txt -f Fasting_Example.fna -q Fasting_Example.qual -o split_library_output/
denoise_wrapper.py -i Fasting_Example.sff.txt -f split_library_output/seqs.fna -m Fasting_Map.txt -o denoiser/
# Inflate denoiser output: http://qiime.org/scripts/inflate_denoiser_output.html
inflate_denoiser_output.py -c denoiser/centroids.fasta -s denoiser/singletons.fasta -f split_library_output/seqs.fna -d denoiser/denoiser_mapping.txt -o inflated_denoised_seqs.fna
Because you've already run those three commands, we have the results files, and there's no need to do it again for this data set. I'm also assuming you still have that qiime_parameters.txt file with you in the ~/qiime_tutorial/ directory. Now, let's run the OTU table workflow. It doesn't need much as input: just the inflated_denoised_seqs.fna file that you created from the inflate denoiser output process.
# OTU Table WORKFLOW: http://qiime.org/scripts/pick_otus_through_otu_table.html
pick_otus_through_otu_table.py -i inflated_denoised_seqs.fna -p qiime_parameters.txt -o PickOTUsWorkflow/
I've set the output directory to be called PickOTUsWorkflow/. Look in that directory and you'll find a whole bunch of files that should look very familiar. Alignments, a tree, an OTU table, taxonomy information, the whole nine yards. Awesome!
Before you start analyzing your own sequences, look closely at all those variables in the qiime_parameters.txt file. You may want to tweak them! For example, if you want to set up your own taxonomy training set, perhaps based on the Greengenes OTUs, there are a couple of variables that will need to be set for that. Or, if you want to use a different alignment method, tree building method, etc, it will all be controlled through giving values to the variables in that parameters file. Good luck!
Thanks for putting up with this verbose tutorial, and please let me know if you have any comments. If you'd like to try out some of the QIIME scripts on a larger data set, please feel free to try out this open-ended example project.