Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
## Overall ReadMe for the repo ### This ReadMe gives an overview of the repo for the Phylotranscriptomic analysis associated with Wheeler and Walker et al. 2022 "Transcription factors evolve faster than their structural gene targets in the flavonoid pigment pathway". ### The sub-directories of the repo directory (stored here in downloadable zipped format) contain all the analyses included in the manuscript. Each sub-directory has its own ReadMe.txt files with more specific information. ##### **assembled_transcriptomes.zip** contains the raw Trinity assemblies and the predicted CDS from chimera-filtered, Corset-clustered assemblies ##### **genes_of_interest_BLAST_raw.zip** contains the query sequences and the scripts to extract BLAST hit seqs from transcriptome assemblies ##### **Phylogeny_reconstruction.zip** contains the materials needed to reconstruct the species tree from transcriptomic CDS ##### **Sample_collection_information.zip** contains information about the collection of plants samples, such as coordinates, descriptions, notes ##### **selected_transcripts_raw.zip** contains the full set of molecular evolution analyses, using the filtered BLAST hits, within its sub-directories ##### **SequencingDataInformation.zip** contains information about the raw sequencing data ##### **SRA_information.zip** contains information used to create/upload SRA BioSamples for the dataset ##### **transcriptome_assembly.zip** contains the pipeline tools for automating assembly of the transcriptomes ### Notes ##### To rerun the full transcriptome assembly pipeline you will need to acquire the raw gzipped reads from the SRA BioProject PRJNA746328 (https://www.ncbi.nlm.nih.gov/sra/PRJNA746328) ####################################################################################### ### Overview of workflow ##### To get to the location where all of the molecular evolution analyses are performed go to: **./selected_transcripts_raw/concatenated_fastas/transdecoder_orfs_cds_seqs/filter_by_ref_similarity** #### Diagram of species-tree estimation pipeline *run transcriptome assembly pipeline → run tree estimation pipeline with predicted CDS → run TreePL analysis to get time-calibrated tree* #### Diagram of molecular evolution analyses workflow *run BLAST pipeline to identify hits that match genes of interest → extract BLAST hits from the raw transcriptomes → run TransDecoder on extracted BLAST hits to get predicted CDS and peptide sequences → filter the predicted CDS by similarity to query reference sequences to get best matches → generate codon alignments for the best-matched sequences (requires acquiring matched peptide sequences and resolving the relationships between PHZ-like (AN2, DPL, PHZ) → run all subsequent molecular evolution analyses using the best-matched CDS codon alignments for each gene* #### *** Important note *** All of the immediate input files for molecular evolution analyses as well as the final outputs of those analyses are present in this repo. However, if you desire to rerun any analysis starting with the most upstream input files (rather than examine the outputs or rerun with the immediate input files), you can do so by following the instructions in the ReadMe.txt files. For example, the immediate inputs and the final results of the HyPhy single-omega model fit for all of the genes-of-interest can be found here: **Petunieae_phylotranscriptomics_repo/selected_transcripts_raw/concatenated_fastas/transdecoder_orfs_cds_seqs/filter_by_ref_similarity/codon_alignments_analyses/hyphy_analyses/single_rate_model** It is not necessary to rerun all the upstream steps that lead to fitting this model, but (for the sake of reproducibility) it is possible to do so by following the series of instructions in the ReadMe.txt files descending through the sub-directories of the repo that lead from *running BLAST of reference queries against the Trinity transcriptomes → extracting BLAST hits into organized fasta files → predicting CDS and pep sequences from BLAST hits → filtering sequences by reference similarity -> generating codon alignments → fitting the single-omega model.*
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.