Stage
Script or executable
Output files or directories
Initialization and configuration file; verification and loading
ShapeMapper.py, parseConfigFile.py, conf.py
RUN/log.txt: file that logs pipeline stage execution and error messages. RUN/temp: folder that stores subprocess standard out and standard error streams during execution (can be deleted after run completion). RUN/output: folder that will store the bulk of the pipeline output. RUN/output/*: subfolders that will store the output from each pipeline stage
Quality trimming
trimPhred
RUN/output/trimmed_reads/*.fastq: sequencing reads trimmed left-to-right at the site of the first average phred score below conf.minPhred over a window of length conf.windowSize with resulting read lengths greater or equal to conf.minLength
Sequence alignment preparation
bowtie2-build (third party)38
RUN/output/bowtie_index/*: Bowtie2 reference sequence indices
Sequence alignment
bowtie2 (third party)38
RUN/output/aligned_reads/*.sam: aligned sequence files, one file for each line in the configuration file section ' [alignments]'
Alignment parsing and ambiguously aligned deletion identification
parseAlignment
RUN/output/mutation_strings/*.txt: parsed and simplified alignments
Mutation counting
countMutations, pivotCSV.py
RUN/output/counted_mutations/*.csv: mutation counts and read depths written to comma-separated files, one file for each line in configuration section '[alignments]'. RUN/output/counted_mutations_columns/*.csv: the same files arranged in column format These files also contain the total mismatch count, total deletion count and total unambiguously aligned deletion count
Reactivity profile creation and standard error calculation
generateReactivityProfiles.py (uses matplotlib–third party)
RUN/output/reactivity_profiles/*.tab: the most detailed output, containing mutation rates, depths, reactivities and standard errors in tab-delimited columns. RUN/output/reactivity_profiles/*.shape: simple SHAPE reactivity file, tab-delimited columns with nucleotide numbers in the first column and reactivities in the second, no-data positions indicated by −999. RUN/output/reactivity_profiles/*.map: SHAPE reactivity file including standard errors and nucleotide sequence. RUN/output/reactivity_profiles/*_histograms.pdf: histograms of mutation rates, read depths, and reactivities that are useful for troubleshooting. RUN/output/reactivity_profiles/*_depth_and_reactivity.pdf: read depth profile, mutation rate above background profile, and reactivity profile images
Structure modeling
Fold (part of RNAstructure–third party)33
RUN/output/folds/*.seq: reference sequence files in the format required by RNAstructure. RUN/output/folds/*.ct: structure models, one file for each line in configuration file section ' [folds]'
Structure drawing
pvclient.py (custom client for the Pseudoviewer web service—third party)59
RUN/output/folds/*.eps: postscript image files for the lowest predicted free energy structure colored by SHAPE reactivity, for each RNA specified in configuration section ' [folds]'. RUN/output/folds/*.xrna: XRNA files for each lowest predicted free energy structure, which can be manually edited if desired
The initialization stage is directly executed by the user; all subsequent stages are launched automatically from the ShapeMapper.py script. 'RUN' indicates the path to the folder from which ShapeMapper was executed, which should contain FASTA reference sequences, raw sequencing reads and a configuration file. 'conf' indicates configuration file parameters. '*' is a wild-card character indicating multiple names.
peter
peter
一个生物信息学本科生
Announcement
不定期更新!
08/23/2024 更新“打印机维修手册
08/02/2024 更新”shapemapper“
04/10/2024 更新“vscode101”
04/09/2024 更新“about me”
04/09/2024 更新“新坑”
04/08/2024 更新“日行一善”