Overview
RNAchrom is a comprehensive and flexible Nextflow pipeline designed to process RNA-DNA interactome sequencing data. It efficiently handles large-scale data from various experiments such as GRID-seq, RADICL-seq, and iMARGI, providing a streamlined workflow for analyzing RNA-DNA interactions.
The typical RNAchrom pipeline involves stages such as data trimming, alignment, and annotation. The interactions between RNA and DNA components are extracted, filtered, and then visualized or further analyzed.
Key Features
Input Management: Automatically validates input files and stages them for further processing.
Modular Design: Includes flexible modules for deduplication, trimming, alignment, and more.
Comprehensive Analysis: Handles both one-to-all (e.g., RAP) and all-to-all (e.g., GRID-seq) experimental designs.
Quality Control: Integrates with FastQC and MultiQC for comprehensive quality assessments.
Flexible Customization: Users can skip stages or customize configurations specific to their experimental setup.
Pipeline Modules
RNAchrom includes a variety of modules that can be utilized as needed in the workflow:
Module |
Description |
|---|---|
INPUT_CHECK |
Validates and stages input samples. |
DEDUP |
Deduplicates sequencing reads. |
TRIM |
Trims sequencing reads to remove adaptors and low-quality bases. |
ALIGN |
Aligns reads to the reference genome. |
ANNOTATE |
Conducts annotation voting for the DNA segments. |
MACS2 |
Identifies significant peaks in RNA-DNA interaction data. |
MULTIQC |
Aggregates results across multiple samples for comparative analysis. |
Contents:
- Setup and Installation
- Input file preparation
- RNAchrom Configuration
- Stages of Data Analysis
- 1. Input and Preprocessing
- 2. Deduplication (optional)
- 3. Trimming
- 4. Bridge Processing (for specific experiment types)
- 5. Restriction sites filtering
- 6. Alignment
- 6. Post-alignment Processing
- 7. Contact Generation
- 8. CIGAR Filtering (optional)
- 9. Merging Replicates
- 10. Chromosome Splitting (optional)
- 11. Annotation and Voting
- 12. Background Model Generation
- 13. Normalization
- 14. Peak Calling (for One-to-All experiments)
- 15. Statistics and Visualization
- 16. MultiQC Report
- Main Results
- Results
- 1. Input and Preprocessing
- 2. Deduplication (optional)
- 3. Trimming
- 4. Bridge Processing (for specific experiment types)
- 5. Alignment
- 6. Post-alignment Processing
- 7. Contact Generation
- 8. CIGAR Filtering (optional)
- 9. Merging Replicates
- 10. Chromosome Splitting (optional)
- 11. Annotation and Voting
- 12. Background Model Generation
- 13. Normalization
- 14. Peak Calling (for One-to-All experiments)
- 15. Statistics and Visualization
- 16. MultiQC Report
- Main Results
Optional Stages
Setup and Configuration
The pipeline can be configured using custom parameter settings to fit the needs of different experimental designs. Key configuration settings include:
Genome and Annotation: Provides support for multiple reference genomes and annotation files.
Toolchain Configuration: Choice of tools for alignment (e.g., HISAT2, STAR) and trimming (e.g., Trimmomatic, FastP).
Output Management: Options for generating summarized reports and logs for comprehensive analysis.
User Support and Community
Documentation: Detailed installation and execution instructions available.
Community Support: Engage with the community through forums and GitHub issues.
Contributions: Open to contributions from the research community to enhance features.
For additional help and support, please check our community forums and our GitHub repository.