bam2pbam
We have 2 different version of bam2pbam protocol. (1) For the bam files that are aligned to reference genome, (2) for the bam files that are aligned to reference transcriptome.
Bam Files Aligned to Reference Genome
The main bash script to convert bam files to pbam is bam2pbam.sh
This code requires:
(1) python3 (required for compression algorithm to work)
(2) samtools
(3) createDiff.py
(4) compress.py
README
User arguments
(1) read length
(2) paired end or single end squencing (PE, SE)
(3) bam file
(4) name of the temporary file (will be deleted at the end)
(5) type of cleaning. options: all, mismatch, indel, split (what kind of variants we want to mask)
example usage: sh bam2pbam.sh 100 PE file.bam tmp all
Bam Files Aligned to Reference Transcriptome
The bash script to convert bam files that are created by mapping the RNA-Seq data to transcriptome to pbam is createptransbam.sh
This code requires:
(1) samtools
(2) ptransbam.sh
README
For now, necessary file locations are hardcoded in createptransbam.sh and ptransbam.sh
Please make sure to change bam and rsemfa variables in createptransbam.sh and ptransbam.sh
ref = < reference transcriptome >
bam = < name of the bam file >
We used the rsem created reference transcriptome from ENCODE data portal.