Generating fastq files with bcl2fastq – Support Suite - Parse Biosciences

The following article contains instructions for running bcl2fastq, as well as example sample sheets for dual index and single index libraries.

Installing bcl2fastq on Ubuntu 20.03 LTS

Install alien to be able to use rpm installer on ubuntu:

sudo apt-get install alien

Download the linux rpm for bcl2fastq (v2.20) from Illumina:

https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html

Unzip the rpm installer:

unzip bcl2fastq2-v2-20-0-linux-x86-64.zip

Install bcl2fastq using alien:

sudo alien -i bcl2fastq2-v2.20.0.422-Linux-x86_64.rpm

Running bcl2fastq to get fastq files

First download the entire directory from your Illumina sequencer or from basespace (in this example the directory is called 210331_NB923494_0012_ADK4NFCDA1):

> ls 210331_NB923494_0012_ADK4NFCDA1
Alignment_1                     InterOp                RTARead3Complete.txt
CompletedJobInfo.xml            Logs                   Recipe
Config                          QueuedForAnalysis.txt  RunCompletionStatus.xml
CopyComplete.txt                RTAComplete.txt        RunInfo.xml
Data                            RTAConfiguration.xml   RunParameters.xml
GenerateFASTQRunStatistics.xml  RTALogs                SoftwareVersionsFile.csv
Images                          RTARead1Complete.txt   
InstrumentAnalyticsLogs         RTARead2Complete.txt

Example sample sheet for unique dual indexed (UDI) libraries

Make a SampleSheet.csv file by filling out the below template file. A CSV version of the example is attached at the bottom of this article. You will need to specify the number of cycles for Read1 and Read2. Depending on the kit used to prepare libraries, Read 2 must be a minimum of 86 cycles (i.e., Evercode WT v2) or 58 cycles (i.e., Evercode WT v3) to detect all barcodes. This example includes sublibrary index IDs 1-8 from the UDI plate-WT that were used for WT library preparation. Please refer to the user manual for a list of which index sequences to use for demultiplexing.

Important: Please do not include an adapter sequence under settings. This will result in unnecessary trimming by bcl2fastq which can remove barcode sequences from read 2.

Note: For the i5 index, some sequencing instruments require the reverse complementary sequence (as shown in the example below) in the sample sheet instead of the forward sequence. Please input the sequence according to the sequencing instrument you are using.

[Header]
Local Run Manager Analysis Id,<fill>
Experiment Name,<fill>
Date,<fill>
Module,GenerateFASTQ - 2.0.1
Workflow,GenerateFASTQ
Assay,Nextera
Description,<fill>
Chemistry,Default

[Reads]
<read1 length>
<read2 length>

[Settings]

[Data]
Sample_ID,Sample_Name,Description,I7_Index_ID,index,I5_Index_ID,index2,Sample_Project
<fill>,<fill>,,,CAGATCAC,,ATGTGAAG,
<fill>,<fill>,,,ACTGATAG,,GTCCAACC,
<fill>,<fill>,,,GATCAGTC,,AGAGTCAA,
<fill>,<fill>,,,CTTGTAAT,,AGTTGGCT,
<fill>,<fill>,,,AGTCAAGA,,ATAAGGCG,
<fill>,<fill>,,,CCGTCCTA,,CCGTACAG,
<fill>,<fill>,,,GTAGAGTA,,CATTCATG,
<fill>,<fill>,,,GTCCGCCT,,AGATACGG,

Note: Some sample sheets may have different terminology to specify the index reads. Typically, “Read 1 Index” refers to the i7 index and “Read 2 Index” refers to the i5 index. If you have questions on where to input read length or index sequences, please consult with your sequencing provider.

Then add your SampleSheet.csv to the top level of the sequencing directory.

> ls 210331_NB923494_0012_ADK4NFCDA1
Alignment_1                     InterOp                RTARead3Complete.txt
CompletedJobInfo.xml            Logs                   Recipe
Config                          QueuedForAnalysis.txt  RunCompletionStatus.xml
CopyComplete.txt                RTAComplete.txt        RunInfo.xml
Data                            RTAConfiguration.xml   RunParameters.xml
GenerateFASTQRunStatistics.xml  RTALogs                SampleSheet.csv
Images                          RTARead1Complete.txt   SoftwareVersionsFile.csv
InstrumentAnalyticsLogs         RTARead2Complete.txt

Then run bcl2fastq. In the following example we are using 32 cores and outputting the fastq files into a folder called "fastq_files". The --no-lane-splitting parameter can be convenient since it ensures that all reads with a given index will be demultiplexed into the same fastq files regardless of lane.

bcl2fastq -i 210331_NB923494_0012_ADK4NFCDA1/Data/Intensities/BaseCalls/ -p 32 --output-dir 210331_NB923494_0012_ADK4NFCDA1/fastq_files --no-lane-splitting

Here are the resulting fastq files after bcl2fastq completes.

> ls fastq_files
Reports                          s3_S3_R2_001.fastq.gz  s8_S8_R1_001.fastq.gz
Stats                            s4_S4_R1_001.fastq.gz  s8_S8_R2_001.fastq.gz
Undetermined_S0_R1_001.fastq.gz  s4_S4_R2_001.fastq.gz
Undetermined_S0_R2_001.fastq.gz  s5_S5_R1_001.fastq.gz
s1_S1_R1_001.fastq.gz            s5_S5_R2_001.fastq.gz
s1_S1_R2_001.fastq.gz            s6_S6_R1_001.fastq.gz
s2_S2_R1_001.fastq.gz            s6_S6_R2_001.fastq.gz
s2_S2_R2_001.fastq.gz            s7_S7_R1_001.fastq.gz
s3_S3_R1_001.fastq.gz            s7_S7_R2_001.fastq.gz

The demultiplexed fastq files are now ready to be processed. See Running the Pipeline (Current Version) for more information.

Appendix: Example sample sheet for single indexed libraries

Make a SampleSheet.csv file by filling out the below template file. A CSV version of the example is attached at the bottom of this article. You will need to specify the number of cycles for Read1 and Read2 (e.g., for Evercode WT v2, Read2 must be a minimum of 86 cycles to detect all barcodes). This example includes the sequences for eight single indices from an Evercode WT kit. Please refer to the user manual for a list of which index sequences to use for demultiplexing.

Important: Please do not include an adapter sequence under settings. This will result in unnecessary trimming by bcl2fastq which can remove barcode sequences from read 2.

[Header]
Local Run Manager Analysis Id,<fill>
Experiment Name,<fill>
Date,<fill>
Module,GenerateFASTQ - 2.0.1
Workflow,GenerateFASTQ
Assay,Nextera
Description,<fill>
Chemistry,Default

[Reads]
<read1 length>
<read2 length>

[Settings]

[Data]
Sample_ID,Sample_Name,Description,index,I7_Index_ID,Sample_Project
<fill>,<fill>,,CAGATC,CAGATC,
<fill>,<fill>,,ACTTGA,ACTTGA,
<fill>,<fill>,,GATCAG,GATCAG,
<fill>,<fill>,,TAGCTT,TAGCTT,
<fill>,<fill>,,ATGTCA,ATGTCA,
<fill>,<fill>,,CTTGTA,CTTGTA,
<fill>,<fill>,,AGTCAA,AGTCAA,
<fill>,<fill>,,AGTTCC,AGTTCC,

Example sample sheet for UDI libraries.csv
713 Bytes Download
Example sample sheet for single indexed libraries.csv
612 Bytes Download

Installing bcl2fastq on Ubuntu 20.03 LTS

Running bcl2fastq to get fastq files

Example sample sheet for unique dual indexed (UDI) libraries

Appendix: Example sample sheet for single indexed libraries

Related articles