Running hadge on multiple samples#

The pipeline is able to run on multiple samples. In this scenario, the shared parameters for input data are retrieved from a sample sheet using params.multi_sample, which is set to null by default.

Sample sheet#

The sample sheet should contain a column called sampleId for unique sample IDs assigned to each sample.
The sample sheet (example file see the Resources section below) must contain different columns depending on the mode and methods you want to run.
- hashing mode:
  
  sampleId
  
  rna_matrix_raw
  
  rna_matrix_filtered
  
  hto_matrix_raw
  
  hto_matrix_filtered
  
  sample1
  
  sample2
- genetic mode: Set the value to “None” if the input data, for example, vcf_donor, is not available, similar to the single-sample mode. Do not forget to include the columns for HTO and RNA count matrices if params.generate_anndata or params.generate_mudata is enabled.
  
  sampleId
  
  bam
  
  bam_index
  
  barcodes
  
  nsample
  
  celldata
  
  vcf_mixed
  
  vcf_donor
  
  sample1
  
  sample2
- rescue mode:
  
  sampleId
  
  rna_matrix_raw
  
  rna_matrix_filtered
  
  hto_matrix_raw
  
  hto_matrix_filtered
  
  bam
  
  bam_index
  
  barcodes
  
  nsample
  
  celldata
  
  vcf_mixed
  
  vcf_donor
  
  sample1
  
  sample2
The remaining parameters for each process are specified in the nextflow.config file, just like when demultiplexing a single sample.
There is a distinction between running on a single sample and running on multiple samples. When processing multiple samples, the pipeline only permits a single value for each process parameter, whereas in the case of a single sample, multiple values separated by commas are allowed.

Output#

When running the pipeline on multiple samples, the pipeline output will be found in the folder "$projectDir/$params.outdir/$sampleId/$params.mode.

Resources#

There is an example sample sheet for multi_sample mode.

sampleId	rna_matrix_raw	rna_matrix_filtered	hto_matrix_raw	hto_matrix_filtered
sample1
sample2

sampleId	bam	bam_index	barcodes	nsample	celldata	vcf_mixed	vcf_donor
sample1
sample2

sampleId	rna_matrix_raw	rna_matrix_filtered	hto_matrix_raw	hto_matrix_filtered	bam	bam_index	barcodes	nsample	celldata	vcf_mixed	vcf_donor
sample1
sample2