FAQ - gws_ubiome - Community

Q: Is constellab pipeline adapted to ligation sequecing ?

The Constellab pipeline is not currently optimized for ligation-based sequencing workflows.

It targets the V3–V4 regions and is adapted for reads generated by NovaSeq, iSeq, or MiSeq sequencers.

Q: What should I do if the number of non-chimeric reads is low after the denoising step?

If you notice that the number of non-chimeric reads is low after the denoising step, you may need to adjust the --p-min-fold-parent-over-abundance parameter in the Q2FeatureInferencePE and Q2FeatureInferenceSE tasks.

This parameter specifies the minimum abundance required for potential parent sequences of a sequence being tested as chimeric. It is expressed as a fold-change relative to the abundance of the sequence under test. Values should be greater than or equal to 1 (i.e., potential parent sequences should be at least as abundant as the sequence being evaluated).

By default, the parameter is set to 1. It is recommended not to exceed a value of 16.

Q: What is the consequence of setting the --p-min-fold-parent-over-abundance parameter to a higher value?

Increasing this parameter generally leads to a higher number of non-chimeric reads being retained. As a result, the Shannon index tends to increase, since allowing more reads to pass through mechanically reveals a greater observed diversity.

Q: What artifacts could influence the Shannon Index?

Several factors can influence the Shannon index. This index can be calculated using different logarithmic bases, and each tool often uses a specific base by default. This variation explains why discrepancies may arise when comparing results across different tools or with published literature.

Other factors may also contribute to these variations, such as the sequencing technologies, the selected molecular targets, the animal species studied, the geographical location, as well as the transition from OTUs to ASVs.

For reference, ASVs are now preferred in most cases due to their higher resolution and reproducibility.

It is important to remember that the Shannon index is a relative measure. Its primary value lies in the differences observed between sample groups. This index is particularly sensitive to experiment-specific parameters.

Q: How can I minimize batch effects and ensure meaningful comparisons across studies after using --p-min-fold-parent-over-abundance parameter ?

To minimize batch effects and allow for meaningful comparisons between different studies, It's recommended ideally eprocessing all relevant datasets using the same parameter value. This value should be as parsimonious as possible; that is, the lowest value from your benchmark that still ensures a sufficient number of non-chimeric reads is retained.

Q: Why do I see weird quality check plots, and how should I assess read quality properly?

If your data comes from a NovaSeq or iSeq platform, it is normal to observe unusual quality check plots when using the Constellab pipeline.

To better assess the quality of your reads, we recommend using the FastQC and MultiQC tasks available in the gws_omix brick.

However, in all cases, you must also run the Q2QualityCheck task.