🔍 Introduction
1) What this calculator does ?
- DNA / RNA Accepts a nucleotide sequence (FASTA or plain text).
Lets you specify strand (single‐stranded
ss or double‐stranded ds), topology (linear or circular), and 5′ end chemistry (hydroxyl, phosphate, triphosphate).
Computes molecular weight using base-specific masses plus small end corrections so results agree with common vendor calculators.
For circular nucleic acids, we model ligation (closing the strand) by removing water (condensation) at the junction.
- Accepts a nucleotide sequence (FASTA or plain text).
- Lets you specify strand (single‐stranded
ss or double‐stranded ds), topology (linear or circular), and 5′ end chemistry (hydroxyl, phosphate, triphosphate).
- Computes molecular weight using base-specific masses plus small end corrections so results agree with common vendor calculators.
- For circular nucleic acids, we model ligation (closing the strand) by removing water (condensation) at the junction.
- ProteinReads a protein sequence and uses Biopython ProtParam to compute average molecular weight (with terminal H₂O), plus composition, % per residue, aromaticity, instability index, pI, secondary structure fractions, and extinction coefficients.
- Reads a protein sequence and uses Biopython ProtParam to compute average molecular weight (with terminal H₂O), plus composition, % per residue, aromaticity, instability index, pI, secondary structure fractions, and extinction coefficients.
🧰 Prerequisites
- Access to Constellab and a valid Digital Lab environment
- Installed bricks: gws_omix ≥ xxxx
- Input file: FASTA or plain text nucleotide sequence (DNA or RNA or Protein)
🧪 Workflow: Step by Step
- Add the task: DNA/RNA/Protein Molecular Weight Calculator in Constellab.
- Provide input: Add your FASTA sequence file.
- Configure:
prefix (optional): base name of the output report (default: bioseq_mw).
- Run the task.


2) Nucleic acids: core ideas and constants
Base (nucleotide) masses used (approx. average masses, Da)
DNA (deoxy):
- dA = 313.209
- dT = 304.197
- dC = 289.184
- dG = 329.205
RNA (ribose):
- A = 329.21
- U = 306.17
- C = 305.18
- G = 345.21
Double-stranded average per base pair (Da per bp)
- dsDNA bp average ≈ 618.004812
- dsRNA bp average ≈ 644.574228
These bp-averages let us compute ds MW quickly and closely match vendor tools.
End-group / 5′ chemistry corrections (Da)
Because “a polymer is not just the sum of its monomers,” the chemical groups at each end slightly shift the total mass. We apply small, tuned deltas that reproduce the behavior of popular calculators.
- Single-stranded DNA (per strand):5′ hydroxyl: −67.4975′ phosphate: hydroxyl + 79.985′ triphosphate: hydroxyl + 239.94
- 5′ hydroxyl: −67.497
- 5′ phosphate: hydroxyl + 79.98
- 5′ triphosphate: hydroxyl + 239.94
- Double-stranded DNA (total for the whole molecule):5′ hydroxyl: −123.925′ phosphate: +36.045′ triphosphate: +355.96
- 5′ hydroxyl: −123.92
- 5′ phosphate: +36.04
- 5′ triphosphate: +355.96
- Single-stranded RNA (per strand):5′ hydroxyl: −72.105′ phosphate: hydroxyl + 79.985′ triphosphate: hydroxyl + 159.96
- 5′ hydroxyl: −72.10
- 5′ phosphate: hydroxyl + 79.98
- 5′ triphosphate: hydroxyl + 159.96
Why end corrections?
When you build a strand, phosphodiester bonds eliminate atoms (e.g., water) compared to a simple sum of free nucleotides. Ends carry specific groups (5′-OH, 5′-P, 5′-PPP) that shift the net mass slightly. These tuned deltas capture that effect.
Circular ligation (closing a circle) — why water is removed
When a linear polynucleotide becomes circular, the 5′ phosphate reacts with the 3′ hydroxyl to form a final phosphodiester bond. Forming that bond eliminates one water molecule (H₂O = 18.01528 Da) per closed strand:
- Circular ss: subtract 18.01528 Da
- Circular ds: subtract 2 × 18.01528 = 36.03056 Da
Why force 5′ phosphate for circular molecules?
In practice, the chemical step that closes the backbone is a phosphodiester formation; conceptually, the 5′ end behaves as phosphate. Most calculators therefore treat circular DNA/RNA as “5′ phosphate” and apply the H₂O loss at ligation.
3) How we calculate DNA/RNA MW
Let N be the sequence length (bases for ss, base pairs for ds), and let counts be the base composition.
A) Single-stranded DNA (ssDNA)
- Sum base masses:
S = (#A×313.209 + #T×304.197 + #C×289.184 + #G×329.205)
- Add end correction (per strand):
MW = S + Δ_end(5′ chemistry)
- If circular:
MW_circular = MW − 18.01528
B) Double-stranded DNA (dsDNA)
Using the bp-average model for speed and reproducibility:
- Base-pair core:
core = N_bp × 618.004812
- Add ds end correction (total):
MW = core + Δ_end_ds(5′ chemistry)
- If circular:
MW_circular = MW − 36.03056
This matches vendor behavior closely and is much simpler than summing both strands explicitly for large sequences.
C) Single-stranded RNA (ssRNA)
- Sum base masses:
S = (#A×329.21 + #U×306.17 + #C×305.18 + #G×345.21)
- Add end correction (per strand):
MW = S + Δ_end(5′ chemistry)
- If circular:
MW_circular = MW − 18.01528
D) Double-stranded RNA (dsRNA)
- Base-pair core:
core = N_bp × 644.574228
- End correction: not applied in this model (the bp-average already captures bonding pattern for dsRNA).
- If circular:
MW_circular = core − 36.03056
4) Proteins: how we calculate MW and extra properties
We use Biopython’s ProteinAnalysis (ProtParam):
- Molecular weight:
ProteinAnalysis(seq).molecular_weight() returns the average mass of the intact chain — including terminal H₂O (i.e., peptide with N-terminus H and C-terminus OH).
- Composition: counts and percent per residue (we output % = fraction×100, rounded to 2 decimals).
5) Why results differ slightly between tools
- Mass model: “average” vs “monoisotopic” vs per-residue kDa tables.
- End conventions: whether you account for terminal H₂O (proteins) or 5′ chemistry (nucleic acids).
- Circularization: whether you subtract H₂O per closed strand.
- Vendor tuning: many web calculators apply small constants (like the ones we use) to align with their wet-lab assumptions.
Our implementation picks constants and end deltas that match typical vendor calculators within a few Daltons across common cases.