Model Specification¶
Specify a physical model¶
NUPACK 4 analysis and design jobs are run based on a physical model created using the Model
class:
model1 = Model(material='rna', ensemble='stacking', celsius=37,
sodium=1.0, magnesium=0.0)
Any unspecified properties take on their default values (which happen to be the ones specified for model1
above).
Model options¶
The valid options for each property are described below.
Material¶
NUPACK 4 algorithms use the following temperature-dependent RNA and DNA free energy parameter sets specified by the keyword material
(default: material='rna'
):
-
rna06
(shorthand:rna
) Based on [Mathews99] and [Lu06] with additional parameters [Xia98,Zuker03] including coaxial stacking [Mathews99,Turner10] and dangle stacking [Serra95,Zuker03,Turner10] in 1M Na^+. -
rna95
Based on [Serra95] with additional parameters [Zuker03] including coaxial stacking [Mathews99,Turner10] and dangle stacking [Serra95,Zuker03,Turner10] in 1M Na^+. -
dna04
(shorthand:dna
) Based on [Santalucia98] and [Santalucia04] with additional parameters [Zuker03] including coaxial stacking [Peyret00] and dangle stacking [Bommarito00,Zuker03] in user-specified concentrations of Na^+ and Mg^{++} [Santalucia98,Peyret00,Santalucia04]. -
custom-parameters
Custom parameters provided in a JSON file (e.g.,custom-parameters.json
) using the same format as the provided parameter files. Provide \Delta G_{37}(\mathrm{loop}) and \Delta H(\mathrm{loop}) values to allow calculations at different temperatures or only \Delta G(\mathrm{loop}) values to allow calculations at one temperature. Place the JSON file in the same directory as the default parameter files (specifymaterial = 'custom-parameters'
) or specify the full path to the file (material = 'path/to/my/custom-parameters.json'
).
Free energies are expressed in kcal/mol. Base pairs are either Watson-Crick pairs (G
\cdot C
and A
\cdotU
for RNA; G
\cdotC
and A
\cdotT
for DNA) or wobble pairs (G
\cdotU
for RNA). Note that for DNA, G
and T
form a mismatch and not a wobble pair [Santalucia04].
DNA/RNA hybrids are not allowed.
Stacking¶
NUPACK 4 algorithms perform calculations on the following complex ensembles specified by the keyword ensemble
(default: ensemble='stacking'
):
-
stacking
Complex ensemble with coaxial and dangle stacking (ensemble \overline\Gamma^\shortparallel(\phi)). -
dangle-stacking
Complex ensemble with dangle stacking. -
coaxial-stacking
Complex ensemble with coaxial stacking. -
nostacking
Complex ensemble without coaxial and dangle stacking (ensemble \overline\Gamma(\phi)).
Temperature¶
celsius
Temperature is specified in ^\circC using the keywordcelsius
(default:celsius=37
).kelvin
Alternatively, the temperature can be specified in K using the keywordkelvin
.
Salt¶
The default salt conditions for RNA and DNA parameter sets are [\mathrm{Na}^+] = 1 {\rm M}; these are the only salt conditions for RNA. Salt corrections are available for DNA parameters to permit calculations in user-specified sodium, potassium, ammonium, and magnesium ion concentrations.
-
sodium
Based on [Santalucia98,SantaLucia04] the sum of the concentrations of (monovalent) sodium, potassium, and ammonium ions, [{\rm Na}^+] + [\mathrm{K}^+] + [\mathrm{NH}_4^+], is specified in units of molar (default: 1.0, range: [0.05,1.1]) using the keywordsodium
. -
magnesium
Based on [Peyret00,Koehler05] the concentration of (divalent) magnesium ions, [{\rm Mg}^{++}], is specified in units of molar (default: 0.0, range: [0.0,0.2]) using the keywordmagnesium
.
Examples
- Define a model for DNA calculations at 23 ^\circC in [{\rm Na}^{+}]= 0.5 M and [{\rm Mg}^{++}]= 0.01 M:
model2 = Model(material='dna', celsius=23, sodium=0.5, magnesium=0.01)
ensemble
is unspecified so it defaults to ensemble='stacking'
.
- Define a model using custom parameters at 45 ^\circC without coaxial and dangle stacking:
model3 = Model(material='path/to/my/custom-parameters.json',
ensemble='nostacking', celsius=45)
Historical options¶
For backwards compatibility with NUPACK 3, the following historical complex ensembles without coaxial stacking and with approximate dangle stacking are supported:
-
none-nupack3
No dangle stacking and no coaxial stacking (danglesnone
option for NUPACK 3) -
some-nupack3
Some dangle stacking and no coaxial stacking (danglessome
option for NUPACK 3). A dangle energy is incorporated for each unpaired base flanking a duplex (a base flanking two duplexes contributes only the minimum of the two possible dangle energies). -
all-nupack3
All dangle stacking and no coaxial stacking (danglesall
option for NUPACK 3). A dangle energy is incorporated for each unpaired base flanking a duplex (a base flanking two duplexes contributes both possible dangle energies).
For these historical ensembles, base pairs are either Watson-Crick pairs (G
\cdotC
and A
\cdotU
for RNA; G
\cdotC
and A
\cdotT
for DNA) or wobble pairs (G
\cdotU
for RNA; G
\cdotT
for DNA). Note that for the historical ensembles, G
\cdotT
is classified as a DNA wobble pair and not as a mismatch. The historical ensembles prohibit a wobble pair (G
\cdotU
or G
\cdotT
) as a terminal base pair in an exterior loop or a multiloop. As a result, an attempt to evaluate a free energy for a sequence \phi and secondary structure s that place a wobble pair as a terminal base pair in an exterior loop or multiloop will return \overline{\Delta G}(\phi,s)=\Delta G(\phi,s) = \infty. These historical ensembles can be used for calculations in combination with the following historical DNA and RNA parameter sets:
-
rna95-nupack3
Same asrna95
except that terminal mismatch free energies in exterior loops and multiloops are replaced by two dangle stacking free energies. -
dna04-nupack3
Same asdna04
except that G\cdotT was treated as a wobble pair (analogous to aG
\cdotU
RNA wobble pair) instead of classifyingG
andT
as a mismatch. Note that while terminal mismatch free energies in exterior loops and multiloops are replaced by two dangle stacking free energies, this is the same treatment as indna04
, as terminal mismatch parameters are not public for DNA [Santalucia04]. -
rna99-nupack3
Parameters from [Mathews99] with terminal mismatch free energies in exterior loops and multiloops replaced by two dangle stacking free energies. Parameters are provided only for 37 ^\circC.
Compute loop free energy¶
The loop_energy
method operates on a Model
object to calculate the loop free energy in kcal/mol. The loop sequence is specified with keyword loop
and the loop structure is specified with keyword structure
. For example:
my_model = Model(material='RNA', ensemble='stacking')
#Calculate the free energy of an unstructured strand
dGloop2 = my_model.loop_energy(loop='AAUU', structure='....')
print(dGloop2)
# --> 0.0
#Calculate the free energy of a hairpin loop
dGloop3 = my_model.loop_energy(loop='AACCCUU', structure='(.....)')
print(dGloop3)
# --> 5.15
#Calculate the free energy of an exterior loop
dGloop4 = my_model.loop_energy(loop='AA+UU', structure='((+))')
print(dGloop4)
# --> -0.9
#Calculate the free energy of a multiloop
dGloop5 = my_model.loop_energy(loop='AAU+ACU+AGU', structure='(.(+).(+).)')
print(dGloop5)
# --> 9.355
Compute stacking state free energies¶
The stack_energies
method operates on a Model
object to calculate the stacking state free energies for the subensemble of stacking states in a single loop. The loop sequence is specified with keyword loop
and the loop structure is specified with keyword structure
. The algorithm returns a list of stacking states and the free energy for each in kcal/mol.
For a loop defined as a list of N snippets, a stacking state is specified as a string composed of one letter per snippet. For each snippet, the returned letter is:
's'
if the snippet contains only 2 nucleotides, each base-paired to a nucleotide in the adjacent snippet, with the two base pairs coaxially stacked on each other'b'
if both the 5' and 3' unpaired nucleotides are dangle stacking on adjacent base pairs'5'
if only the 5'-most unpaired base is dangle stacking on its adjacent base pair'3'
if only the 3'-most unpaired base is dangle stacking its adjacent base pair'n'
if none of the above apply (i.e., the snippet does not have a dangle at either the 5' or 3' end nor does it contain only 2 adjacent nucleotides participating in a coaxial stack)
For example, the following figures illustrate snippet annotations for coaxial and dangle stacking states in representative multiloops and exterior loops:
For a specified multiloop or exterior loop sequence and structure, the stack_energies
method returns a set of stacking state strings each with a corresponding stacking state free energy (kcal/mol):
# Calculate the dangle stacking state free energies for an exterior loop
my_model.stack_energies(loop='CA+UC', structure='.(+).')
# --> {'35': -0.15, '3n': 0.15, 'n5': 0.35, 'nn': 0.45}
# Calculate the coaxial stacking state free energies for an exterior loop
my_model.stack_energies(loop='AA+U+U', structure='((+)+)')
# --> {'nnn': 0.9, 'snn': 0.0}
# Calculate the coxial stacking state free energies for a multiloop
my_model.stack_energies(loop='AU+AU+AU', structure='((+)(+))')
# --> {'nnn': 11.9725, 'nns': 10.8725, 'nsn': 10.8725, 'snn': 10.8725}
For loops that are not multiloops or exterior loops, the loop free energy is returned with a string indicating that there is no stacking state. For example, for a hairpin loop:
my_model.stack_energies(loop='AAAAU', structure='(...)')
# --> {'n': 5.85}
Mathews99 | Mathews D.H., Sabina J., Zuker M., Turner D.H.: Expanded Sequence Dependence of Thermodynamic Parameters Improves Prediction of RNA Secondary Structure. J. Mol. Biol.. 288, (1999) |
Lu06 | Lu Z.J., Turner D.H., Mathews D.H.: A Set of Nearest Neighbor Parameters for Predicting the Enthalpy Change of RNA Secondary Structure Formation. Nucleic acids research. 34, (2006) |
Xia98 | Xia T., SantaLucia J., Burkard M., Kierzek R., Schroeder S., Jiao X., Cox C., Turner D.: Thermodynamic Parameters for an Expanded Nearest-Neighbor Model for Formation of RNA Duplexes with Watson-Crick Base Pairs. Biochemistry. 37, (1998) |
Zuker03 | Zuker M.: Mfold Web Server for Nucleic Acid Folding and Hybridization Prediction. Nucleic Acids Res.. 31, (2003) |
Turner10 | Turner D.H., Mathews D.H.: NNDB: The Nearest Neighbor Parameter Database for Predicting Stability of Nucleic Acid Secondary Structure. Nucleic Acids Res.. 38, (2010) |
Serra95 | Serra M.J., Turner D.H.: Predicting Thermodynamic Properties of RNA. Methods Enzymol.. 259, (1995) |
Santalucia98 | SantaLucia J.: A Unified View of Polymer, Dumbbell, and Oligonucleotide DNA Nearest-Neighbor Thermodynamics. Proc. Natl. Acad. Sci. USA. 95, (1998) |
Santalucia04 | SantaLucia J., Hicks D.: The thermodynamics of DNA structural motifs. Annu. Rev. Biophys. Biomol. Struct.. 33, (2004) |
Peyret00 | Peyret N.: Prediction of Nucleic Acid Hybridization: Parameters and Algorithms. (2000) |
Bommarito00 | Bommarito S., Peyret N., SantaLucia J.: Thermodynamic Parameters for DNA Sequences with Dangling Ends. Nucleic Acids Res.. 28, (2000) |
SantaLucia04 | SantaLucia J., Hicks D.: The thermodynamics of DNA structural motifs. Annu. Rev. Biophys. Biomol. Struct.. 33, (2004) |
Koehler05 | Koehler R.T., Peyret N.: Thermodynamic Properties of DNA Sequences: Characteristic Values for the Human Genome. Bioinformatics. 21, (2005) |