Stability Checker
Heuristic stability checker for molecules in retrosynthetic pathways.
When an LLM proposes reactant molecules, some of them may be chemically unstable — strained small rings, anti-aromatic systems, reactive intermediates like carbocations or carbenes, etc. This module catches those problems by inspecting the molecular graph with RDKit descriptors and SMARTS pattern matching. No trained model is required.
from deepretro.algorithms import check_molecule_stability
result = check_molecule_stability("c1ccccc1")
print(result["assessment"]) # "Likely stable"
print(result["stability_score"]) # 100
Checks performed
Strained small rings — 3- or 4-membered rings that contain a heteroatom (N, O, S …) are flagged. Aziridines and azetidines are significantly more strained than plain cyclopropane / cyclobutane.
Anti-aromatic motifs — rings whose π-electron count is a multiple of 4 (Hückel 4n rule) are thermodynamically destabilised. Known patterns (cyclobutadiene, cyclooctatetraene, pentalene) are matched via SMARTS, and π electrons are also counted directly for rings of size 4, 8, 12, or 16.
Fused small rings — two rings of ≤ 4 atoms sharing atoms create extreme angle strain (e.g. bicyclo[1.1.0]butane). These systems can be explosive.
Large heterocycles — rings with ≥ 7 atoms containing heteroatoms tend to be conformationally floppy and often unstable. Very large heterocycles (> 10 atoms, ≥ 3 heteroatoms) get an extra penalty.
Carbocations — positively charged carbon centres are reactive intermediates, not isolable species. The checker distinguishes:
sp2 (
[C+;X3]) — penalised unless stabilised by an adjacent aromatic ring, allylic double bond, or benzylic position.sp (
[C+;X2]) — always penalised heavily.Primary vs secondary — primary is worse because fewer alkyl groups donate electron density.
Adjacent to EWG — a carbocation next to F, Cl, Br, I or charged N/S/O is the worst case.
Carbenes — a neutral carbon with two bonds and no hydrogens (
[C;X2;H0;+0]) is extremely reactive. Extra penalties if the carbene sits inside a 3- or 4-membered ring or is adjacent to an electron-withdrawing group.Fused cyclopentane + small hetero ring — a 5-membered all-carbon ring sharing atoms with a 3- or 4-membered hetero ring creates significant ring strain.
Physicochemical outliers — extreme
logPvalues withabs(logP) > 10or too many rotatable bonds (> 15) each incur a small penalty.Aromatic bonus — aromatic rings stabilise a molecule, so each one adds a small bonus (capped at +15 total).
Scoring
After all checks, the score is clamped to 0–100:
≥ 80 →
"Likely stable"50–79 →
"Moderately stable"< 50 →
"Potentially unstable"
Entry points
check_molecule_stability — analyse a single SMILES and return a 0–100 stability score with an issue list.
is_valid_smiles — quick check that a SMILES string parses.
Heuristic stability checker for molecules in retrosynthetic pathways.
When an LLM proposes reactant molecules, some of them may be chemically unstable — strained small rings, anti-aromatic systems, reactive intermediates like carbocations or carbenes, etc. This module catches those problems by inspecting the molecular graph with RDKit descriptors and SMARTS pattern matching. No trained model is required.
Checks performed
Strained small rings — 3- or 4-membered rings that contain a heteroatom (N, O, S …) are flagged. Aziridines and azetidines are significantly more strained than plain cyclopropane / cyclobutane.
Anti-aromatic motifs — rings whose π-electron count is a multiple of 4 (Hückel 4n rule) are thermodynamically destabilised. Known patterns (cyclobutadiene, cyclooctatetraene, pentalene) are matched via SMARTS, and π electrons are also counted directly for rings of size 4, 8, 12, or 16.
Fused small rings — two rings of ≤ 4 atoms sharing atoms create extreme angle strain (e.g. bicyclo[1.1.0]butane). These systems can be explosive.
Large heterocycles — rings with ≥ 7 atoms containing heteroatoms tend to be conformationally floppy and often unstable. Very large heterocycles (> 10 atoms, ≥ 3 heteroatoms) get an extra penalty.
Carbocations — positively charged carbon centres are reactive intermediates, not isolable species. The checker distinguishes:
sp2 (
[C+;X3]) — penalised unless stabilised by an adjacent aromatic ring, allylic double bond, or benzylic position.sp (
[C+;X2]) — always penalised heavily.Primary vs secondary — primary is worse because fewer alkyl groups donate electron density.
Adjacent to EWG — a carbocation next to F, Cl, Br, I or charged N/S/O is the worst case.
Carbenes — a neutral carbon with two bonds and no hydrogens (
[C;X2;H0;+0]) is extremely reactive. Extra penalties if the carbene sits inside a 3- or 4-membered ring or is adjacent to an electron-withdrawing group.Fused cyclopentane + small hetero ring — a 5-membered all-carbon ring sharing atoms with a 3- or 4-membered hetero ring creates significant ring strain.
Physicochemical outliers — extreme
logPvalues withabs(logP) > 10or too many rotatable bonds (> 15) each incur a small penalty.Aromatic bonus — aromatic rings stabilise a molecule, so each one adds a small bonus (capped at +15 total).
Scoring
After all checks, the score is clamped to 0–100:
≥ 80 →
"Likely stable"50–79 →
"Moderately stable"< 50 →
"Potentially unstable"
Entry points
check_molecule_stability — analyse a single SMILES and return a 0–100 stability score with an issue list.
is_valid_smiles — quick check that a SMILES string parses.
- deepretro.algorithms.stability_checker.is_valid_smiles(smiles)[source]
Check whether a SMILES string can be parsed by RDKit.
- Parameters:
smiles (str) – SMILES string to validate.
- Returns:
Trueif RDKit can build a molecule from smiles.- Return type:
bool
Examples
>>> is_valid_smiles("CCO") True >>> is_valid_smiles("not_a_molecule") False
- deepretro.algorithms.stability_checker.check_molecule_stability(smiles)[source]
Assess the stability of a molecule from its SMILES string.
Parses the molecule with RDKit and runs nine heuristic checks (see the module docstring for a full description of each one): strained small rings, anti-aromatic motifs, fused small rings, large heterocycles, carbocations, carbenes, fused cyclopentane systems, physicochemical outliers, and an aromatic-ring bonus. Each problem subtracts from a base score of 100; the final score is clamped to 0–100.
- Parameters:
smiles (str) – SMILES string of the molecule to assess.
- Returns:
Keys returned:
valid_structure(bool) — whether RDKit could parse the SMILES at all.stability_score(int, 0–100) — overall stability rating.issues(list[str]) — plain-English descriptions of every problem found (e.g."Three-membered heterocycle (potentially unstable)").metrics(dict) — molecular weight, logP, H-bond donors / acceptors, rotatable bonds.ring_data(dict) — ring counts broken down by type (aliphatic / aromatic, carbocycle / heterocycle, bridgehead atoms, etc.).atom_data(dict) — total atoms, bonds, heavy atoms, aromatic vs aliphatic counts.assessment(str) — one of"Likely stable","Moderately stable", or"Potentially unstable".
- Return type:
dict[str, Any]
Examples
>>> res = check_molecule_stability("c1ccccc1") # benzene >>> res["assessment"] 'Likely stable' >>> res = check_molecule_stability("[CH2+]C") # ethyl cation >>> res["assessment"] 'Potentially unstable'
- deepretro.algorithms.stability_checker.check_carbocations(mol, results, score)[source]
Detect carbocation intermediates and apply score penalties.
Carbocations are positively charged carbon centres i.e. reactive intermediates that cannot be bottled. The function uses SMARTS pattern matching to find them and then checks whether the charge is stabilised by resonance (allylic or benzylic position) or by neighbouring aromatic atoms. Unstabilised and primary carbocations get the heaviest penalties; stabilised ones get a small bonus.
- Parameters:
mol (Chem.Mol) – RDKit molecule object already parsed from SMILES.
results (dict[str, Any]) – Accumulator — detected issues are appended to
results["issues"].score (int) – Running stability score to adjust.
- Returns:
Updated stability score after carbocation penalties / bonuses.
- Return type:
int
Examples
>>> from rdkit import Chem >>> from deepretro.algorithms import check_carbocations >>> mol = Chem.MolFromSmiles("[CH2+]C") # ethyl cation >>> results = {"issues": []} >>> new_score = check_carbocations(mol, results, 100) >>> "Contains primary carbocation (highly unstable)" in results["issues"] True >>> new_score < 100 True
- deepretro.algorithms.stability_checker.check_carbenes(mol, results, score)[source]
Detect carbene intermediates and apply score penalties.
A carbene is a neutral carbon with only two bonds and no hydrogens i.e. an extremely reactive species that usually exists only as a fleeting intermediate. Additional penalties stack if the carbene is inside a strained 3- or 4-membered ring, or sits next to an electron-withdrawing group (halogens, charged heteroatoms).
- Parameters:
mol (Chem.Mol) – RDKit molecule object already parsed from SMILES.
results (dict[str, Any]) – Accumulator — detected issues are appended to
results["issues"].score (int) – Running stability score to adjust.
- Returns:
Updated stability score after carbene penalties.
- Return type:
int
Examples
>>> from rdkit import Chem >>> from deepretro.algorithms import check_carbenes >>> mol = Chem.MolFromSmiles("[C]1CC1") # carbene in 3-membered ring >>> results = {"issues": []} >>> new_score = check_carbenes(mol, results, 100) >>> any("carbene" in i for i in results["issues"]) True >>> new_score < 100 True
- deepretro.algorithms.stability_checker.check_fused_cyclopentane(mol, atom_rings, results, score)[source]
Detect 5-membered carbon rings fused with small hetero rings.
A cyclopentane ring sharing atoms with a 3- or 4-membered ring that contains a heteroatom (N, O, S …) creates significant angle strain, for example 1,2-epoxycyclopentane. Each such fusion incurs a heavy penalty (-40).
- Parameters:
mol (Chem.Mol) – RDKit molecule object already parsed from SMILES.
atom_rings (tuple) – Ring atom-index tuples from
mol.GetRingInfo().AtomRings().results (dict[str, Any]) – Accumulator — detected issues are appended to
results["issues"].score (int) – Running stability score to adjust.
- Returns:
Updated stability score after fused-ring penalties.
- Return type:
int
Examples
>>> from rdkit import Chem >>> from deepretro.algorithms import check_fused_cyclopentane >>> mol = Chem.MolFromSmiles("C1CC2OCC12") # epoxycyclopentane >>> rings = mol.GetRingInfo().AtomRings() >>> results = {"issues": []} >>> new_score = check_fused_cyclopentane(mol, rings, results, 100) >>> any("strained system" in i for i in results["issues"]) True >>> new_score < 100 True