deepretro.utils
Utility layer for domain features and template-based retrosynthesis integration.
This module group supports three major workflows:
Feature engineering utilities for ML-ready reaction-step vectors.
AiZynthFinder orchestration helpers for template route generation.
Molecule utilities for SMILES validation, substructure matching, and pathway filtering (
deepretro.utils.utils_molecule).
Utility Overview
Utility |
Purpose |
|---|---|
|
Execute AiZynthFinder and return solved flag + route dictionaries. |
|
Same as |
|
Shortcut heuristic for trivial molecules to bypass heavy search. |
AiZynthFinder Integration Notes
run_az and run_az_with_img use environment-configured model paths:
AZ_MODELS_PATH(preferred model-variant path)AZ_MODEL_CONFIG_PATH(fallback config path)
Behavior highlights:
Auto-bypass for trivial/basic molecules via
BASIC_MOLECULESandis_basic_molecule.Caching via
src.cache.cache_resultsdecorator.Returns route dictionaries with metadata and scores from AiZynthFinder.
Example: run template search
from deepretro.utils.az import run_az
solved, routes = run_az("C1CCCCC1", az_model="USPTO")
print(solved, len(routes))
Submodules
API Reference
deepretro.utils.domain_features
Domain feature extraction utilities for reaction-step featurization.
- deepretro.utils.domain_features.extract_domain_features_single(product_smiles, reactants_smiles)[source]
Extract hand-crafted domain features for one product-reactant pair.
Computes atom-count deltas (C, N, O, Cl, Br), bond/ring/aromaticity deltas, molecular-weight deltas, and absolute counts.
- Parameters:
product_smiles (str) – SMILES of the target product.
reactants_smiles (str) – SMILES of the proposed reactants (dot-separated when multiple).
- Returns:
features – 1-D feature vector. Returns a NaN vector on any parsing failure, so invalid rows are distinguishable from real data downstream.
- Return type:
np.ndarray, shape (NUM_DOMAIN_FEATURES,)
Examples
>>> from deepretro.utils import extract_domain_features_single >>> feats = extract_domain_features_single("CCO", "CC.O") >>> feats.shape (15,)