DeepRetro
deepretro is a chemistry ML utility package for retrosynthesis workflows.
It focuses on robust reaction-step featurization and practical integrations
that can be dropped into DeepChem training pipelines or custom research code.
Overview
The package currently provides:
Reaction-step vectorization using product/reactant fingerprints plus handcrafted chemistry descriptors.
Domain-feature extraction helpers for product/reactant SMILES pairs.
AiZynthFinder wrappers for template-based route search.
Heuristic hallucination detection and scoring for retrosynthetic steps.
ML-based hallucination classification (XGBoost via DeepChem
GBDTModel).Dataset loading with DeepChem
DiskDatasetsharding and stratified splitting.
Input and Output Conventions
Reaction steps are represented as:
(product_smiles, reactants_smiles)
where reactants_smiles may contain multiple molecules separated by ..
Quickstart
from deepretro import ReactionStepFeaturizer
featurizer = ReactionStepFeaturizer(radius=2, size=2048, use_domain_features=True)
X = featurizer.featurize([
("CCO", "CC.O"),
("c1ccccc1", "c1ccccc1.Cl"),
])
print(X.shape) # (2, 4111)
Top-Level API
deepretro — retrosynthesis ML utilities.
Provides DeepChem-compatible featurizers, dataset loaders, algorithms, and model wrappers for reaction-step data.