deepretro.featurizers
Feature engineering helpers for reaction-step ML workflows.
Overview
The deepretro.featurizers package currently exposes a single
DeepChem-compatible featurizer:
ReactionStepFeaturizerfor product/reactant reaction-step pairs.
API
Featurizers for reaction-step data.
- class deepretro.featurizers.ReactionStepFeaturizer(*args, **kwargs)[source]
Featurize a reaction step (product + reactants) into a flat numeric vector.
Concatenates three parts:
CircularFingerprint (Morgan/ECFP) for the product —
sizebitsCircularFingerprint (Morgan/ECFP) for the reactants —
sizebits15 hand-crafted domain features (optional)
- Parameters:
radius (int, optional (default 2)) – Morgan fingerprint radius. radius=2 corresponds to ECFP4.
size (int, optional (default 2048)) – Fingerprint bit length for each molecule.
use_domain_features (bool, optional (default True)) – If True, appends 15 domain features (atom/bond/ring/MW deltas).
Notes
This class requires RDKit to be installed.
Examples
>>> from deepretro.featurizers.reactionstep import ReactionStepFeaturizer >>> featurizer = ReactionStepFeaturizer(radius=2, size=2048) >>> reactions = [("CCO", "CC.O"), ("c1ccccc1", "c1ccccc1.Cl")] >>> X = featurizer.featurize(reactions) >>> X.shape (2, 4111)
- __init__(radius=2, size=2048, use_domain_features=True)[source]
- Parameters:
radius (int)
size (int)
use_domain_features (bool)
- Return type:
None
- property feature_dim: int
Total length of one feature vector.
- Returns:
dim –
2 * size + 15whenuse_domain_features=True,2 * sizeotherwise.- Return type:
int