deepretro.models.hallucination\_classifier
==========================================

XGBoost-based binary classifier for detecting hallucinated retrosynthesis
reactions.  Built on DeepChem's :class:`~deepchem.models.GBDTModel`, which
wraps an ``XGBClassifier`` and adds automatic early-stopping via an
internal 80/20 train/validation split.

Training a new model
--------------------

Prepare a CSV with columns ``product``, ``reactants``, and ``label``
(1 = hallucinated, 0 = valid).  Then:

.. code-block:: python

   from deepretro.data import ReactionDataLoader, stratified_split
   from deepretro.models import HallucinationClassifier

   # Load and featurize
   loader = ReactionDataLoader()
   dataset = loader.create_dataset("data/hallucination_dataset.csv")
   train, valid, test = stratified_split(dataset)

   # Train
   clf = HallucinationClassifier(model_dir="my_models/")
   clf.fit(train)

   # Evaluate (also sets the optimal probability threshold)
   scores = clf.evaluate(test)
   print(scores)

Saving and loading
------------------

The model is auto-saved to ``model_dir`` after training.  To reload:

.. code-block:: python

   clf = HallucinationClassifier(model_dir="my_models/")
   clf.load("my_models/")

The saved artifacts include the XGBoost model weights and the optimal
classification threshold.

Configuration
-------------

No environment variables are required.  All paths are passed as
arguments to the constructor and ``load()`` / ``save()`` methods.

API
---

.. automodule:: deepretro.models.hallucination_classifier
   :members:
   :undoc-members: