deepretro.utils.llm
===================
LiteLLM-backed single-step retrosynthesis utilities.
The LLM code is split into three layers:
- ``deepretro.utils.llm`` is the public workflow facade. It exposes prompt
selection, model calls, response parsing, JSON validation, pathway filtering,
and the ``llm_pipeline()`` orchestration function.
- ``deepretro.utils.llm_interface`` owns provider-specific behavior. The
``LLMInterface`` base class defines prompt construction, completion parameter
construction, API calling, and response parsing. Concrete implementations
handle Claude-style, OpenAI-style, DeepSeek-style, and generic responses.
- ``deepretro.utils.llm_helpers`` contains model normalization and small parsing
helpers shared by the interface layer.
Provider Interfaces
-------------------
Use ``create_llm_interface()`` when code needs direct access to the
provider-specific interface:
.. code-block:: python
from deepretro.utils.llm_interface import LLMRequest, create_llm_interface
interface = create_llm_interface("openai/gpt-4o-mini")
request = LLMRequest(
molecule="CCO",
model="openai/gpt-4o-mini",
max_output_tokens=2048,
enable_thinking=False,
)
messages = interface.build_messages(request)
params = interface.build_completion_params(request, messages)
The factory returns one of these implementations:
.. list-table::
:widths: 25 25 50
:header-rows: 1
* - Interface
- Provider family
- Parser behavior
* - ``AnthropicLLM``
- Anthropic / Claude
- Requires ```` content with at least one ```` entry and a
JSON payload. JSON is accepted in ```` tags or as fenced/raw JSON
after ```` because Claude can return either shape.
* - ``OpenAILLM``
- OpenAI
- Extracts tagged, fenced, or raw JSON and does not return thinking steps.
* - ``DeepSeekLLM``
- DeepSeek
- Extracts optional ```` content plus tagged, fenced, or raw JSON.
* - ``GenericLLM``
- Fallback
- Uses the Claude-style parser for compatible providers.
Public Workflow
---------------
Most callers should use the facade functions in ``deepretro.utils.llm``:
.. code-block:: python
from deepretro.utils.llm import call_LLM, parse_response, validate_split_json
status, response_text = call_LLM(
molecule="CCO",
model="openai/gpt-4o-mini",
max_output_tokens=2048,
enable_thinking=False,
)
if status == 200:
parse_status, thinking_steps, json_content = parse_response(
response_text,
"openai/gpt-4o-mini",
)
if parse_status == 200:
validate_split_json(json_content)
``llm_pipeline()`` combines the same steps into the end-to-end flow:
.. code-block:: python
from deepretro.utils.llm import llm_pipeline
pathways, explanations, confidence = llm_pipeline(
molecule="CCO",
model="openai/gpt-4o-mini",
stability_check=False,
hallucination_check=False,
max_output_tokens=2048,
enable_thinking=False,
)
Model Selection
---------------
Model identifiers are normalized before calling LiteLLM:
- OpenAI models can be passed with a LiteLLM prefix, for example
``openai/gpt-4o-mini``, or as names recognized by
``deepretro.utils.variables.OPENAI_MODELS``.
- OpenAI chat models use ``max_completion_tokens`` and receive a deterministic
``seed``.
- OpenAI reasoning models, such as ``gpt-5`` and ``o``-series models, use
``reasoning_effort`` when reasoning controls are enabled. Their output-token
budget is raised to a provider-safe minimum.
- Anthropic Claude 4 Opus/Sonnet models also receive ``reasoning_effort`` when
reasoning controls are enabled. Their output-token budget is raised to the
same provider-safe minimum, and their temperature is set to ``1`` for
reasoning calls.
- DeepSeek aliases such as ``fireworks/deepseek-v3p2`` are normalized to the
preferred Fireworks-hosted DeepSeek R1 model.
- A ``:adv`` suffix, such as ``openai/gpt-4o-mini:adv``, selects the advanced
prompt mode unless an explicit ``prompt_mode`` argument is provided.
Testing
-------
Most tests in this module do not require live LLM credentials. They exercise
provider selection, completion-parameter construction, parser behavior, JSON
validation, and pipeline orchestration using local test doubles.
The focused test file also includes two slow Anthropic integration tests. When
``ANTHROPIC_API_KEY`` is configured, they call the live server with the real
retrosynthesis prompt for aspirin in both standard and advanced prompt modes.
Those tests assert that the response contains ```` / ```` tags,
that JSON can be extracted from either tagged or fenced output, and that
``validate_split_json()`` can convert the payload into aligned pathways,
explanations, and confidence scores.
Run the focused test file with:
.. code-block:: bash
uv run --project deepretro pytest deepretro/tests/test_llm.py -q
Run only the live Anthropic prompt checks with:
.. code-block:: bash
ANTHROPIC_API_KEY=... uv run --project deepretro pytest \
deepretro/tests/test_llm.py::test_live_anthropic_retrosynthesis_prompt_returns_parseable_tagged_json -q
API
---
.. automodule:: deepretro.utils.llm
:members:
.. automodule:: deepretro.utils.llm_interface
:members:
:no-index:
.. automodule:: deepretro.utils.llm_helpers
:members:
:no-index: