deepretro.utils.llm =================== LiteLLM-backed single-step retrosynthesis utilities. The LLM code is split into three layers: - ``deepretro.utils.llm`` is the public workflow facade. It exposes prompt selection, model calls, response parsing, JSON validation, pathway filtering, and the ``llm_pipeline()`` orchestration function. - ``deepretro.utils.llm_interface`` owns provider-specific behavior. The ``LLMInterface`` base class defines prompt construction, completion parameter construction, API calling, and response parsing. Concrete implementations handle Claude-style, OpenAI-style, DeepSeek-style, and generic responses. - ``deepretro.utils.llm_helpers`` contains model normalization and small parsing helpers shared by the interface layer. Provider Interfaces ------------------- Use ``create_llm_interface()`` when code needs direct access to the provider-specific interface: .. code-block:: python from deepretro.utils.llm_interface import LLMRequest, create_llm_interface interface = create_llm_interface("openai/gpt-4o-mini") request = LLMRequest( molecule="CCO", model="openai/gpt-4o-mini", max_output_tokens=2048, enable_thinking=False, ) messages = interface.build_messages(request) params = interface.build_completion_params(request, messages) The factory returns one of these implementations: .. list-table:: :widths: 25 25 50 :header-rows: 1 * - Interface - Provider family - Parser behavior * - ``AnthropicLLM`` - Anthropic / Claude - Requires ```` content with at least one ```` entry and a JSON payload. JSON is accepted in ```` tags or as fenced/raw JSON after ```` because Claude can return either shape. * - ``OpenAILLM`` - OpenAI - Extracts tagged, fenced, or raw JSON and does not return thinking steps. * - ``DeepSeekLLM`` - DeepSeek - Extracts optional ```` content plus tagged, fenced, or raw JSON. * - ``GenericLLM`` - Fallback - Uses the Claude-style parser for compatible providers. Public Workflow --------------- Most callers should use the facade functions in ``deepretro.utils.llm``: .. code-block:: python from deepretro.utils.llm import call_LLM, parse_response, validate_split_json status, response_text = call_LLM( molecule="CCO", model="openai/gpt-4o-mini", max_output_tokens=2048, enable_thinking=False, ) if status == 200: parse_status, thinking_steps, json_content = parse_response( response_text, "openai/gpt-4o-mini", ) if parse_status == 200: validate_split_json(json_content) ``llm_pipeline()`` combines the same steps into the end-to-end flow: .. code-block:: python from deepretro.utils.llm import llm_pipeline pathways, explanations, confidence = llm_pipeline( molecule="CCO", model="openai/gpt-4o-mini", stability_check=False, hallucination_check=False, max_output_tokens=2048, enable_thinking=False, ) Model Selection --------------- Model identifiers are normalized before calling LiteLLM: - OpenAI models can be passed with a LiteLLM prefix, for example ``openai/gpt-4o-mini``, or as names recognized by ``deepretro.utils.variables.OPENAI_MODELS``. - OpenAI chat models use ``max_completion_tokens`` and receive a deterministic ``seed``. - OpenAI reasoning models, such as ``gpt-5`` and ``o``-series models, use ``reasoning_effort`` when reasoning controls are enabled. Their output-token budget is raised to a provider-safe minimum. - Anthropic Claude 4 Opus/Sonnet models also receive ``reasoning_effort`` when reasoning controls are enabled. Their output-token budget is raised to the same provider-safe minimum, and their temperature is set to ``1`` for reasoning calls. - DeepSeek aliases such as ``fireworks/deepseek-v3p2`` are normalized to the preferred Fireworks-hosted DeepSeek R1 model. - A ``:adv`` suffix, such as ``openai/gpt-4o-mini:adv``, selects the advanced prompt mode unless an explicit ``prompt_mode`` argument is provided. Testing ------- Most tests in this module do not require live LLM credentials. They exercise provider selection, completion-parameter construction, parser behavior, JSON validation, and pipeline orchestration using local test doubles. The focused test file also includes two slow Anthropic integration tests. When ``ANTHROPIC_API_KEY`` is configured, they call the live server with the real retrosynthesis prompt for aspirin in both standard and advanced prompt modes. Those tests assert that the response contains ```` / ```` tags, that JSON can be extracted from either tagged or fenced output, and that ``validate_split_json()`` can convert the payload into aligned pathways, explanations, and confidence scores. Run the focused test file with: .. code-block:: bash uv run --project deepretro pytest deepretro/tests/test_llm.py -q Run only the live Anthropic prompt checks with: .. code-block:: bash ANTHROPIC_API_KEY=... uv run --project deepretro pytest \ deepretro/tests/test_llm.py::test_live_anthropic_retrosynthesis_prompt_returns_parseable_tagged_json -q API --- .. automodule:: deepretro.utils.llm :members: .. automodule:: deepretro.utils.llm_interface :members: :no-index: .. automodule:: deepretro.utils.llm_helpers :members: :no-index: