deepretro.utils.cache
Explicit in-memory caching utilities for expensive library operations.
Overview
The cache module provides:
CacheEntry: public record describing a cached value, its expiry deadline, and optional eviction tag
CacheManager: process-local cache with tag support, TTL, and statistics
make_args_hash(): deterministic argument hashing used by cache keys
make_cache_key(): deterministic cache keys for explicit cache lookups
The cache only lives for the current Python process. If the process exits,
the cached values are discarded. CacheManager uses plain Python data
structures without locking, so one instance should not be shared across threads
unless the caller adds synchronization.
Key Format
make_cache_key() returns keys in the form
v<version>:<namespace>:<64-char-sha256>. For example:
from deepretro.utils.cache import make_args_hash, make_cache_key
print(make_args_hash("CCO", az_model="USPTO"))
# 6ad01e27a3a319962ad084787e060ab0fa0e661cc7d3e018e96747b06f7bacf7
print(make_cache_key("run_az", "CCO", az_model="USPTO", version=1))
# v1:run_az:6ad01e27a3a319962ad084787e060ab0fa0e661cc7d3e018e96747b06f7bacf7
Algorithm Notes
make_args_hash() serializes args and kwargs into a stable payload,
tries JSON first for common serializable values, and falls back to pickle for
complex Python objects. CacheManager stores each key in _entries and
maintains a secondary tag -> set[key] mapping so one tag can evict multiple
keys with evict_tag().
Usage
Create and pass cache objects explicitly:
from deepretro.utils.cache import CacheManager, make_cache_key
cache = CacheManager()
key = make_cache_key("call_llm", "CCO", model="claude-opus-4-6", version=1)
cache_miss = object()
result = cache.get(key, default=cache_miss)
if result is cache_miss:
result = {"molecule": "CCO", "model": "claude-opus-4-6"}
cache.set(key, result, expire=3600, tag="CCO")
# Evict all entries for a molecule from this cache instance
cache.evict_tag("CCO")
# Inspect helper methods directly during tests or diagnostics
cache.purge_expired_entries()
print(cache.estimate_size_bytes())
# Clear this cache instance
cache.clear()
# Inspect statistics
stats = cache.stats()
print(stats.hits, stats.misses, stats.num_entries)
Tag Semantics
A tag is an arbitrary group label attached to one or more cache keys when
calling cache.set(..., tag="..."). Tags are useful when many cached values
should be invalidated together, such as all results derived from one molecule or
all outputs from one model configuration.
API
- deepretro.utils.cache.make_args_hash(*args, **kwargs)[source]
Generate a deterministic hash of arguments for cache keying.
Tries JSON first for common types; falls back to pickle for complex objects.
Examples
>>> make_args_hash("CCO", az_model="USPTO") '6ad01e27a3a319962ad084787e060ab0fa0e661cc7d3e018e96747b06f7bacf7'
- Parameters:
args (Any)
kwargs (Any)
- Return type:
str
- deepretro.utils.cache.make_cache_key(namespace, *args, version=1, **kwargs)[source]
Build a deterministic cache key for a namespaced operation.
- Parameters:
namespace (str) – Stable operation name, such as
"run_az".*args (Any) – Positional arguments that affect the cached result.
version (int, optional) – Cache version. Bump when behavior changes and old entries should be invalidated, by default 1.
**kwargs (Any) – Keyword arguments that affect the cached result.
- Returns:
A deterministic key suitable for
CacheManager.getandset.- Return type:
str
Examples
>>> make_cache_key("run_az", "CCO", az_model="USPTO", version=1) 'v1:run_az:6ad01e27a3a319962ad084787e060ab0fa0e661cc7d3e018e96747b06f7bacf7'
- class deepretro.utils.cache.CacheEntry(value, expires_at, tag)[source]
Single in-memory cache entry.
Each entry stores a cached payload, an optional expiry deadline measured with
time.monotonic(), and an optional tag used for group invalidation. Tags let callers associate multiple cache keys with the same logical input such as one molecule, model configuration, or request family.- Parameters:
value (Any)
expires_at (float | None)
tag (str | None)
- value
Cached payload returned by
CacheManager.get.- Type:
Any
- expires_at
time.monotonic()deadline when the key becomes stale.Nonemeans the entry does not expire automatically.- Type:
float | None
- tag
Optional group label attached when calling
cache.set(..., tag=...). All keys written with the same tag can be removed together withCacheManager.evict_tag, which is useful when multiple cached values should be invalidated as one group.- Type:
str | None
- class deepretro.utils.cache.CacheStats(hits, misses, size_bytes, num_entries)[source]
Snapshot of live cache statistics returned by
CacheManager.stats().The reported values describe the cache after expired entries have been purged. They are intended for diagnostics and monitoring rather than exact process-memory accounting.
- Parameters:
hits (int)
misses (int)
size_bytes (int)
num_entries (int)
- hits
Number of successful
CacheManager.getlookups.- Type:
int
- misses
Number of failed
CacheManager.getlookups, including expired keys.- Type:
int
- size_bytes
Shallow approximation of the live cache footprint in bytes. The estimate includes the top-level entry and tag dictionaries, their keys, and the immediate cached values, but does not traverse referenced objects recursively.
- Type:
int
- num_entries
Number of live entries remaining after expired values are purged. This reflects the keys that still participate in lookups and tag eviction.
- Type:
int
- class deepretro.utils.cache.CacheManager[source]
Process-local in-memory cache manager with tag support and TTL.
Each instance owns two in-memory indexes:
_entriesmaps cache keys toCacheEntryobjects, and_tagsmaps each tag to the set of keys currently carrying that tag.getremoves expired keys lazily,evict_tagremoves every key associated with a tag, andstatsfirst purges expired values so the reported counts reflect live entries only.The cache is process-local and not thread-safe. Reuse the same
CacheManagerinstance only when callers intentionally want to share state.Examples
>>> cache = CacheManager() >>> key = make_cache_key("call_llm", "CCO", model="gpt-5.4", version=1) >>> miss = object() >>> cache.get(key, default=miss) is miss True >>> cache.set(key, {"molecule": "CCO"}, expire=300, tag="molecule:CCO") >>> cache.get(key) {'molecule': 'CCO'} >>> cache.evict_tag("molecule:CCO") 1
- purge_if_expired(key)[source]
Remove a key if its expiry deadline has passed.
- Parameters:
key (str) – Cache key to inspect.
- Returns:
Truewhen the key existed and was removed because it had expired, otherwiseFalse.- Return type:
bool
- delete_key(key)[source]
Remove a key from the cache and tag index if present.
- Parameters:
key (str) – Cache key to remove.
- Returns:
Truewhen an entry was removed, otherwiseFalse.- Return type:
bool
- purge_expired_entries()[source]
Remove every expired entry currently stored in the cache.
This is useful before inspecting cache size or exporting diagnostics.
- Return type:
None
- estimate_size_bytes()[source]
Return a shallow approximation of the current in-memory cache size.
The estimate includes the top-level dictionaries, keys, tag sets, and the immediate cached values. Referenced objects are not traversed recursively.
- Return type:
int
- get(key, default=<object object>)[source]
Retrieve a value by key.
- Parameters:
key (str) – Cache key.
default (Any, optional) – Value returned when the key is not cached, by default an internal sentinel object.
- Returns:
Cached value, or
defaultif not found.- Return type:
Any
Examples
>>> cache = CacheManager() >>> miss = object() >>> cache.get("missing", default=miss) is miss True
- set(key, value, *, expire=None, tag=None)[source]
Store a value with optional TTL and tag.
- Parameters:
key (str) – Cache key.
value (Any) – Value to store.
expire (float | None, optional) – Time-to-live in seconds. None means no expiry.
tag (str | None, optional) – Optional group label for later eviction via
evict_tag. Multiple keys may share the same tag.
- Return type:
None
Examples
>>> cache = CacheManager() >>> cache.set("demo", {"smiles": "CCO"}, expire=60, tag="molecule:CCO")
- evict_tag(tag)[source]
Remove all live entries with the given tag.
- Parameters:
tag (str) – Group label identifying entries to remove. A single tag may be attached to multiple cache keys.
- Returns:
Number of entries evicted.
- Return type:
int
Examples
>>> cache = CacheManager() >>> cache.set("a", 1, tag="batch:1") >>> cache.set("b", 2, tag="batch:1") >>> cache.evict_tag("batch:1") 2
- stats()[source]
Return cache statistics.
- Returns:
A snapshot containing hit count, miss count, shallow byte estimate, and live entry count.
- Return type:
Examples
>>> cache = CacheManager() >>> cache.set("demo", 1) >>> stats = cache.stats() >>> (stats.hits, stats.misses, stats.num_entries) (0, 0, 1)