-
Notifications
You must be signed in to change notification settings - Fork 14
PSv2: Use connection pooling and retries for NATS #1130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
carlosgjs
wants to merge
21
commits into
RolnickLab:main
Choose a base branch
from
uw-ssec:carlosg/natsconn
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 8 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
b60eab0
merge
carlos-irreverentlabs 644927f
Merge remote-tracking branch 'upstream/main'
carlosgjs 218f7aa
Merge remote-tracking branch 'upstream/main'
carlosgjs 90da389
Merge remote-tracking branch 'upstream/main'
carlosgjs 842e9b3
PSv2: Use connection pooling and retries for NATS
carlosgjs 227a8db
Refactor and fix nats tests
carlosgjs 2acf620
Tighten formatting
carlosgjs 0632ce0
format
carlosgjs c5f8106
CR feedback
carlosgjs 8805dbe
Apply suggestions from code review
carlosgjs 8618d3c
Merge remote-tracking branch 'upstream/main'
carlosgjs c384199
refactor: simplify NATS connection handling — keep retry decorator, d…
mihow 98a17f1
Merge branch 'main' into carlosg/natsconn
carlosgjs cf42506
revert: restore NATS connection pool — avoid per-operation connection…
mihow dc798ea
refactor: add switchable NATS connection strategies
mihow 4d66c07
refactor: simplify NATS connection module — pool-only, archive original
mihow 41bbeb3
docs: clarify where connection pool provides reuse vs. single-use
mihow ead53d1
fix: use `from None` to suppress noisy exception chain in _get_pool
mihow 9737301
docs: update AGENTS.md test commands to use docker-compose.ci.yml
mihow fa0f84b
fix: correct mock setup in NATS task tests to match plain instantiation
mihow c7b2014
fix: address PR review feedback for NATS connection module
mihow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| """ | ||
| NATS connection pool for both Celery workers and Django processes. | ||
|
|
||
| Maintains a persistent NATS connection per process to avoid | ||
| the overhead of creating/closing connections for every operation. | ||
|
|
||
| The connection pool is lazily initialized on first use and shared | ||
| across all operations in the same process. | ||
| """ | ||
|
|
||
| import asyncio | ||
| import logging | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| import nats | ||
| from django.conf import settings | ||
| from nats.js import JetStreamContext | ||
|
|
||
| if TYPE_CHECKING: | ||
| from nats.aio.client import Client as NATSClient | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class ConnectionPool: | ||
| """ | ||
| Manages a single NATS connection per process (Celery worker or Django web worker). | ||
|
|
||
| This is safe because: | ||
| - Each process gets its own isolated connection | ||
| - NATS connections are async-safe (can be used by multiple coroutines) | ||
| - Works in both Celery prefork and Django WSGI/ASGI contexts | ||
| """ | ||
|
|
||
| def __init__(self): | ||
| self._nc: "NATSClient | None" = None | ||
| self._js: JetStreamContext | None = None | ||
| self._nats_url: str | None = None | ||
| self._lock = asyncio.Lock() | ||
|
|
||
| async def get_connection(self) -> tuple["NATSClient", JetStreamContext]: | ||
| """ | ||
| Get or create the worker's NATS connection. Checks connection health and recreates if stale. | ||
|
|
||
| Returns: | ||
| Tuple of (NATS connection, JetStream context) | ||
| Raises: | ||
| RuntimeError: If connection cannot be established | ||
| """ | ||
| # Fast path: connection exists, is open, and is connected | ||
| if self._nc is not None and not self._nc.is_closed and self._nc.is_connected: | ||
| return self._nc, self._js # type: ignore | ||
|
|
||
| # Connection is stale or doesn't exist | ||
| if self._nc is not None: | ||
| logger.warning("NATS connection is closed or disconnected, will reconnect") | ||
| self._nc = None | ||
| self._js = None | ||
|
|
||
| # Slow path: need to create/recreate connection | ||
| async with self._lock: | ||
| # Double-check after acquiring lock | ||
| if self._nc is not None and not self._nc.is_closed and self._nc.is_connected: | ||
| return self._nc, self._js # type: ignore | ||
|
|
||
| # Get NATS URL from settings | ||
| if self._nats_url is None: | ||
| self._nats_url = getattr(settings, "NATS_URL", "nats://nats:4222") | ||
|
|
||
| try: | ||
| logger.info(f"Creating NATS connection to {self._nats_url}") | ||
| self._nc = await nats.connect(self._nats_url) | ||
| self._js = self._nc.jetstream() | ||
| logger.info(f"Successfully connected to NATS at {self._nats_url}") | ||
| return self._nc, self._js | ||
| except Exception as e: | ||
| logger.error(f"Failed to connect to NATS: {e}") | ||
| raise RuntimeError(f"Could not establish NATS connection: {e}") from e | ||
|
|
||
| async def close(self): | ||
| """Close the NATS connection if it exists.""" | ||
| if self._nc is not None and not self._nc.is_closed: | ||
| logger.info("Closing NATS connection") | ||
| await self._nc.close() | ||
| self._nc = None | ||
| self._js = None | ||
|
|
||
| def reset(self): | ||
| """ | ||
| Reset the connection pool (mark connection as stale). | ||
|
|
||
| This should be called when a connection error is detected. | ||
| The next call to get_connection() will create a fresh connection. | ||
| """ | ||
| logger.warning("Resetting NATS connection pool due to connection error") | ||
| self._nc = None | ||
| self._js = None | ||
|
carlosgjs marked this conversation as resolved.
Outdated
coderabbitai[bot] marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
| # Global pool instance - one per process (Celery worker or Django process) | ||
| _connection_pool: ConnectionPool | None = None | ||
|
|
||
|
|
||
| def get_pool() -> ConnectionPool: | ||
| """ | ||
| Get the process-local connection pool. | ||
| """ | ||
| global _connection_pool | ||
| if _connection_pool is None: | ||
| _connection_pool = ConnectionPool() | ||
| logger.debug("Lazily initialized NATS connection pool") | ||
| return _connection_pool | ||
|
coderabbitai[bot] marked this conversation as resolved.
Outdated
|
||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.