🚀
keeping everything running
Sr. Machine Learning Engineer @ Red Hat | Inference Engineering | Building llm-d: distributed inference for LLMs on Kubernetes
- San Francisco
-
22:25
(UTC -07:00) - in/gregpereira1
Pinned Loading
-
llm-d/llm-d
llm-d/llm-d PublicAchieve state of the art inference performance with modern accelerators on Kubernetes
-
kubernetes-sigs/gateway-api-inference-extension
kubernetes-sigs/gateway-api-inference-extension PublicGateway API Inference Extension
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
deepseek-ai/DeepEP
deepseek-ai/DeepEP PublicDeepEP: an efficient expert-parallel communication library
-
llm-d/llm-d-latency-predictor
llm-d/llm-d-latency-predictor PublicLatency prediction service for ML-model based scoring with llm-d-inference-scheduler
-
llm-d-inference-scheduler
llm-d-inference-scheduler PublicForked from llm-d/llm-d-inference-scheduler
Inference scheduler for llm-d
Go
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.



