@drxim

· AI / LLM ops lead UTC+3

Founder of XIMTRX. 12 years SRE and DevOps. I lead the team, keep a hand on on-call rotation, and own architecture for client deploys. On-call for GPU inference and LLM serving.

vLLM GPU ops Triton CUDA Ray Kubernetes

GitHub → ← Back to team

what I do at XIMTRX

GPU inference fleets and LLM serving: on-call for vLLM and Triton, autoscaling and cost review on prod. Hold the UTC+3 on-call shift (00:00 to 08:00 UTC). On Discovery calls I talk directly with CTOs/founders, no sales layer.

What I've worked on most over the last few years: vLLM and Triton inference on A100/H100, multi-GPU autoscaling, KV-cache and batching tuning, GPU cost optimization.

background

12 years in infra. Started as a classic sysadmin, moved into SRE on high-traffic backend services, then into Web3 and AI infrastructure. Since 2024 I've been building XIMTRX as a managed-team-on-contract.

I'll expand this profile later with publications, talks, and public configs. For now, the fastest way to talk tech is GitHub or Telegram via the contacts page.