Lead Machine Learning Engineer — Pixis
Bengaluru, India
Nov 2022 – Present
Own the AI Python backend across two production GenAI platforms serving enterprise advertisers — Adroom (creative generation) and Competitor Insights (competitive intelligence with AI-agent surface).
Adroom Creative Playground — multi-modal GenAI platform
- Built AdRoom, an AI-powered creative automation platform from inception to $2M+ ARR; led end-to-end product development including roadmap, design, sprints, and a 4-person team through v1 launch.
- Architected and operate a 46-route creative-generation platform across 5 independently deployed services, orchestrating 25+ Baseten-hosted models (SDXL, ControlNet, custom Flux fine-tunes, LaMa inpainting, harmonization, upscaling) plus FAL, Replicate, OpenAI, Anthropic, and Google Gemini APIs behind a unified provider abstraction with webhook-driven async fan-out.
- Designed the unified_generation orchestrator and its four execution variants — unified (single brief), controlled (knob-per-stage power mode), template-based (brand-sealed templates), and bulk_execution_controller (1,000+ variants fanned out per call with parent-child task aggregation, per-task retry, and NSFW gating).
- Built the Ad Copy Generation engine (llm_advertisements) — brand-voice-grounded copy across FB / IG / TikTok / Pinterest / LinkedIn shapes with per-provider routing (GPT-4o for CTAs, Claude for narrative, Gemini for batch) and draft/polish cost-tiering — 60%+ cost reduction vs all-flagship with no quality regression.
- Shipped the Prompt Intelligence layer (prompt_enhancer + prompt_variation) — parses a 5-word user brief into a 5-field ad concept, fans out 12 variants with embedding-based semantic diversity scoring, and compiles per-stage prompts for SDXL / Flux / DALL-E / video pipelines; outputs scored back via the creative-QA MCP for closed-loop quality.
- Owned the AI Image Editor suite — 7 operations (image_editor, lama_cleaner, generative_fill, enhance_by_reference, harmonizer, segment_anything, blend) with SAM-driven auto-masking so users no longer hand-paint masks; harmonizer matches color temperature and light direction across a campaign series for visual cohesion; mask → fill → harmonize → upscale chainable under a single async task.
- Built the Product-to-Video studio — 4-stage pipeline (image_to_video → product_to_video → camera_movement → extend_video) with vision-LLM continuity checks between stages and automatic Baseten → FAL → Replicate provider failover on cold starts.
- Built the UGC video ad pipeline — talking-human avatars + ElevenLabs voice cloning muxed via ffmpeg with lip-sync correction; Celery + RabbitMQ queue scaled to thousands of videos per execution batch with per-task retry.
- Drove production LLM cost & latency optimization through context management, prompt caching, per-tenant token budgets, and New Relic-backed observability across the multi-provider orchestration layer. Stack: FastAPI, ARQ, Redis, RabbitMQ, PostgreSQL (asyncpg + SQLAlchemy 2), AWS EKS + S3 + CloudFront, New Relic APM.
Competitor Insights — multi-tenant LLM analytics service with MCP exposure
- Designed and shipped a workspace-scoped microservice that ingests Meta Ad Library snapshots and produces 20+ LLM-driven strategy reports per snapshot — using GPT-4o-mini for text reports, GPT-4o vision for opening-frame and creative analysis, OpenAI o-series reasoning models for executive summaries and key takeaways, and Whisper-1 for video-ad audio transcription.
- Built an MCP server exposing 11 read-only tools over Streamable HTTP, integrated with Claude Desktop and Cursor — extends the same data backend to AI copilots without duplicating business logic, with API-key auth, context resolution, and SQL-injection guards. Shipped a second Creative QA MCP (8 tools: brand-compliance, heatmap analysis, cross-creative pattern detection) with multi-modal routing across Claude Vision, GPT-4o, and Gemini.
- Implemented credit-aware execution (per-report pricing with media-type weighting), recurring sync and report schedules (cron-driven, every 6h), self-healing for stuck jobs with Slack alerting, 9 zero-downtime SQL migrations (report dedup, schedule uniqueness, job-run enums, full table restructure), and Gamma.app-powered PDF / PPTX export for client deliverables. Stack: Python, FastAPI, FastMCP 3.1, ARQ, PostgreSQL, Redis, RabbitMQ (Kombu fanout), S3 + CloudFront, ffmpeg.
Cross-platform engineering leadership
- Designed and deployed a RAG-based LLM agent for pandas data operations — owned dataset design, training, evaluation, and serving — automating analytical workflows for non-technical users.
- Define the ML technical roadmap and lead cross-team design review of models, datasets, and evaluation patterns; drive MLOps discipline across both products — versioning, drift monitoring, GPU autoscaling, release gates, and rollback strategy on production inference endpoints.