Vishnu Anilkumar

Lead Machine Learning Engineer · Production GenAI · LLM Agents & MCP · MLOps

Summary

Lead ML Engineer (8 years) owning the AI Python backend for two production GenAI platforms — a 46-route creative-generation platform (SDXL, ControlNet, custom Flux fine-tunes, 7-tool image editor, product-to-video studio, UGC video with talking avatars) and a multi-tenant competitor-intelligence service that ships 20+ LLM-driven analysis reports per snapshot and exposes 11 tools over an MCP server to AI agents. Built a unified provider abstraction routing across OpenAI, Anthropic, Gemini, FAL, Replicate, and Baseten with webhook-driven async fan-out. Combine deep diffusion + LLM/agent expertise with hard MLOps discipline (Kubeflow, GPU autoscaling on EKS, drift monitoring, release gates) and cross-team technical leadership.

Core Expertise

Generative AI & Diffusion

Stable Diffusion XL, ControlNet (7-weight: pose / canny / HED / normal / depth / MLSD / lineart), custom Flux fine-tunes, multi-stage diffusion pipelines (LaMa inpainting, harmonization, upscaling), SAM-driven auto-masking, prompt-intelligence layer with structure inference + semantic-diversity variant fan-out, AI image editor suite (inpaint, outpaint, object removal, reference-guided enhancement, color/light harmonization, blend), 4-stage product-to-video pipeline with cross-stage continuity checks, UGC video generation with avatars and voice cloning (ElevenLabs), Hugging Face Diffusers and Transformers.

LLMs, RAG & Agents

Multi-provider orchestration (OpenAI GPT-4o / GPT-4o-mini / o3, Anthropic Claude, Google Gemini, Whisper-1), RAG with Chroma, LangChain, MCP (Model Context Protocol) servers — built and shipped two in production (FastMCP, 11 + 8 tools, integrated with Claude Desktop and Cursor), agentic workflows, prompt engineering, evaluation framework design, context management, prompt caching, and token optimization strategies.

ML Architecture & System Design

Async microservices (FastAPI + asyncpg + SQLAlchemy 2), distributed task orchestration (ARQ, Redis, RabbitMQ), webhook-driven model serving (Baseten, FAL, Replicate) with per-provider failover (Baseten → FAL → Replicate), cost-tier routing (draft on mini / polish on flagship, 40–60% cost reduction), parent-child bulk-task aggregation for 1,000+ variant generation calls, multi-tenant data isolation, credit-aware execution, self-healing job pipelines, dual real-time + batch systems.

MLOps & Production Lifecycle

Kubeflow on GCP AI Platform, GPU autoscaling on AWS EKS, model / dataset / code versioning, custom drift monitoring, release gates and rollback strategy, New Relic APM, structured logging with task-scoped contextvars.

Cloud & Data Infrastructure

AWS (EKS, EC2, S3, CloudFront), GCP (AI Platform, BigQuery, Kubeflow, Vertex AI), Docker, Kubernetes, PostgreSQL, MongoDB, Redis, Chroma vector DB, PySpark.

ML / DL Stack

Python, PyTorch, TensorFlow / Keras, Scikit-learn, OpenCV, XGBoost, BERT, YOLO, CNN architectures, Pandas, NumPy.

Professional Experience

Lead Machine Learning Engineer — Pixis

Bengaluru, India

Nov 2022 – Present

Own the AI Python backend across two production GenAI platforms serving enterprise advertisers — Adroom (creative generation) and Competitor Insights (competitive intelligence with AI-agent surface).

Adroom Creative Playground — multi-modal GenAI platform

Built AdRoom, an AI-powered creative automation platform from inception to $2M+ ARR; led end-to-end product development including roadmap, design, sprints, and a 4-person team through v1 launch.
Architected and operate a 46-route creative-generation platform across 5 independently deployed services, orchestrating 25+ Baseten-hosted models (SDXL, ControlNet, custom Flux fine-tunes, LaMa inpainting, harmonization, upscaling) plus FAL, Replicate, OpenAI, Anthropic, and Google Gemini APIs behind a unified provider abstraction with webhook-driven async fan-out.
Designed the unified_generation orchestrator and its four execution variants — unified (single brief), controlled (knob-per-stage power mode), template-based (brand-sealed templates), and bulk_execution_controller (1,000+ variants fanned out per call with parent-child task aggregation, per-task retry, and NSFW gating).
Built the Ad Copy Generation engine (llm_advertisements) — brand-voice-grounded copy across FB / IG / TikTok / Pinterest / LinkedIn shapes with per-provider routing (GPT-4o for CTAs, Claude for narrative, Gemini for batch) and draft/polish cost-tiering — 60%+ cost reduction vs all-flagship with no quality regression.
Shipped the Prompt Intelligence layer (prompt_enhancer + prompt_variation) — parses a 5-word user brief into a 5-field ad concept, fans out 12 variants with embedding-based semantic diversity scoring, and compiles per-stage prompts for SDXL / Flux / DALL-E / video pipelines; outputs scored back via the creative-QA MCP for closed-loop quality.
Owned the AI Image Editor suite — 7 operations (image_editor, lama_cleaner, generative_fill, enhance_by_reference, harmonizer, segment_anything, blend) with SAM-driven auto-masking so users no longer hand-paint masks; harmonizer matches color temperature and light direction across a campaign series for visual cohesion; mask → fill → harmonize → upscale chainable under a single async task.
Built the Product-to-Video studio — 4-stage pipeline (image_to_video → product_to_video → camera_movement → extend_video) with vision-LLM continuity checks between stages and automatic Baseten → FAL → Replicate provider failover on cold starts.
Built the UGC video ad pipeline — talking-human avatars + ElevenLabs voice cloning muxed via ffmpeg with lip-sync correction; Celery + RabbitMQ queue scaled to thousands of videos per execution batch with per-task retry.
Drove production LLM cost & latency optimization through context management, prompt caching, per-tenant token budgets, and New Relic-backed observability across the multi-provider orchestration layer. Stack: FastAPI, ARQ, Redis, RabbitMQ, PostgreSQL (asyncpg + SQLAlchemy 2), AWS EKS + S3 + CloudFront, New Relic APM.

Competitor Insights — multi-tenant LLM analytics service with MCP exposure

Designed and shipped a workspace-scoped microservice that ingests Meta Ad Library snapshots and produces 20+ LLM-driven strategy reports per snapshot — using GPT-4o-mini for text reports, GPT-4o vision for opening-frame and creative analysis, OpenAI o-series reasoning models for executive summaries and key takeaways, and Whisper-1 for video-ad audio transcription.
Built an MCP server exposing 11 read-only tools over Streamable HTTP, integrated with Claude Desktop and Cursor — extends the same data backend to AI copilots without duplicating business logic, with API-key auth, context resolution, and SQL-injection guards. Shipped a second Creative QA MCP (8 tools: brand-compliance, heatmap analysis, cross-creative pattern detection) with multi-modal routing across Claude Vision, GPT-4o, and Gemini.
Implemented credit-aware execution (per-report pricing with media-type weighting), recurring sync and report schedules (cron-driven, every 6h), self-healing for stuck jobs with Slack alerting, 9 zero-downtime SQL migrations (report dedup, schedule uniqueness, job-run enums, full table restructure), and Gamma.app-powered PDF / PPTX export for client deliverables. Stack: Python, FastAPI, FastMCP 3.1, ARQ, PostgreSQL, Redis, RabbitMQ (Kombu fanout), S3 + CloudFront, ffmpeg.

Cross-platform engineering leadership

Designed and deployed a RAG-based LLM agent for pandas data operations — owned dataset design, training, evaluation, and serving — automating analytical workflows for non-technical users.
Define the ML technical roadmap and lead cross-team design review of models, datasets, and evaluation patterns; drive MLOps discipline across both products — versioning, drift monitoring, GPU autoscaling, release gates, and rollback strategy on production inference endpoints.

Senior Machine Learning Engineer — Quantiphi Inc.

Bengaluru, India

May 2021 – Nov 2022

Dawn Foods (Customer Lifetime Value): Led the engagement end-to-end as Senior MLE; delivered a production CLTV model with monthly revenue as the North Star metric — covered data onboarding, modeling, and stakeholder sign-off.
Dyson UK (Production ML pipeline): Designed and deployed a Kubeflow pipeline on GCP AI Platform with custom model monitoring, delivering an ensemble-based CLTV model into production.
Mentored a team of eight MLEs across multiple engagements; advised R&D and product teams in marketing analytics. Recognised as Employee of the Year for delivery quality and team leadership.

Machine Learning Engineer — Reflections Info Systems Pvt Ltd

India

Apr 2020 – May 2021

Skill-extraction CNN: Developed and deployed an entity-recognition model for an enterprise hiring platform — reached 82% accuracy and materially improved skill detection and candidate matching.
Legal-clause semantic search: Built a BERT + Elasticsearch search engine enabling faster clause comparison and retrieval across large legal corpora.
Signature and stamp-seal detection (YOLOv3): Computer-vision model automating proofreading of legal documents.

Associate Data Scientist — Techvantage Systems Pvt Ltd

India

Jun 2018 – Mar 2020

Resume intelligence platform (client-facing): Candidate scoring, section identification, content-quality evaluation, and alternative-industry / role prediction using TensorFlow and Random Forest; fine-tuned BERT for role-prediction from experience profiles.
Customer-churn prediction: Achieved 82% accuracy across 22,000 customer records for a financial-services client using XGBoost; supported targeted retention campaigns.
Cattle face-identification: Computer-vision model automating cattle insurance claim verification for India's largest livestock insurer.

Intern Data Scientist — Techvantage Systems Pvt Ltd

India

Dec 2017 – Jun 2018

Credit-risk analytics: Probability-of-default modelling from KYC data using Regression, SVM, Random Forest, and XGBoost with SMOTE, anomaly detection, and feature-selection pipelines.

Education

M.Sc. Computer Science (Machine Intelligence)

Indian Institute of Information Technology and Management — Kerala (CUSAT), India · 2016–2018

Awards & Recognition

Employee of the Year — Quantiphi Inc.

Recognised for flawless delivery and leading the ML team across multiple Fortune-500 engagements.

Languages

English (fluent) · Malayalam (native)

Got a hard problem? I’d love to hear it.

Reach me at vishnuanilkumar.engineer@gmail.com or read the cover letter.

LinkedIn GitHub Email