pytest-tenantguard
BUILDINGA pytest plugin and AST linter that catches cross-tenant data leaks in multi-tenant SaaS codebases before they ship. MIT-licensed core, with a paid compliance dashboard for teams that need audit trails.
Five years building backend systems, from kernel-level networking at VMware to AI-agentic infrastructure at a fast-moving startup. Currently relocating from Bengaluru to Barcelona for senior and staff backend roles.
I'm a backend engineer who likes problems where the constraints are real: limited memory, strict latency budgets, traffic that won't wait for you to scale up. Day to day, that means owning technical decisions, defining API contracts, and mentoring the engineers around me, as much as it means writing code.
Most recently, I was a founding engineer at Knowl, where I built the Python and FastAPI backbone for an AI-native product handling over a million requests a day, including an autonomous agentic system that improved language detection accuracy by 45%.
Before that, I spent four years at VMware on the VeloCloud SD-WAN platform, writing production C and C++ for systems running on 200,000+ edge devices: threat detection, IPv6 prefix delegation, and a testing framework that took a legacy C codebase from 5% to 90% coverage.
I studied Computer Science at IIT Guwahati, and still keep my competitive programming chops sharp. Outside of work, I run local LLMs on a Mac Mini and write about what I find.
A pytest plugin and AST linter that catches cross-tenant data leaks in multi-tenant SaaS codebases before they ship. MIT-licensed core, with a paid compliance dashboard for teams that need audit trails.
A CLI tool that catches backward-incompatible database migrations before they break a zero-downtime deploy, by tracing migration diffs back to the application code that actually depends on them.
A local RAG application that stores and queries interview experiences with semantic search and metadata filtering, running entirely on local infrastructure end to end: ingestion, indexing, retrieval, and LLM-powered answers.
A multi-book conversational AI that ingests PDFs, builds a searchable knowledge base, and answers questions with citations, using hybrid retrieval across vector search, TF-IDF, and cosine similarity.
More on GitHub.
Running larger coding models on a 16GB Mac Mini, I noticed generation slowing down and the model losing track of earlier context as a project grew. The cause wasn't the model, it was the KV cache eating into available memory as context filled up. A smaller model with more headroom for context can outperform a bigger one on real projects, and I think the next real unlock for local AI is better memory architecture, not bigger weights.
Read the full post on LinkedInSame coding prompt, run across 4B and 9B Qwen3.5 models with and without Apple's MLX backend. Runtime optimization turned out to matter almost as much as model size.
~2.1x faster generation on the 4B model, with total inference time dropping from 4m39s to 1m43s. 4B models now feel genuinely usable for real engineering work.
Read the full post on LinkedInI'm relocating from Bengaluru and ready to start the visa sponsorship process. If you're hiring backend or AI-systems engineers, I'd love to talk.