NitishKumarPandey.
Data & AI Engineer with more than 3.5 years building production data pipelines and LLM systems. I specialize in RAG, multi agent workflows, and semantic retrieval at a Berlin procurement technology company, where reliability matters as much as the model.
Data
- Python, SQL
- PostgreSQL
- Milvus
Pipelines
- Azure Data Factory
- Databricks, dbt
- Airflow
AI Systems
- LLMs, RAG
- LangGraph
- FastAPI
Data driven engineering, end to end.
Data & AI Engineer with more than 3.5 years building production data pipelines and LLM systems. At Accenture I delivered ETL and ELT pipelines processing more than 1 million records per run and optimized over 50 SQL procedures for a global life sciences client. Now at Mercanis in Berlin, I build RAG pipelines, multi agent workflows, and semantic retrieval over more than 40,000 enterprise entities, where reliability matters as much as the model.
Production Data Engineering
More than 3.5 years building robust ETL and ELT pipelines on Python, SQL, Azure, and Databricks, handling more than 1 million records per run.
AI and Agentic Systems
Designing LangGraph multi agent workflows, RAG pipelines, and semantic retrieval with Milvus for reliable, production grade AI.
MSc in Data Science and AI
Completing a Master's at GISMA University of Applied Sciences in Berlin, with a thesis on multi agent supplier discovery.
Key strengths and technologies
The tools I reach for to build AI powered, data intensive systems.
RAG & Semantic Retrieval
Embedding based retrieval with Milvus, hybrid search, and context aware re ranking.
LangGraph Multi Agent
Orchestrating multi step, multi agent workflows for autonomous task execution.
FastAPI
High performance APIs exposing AI capabilities as production ready services.
Python & SQL
pandas, NumPy, scikit-learn, and PyTorch with advanced SQL on PostgreSQL.
Azure & Databricks
Cloud native data platforms, Azure Data Factory, dbt, and Airflow orchestration.
Docker & CI/CD
Containerized deployments with Docker, Git, and GitHub Actions.
Where the pipeline runs
Working Student AI Engineer
- Designed and deployed agentic AI workflows using LangGraph and LangChain, enabling multi step reasoning and autonomous task execution in production.
- Built and maintained LLM powered data pipelines with structured output validation, ensuring reliability and consistency at scale.
- Developed backend services using FastAPI to expose AI capabilities as modular, production ready APIs.
- Integrated vector databases and RAG architectures to ground responses in domain specific knowledge, reducing hallucination rates.
Data & AI Engineer
- Architected and maintained scalable ETL and ELT pipelines ingesting over 1 million records from REST APIs, flat files, and relational databases, ensuring 99.9% data accuracy in production.
- Optimized over 50 SQL stored procedures, cutting query execution time by 15% and improving overall pipeline performance by 20%.
- Orchestrated large scale workflows on Azure Data Factory and Kubernetes, enabling resilient, cloud native processing.
- Built Power BI dashboards for product lifecycle, architecture, and operational KPIs, adopted across cross functional teams.
Data Engineer Associate
- Developed and maintained ETL workflows and backend data systems supporting more than 3 enterprise client applications.
- Led production release planning and deployment configurations for zero defect go lives.
- Resolved defects and bottlenecks within SLA targets, maintaining over 99% uptime across critical pipelines.
Machine Learning Intern
- Completed a hands on machine learning internship, building foundational models and data workflows.
Featured work
Production grade, agentic, and full stack AI, from data ingestion to auditable, queryable insight.
SupplierMind
An AI assisted supplier discovery system for procurement teams that need auditable results under hard constraints such as certifications, capacity, lead time, and geography. A five agent LangGraph pipeline (parser, discovery, compliance, ranking, and evaluator) with ReAct tool use: it searches approved suppliers first, discovers new web suppliers on request, and holds them for human approval with a full audit log. Benchmarked against single prompt and RAG baselines, backed by more than 173 tests.
MSc Data Science, AI and Digital Business
GISMA University of Applied Sciences
BTech Computer Science
Rajiv Gandhi Proudyogiki Vishwavidyalaya
- LangGraph
- LLMs
- RAG
- NLP
- Agentic Pipelines
- FastAPI
- Python
- SQL
- ETL and ELT
- Azure and Databricks
- Airflow and dbt
- PostgreSQL
- Docker and Kubernetes
Best Athlete, Senior
Recognized for athletic excellence.
BMW Berlin Marathon 2025
Participant in the 5K Generali Run.
Delivery Foundation Academy
L1 Assessment for CL12 and CL13 at Accenture.
Architecting with Google Compute Engine (Specialization)
Cloud Security Fundamentals, Cloud Application Security
Python Programming, Beginner to Advanced
Let's build something scalable.
Open to full time Data & AI Engineer roles across Berlin, Germany and the EU. The fastest way to reach me is via email and LinkedIn.