AI Engineering Lead
Vichara
Posted: December 16, 2025
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We are seeking an experienced AI Engineering Lead to design and lead multi-agent LLM systems for prompt lifecycle management and benchmarking, leveraging LangGraph, LangChain, and Promptfoo for prompt lifecycle management and benchmarking.
Required Skills
Job Description
Vichara is a Financial Services focused products and services firm headquartered in NY and building systems for some of the largest i-banks and hedge funds in the world.
Key Responsibilities
🔹 Architecture & System Design
• Architect, design, and lead multi-agent LLM systems using LangGraph, LangChain, and Promptfoo for prompt lifecycle management and benchmarking.
• Build Retrieval-Augmented Generation (RAG) pipelines leveraging hybrid vector search (dense + keyword) using LanceDB, Pinecone, or Elasticsearch.
• Define system workflows for summarization, query routing, retrieval, and response generation, ensuring minimal latency and high precision.
• Develop RAG evaluation frameworks combining retrieval precision/recall, hallucination detection, and latency metrics — aligned with analyst and business use cases.
🔹 AI Model Integration & Fine-Tuning
• Integrate GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral) for task-specific contextual Q&A.
• Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis.
• Manage prompt routing and variant testing using Promptfoo or equivalent tools.
🔹 Agentic AI & Orchestration
• Implement multi-agent architectures with modular flows — enabling task-specific agents for summarization, retrieval, classification, and reasoning.
• Design fallback and recovery behaviors to ensure robustness in production.
• Employ LangGraph for parallel and stateful agent orchestration, error recovery, and deterministic flow control.
🔹 Data Engineering & RAG Infrastructure
• Architect ingestion pipelines for structured and unstructured data — including financial statements, filings, and PDF documents.
• Leverage MongoDB for metadata storage and Redis Streams for async task execution and caching.
• Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems.
🔹 Observability & Production Deployment
• Deploy end-to-end AI systems on AWS EKS / Azure Kubernetes Service, integrated with CI/CD pipelines (Azure DevOps).
• Build comprehensive monitoring dashboards using OpenTelemetry and Signoz, tracking latency, retrieval precision, and application health.
• Enforce testing and regression validation using golden datasets and structured assertion checks for all LLM responses.
🔹 Cross-functional Collaboration
• Collaborate with DevOps, MLOps, and application development teams to integrate AI APIs with React / FastAPI-based user interfaces.
• Work with business analysts to translate credit, compliance, and customer-support requirements into actionable AI agent workflows.
• Mentor a small team of GenAI developers and data engineers in RAG, embeddings, and orchestration techniques.
• Experience:• 5+ years as an AI or ML Engineer
• Required Skills & Experience
• LLMs & GenAI: GPT-4o, PaLM 2, LangGraph, LangChain, Promptfoo, SentenceTransformers
• RAG Frameworks: LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB
• Agentic AI: LangGraph multi-agent orchestration, routing logic, task decomposition
• Fine-Tuning: BERT / domain-specific transformer tuning, evaluation framework design
• Infra & MLOps: FastAPI, Docker, Kubernetes (EKS/AKS), Redis Streams, Azure DevOps CI/CD
• Monitoring: OpenTelemetry, Signoz, Prometheus
• Languages & Tools: Python, SQL, REST APIs, Git, Pandas, NumPy
• 🧠 Nice-to-Have Skills
• Knowledge of Reranker-based retrieval (MiniLM / CrossEncoder)
• Familiarity with Prompt evaluation and scoring (BLEU, ROUGE, Faithfulness)
• Domain exposure to Credit Risk, Banking, and Investment Analytics
• Experience with RAG benchmark automation and model evaluation dashboards