Role Overview
We are seeking a skilled Generative AI Engineer to design, develop, and deploy AI solutions using Large Language Models (LLMs), multimodal models, vector databases, and AI agents. The ideal candidate should have hands-on experience in building AI-powered applications, optimizing model performance, integrating AI systems into production environments, and staying current with the rapidly evolving GenAI landscape.
Key Responsibilities
- Design, develop, and deploy GenAI-based solutions using LLMs (e.g., GPT, Llama, Claude, Mistral).
- Build and fine-tune foundation models for application-specific tasks such as:
- Text generation and classification
- Autonomous AI agents
- Retrieval-Augmented Generation (RAG)
- Knowledge Augmented Generation (KAG)
- Chatbots and conversational systems
- Integrate AI models with applications using frameworks such as Lang Chain and Lang graph.
- Develop scalable inference architectures using GPU/CPU compute systems, vector databases, and orchestration frameworks.
- Implement RAG/KAG pipelines using vector stores like Pinecone or Weaviate.
- Optimize model performance, latency, and token cost.
- Work with prompt engineering, evaluation strategies, and quality measurement frameworks.
- Deploy AI workloads to cloud platforms (AWS, Azure, GCP) using services such as:
- Bedrock, Sage maker, Vertex AI, Azure OpenAI
- Kubernetes, Docker
- Implement data pipelines and manage training datasets for continual improvement.
- Ensure compliance with responsible AI practices, including security, privacy, and hallucination reduction.
- Collaborate with product, UX, and engineering teams to deliver customer-focused AI solutions.
Required Skills & Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, AI/ML, Engineering, or equivalent.
- Strong programming skills in Python.
- Hands-on experience with:
- LLMs and NLP systems
- KAG and RAG implementations
- Embeddings, vector search, and knowledge indexing
- Experience with AI orchestration tools:
- Lang Chain, Lang graph, or similar
- Lang Chain, Lang graph, or similar
- Familiarity with GPU compute optimization.
- Strong understanding of REST APIs, microservices, and modern backend architecture.
- Experience with cloud platforms such as AWS, GCP, or Azure
Job Type: Full Time
Job Location: Remote