Provides deep technical expertise in the aspects of cloud infrastructure design and API development for the business environments.
Bridges the gap between data scientists and software engineers, enabling the efficient and reliable delivery of ML - powered solutions
Ensures solutions are well designed with maintainability/ease of integration and testing across multiple platforms.
Possess strong proficiency in development and testing practices common to the industry
Summary of Principal Job Responsibility & Specific Job Duties and Responsibilities:
Working closely with data scientists, ML engineers, and other stakeholders to deploy ML models
Setting up and maintaining cloud and edge infrastructure for MIL models deployment
Design, implement and maintain scalable infrastructure for ML workloads
Good verbal and written communication skills
Collaborative and oriented
Academic Qualification (s): Bachelor's degree in computer science, Engineering or related subject and/or equivalent formal training or work experience Work Experience / Skills Requirement(s): 1. Cloud Infrastructure & Kubernetes
Minimum 2 years of hands-on experience managing cloud infrastructure (e.g. AWS,GCP,Azure) in a production environment
Hands-on experience with Kubernetes for container orchestration, scaling and deployment of ML services
Familiar with Helm charts, ConfigMaps, Secret and autoscaling strategies
2. API Development & Messaging Integration
Proficient in building and maintaining RESTful or gRPC APIs for ML inference and data services
Experience in message queue integration such as RabbitMQ or ZeroMQ for asyncronous communication, job queuing or real-time model inference pipelines
3. System Design, Database & Software Architecture
Proven experience working with relational databases (RDBMS) such as Microsoft SQL Server and PostgreSQL.
Proficient in schema design, writing complex queries, stored procedures, indexing strategies, and query optimization.
Hands-on experience with vector search and embedding-based retrieval systems.
Practical knowledge using FAISS, LanceDB, or Qdrant for building similarity search or semantic search pipelines.
Understanding of vector indexing strategies (e.g., HNSW, IVF), embedding dimensionality management, and integration with model inference pipelines.
4. Programming Languages
Demonstrated expertise in building scalable and maintainable API services using Python frameworks such as Flask, FastAPI, or Litestar.
Fluent in HTML, CSS, and JavaScript for building simple web-based dashboards and monitoring interfaces.
Experience with Go, C++, or Rust is a strong plus, especially for performance-critical or low-latency inference applications.
5. Edge AI Deployment
Experience in integrating models using NCNN, MNN, or ONNX Runtime Mobile on mobile and edge devices.
Familiarity with quantization, model optimization, and mobile inference profiling tools.
6. MLOps & Tooling
Experience with Docker/Podman, CI/CD pipelines, Git, and ML lifecycle tools such as MLflow, Airflow, or Kubeflow.
Exposure to model versioning, A/B testing, and automated re-training workflows.
7. Monitoring & Logging
Ability to set up monitoring (e.g., Prometheus, Grafana) and logging (e.g., ELK stack, Loki) to track model performance and system health.
8. Soft Skills & Collaboration
Strong analytical and troubleshooting skills.
Able to work closely with data scientists, backend engineers, and DevOps to deploy and maintain reliable ML systems.