Engineer, Principal/manager - Machine Learning, Ai Job in Qualcomm India Private Limited
Engineer, Principal/manager - Machine Learning, Ai
Qualcomm India Private Limited
9 hours ago
- Bengaluru, Bangalore Urban, Karnataka
- Not Disclosed
- Full-time
Job Summary
Engineer, Principal/Manager - Machine Learning, AI
Location: Bangalore, Karnataka, India
Company: Qualcomm India Private Limited
General Summary
Qualcomm is seeking an experienced and visionary Principal AI/ML Engineer to lead research, development, and optimization of AI inference systems. This role involves developing high-performance AI models, optimizing deployments across various hardware platforms, and contributing to research in model compression, quantization, and hardware-aware optimization.
Education & Experience
- PhD with 6+ years, Master's with 7+ years, or Bachelor's with 8+ years in Engineering, CS, or related field.
- 20+ years of experience in AI/ML development; 5+ years in inference optimization and debugging.
Key Responsibilities
Model Optimization & Quantization
- Optimize models using quantization (INT8, INT4, mixed precision), pruning, and knowledge distillation.
- Implement PTQ and QAT techniques for deployment.
- Experience with TensorRT, ONNX Runtime, OpenVINO, TVM.
AI Hardware Acceleration & Deployment
- Target platforms: Hexagon DSP, CUDA GPUs, TPUs, NPUs, FPGAs, Habana Gaudi, Apple Neural Engine.
- Use Python APIs: cuDNN, XLA, MLIR for hardware acceleration.
- Benchmark and debug performance across platforms.
AI Research & Innovation
- Research on efficient AI inference: model compression, low-bit precision, sparse computing.
- Explore architectures like Sparse Transformers, Mixture of Experts, Flash Attention.
- Publish in ML conferences: NeurIPS, ICML, CVPR; contribute to open-source projects.
Technical Expertise
- Optimization of LLMs, LMMs, LVMs for inference.
- Deep Learning frameworks: TensorFlow, PyTorch, JAX, ONNX.
- Expert in CUDA, cuPy, Numba, TensorRT, ONNX Runtime, OpenVINO.
- Skilled in Python for scalable AI development.
- Experience with ML runtime delegates: TFLite, ONNX, Qualcomm AI Stack.
- Debugging: Netron, TensorBoard, PyTorch Profiler, Nsight, perf, Py-Spy.
- Cloud inference: AWS Inferentia, Azure ML, GCP AI Platform, Habana Gaudi.
- Hardware-aware optimization: oneDNN, ROCm, MLIR, SparseML.
- Contributions to open-source and research publications are a strong plus.
Leadership & Collaboration
- Lead a team of engineers in Python-based AI inference and optimization.
- Collaborate with researchers, software engineers, DevOps, and hardware vendors.
- Define debugging, deployment, and performance tuning best practices.


Help us improve JobGrin
Need Help? Contact us