Engineer, Principal/manager - Machine Learning, Ai Job in Qualcomm India Private Limited

Engineer, Principal/manager - Machine Learning, Ai

Apply Now
Job Summary

Engineer, Principal/Manager - Machine Learning, AI

Location: Bangalore, Karnataka, India
Company: Qualcomm India Private Limited

General Summary

Qualcomm is seeking an experienced and visionary Principal AI/ML Engineer to lead research, development, and optimization of AI inference systems. This role involves developing high-performance AI models, optimizing deployments across various hardware platforms, and contributing to research in model compression, quantization, and hardware-aware optimization.

Education & Experience

  • PhD with 6+ years, Master's with 7+ years, or Bachelor's with 8+ years in Engineering, CS, or related field.
  • 20+ years of experience in AI/ML development; 5+ years in inference optimization and debugging.

Key Responsibilities

Model Optimization & Quantization

  • Optimize models using quantization (INT8, INT4, mixed precision), pruning, and knowledge distillation.
  • Implement PTQ and QAT techniques for deployment.
  • Experience with TensorRT, ONNX Runtime, OpenVINO, TVM.

AI Hardware Acceleration & Deployment

  • Target platforms: Hexagon DSP, CUDA GPUs, TPUs, NPUs, FPGAs, Habana Gaudi, Apple Neural Engine.
  • Use Python APIs: cuDNN, XLA, MLIR for hardware acceleration.
  • Benchmark and debug performance across platforms.

AI Research & Innovation

  • Research on efficient AI inference: model compression, low-bit precision, sparse computing.
  • Explore architectures like Sparse Transformers, Mixture of Experts, Flash Attention.
  • Publish in ML conferences: NeurIPS, ICML, CVPR; contribute to open-source projects.

Technical Expertise

  • Optimization of LLMs, LMMs, LVMs for inference.
  • Deep Learning frameworks: TensorFlow, PyTorch, JAX, ONNX.
  • Expert in CUDA, cuPy, Numba, TensorRT, ONNX Runtime, OpenVINO.
  • Skilled in Python for scalable AI development.
  • Experience with ML runtime delegates: TFLite, ONNX, Qualcomm AI Stack.
  • Debugging: Netron, TensorBoard, PyTorch Profiler, Nsight, perf, Py-Spy.
  • Cloud inference: AWS Inferentia, Azure ML, GCP AI Platform, Habana Gaudi.
  • Hardware-aware optimization: oneDNN, ROCm, MLIR, SparseML.
  • Contributions to open-source and research publications are a strong plus.

Leadership & Collaboration

  • Lead a team of engineers in Python-based AI inference and optimization.
  • Collaborate with researchers, software engineers, DevOps, and hardware vendors.
  • Define debugging, deployment, and performance tuning best practices.
Experience Required :

Minimum 8 Years

Vacancy :

2 - 4 Hires

Apply Now
Similar Jobs for you

See more recommended jobs

Your 4 Step Guide to Career Success

Apply for jobs
Create Profile
Schedule Interview
Get Hired