Observability Jobs in Pune

12 Jobs Found

AT

Senior Performance Engineer

Aera Technology

6-10 Years | Not Disclosed | Pune, Maharashtra, India | Full-time

About Aera Technology Aera Technology is at the forefront of Decision Intelligence, enabling enterprises to optimize operations through AI-driven automation. Our Aera Decision Cloud integrates seamlessly with enterprise systems to enhance decision-making in real time. Performance Test Engineer We are seeking a Performance Test Engineer to join our growing Performance Testing team in Pune. If you are passionate about identifying bottlenecks, optimizing performance, and driving efficiency in a large-scale cloud-based platform, this role is for you. Key Responsibilities Design and implement performance testing frameworks following best practices. Develop, execute, and report on performance tests for web-based enterprise applications. Create and analyze workloads to benchmark and optimize cloud applications. Identify and resolve critical performance issues in test environments. Utilize Java/Python and industry-standard tools to simulate real-world workloads. Conduct scalability, stress, and load testing to identify hot spots and bottlenecks. Diagnose performance issues and optimize API/UI performance. Develop custom tools for automated test execution and result analysis. Ensure performance testing aligns with CI/CD pipelines using Jenkins. Maintain Grafana dashboards and InfluxDB for performance test monitoring. About You 6-10 years of experience in Performance Engineering/Testing for web applications. Expertise in load, soak, scalability, and stress testing methodologies. Hands-on experience in API and UI performance testing. Strong knowledge of JMeter, Java Mission Control, JVisualVM, JFR, AppDynamics, Dynatrace, New Relic, Splunk. Experience in microservices/API performance testing on cloud-based platforms (AWS or on-premise). Familiarity with Grafana and InfluxDB for performance monitoring. Ability to collaborate in cross-functional, agile, and DevOps environments. Strong problem-solving skills, with a proactive and self-motivated mindset. Competitive salary and stock options Comprehensive benefits Flexible work environment Professional development opportunities Equal Opportunity Employer Aera Technology is an equal opportunity employer. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, marital status, veteran status, or disability status. Join us and contribute to building the future of Decision Intelligence!

Senior Performance Engineer Senior engineer Performance engineer
DR

Senior Staff Software Engineer (foundation)

Druva

5+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Job Title: Senior Staff Software Engineer (Foundation) Company: Druva Location: Pune, India About Druva: Druva empowers organizations with cyber, data, and operational resilience through the Data Resiliency Cloud the industry s first and only SaaS solution at scale. Our platform simplifies data protection, streamlines governance, and provides unparalleled data visibility and insights, enabling customers to accelerate cloud adoption. Trusted by thousands of enterprises, including 60 of the Fortune 500, Druva eliminates complex infrastructure and management costs by delivering data resilience through a single platform spanning multiple geographies and cloud environments. About the Role: The Foundation team at Druva designs and develops a highly scalable, petabyte-scale distributed cloud file system built on AWS. This service-oriented system handles critical features such as file system metadata management, versioning, and eventual consistency using AWS services like S3, DynamoDB, and Kinesis. Beyond the core file storage engine which supports backup storage for all Druva products the team builds and maintains allied components like indexing engines, key-value stores, and big data pipelines to enable scalable search, analytics, and compliance capabilities. As a Senior Staff Software Engineer, you will collaborate closely with cross-functional teams, architects, and DevOps to define high-level and low-level designs (HLD/LLD) for advanced data security and management services. You will stay abreast of emerging trends in data security, platforms, technologies, and APIs, applying these insights to enhance existing features and develop new capabilities. You will also mentor and guide junior engineers, fostering a culture of high-quality, high-velocity software development. Key Skills & Experience: 5 to 7 years of experience building enterprise-grade software products. Expertise in designing and implementing SaaS solutions at scale. Proficiency in Python or Golang for software development. Deep knowledge and hands-on experience with cloud storage and data management systems. Proven experience building storage systems focused on securing and protecting data at scale, including managing data and metadata. Prior experience working on data protection products is highly desirable. Familiarity with cloud platforms such as AWS and Azure is a plus. Strong hands-on developer with excellent communication and collaboration skills. Ability to influence architecture, design, and implementation for timely delivery of business outcomes. Desirable: Strong written and verbal communication skills. Experience with Agile methodologies (Scrum). In-depth knowledge of AWS cloud technologies. Responsibilities: Collaborate with architects, developers, and DevOps to define HLD and LLD for data security and management services and features. Monitor and evaluate emerging trends in data security, management platforms, technologies, and APIs to refine and develop features. Lead the integration of data management tools and applications to enhance product quality and user experience. Promote best practices and principles of high-quality SaaS software development. Mentor and train junior engineers on data management, security principles, and agile development methodologies. Qualifications: Bachelor s or Master s degree in Computer Science, Engineering, or a related field (Advanced degree preferred). Qualification : Bachelors or Masters degree in Computer Science, Engineering, or a related field (Advanced degree preferred).

Senior Software Senior software Engineer Senior engineer
NV

Senior Site Reliability Engineer

Nvidia

7+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

NVIDIA s Infrastructure, Planning and Processes (IPP) organization is seeking a hard-working and experienced Site Reliability/DevOps Engineer, with strong background in Infrastructure Management, Monitoring, Automation, & System Administration, to join our Sanity Operations Team in Pune. The IPP Org provides Infrastructure, Products & Services for multiple software teams including GPU, Mobile, and Automotive divisions working on Nvidia's extraordinary products & services. The team is responsible for hosting, enabling & running the large scale private cloud systems & services, for our in-house Testing CI framework. The cloud hosts a heterogeneous mix of machines and devices with various operating systems (Windows/Linux/Android, etc.), running with NVIDIA GPUs and Tegra Processors. What you ll be doing: Create resilient, scalable, and efficient test and deployment pipelines. Design and implement complex automation platforms to identify & resolve operational inefficiencies. Triaging software, hardware and infrastructure issues and maintaining high availability for our infrastructure & services. Deploying & Monitoring critical high performance, large scale services running on Geo-distributed systems. Continuously Strive for efficient utilization & management of the infrastructure. Automate processes for enabling developers to adopt self-service practices, while ensuring compliance with security standards. Work with architects and engineers across the teams to review the designs & solutions during development and deployment phases. Collaborate with our other engineering teams to deliver reliable, robust, and high-performance capability of the underlying infra. Mine & analyze data from multiple sources for identifying scaling & optimization opportunities. What we need to see: Bachelor s or Master s degree in computer science, Software Engineering, or equivalent experience with 7+ years of experience in a DevOps environment. Strong hands-on experience in Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc) Working Experience in monitoring & maintaining large-scale infrastructure applications running in a microservice-based architecture. Proficient with Virtualization architecture with strong experience in Kubernetes, VMs, Dockers. Experience with continuous integration and continuous delivery systems such as GitLab, GitOps, Jenkins, Packer, and Terraform. Strong Python scripting skills, with proven background of using/writing JSON/REST APIs. Fluency in using MySQL or equivalent NoSQL databases queries Solid understanding of configuration management tools like, Chef, Puppet, Ansible, etc. Working Experience with Perforce, GIT or any other version control system is necessary. Experience with telemetry and alerting systems such as Kibana, Elastic Search, Grafana, and Prometheus to create rich visualizations of system health over time. Ability to self-manage, show leadership, mentor others and communicate well. Ways to stand out from the crowd: Understanding of networking concepts like TCP/IP and firewall management. Exposure to web apps/dashboards on frameworks like Django, AngularJS, VueJS, etc. High level understanding of Build and Test systems. Experience in Building regression detection systems by analyzing real-time production data, emphasizing important metrics. Innovating with industry-standard tools and collaborating with the open source community Qualification : Bachelors or Masters degree in computer science, Software Engineering, or equivalent experience with 7+ years of experience in a DevOps environment.

Senior Site Reliability Site reliability Engineer
RS

Ai/ml Engineer

Roxiler Systems

2-3 Years | Not Disclosed | Pune, Maharashtra, India | Full-time

AI/ML Engineer Experience: 2 3 Years Education: BBA, BSc, BTech/BE, MSc, MTech, MCA Location: Pune About the Role We are seeking a passionate AI/ML Engineer to design, build, and scale production-grade AI systems that power our platform. You will work on real-world AI problems such as document intelligence, real-time voice AI, and intelligent job-to-candidate matching systems. This role requires strong hands-on experience in LLM engineering, backend development, and AI system productionization. Required Skills Python (FastAPI) LangChain, LlamaIndex OpenAI API, Anthropic Good to Have Postgres (Pgvector), Supabase Redis Docker LangSmith GitHub Actions Roles & Responsibilities Build Agentic Workflows: Design and deploy AI agents capable of reasoning and decision-making using LangChain or LlamaIndex. Develop Voice AI Systems: Build low-latency, real-time conversational voice bots for candidate outreach using WebSockets, STT, and TTS, ensuring effective context retention and state management. Engineer Data Pipelines: Create robust parsing systems that enforce strict JSON outputs from LLMs for resume data extraction, along with data cleaning pipelines for unstructured formats (PDF/DOCX). Implement Advanced RAG Systems: Develop retrieval pipelines using Pgvector or Pinecone with hybrid search (semantic + keyword) for accurate job-to-candidate matching. Productionize & Monitor AI Systems: Set up tracing and observability using LangSmith to debug chains, monitor token usage, and optimize performance and cost. Backend Integration: Package AI logic into scalable, asynchronous microservices using FastAPI and Docker. Technical Requirements 1. Generative AI & LLM Engineering Experience enforcing structured JSON outputs using function calling. Strong prompt engineering skills (Chain-of-Thought, Few-Shot). Hands-on experience orchestrating complex AI pipelines using LangChain or LlamaIndex. 2. Voice AI & Real-Time Processing Experience with STT/TTS APIs (e.g., Whisper, Deepgram). Strong understanding of WebSockets and async programming for real-time audio streaming. Ability to design conversation managers with memory and context handling. 3. Search & Data (RAG) Experience with vector databases such as Pgvector, Pinecone, or Qdrant. Expertise in chunking, cleaning, and ingesting unstructured data. 4. MLOps & Production Engineering Observability and tracing using LangSmith. Writing automated AI evaluation scripts to test prompts before deployment. Experience optimizing cost by balancing model performance and token usage. 5. Core Backend Engineering Strong Python skills with FastAPI. Proficiency in async/await patterns for concurrent processing. Experience with Docker and relational databases (SQL). 6. Machine Learning & Algorithms Understanding of recommendation systems (Collaborative Filtering, Two-Tower models). Experience with ranking and re-ranking techniques (LTR, Cross-Encoders). Familiarity with traditional ML models using Scikit-learn or XGBoost. The Applied Mindset We Value Model Strategy: Ability to choose the right model based on cost, latency, and intelligence requirements. Security First: Awareness of prompt injection and jailbreaking risks and strategies to mitigate them. Hallucination Control: Strong grounding techniques to ensure factual and reliable AI outputs. Tech Stack Overview Languages: Python (FastAPI) AI & Orchestration: LangChain, LlamaIndex, OpenAI API, Anthropic Voice AI: Deepgram, Twilio Databases: Postgres (Pgvector), Supabase, Redis DevOps & MLOps: Docker, LangSmith, GitHub Actions Qualification : BBA, BSc, BTech/BE, MSc, MTech, MCA

Ai Ai ml Engineer Ai engineer Ml engineer
AN

Senior Mobile App Developer

Anchanto

4+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Job Title: Senior Mobile App Developer Location: Pune We are looking for an experienced and highly skilled Senior React Native Mobile App Developer to lead the development of a cutting-edge mobile application. This role offers the opportunity to build the app from the ground up, with a focus on defining architecture, setting coding standards, and ensuring optimal performance and reliability. You will work closely with cross-functional teams, including product, backend, and design, to deliver a seamless, high-quality mobile experience across both iOS and Android platforms. Key Responsibilities: Lead the end-to-end development of a new mobile application using React Native for both iOS and Android. Define and implement the app architecture, coding standards, folder structure, dependency management, and reusable components. Collaborate with product managers and designers to translate business requirements into intuitive, responsive, and high-performing user interfaces. Securely integrate with backend services (REST APIs / GraphQL) and handle complex authentication flows. Ensure high app performance, responsiveness, offline capabilities, and compatibility across a variety of devices. Manage private and enterprise distribution processes: iOS Enterprise provisioning and distribution (Ad Hoc, MDM-based) Internal Android enterprise distribution (APK/AAB deployment) Experience with AppCenter, Firebase App Distribution, Intune, or AirWatch is a plus. Work with native code when necessary (Swift/Objective-C for iOS, Java/Kotlin for Android). Implement and maintain observability, crash analytics, and logging solutions (e.g., Crashlytics, Sentry). Perform code reviews, mentor junior developers, and enforce engineering best practices. Participate in release planning, versioning, and the management of continuous integration (CI) pipelines. What You Bring: 4+ years of mobile development experience, with at least 2+ years of hands-on experience in React Native, building production-grade applications. Strong proficiency in JavaScript and TypeScript, with a deep understanding of React concepts. In-depth knowledge of mobile app lifecycle management, including navigation, animations, gestures, UI rendering, and state management. Extensive experience with state management libraries (Redux, Recoil, Zustand, MobX, etc.). Solid understanding of both iOS and Android platform fundamentals, such as: Permissions and security management Push notifications integration Local storage and offline sync solutions (e.g., AsyncStorage, SQLite, MMKV) Familiarity with private enterprise app distribution workflows (outside of App Store/Play Store). Experience with CI/CD pipelines for mobile applications (e.g., Fastlane, AppCenter, Bitrise, GitHub Actions). Ability to work independently with a product-ownership mindset and strong collaboration skills. Excellent communication skills and the ability to collaborate effectively with cross-functional teams. Nice to Have: Experience with Expo (managed or bare workflow). Experience developing or integrating native modules. Familiarity with deep linking, dynamic links, and integrating analytics SDKs. Knowledge of OTA (Over-the-Air) updates (e.g., CodePush, EAS, AppCenter). If you re passionate about building scalable, high-performance mobile applications and are ready to take ownership of a key product, we d love to have you on our team!

Senior Mobile Mobile app Developer Senior developer
AN

Engineering Manager

Anchanto

12+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Job Title: Engineering Manager Order Management System (OMS) Location: Pune Role Overview: As the Engineering Manager Order Management System (OMS), you will be responsible for leading the design, development, and continuous evolution of a large-scale, distributed eCommerce platform. This platform processes high transaction volumes and integrates with complex third-party systems. You will manage a full-stack engineering team, ensuring system scalability, performance, and resilience while fostering a culture of ownership, technical excellence, and collaboration. Key Responsibilities: Own the full product lifecycle: Lead the conceptualization, architecture, design, implementation, deployment, and maintenance of the OMS and its integrations. Lead and mentor a team of 10+ engineers, guiding them through technical challenges and driving both backend and frontend development efforts to successful delivery. Architect scalable, distributed systems that handle high volumes of orders, inventory updates, and third-party data exchanges across the platform. Drive eCommerce integration strategy, collaborating with various systems including marketplaces, ERPs, WMS, payment gateways, and 3PLs to ensure robust data synchronization. Take technical ownership of both backend and frontend components, from database schema and API design to UI architecture and performance optimization. Establish and enforce engineering best practices, including coding standards, CI/CD workflows, observability, and security compliance, to ensure consistency and quality across the team. Be **hands-on** when necessary actively contributing to code, reviewing critical modules, and troubleshooting complex production issues. Ensure high availability, scalability, and data integrity in every design decision, embedding performance and security into the development lifecycle. Collaborate cross-functionally with Product, QA, DevOps, and Customer Success teams to ensure alignment between technical delivery and business priorities. Recruit and develop talent within the team, conducting technical interviews and nurturing a strong engineering culture. What You ll Bring: 12+ years of software engineering experience, with at least 4 5 years of hands-on experience in Ruby on Rails (RoR) backend development. Proven success in building and scaling distributed, event-driven systems that can handle high transaction volumes and complex integrations. Strong Angular expertise: Experience leading teams to deliver rich, responsive web applications. Deep knowledge of eCommerce and OMS domain concepts, including order lifecycle, inventory management, shipments, returns, and third-party partner integrations. Expertise in PostgreSQL/MySQL: Proficiency in schema design, query optimization, and performance tuning. Familiarity with RESTful APIs, webhooks, and common integration patterns for external systems. Experience working with cloud platforms (preferably AWS) and managing CI/CD pipelines for continuous deployment and delivery. Proven experience as an Engineering Manager or Technical Lead, with a track record of mentoring engineers and managing delivery across multiple engineering modules. A passion for building reliable, secure, and performant systems that deliver measurable business impact and enhance the customer experience. Excellent communication, organizational, and problem-solving skills, with the ability to effectively manage complex technical challenges. A strong sense of ownership, self-motivation, and a growth-oriented mindset, always striving to improve processes and systems. Nice to Have: Experience with microservices, asynchronous job processing, or message queues (e.g., Sidekiq, Resque, RabbitMQ). Exposure to SaaS or multi-tenant architectures. Familiarity with containerization (e.g., Docker) and monitoring tools (e.g., Grafana, ELK, Prometheus). Understanding of API versioning, rate limiting, and data consistency patterns in large-scale distributed systems. Innovative Environment: Work on a high-impact eCommerce platform that powers complex integrations and supports millions of transactions globally. Leadership Opportunity: Lead and mentor a talented team of engineers while driving technical strategy and best practices. Career Growth: Be part of a rapidly growing company with opportunities to develop both technically and professionally in a collaborative, dynamic environment. Impactful Work: Your work will directly impact the success of a highly scalable, high-performance platform that serves leading global businesses. If you are an experienced engineering leader with a passion for building scalable and resilient systems in the eCommerce domain, we would love to hear from you!

Engineering Manager Engineering manager Manager engineering Full-Time
RA

AI Platform Engineer II

Rapid7

2-3 Years | Not Disclosed | Pune, Maharashtra, India | Full-time

AI Platform Engineer II Location: Pune Overview At Rapid7, we are looking for an AI Platform Engineer to help us build production-ready, agent-based AI experiences on Google Cloud (Vertex AI) and OpenAI platforms. You will be directly responsible for developing and shipping features such as AI agents, tools, and lightweight UIs that plug into our internal systems and data, driving productivity improvements across the company. As part of the AI Platform team, you will be involved in designing, implementing, and refining AI models and tools that integrate seamlessly with enterprise systems, while ensuring reliability, security, and scalability. Key Responsibilities Agentic Development with Vertex AI: Design and implement multi-step agents using Vertex AI (Agent Builder / Gemini), incorporating agent behaviors such as tool use, memory, and evaluation loops to create intelligent, autonomous workflows. MCP & Tool Integrations: Develop Model Context Protocol (MCP) connectors and API integrations that enable agents to interact with enterprise systems while maintaining least-privilege access and security standards. RAG & Data Grounding: Implement retrieval-augmented generation (RAG), utilizing vector stores, embeddings, and schema-aware prompts, while ensuring agents produce accurate and auditable outputs through robust guardrails. Vertex-based UI for Agents: Prototype intuitive chat interfaces and task UIs, including features like session history, tool invocation panels, and feedback widgets. Embed AI agents into existing internal applications to streamline workflows and improve user experience. Integration & Data Access: Connect AI agents to internal APIs and enterprise data sources (e.g., BigQuery, REST APIs, webhooks) using Google Cloud tools, ensuring smooth data access and system integration. Quality & Telemetry: Define key performance indicators (KPIs) such as quality, latency, and cost-efficiency for AI-powered features. Implement logging, monitoring, and observability to track performance and run structured experiments to improve agent behavior and outcomes. Requirements Experience 2 3 years of hands-on experience in software or AI application engineering, with a focus on creating and deploying custom AI agents, particularly using Gemini / Vertex AI Agent Builder. Technical Skills Proficiency with Google Cloud (Vertex AI, BigQuery, Cloud Run/Functions) and at least one large language model (LLM) API (e.g., OpenAI or Gemini). Experience building small end-to-end features, including backend APIs, agent logic, and basic UI for demos and production environments. Familiarity with agent frameworks such as LangGraph or LangChain, and common agent patterns like tool use, planning, and memory management. Knowledge of retrieval-augmented generation (RAG), including chunking, embeddings, and retrieval, along with prompt testing and evaluation. Collaboration Skills Ability to work cross-functionally with security, data, and application teams to ensure safe and efficient deployment of AI-driven solutions. Nice to Have Experience with GCP IAM, secrets management, and enterprise compliance considerations. Familiarity with Vertex AI Search, vector databases, or event/webhook architectures. Interest in emerging AI standards, MCP, responsible AI practices, and optimization for cost/performance. About Rapid7 At Rapid7, we are on a mission to create a secure digital world for our customers, industry, and communities. We empower our employees to challenge the status quo and drive transformative change. With over 11,000 customers relying on us for cybersecurity protection, we continue to innovate and solve some of the most challenging problems in the industry. Security and Compliance Rapid7 is committed to safeguarding our customers' security. All employees uphold the highest standards of security and privacy, ensuring sensitive information is protected and compliance with relevant regulations is maintained.

Ai Platform Ai platform Engineer Ai engineer
RA

Principal Ai Engineer - Mlops

Rapid7

15+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Principal AI Engineer - MLOps Location: Pune Overview Rapid7 is seeking a **Principal AI Engineer - MLOps** to drive the development and deployment of **scalable, production-grade AI/ML systems** in the field of cybersecurity. This role is within the **AI Center of Excellence team** and involves managing the **end-to-end design of ML production systems** from scoping and data requirements to continuous deployment and monitoring. You will play a pivotal role in mentoring junior engineers and shaping the **MLOps culture**. Key Technologies Used **Cloud/Infrastructure:** AWS (research environments, data hosting), **EKS** (Kubernetes), **Terraform** (Infrastructure as Code). **Modeling/Data:** **Python** (numpy, pandas, scikit-learn), Jupyter notebooks, Anomaly detection methods. Key Responsibilities End-to-End ML Production System Design: **Architect and manage the design and deployment** of ML systems from inception to production. Define project scopes, data requirements, modeling strategies, and **deployment pipelines** for AI/ML systems. Ensure **seamless integration** of models and AI components into the production environment, focusing on **scalability, reliability, and security**. Data Pipeline Management: **Develop and maintain data pipelines**, ensuring data consistency, integrity, and accessibility throughout the ML lifecycle. Oversee the data lifecycle, including **data preprocessing, feature engineering**, and storage. ML Guardrails & Monitoring: Implement **robust ML guardrails** to ensure model performance, fairness, and compliance. Manage the **monitoring and observability** of deployed models, ensuring timely and accurate tracking of model performance. Deployment & Service Monitoring: Develop and deploy accessible endpoints, including **web applications and REST APIs**, adhering to security best practices. Continuously monitor models and data to detect issues such as **drift, anomalies, or performance degradation**. Collaboration & Mentorship: Collaborate closely with engineering, data science, and product teams to ensure alignment on goals. **Mentor and guide junior engineers**, share knowledge, and promote best practices in MLOps. Agile Development & Iteration: Embrace **agile practices**, focusing on continuous iteration and solving complex challenges. Skills & Experience Core Technical Expertise **Extensive Software Engineering Experience:** 15+ years as a software engineer, with at least **3-5 years of expertise in MLOps** in AWS environments. **MLOps & DevOps:** Proven experience deploying **scalable AI/ML systems**, managing **CI/CD pipelines**, and utilizing cloud AI resources. Expertise with **Docker, Kubernetes, and cloud-based AI infrastructure management**. **Programming:** Strong proficiency in **Python** (Flask or FastAPI) and experience building APIs. Data & Modeling Skills Experience in designing **data pipelines** and performing **feature engineering**. Familiarity with **model risk management** strategies, including **concept drift monitoring** and hyperparameter tuning. Leadership & Communication Demonstrated ability to **collaborate across engineering, data science, and product teams**. Proven track record of **mentoring and guiding junior engineers**. Strong communication skills for conveying complex technical concepts and creating detailed system architecture documentation. Nice to Have Experience deploying resources that enable data scientists to fine-tune and experiment with **LLMs (Large Language Models)**. Understanding of AI/ML operational frameworks and associated challenges. Be a part of **cutting-edge AI/ML efforts** to strengthen cybersecurity. Mentor and grow the next generation of engineers in an inclusive, fast-paced environment. Tackle some of the most challenging security problems with a passionate and creative team.

Principal Ai Engineer Principal engineer Ai engineer
DR

Staff Data / Machine Learning Engineer

Druva

7+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Job Title: Staff Data / Machine Learning Engineer Company: Druva Location: Pune, Maharashtra, India About Druva: Druva, the autonomous data security company, delivers data protection on autopilot through a 100% SaaS, fully managed platform that secures and recovers data from all threats. The Druva Data Security Cloud guarantees data availability, confidentiality, and integrity providing customers with autonomous protection, rapid incident response, and guaranteed data recovery. Trusted by over 6,000 customers worldwide including 65 of the Fortune 500 Druva safeguards business-critical data in today s connected world. Backed by over $350M in venture capital, Druva protects more than 200 PB of data globally and offers a $10 million Data Resiliency Guarantee against cyber threats. The Role & Team: Join Druva s Business Intelligence team, where data drives key insights and fuels operational and strategic decisions. As a Staff Data/Analytics/ML Engineer, you will lead the development of data-to-insights recommendation engines and drive collaboration across teams to deliver impactful, data-driven solutions. This is a high-impact role working alongside expert engineers and product leaders, leveraging industry-leading data platforms and tools. What You ll Do: Bridge the gap between data engineering, analytics, and machine learning to unlock actionable business insights. Collaborate cross-functionally with Product, Engineering, GTM, and Customer Success teams to build data-driven solutions that enhance decision-making and business outcomes. Design, build, and maintain scalable data pipelines and infrastructure supporting analytics, machine learning, and operational workflows. Develop and optimize data models for efficient analytics and reporting across large datasets. Lead feature engineering, model training, and deployment pipelines for real-time and batch ML/AI applications. Drive architectural improvements in data governance, observability, and platform scalability. Evaluate and implement cutting-edge data tools and platforms including Reverse ETL, MLOps, and DataOps frameworks. Mentor and guide data engineers, analysts, and ML practitioners, fostering technical excellence and collaboration. Translate complex business requirements into technical deliverables, prioritizing initiatives that maximize impact. What We Are Looking For: Bachelor s or Master s degree in Computer Science, Data Science, Engineering, Statistics, or a related field. 7+ years experience spanning data engineering, analytics engineering, and machine learning with a strong technical foundation in all three areas. Expertise in modern data stacks: Snowflake, dbt, Airflow, Spark, and cloud platforms (AWS, GCP, Azure). Proficiency with BI and analytics tools such as Looker, Sigma, Tableau, Power BI, or equivalents. Strong skills in SQL, Python, and distributed data processing frameworks (Spark, Dask, etc.). Experience designing and maintaining data pipelines, ETL/ELT workflows, and data transformations for analytics and ML use cases. Familiarity with ML frameworks like TensorFlow, PyTorch, Scikit-learn, and MLOps lifecycle management. Deep knowledge of data modeling, architecture, and governance best practices. Proven problem-solver who thrives in ambiguity and designs robust data solutions. Experience working in Agile environments with strong prioritization and collaboration skills. Excellent communication skills, capable of simplifying complex technical topics for non-technical stakeholders. Bonus Points For: Experience with Generative AI, LLMs, and integrating AI into business workflows. Knowledge of streaming architectures (Kafka, Kinesis, Pub/Sub) and real-time analytics. Familiarity with Data Contracts, Data Quality frameworks, or Data Mesh architectures. Experience with Reverse ETL tools such as Salesforce, Census, Hightouch, or Segment for operational analytics. If you re passionate about driving data innovation at scale and want to work in a dynamic environment protecting the world s critical data, Druva invites you to join our team! Qualification : Bachelors or Masters degree in Computer Science, Data Science, Engineering, Statistics, or a related field.

Data Engineer Staff Engineer Data Engineer Staff data engineer
TS

Devops Sre Manager

Talentica Software (i) Pvt. Ltd.

8-12 Years | Not Disclosed | Pune, Maharashtra, India | Full-time

About Talentica Software: Talentica Software is a boutique software development company founded by industry veterans and alumni from IIT Bombay. We specialize in helping startups build innovative products by leveraging the latest tools and technologies to solve real-world challenges. With over 21 years of experience, we've partnered with 180+ startups, primarily in the US, and contributed to numerous successful exits. In 2022, Talentica Software was recognized by Great Place to Work as one of India s Great Mid-Size Workplaces. What We re Looking For: We are seeking a DevOps SRE Manager to lead our cloud operations, with a primary focus on Google Cloud Platform (GCP) and secondary support for AWS. In this role, you will manage two critical teams: one DevOps team responsible for GCP infrastructure, and a CloudOps/SRE team ensuring 24/7 uptime for our mission-critical services. This position requires a blend of technical expertise, leadership skills, and customer relationship management. You ll be responsible for ensuring the reliability, scalability, and security of our infrastructure while overseeing smooth cloud operations. What You ll Be Doing: As a DevOps SRE Manager, your responsibilities will include: Managing GCP Operations: Oversee DevOps operations within Google Cloud Platform using tools like Terraform, Kubernetes (GKE), Prometheus, and Grafana. Infrastructure Automation: Ensure timely execution of tasks and optimize infrastructure automation to improve operational efficiency. CI/CD Enhancement: Drive improvements to CI/CD pipelines, enforce cloud security best practices, and enhance software delivery processes. System Reliability: Improve system reliability through advanced monitoring, logging, and alerting solutions. Cloud Optimization: Optimize cloud infrastructure for cost-effectiveness, scalability, and security, ensuring long-term operational efficiency. Leading CloudOps/SRE Teams: Manage a 24x7 CloudOps/SRE team focused on maintaining service uptime and providing prompt incident response. Incident Management: Lead incident management processes, including conducting Root Cause Analysis (RCA) and ensuring adherence to SLAs. Implement Observability Best Practices: Utilize Grafana, Prometheus, and Opsgenie to implement observability best practices. Promote Automation: Foster self-healing, automated infrastructure to reduce manual interventions and improve operational efficiency. Customer Relationship Management: Build and maintain strong customer relationships through transparent and clear communication. Mentorship and Leadership: Lead and mentor cross-functional teams of DevOps and CloudOps/SRE engineers, ensuring high productivity, continuous professional growth, and performance reviews. AWS Support: Provide basic-to-intermediate support for AWS services (IAM, EC2, S3, Lambda, CloudFormation) and assist in hybrid cloud integration when required. To Be Successful in This Role, You Should Have: Qualifications: BE/BTech from a reputable engineering institute. Experience: 8-12 years of experience in DevOps, CloudOps, or SRE roles. Technical Expertise: Primary Cloud Platform: Expertise in Google Cloud Platform (GCP). Secondary Cloud Platform: Experience with AWS. Infrastructure as Code (IaC): Strong experience with Terraform. Containerization & Orchestration: Hands-on experience with Kubernetes (GKE). CI/CD & Automation: Expertise in tools such as Jenkins, GitOps, and Ansible. Monitoring & Observability: Proficient in Prometheus, Grafana. Incident & Alerting: Familiarity with Opsgenie. Big Data & Streaming: Experience with Kafka, Airflow, Druid. AWS Services: Experience with IAM, EC2, S3, Lambda, CloudFormation, and CloudWatch. Additional Skills: Proven experience managing 24x7 operations and multi-cloud environments. Hands-on expertise with GCP infrastructure, Terraform, Kubernetes, and CI/CD pipelines. Experience with incident management, RCA, monitoring, and alerting. Strong understanding of reliability engineering, automation, and cloud security best practices. Bonus Points If You Have: Experience working with Kafka, Airflow, and Druid in large-scale environments. Certifications such as GCP Professional DevOps Engineer, AWS Solutions Architect, or Kubernetes. Working knowledge of AWS cloud services, especially in hybrid-cloud environments. What You ll Find Here: A Culture of Innovation: We focus exclusively on cutting-edge development. Our clients seek our expertise for innovative solutions, not maintenance work. Endless Learning Opportunities: Constantly expand your skills and stay on top of the latest trends and advancements in cloud technologies. Talented Peers: Work alongside top-tier engineers from India s best institutes (IITs, NITs, and others), fostering a collaborative and growth-oriented environment. Work-Life Balance: We value your well-being and offer flexible schedules and remote work options to help you maintain a healthy work-life balance. A Great Culture: 82% of our employees recommend Talentica to their peers (according to Glassdoor), which speaks to the positive work environment we ve built. Recognition & Rewards: We celebrate success and ensure that your contributions are recognized and appreciated. At Talentica, we invite you to take ownership of large-scale, impactful projects and work with cutting-edge technologies. If you re ready to make a real difference in shaping the future of our industry, we d love to have you join us. Qualification : BE/BTech from a reputable engineering institute.

DevOps SRE Manager Devops manager Full-Time
AT

Specialist-automation

Allianz Technology

7+ Years | Not Disclosed | Pune, Maharashtra, India | Full-time

Qualifications: 7+ years of experience working in the AI & Automation field Proven experience in designing and implementing AIOps solutions in large-scale environments; strong expertise as an Automation Engineer with a focus on AIOps, Generative AI, and Conversational AI. Hands-on experience withAmeliaAIOps software and integrations is a must Strong knowledge of AI/ML techniques applied to IT operations. Proficiency with automation tools (e.g., Ansible, Puppet, Terraform, Chef etc). Expertise in cloud platforms (AWS, Azure, GCP), with hands-on experience in automation and orchestration. Solid understanding of APIs, web services, and integration technologies (e.g., REST, GraphQL, Kafka). Proficiency in scripting/programming languages (Python, Java,Bash, etc). Familiarity with observability tools (e.g., Splunk, Dynatrace, New Relic) and ITSM tools (e.g., ServiceNow) Strong background in machine learning and deep learning algorithms. - Proficiency in Python, TensorFlow, and PyTorch, Huggingface for developing AI models. Generative AI frameworks Langchain, LlamaIndex, Agentic Frameworks - AutoGen, Semantic kernel, crewAI, promptflow,Langflow, Langraph Deep understanding of transformer architectures and diffusion models Experience in generative AI techniques such as GANs and VAEs. - Ability to design and implement scalable and efficient AI systems. Experience working with DevOps including but not limited to container technologies like Docker & Kubernetes, as well as Cloud Native technology stack such as Argo, Helm, etcd, and Envoy Strong communication, problem solving and leadership skills, with the ability to work collaboratively with diverse teams. Certifications in AWS, Azure, Generative AI or relevant AI technologies are a plus. Your benefits: We offer a hybrid work model which recognizes the value of striking a balance between in-person collaboration and remote working incl. up to 25 days per year working from abroad We believe in rewarding performance and our compensation and benefits package includes a company bonus scheme, pension, employee shares program and multiple employee discounts (details vary by location) From career development and digital learning programs to international career mobility, we offer lifelong learning for our employees worldwide and an environment where innovation, delivery and empowerment are fostered Flexible working, health and wellbeing offers (including healthcare and parental leave benefits) support to balance family and career andhelp our people return from career breaks with experience that nothing else can teach About Allianz Technology Allianz Technology is the global IT service provider for Allianz and delivers IT solutions that drive the digitalization of the Group. With more than 13,000 employees located in 22 countries around the globe, Allianz Technology works together with other Allianz entities in pioneering the digitalization of the financial services industry. We oversee the full digitalization spectrum from one of the industry s largest IT infrastructure projects that includes data centers, networking and security, to application platforms that span from workplace services to digital interaction. In short, we deliver full-scale, end-to-end IT solutions for Allianz in the digital age. D&I statement Allianz Technology is proud to be an equal opportunity employer encouraging diversity in the working environment. We are interested in your strengths and experience. We welcome all applications from all people regardless of gender identity and/or expression, sexual orientation, race or ethnicity, age, nationality, religion, disability, or philosophy of life. Join us. Let s care for tomorrow.

Specialist Automation Automation specialist Full-Time Process Automation
IB

Devops Software Engineer

Ibm

1-8 Years | Not Disclosed | Pune, Maharashtra, India | Full-time

IBM Systems helps IT leaders think differently about their infrastructure. IBM servers and storage are no longer inanimate - they can understand, reason, and learn so our clients can innovate while avoiding IT issues. Our systems power the world s most important industries and our clients are the architects of the future. Join us to help build our leading-edge technology portfolio designed for cognitive business and optimized for cloud computing. Your role and responsibilities As a DevOps Software Engineer you will be responsible to develop and enable automation for some of the following key functions CI/CD, Observability, Dashboard, Alerting and deployments Required education Bachelor's Degree Preferred education Bachelor's Degree Required technical and professional expertise 1-8 Years of relevant Industry Experience Experience of DevOps in managing, deploying cloud-based production applications Engineers are responsible for ensuring that the underlying infrastructure is running smoothly and that systems and tools are working as expected. Skills: Scripting & Programming Lang skills (Python, Shell, Go Lang, Java, Object Oriented), Automation, GitHub, Public Cloud Platform skills, Ansible, Terraform Continuous Integration and Delivery (CI/CD) across SDLC phases (Kubernetes , OpenShift, TekTon, Terraform) Secure Dev Practices Knowledge of Networking fundamentals/protocols, OpenShift container platform, Linux, Kubernetes Experience on Microservices Architecture Cloud Provider Experience (IBM, Azure, AWS etc), Knowledge of VPC Infrastructure Component Provisioning Understanding of Storage Domain Preferred technical and professional experience Experience working on public cloud such as IBM Cloud, AWS Scripting and automation skills with Python, Shell programming Continuous Integration and Delivery (CI/CD) across SDLC phases (Kubernetes , OpenShift, TekTon, Terraform

DevOps Software Engineer Devops engineer Software Engineer

1 - 20 of 0 jobs

* No exact matches found. Showing closest results instead
Sort by:

No results found

Modify search criteria or create an alert to get relevant jobs as soon as they’re posted

Create an alert

Continue to Save

Please login to your jobseeker account, or create a new one to save this job.

Feedback

Share Feedback