Monitoring AND Alerting Jobs in Bengaluru

446 Jobs Found

EX

Gen AI Support Engineer-2

Exotel

4-7 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Gen AI Support Engineer-2 Location: Bengaluru Experience: 4 7+ years Employment Type: Full-time About Us Exotel is the leading full-stack customer engagement platform and virtual telecom operator for emerging markets. Since its inception in 2011, Exotel has been powering 50 million daily engagements across voice, video, and messaging channels. We provide our unified customer engagement solutions to over 6000 companies globally, including industry leaders like Ola, Swiggy, Flipkart, GoJek, Byjus, Urban Company, HDFC Bank, Zomato, and Oyo. With $100 million in Series D funding and an ARR of $60 million, Exotel is a growth-stage company poised for massive impact. Overview We're seeking a Gen AI Support Engineer-2 to join our team. As an L2 Support Engineer, you will be the highest level of technical escalation within the support organization. Your role will encompass system reliability, platform integrity, troubleshooting mission-critical production issues, and collaborating with engineering teams for architecture feedback. Additionally, you'll help mentor junior engineers and improve operational processes and tools for large-scale environments. If you're passionate about writing clean code with Python and Django and want to contribute to a fast-paced, mission-driven company, this role is for you! Responsibilities Mission-Critical Issue Resolution: Own the resolution of high-priority, time-sensitive production issues. Root Cause Analysis (RCA): Lead RCA reviews and push for systemic improvements in system architecture and processes. Performance Optimization: Identify bottlenecks and propose architectural changes to improve system performance and scalability. Patch Management: Assist in configuring, deploying, and testing patches, releases, and application updates to production environments. SME for Production Systems: Serve as the Subject Matter Expert (SME) for Exotel's production systems and integrations. Cross-Team Collaboration: Work with Delivery, Product, and Engineering teams to influence system design, rollout strategies, and improvement plans. Mentorship: Lead and mentor L1/L2 engineers on troubleshooting best practices and continuous learning. Code Writing & Automation: Write clean, maintainable code for internal tools, scripts, and automation using Python and Django. Support Tooling: Automate recovery workflows and design support tools for proactive monitoring. Operational Excellence: Establish and improve SLAs, monitoring dashboards, alerting systems, and operational runbooks to ensure system reliability. Must Have Skills Backend Development Support: 3+ years of experience in backend development support, production support, or DevOps/SRE roles. Core Technologies: Proficiency in Python, Django, SQL, and troubleshooting in Linux. Web Technologies: Strong understanding of HTML, CSS, JavaScript, and other web technologies. Distributed Systems & Cloud: Experience working with distributed systems, cloud architecture (AWS), Docker, and Kubernetes. Automation: Strong scripting skills with Bash/Python for automation and operational support. CI/CD & Observability: Good understanding of CI/CD, observability tools, and release management workflows. Communication Skills: Excellent communication, leadership, and incident command skills for managing production issues and cross-functional collaboration. Nice to Have Experience with AI-powered systems and machine learning technologies. Familiarity with monitoring systems like Prometheus, Grafana, or Elasticsearch. Knowledge of microservices architectures and scaling distributed systems. Innovative Work: Be at the forefront of cloud-based communications technology and AI-driven customer engagement platforms. Impact: Play a key role in maintaining and optimizing systems that power millions of customer interactions daily. Growth Opportunities: Be part of a fast-growing company with ample learning opportunities and career development. Collaborative Environment: Work in a supportive, inclusive environment where your input and ideas matter. Competitive Benefits: Comprehensive benefits package including health insurance, mental wellness support, and more.

Ai Gen Ai Support Engineer Ai engineer
MF

ML Ops Engineer

Mpokket Financial Services Private Limited

3-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: ML Ops Engineer Location: Bangalore Department: Data Science Employee Type: Full-time Experience Required: 3 5 years Position Overview We are seeking an experienced and motivated ML Ops Engineer to join our Data Science team. In this role, you will be responsible for deploying, monitoring, and maintaining machine learning models in production environments. You will work closely with data scientists, engineers, and product teams to ensure models are scalable, reliable, and aligned with business objectives. This role is ideal for professionals who are passionate about building robust ML pipelines and bringing machine learning solutions into real-world applications at scale. Key Responsibilities Deploy and manage machine learning models in production environments, ensuring scalability, reliability, and performance. Build and maintain MLOps pipelines using platforms like Databricks and MLflow. Monitor model performance, accuracy, and health; implement alerting and diagnostics as needed. Develop and maintain RESTful APIs using Python frameworks such as Flask or Django to serve ML models. Optimize data workflows and collaborate with engineering teams to improve model integration and performance. Design strategies for automated model retraining, deployment, and version control. Write clean, maintainable, and efficient code using Python, adhering to OOP principles and best practices. Write complex queries using SQL and work with NoSQL databases to support data pipelines and feature stores. Leverage Python libraries such as PySpark, Pandas, scikit-learn, SQLAlchemy, and Requests. Minimum Qualifications Bachelor s or Master s degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field. 3 5 years of experience in building, deploying, and monitoring machine learning solutions in production. Must-Have Skills Experience with Databricks and MLflow for model training and deployment. Proven expertise in machine learning model deployment and monitoring in live environments. Strong programming skills in Python, with solid understanding of data structures, algorithms, and OOP concepts. Experience developing RESTful APIs using Flask or Django. Proficient in SQL and NoSQL database operations. Hands-on knowledge of libraries such as Pandas, PySpark, scikit-learn, SQLAlchemy, and Requests. Strong analytical, problem-solving, and debugging skills. Good-to-Have Skills Experience with Kafka streaming and batch processing. Familiarity with CI/CD pipelines and version control systems like Git. Understanding of Python multiprocessing, worker/queue systems, and asynchronous/event-driven programming. This is a unique opportunity to work at the intersection of machine learning and DevOps. You'll play a critical role in operationalizing AI models and making them a core part of our product offerings. If you enjoy building scalable systems and solving real-world ML engineering challenges, we d love to meet you. Qualification : Bachelors or Masters degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field

Ops ML Ops Engineer Ml engineer ML Ops Engineer
GR

Site Reliability Engineer

Groww

4-6 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Site Reliability Engineer Location: Bengaluru About Groww At Groww, we re on a mission to make financial services simple, accessible, and transparent for every Indian. As one of India s fastest-growing financial platforms, we help millions take control of their financial future through a wide range of products. We re a team driven by ownership, radical customer-centricity, and a deep passion for challenging the status quo. From intuitive design to robust engineering, everything we build is grounded in what our customers need. If you re excited about building systems that power the future of finance in India, we d love to hear from you. Our Vision To empower every Indian with the knowledge, tools, and confidence to make sound financial decisions. Our goal is to be the most trusted financial partner for millions across the country. Our Core Values Customer Obsession We put our users first, always. Extreme Ownership We own everything we do, end-to-end. Simplicity We keep things simple, effective, and intuitive. Long-term Thinking We focus on sustainable, impactful decisions. Transparency We believe in open communication and collaboration. Role Overview: As a Site Reliability Engineer (SRE) at Groww, you will be responsible for ensuring our systems are highly available, performant, and secure. You will work closely with engineering and infrastructure teams to improve reliability, automate deployments, and manage mission-critical services that power our platform. Key Responsibilities: Monitor and troubleshoot issues related to system performance, availability, and security. Define and maintain SLIs, SLOs, and Error Budgets to improve system reliability. Use tools like Grafana to analyze and report on metrics and trace data. Participate in the on-call rotation for 24/7 support of production systems. Collaborate with developers to ensure scalability and reliability are built into new services. Roll out security and infrastructure features proactively. Manage automated deployments, version control, and release rollouts. Perform Root Cause Analysis (RCA) for incidents and implement long-term fixes. Optimize system performance, conduct capacity planning, and create recovery strategies. Identify and automate repetitive tasks to reduce toil. Leverage CI/CD tools such as Git, Jira, Jenkins to streamline development workflows. Requirements: 4 6 years of relevant experience in SRE, DevOps, or infrastructure engineering. Bachelor's or Master's degree in Computer Science or a related field. Strong background in Linux/Unix system administration and networking. Hands-on experience with cloud platforms like GCP or AWS. Proficiency in programming languages such as Python, Java, or Go. Experience with monitoring and alerting tools: Grafana, Prometheus, New Relic, etc. Familiarity with configuration management tools. Experience with Kubernetes, Docker, and container orchestration tools is a strong plus. Excellent problem-solving, communication, and team collaboration skills. Be a part of one of India s fastest-growing fintech startups. Build and scale systems that impact millions of users daily. Work with passionate, driven teammates who are redefining financial services. A culture that encourages continuous learning, ownership, and transparency. If you're ready to help shape the future of fintech infrastructure in India, Groww is the place for you. Let s build something extraordinary together. Qualification : Bachelor's or Master's degree in Computer Science or a related field

Site Reliability Site reliability Engineer Site engineer
CO

Platform Engineer

Colortokens

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Platform Engineer Location: Bengaluru, Karnataka, India Full-time partially remote About ColorTokens At ColorTokens, we empower businesses to stay operational and resilient in an increasingly complex cybersecurity landscape. Breaches happen but with our cutting-edge ColorTokens Xshield platform, companies can minimize the impact of breaches by preventing the lateral spread of ransomware and advanced malware. We enable organizations to continue operating while breaches are contained, ensuring critical assets remain protected. Our innovative platform provides unparalleled visibility into traffic patterns between workloads, OT/IoT/IoMT devices, and users, allowing businesses to enforce granular micro-perimeters, swiftly isolate key assets, and respond to breaches with agility. Recognized as a Leader in the Forrester Wave : Microsegmentation Solutions (Q3 2024), ColorTokens safeguards global enterprises and delivers significant savings by preventing costly disruptions. Our culture We foster an environment that values customer focus, innovation, collaboration, mutual respect, and informed decision-making. We believe in alignment and empowerment so you can own and drive initiatives autonomously. Self-starters and high-motivated individuals will enjoy the rewarding experience of solving complex challenges that protect some of world s impactful organizations be it a children s hospital, or a city, or the defense department of an entire country. Position Overview: Colortokens is looking for a Junior Platform Administrator to assist in managing, maintaining, and optimizing our NextGen Security Information and Event Management (SIEM) platform. The ideal candidate will support the day-to-day operations, help onboard customer log sources, troubleshoot integration issues, and provide technical assistance to the security operations team. This role is ideal for a motivated professional with 3+ years of experience in SIEM administration, security operations, or log management. Key Responsibilities: SIEM Platform Administration Assist in deploying, configuring, and maintaining the NextGen SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, QRadar, Chronicle, Exabeam). Perform basic updates and patches to ensure platform security and functionality. Monitor SIEM health, performance, and uptime under the guidance of senior administrators. Log Source Management Onboard new log sources and validate data ingestion. Help troubleshoot log ingestion, parsing, and formatting issues. Maintain log retention policies for compliance. Rule and Use Case Management Support the development and deployment of detection rules, correlation use cases, and alerts. Tune existing use cases to minimize false positives. Work closely with security analysts to refine alerting strategies. Integration and Automation Assist in integrating SIEM with other security tools (e.g., EDR, microsegmentation, vulnerability scanners). Work on basic automation tasks using scripting (Python, PowerShell) to enhance SIEM efficiency. Platform Security and Compliance Support role-based access control (RBAC) and platform security policies. Help ensure SIEM adheres to compliance standards like SOC2, ISO 27001. Participate in periodic security audits. Network Debugging & Troubleshooting Have a basic understanding of TCP/IP, networking concepts, and protocols. Assist in debugging network connectivity issues related to SIEM log ingestion. Use basic network troubleshooting tools. Collaboration and Support Work alongside SOC analysts, threat hunters, and security engineers. Provide basic technical support for SIEM users. Assist in training and documentation for security teams. Performance Monitoring and Optimization Monitor storage and indexing performance to ensure optimal operations. Report any performance issues to senior administrators. Contribute to platform health reports and alerting metrics. Incident Support Assist SOC teams in log analysis, incident response, and forensic investigations. Ensure log data is readily available for security incidents. Education and Certifications: Bachelor s degree in Computer Science, Information Security, or a related field. Certifications (Preferred but not mandatory): Splunk Certified User/Admin Microsoft Certified: Security Operations Analyst Associate QRadar Certification Any SIEM-related certification Experience: 3+ years of experience in SIEM administration, security operations, or log management. Hands-on experience with at least one SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, Chronicle, Exabeam). Basic knowledge of log ingestion, rule creation, and data parsing. Exposure to scripting (Python, PowerShell) for automation. Basic understanding of TCP/IP networking concepts and network debugging. Technical Skills: Understanding of log formats, Syslog, JSON, XML, and data pipelines. Basic knowledge of querying languages (KQL, SPL, AQL). Familiarity with SIEM integration with security tools like EDR, SOAR, NDR. Awareness of MITRE ATT&CK, NIST, or CIS security frameworks. Basic experience with network troubleshooting tools (ping, traceroute, netcat (nc)). Soft Skills: Strong problem-solving and troubleshooting abilities. Good verbal and written communication skills. Ability to work collaboratively in a security operations environment. Preferred Skills: Basic understanding of cloud-based security solutions (AWS, Azure, Google Cloud). Exposure to SOAR tools (e.g., Cortex XSOAR, Splunk Phantom). Interest in machine learning-based anomaly detection for SIEM. Key Metrics for Success: Successful onboarding of log sources. Improvement in log ingestion and parsing accuracy. Contribution to fine-tuning detection rules. Timely resolution of SIEM-related support requests. Ability to identify and troubleshoot basic network connectivity issues.

Platform Engineer Platform engineer Full-Time Platform engineering
SI

Business Technology Data Engineer

Samsara Inc

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Business Technology Data Engineer Location: Bengaluru, India (Hybrid 3 days onsite) Company: Samsara Technologies India Pvt. Ltd. About Samsara Samsara (NYSE: IOT) is a leader in the Connected Operations Cloud, enabling businesses across industries like transportation, logistics, manufacturing, and field services to harness IoT data for safety, efficiency, and sustainability improvements. Samsara helps organizations digitize physical operations at scale, improving outcomes that impact global infrastructure. Role Overview Samsara is seeking a Business Technology Data Engineer to join its Data & Analytics team within the Business Technology division. In this role, you will design, build, and optimize end-to-end data pipelines and infrastructure for various business-critical systems across CRM, marketing, support, and product platforms. You'll collaborate with teams across the company to build reliable and scalable data solutions that power reporting, automation, and analytics. This hybrid role requires working 3 days per week from the Bengaluru office and 2 days remotely, with working hours aligned to India Standard Time (IST). Key Responsibilities Data Engineering & Platform Development Design and maintain ETL/ELT pipelines that integrate and transform data across business systems. Build scalable data infrastructure to support advanced analytics and real-time reporting needs. Write Python and SQL scripts for data ingestion, transformation, and validation. Data Integration & Enablement Work with diverse data sources: CRM, product telemetry, marketing automation, support ticketing, and order flow systems. Develop and support data lake and data warehouse solutions using Snowflake, Redshift, Databricks, or BigQuery. Ensure interoperability between applications and data layers. Performance & Quality Monitor and optimize pipeline performance, implement observability and alerting. Improve data quality, lineage, and governance across systems. Partner with internal stakeholders (e.g., Sales Ops, Marketing Ops, Analytics) to deliver reliable data products. Minimum Qualifications Bachelor s degree in Computer Science, Data Engineering, or related field. 5+ years of professional experience in data engineering. 3+ years experience building and maintaining end-to-end pipelines in a modern data stack. Strong in SQL and Python. Hands-on experience with: ETL tools: Fivetran, dbt Cloud: AWS (preferred), GCP, or Azure Databases: MySQL, PostgreSQL, Oracle, or similar Data Warehouses: Snowflake, Redshift, BigQuery, Databricks Preferred Qualifications Familiarity with API-based ingestion, serverless architecture (Lambda, API Gateway, SQS, etc.). Experience with monitoring tools (DataDog, CloudWatch, Splunk). Comfortable engaging stakeholders to translate business needs into data solutions. Proficiency in Docker, Kubernetes, or AWS Fargate is a plus. Qualification : Bachelors degree in Computer Science, Data Engineering, or related field

Business Technology Data Business Data Data technology
SI

It Automation Engineer

Samsara Inc

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: IT Automation Engineer Location: Bengaluru, India (Hybrid 3 days onsite) Company: Samsara Technologies India Pvt. Ltd. About Samsara Samsara (NYSE: IOT) is a global leader in the Connected Operations Cloud, empowering organizations in physical operations such as transportation, logistics, construction, and manufacturing to unlock actionable insights from IoT data. With products that improve safety, efficiency, and sustainability, Samsara is at the forefront of digital transformation for industries that power the world. Role Overview As an IT Automation Engineer within Samsara s Business Technology Core IT team, you'll play a key role in streamlining internal IT systems and processes through automation, infrastructure-as-code, and modern DevOps practices. This position emphasizes cloud infrastructure, scripting, CI/CD, and SaaS system integration to support high-growth scalability and efficiency across Samsara's enterprise environment. This hybrid role requires 3 days per week in the Bengaluru office and 2 days remote, operating in India Standard Time (IST). Key Responsibilities Automation & Development Design and build automation scripts and services using Python, Bash, or JavaScript (Node.js). Automate repetitive IT operations across internal platforms, SaaS tools, and cloud infrastructure. Develop and deploy Infrastructure-as-Code (IaC) using Terraform or CloudFormation for AWS environments. Cloud & DevOps Engineering Manage and provision AWS services such as Lambda, EC2, S3, RDS, ECS, API Gateway, etc. Build and maintain CI/CD pipelines and implement containerized solutions using Docker. Implement observability and monitoring solutions using tools like CloudWatch and Splunk. Collaboration & Strategy Partner cross-functionally with IT, security, and business systems teams. Lead strategic automation initiatives to improve IT efficiency at scale. Write and maintain clear documentation for automated workflows and tooling. Minimum Qualifications Bachelor's degree in Computer Science, IT, or a related field. 5+ years in IT automation, DevOps, or software development roles. Strong scripting skills in Python, JavaScript (Node.js), or Go. Hands-on experience with AWS services and IaC tools (Terraform preferred). Experience with SaaS ecosystems like Google Workspace, Okta, Slack, Zoom, GitHub, Zendesk. Proficient in version control using Git/GitHub and building CI/CD pipelines. Strong communication and cross-functional collaboration skills. Preferred Qualifications Familiarity with Atlassian tools (Jira, Confluence), OpsGenie, StatusPage. Experience with Splunk and monitoring large-scale cloud systems. Exposure to Google Cloud Platform (GCP). Experience leading end-to-end internal application development projects. Qualification : Bachelor's degree in Computer Science, IT, or a related field

IT Automation It automation Engineer It engineer
CO

Senior Full Stack Engineer

Commure

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior Full Stack Engineer Location: Bengaluru, India Employment Type: Full-time Department: Engineering About Commure At Commure, we empower healthcare providers by reducing administrative burdens and enabling more time for patient care. Our suite of software and hardware solutions including AI-powered assistants, RTLS, and workflow automation are used by over 250,000 clinicians across hundreds of care sites. From clinical documentation and staff safety to patient engagement and remote monitoring, we're transforming healthcare through technology. With the industry entering a pivotal phase of AI-driven transformation, Commure is leading the charge. About the Role As a Senior Full Stack Engineer on our Patient Experience Platform team, you'll design and build intuitive, secure, and scalable web applications that enhance patient engagement and streamline healthcare workflows. This is a high-impact role contributing to mission-critical projects with real-world outcomes. Key Responsibilities Design and develop full-stack applications that connect patients and healthcare providers. Lead architectural decisions to scale and evolve the platform. Work closely with product, design, QA, and DevOps teams to gather requirements, define solutions, and deliver features. Optimize system performance, reliability, and observability using logging, monitoring, and tracing tools. Maintain cloud infrastructure using Infrastructure-as-Code (IaC) for reproducibility and reliability. Enhance alerting systems to reduce noise and improve incident response. Develop secure authentication and authorization systems that comply with industry standards. Build and maintain CI/CD pipelines, supporting a robust and compliant deployment process. Participate in on-call rotations and production support. Document processes, configurations, and troubleshooting steps for internal knowledge sharing. Promote a culture of engineering excellence through code reviews, best practices, and mentorship. Qualifications Required Bachelor s or Master s degree in Computer Science, Engineering, or a related field. 3+ years of experience in full-stack software development. Proficiency in: Front-end: TypeScript, React, Next.js Back-end: Python and Node.js Cloud Platforms: AWS, GCP, or Azure CI/CD: GitHub Actions, Google Cloud Build Version Control: Git Containerization: Docker and Kubernetes Monitoring/Logging: Cloud-native tools and observability practices Experience with production incident support and on-call rotations. Strong communication, collaboration, and leadership skills. Preferred Familiarity with serverless architectures and microservices. Knowledge of healthcare data standards like HL7, FHIR, and HIPAA compliance. Experience optimizing performance for large-scale distributed systems. Why Join Commure + Athelas Mission-Driven Impact: Transforming healthcare, the largest sector in the country. Top-Tier Investors: Backed by General Catalyst, Sequoia, Y Combinator, Lux, and more. Exceptional Growth: Combined organizations growing 500% YoY, with Series D funding and strong runway. Comprehensive Benefits: Competitive compensation, flexible PTO, medical/dental/vision insurance, parental leave (location-dependent). Join us and help power the future of patient care. Qualification : Bachelors or Masters degree in Computer Science, Engineering, or a related field.

Senior Stack Full stack Engineer Senior engineer
CO

Senior Software Engineer, Customer Solutions

Commure

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior Software Engineer Customer Solutions Location: Bengaluru, India Employment Type: Full-time Department: Engineering About Commure Commure is revolutionizing healthcare with AI-powered technologies designed to eliminate administrative overhead and give clinicians more time with patients. Our platform combines advanced LLM AI, RTLS, and workflow automation to streamline clinical operations, improve patient engagement, and enhance care delivery. We support 250,000+ clinicians across hundreds of care sites nationwide and we re just getting started. If you're passionate about building life-changing solutions in one of the world s most vital industries, now is the time to join. About the Role As a Senior Software Engineer on the Customer Solutions team, you ll be instrumental in building and customizing applications on top of our Patient Experience Platform to address client-specific needs. Your work will directly impact how healthcare providers interact with our technology and serve patients better. Key Responsibilities Translate business and client requirements into scalable, maintainable technical solutions. Design, develop, and integrate customized applications and services using our core platform. Collaborate with internal teams and customers to prioritize features and maintain a customer-focused development backlog. Build long-term client relationships through technical leadership and delivery excellence. Implement and maintain observability through logging, monitoring, and alerting systems. Apply SRE and DevOps practices to improve stability and incident response. Coordinate testing and quality assurance activities in collaboration with QA teams. Stay informed on healthcare tech trends and integrate innovations into the platform. Participate in client-facing meetings to advise on feasibility, risks, and technical trade-offs. Mentor junior engineers and contribute to a strong engineering culture. Required Qualifications Bachelor's or Master s degree in Computer Science, Engineering, or a related field. 3+ years of professional software development experience. Frontend: React, Next.js, TypeScript Backend: Python, Node.js Cloud: Proficiency in AWS, Azure, or GCP with experience in cloud-native architectures CI/CD: Familiarity with tools like GitHub Actions, Google Cloud Build, etc. Infrastructure: Experience with Docker, Kubernetes, and IaC principles Monitoring & Observability: Implemented logging, tracing, and alerting systems Production Support: Experience with on-call rotations and incident response Strong communication and collaboration skills with cross-functional teams Experience working directly with clients to deliver technical solutions Understanding of APIs, webhooks, and third-party system integrations in healthcare Preferred Skills Familiarity with HIPAA, FHIR, HL7, and other healthcare standards Understanding of data privacy, compliance, and security best practices Strong problem-solving abilities and adaptability in dynamic environments Experience in client support, customization, or professional services engineering is a plus Why You ll Love Working at Commure + Athelas Mission-Driven Work Help transform healthcare through meaningful technology. Elite Backing Backed by General Catalyst, Sequoia, Y Combinator, and more. Explosive Growth 500%+ YoY growth pre-merger and Series D funded. Competitive Benefits Flexible PTO, health insurance, parental leave, and more (location-specific). Be part of the future of healthcare. Join Commure and help build intelligent, scalable systems that truly matter. Qualification : Bachelor's or Masters degree in Computer Science, Engineering, or a related field.

Senior Software Senior software Engineer Senior engineer
LA

Senior Analyst

Latentview Analytics

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Role: Senior Analyst Machine Learning Performance & Testing Location: Bengaluru, Karnataka, India Experience: 3 5 Years Employment Type: Permanent, Full-Time About the Role We are seeking a skilled and detail-oriented Senior Analyst with strong experience in ML model performance testing, load testing, and end-to-end (E2E) automation. This role is focused on ensuring scalable, low-latency deployment of production-grade machine learning models. The ideal candidate will be proficient in evaluating model performance under varied workloads, building robust test frameworks, and enhancing system monitoring. Key Responsibilities Conduct load testing and performance benchmarking for machine learning models under varying requests per second (RPS) scenarios. Develop and automate end-to-end test cases to validate model readiness and support smooth rollouts. Monitor and improve model scalability, response time, and error rates across production environments. Collaborate with ML engineers, backend developers, and QA test teams to ensure seamless integration and testing workflows. Identify and address bottlenecks in model inference, helping improve performance for high-volume, low-latency applications. Set up alerting and observability pipelines for model health using industry-standard tools. Required Skills & Tools Performance Testing & Monitoring: ML Load Testing, Job Monitoring, Model Scalability Evaluation Platforms & Tools: Databricks, MLflow, Seldon, Kubeflow, Tecton, Jenkins Cloud Services: Experience with AWS and deploying/testing models in cloud environments Programming Languages: Proficiency in at least one of the following Python, Java, Scala Experience: Working with production-level ML models, especially involving high data volumes and real-time inference Strong communication skills and ability to work in cross-functional teams Preferred Qualifications Hands-on experience with CI/CD pipelines for ML systems Knowledge of A/B testing and canary deployments for ML models Experience building testing frameworks for ML infrastructure at scale Understanding of monitoring and alerting best practices in production ML systems Be at the forefront of ML operations and model performance optimization Collaborate with industry-leading engineers and contribute to cutting-edge AI deployments Gain deep exposure to real-time data systems, cloud platforms, and enterprise-scale ML testing Competitive compensation and an innovative, fast-paced work environment

Senior Analyst Senior analyst Full-Time Data Analysis
SA

Devops Engineer

Sarvam

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

DevOps Engineer Location: Bengaluru, Karnataka, India (On-Site) Department: Engineering Employment Type: Full-Time About Sarvam.ai Sarvam.ai is a cutting-edge generative AI startup headquartered in Bengaluru, India, with a mission to make generative AI accessible and impactful for Bharat. Founded by AI experts, we are dedicated to developing high-performance, cost-effective AI agents tailored for the Indian market. We enable enterprises to tap into new opportunities, build deeper customer connections, and reshape the future of AI for India and beyond. Role Overview We are looking for a DevOps Engineer to join our team and help build and manage scalable, secure, and high-performance infrastructure. In this role, you will be a key contributor to automating deployments, managing cloud infrastructure, optimizing CI/CD workflows, and ensuring system reliability. You will work with cutting-edge technologies, including cloud platforms, containerization, and infrastructure as code (IaC), to deliver impactful solutions for AI-driven products. Key Responsibilities CI/CD Pipelines: Design, implement, and manage CI/CD pipelines for seamless software deployment and integration. Cloud Infrastructure: Deploy and manage cloud infrastructure using Terraform, Kubernetes, and Docker for scalability and high performance. Automation & Scaling: Automate infrastructure provisioning, scaling, and security compliance to support high-availability environments. Monitoring & Optimization: Implement logging, monitoring, and alerting solutions using tools like Prometheus, Grafana, ELK Stack, or CloudWatch to monitor system performance and optimize resource utilization. Security & Compliance: Enhance security and compliance by managing IAM policies, encryption, and vulnerability scanning. Troubleshooting & Root Cause Analysis: Troubleshoot system failures, perform root cause analysis, and implement improvements to ensure reliability and uptime. Collaboration: Work closely with development teams to ensure smooth deployment and operation of AI models and applications. Must-Have Skills & Qualifications Educational Background: Bachelor s degree in Computer Science, Engineering, or related field (2024/2025 graduates). Cloud Expertise: Strong experience with AWS, Azure, or GCP for deploying and managing cloud-based applications. Containerization: Proficiency in Docker and Kubernetes for building and managing containerized applications. Infrastructure as Code (IaC): Experience with Terraform, Ansible, or CloudFormation to automate infrastructure management. CI/CD Pipelines: Experience in setting up automated workflows using tools like GitHub Actions, Jenkins, or GitLab CI/CD for smooth deployments. Monitoring & Logging: Experience with Prometheus, Grafana, ELK, or similar tools to implement effective monitoring and logging solutions. Networking & Security: Strong understanding of firewalls, VPNs, SSL, and cloud security best practices for secure infrastructure. Version Control: Proficiency with Git for managing code repositories and version control workflows. Problem Solving: Strong debugging, troubleshooting, and analytical skills to resolve complex system issues. Good to Have (Preferred Experience) Serverless Computing: Exposure to serverless computing models such as AWS Lambda or Azure Functions. Message Queues: Experience with message queues like Kafka, RabbitMQ, or SQS. Site Reliability Engineering (SRE): Familiarity with SRE practices to ensure the reliability and availability of large-scale systems. Open Source Contributions: Contributions to open-source projects or a strong GitHub portfolio showcasing DevOps expertise and best practices. Impactful Work: Work on AI-driven products that are reshaping the future of technology in India. Innovative Team: Collaborate with a team of AI experts and engineers pushing the boundaries of technology. Career Growth: Opportunity to grow in a fast-growing startup at the forefront of the generative AI revolution. Cutting-edge Technologies: Work with cloud technologies, automation, and AI infrastructure to create high-impact products. Qualification : Bachelors degree in Computer Science, Engineering, or related field

DevOps Engineer Devops engineer Full-Time Continuous integration
OR

Site Reliability Developer 2/3

Oracle

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Description: Site Reliability Engineer - OCI Cloud Engineering Team Role: Site Reliability Engineer (SRE) Team: OCI OLTP (Online Transaction Processing) Location: Kiev Career Level: IC2 Experience: 5+ years Overview: Oracle Cloud Infrastructure s (OCI) OLTP organization is seeking a Site Reliability Engineer (SRE) to join our dynamic and fast-paced Cloud engineering team. The team is responsible for mission-critical distributed systems and cloud services, and we are looking for an engineer who is deeply interested in databases, distributed systems, and cloud services. If you thrive in an environment where innovation, problem-solving, and operational excellence intersect, this is an exciting opportunity for you! As a member of the SRE services, you will focus on Cloud Services, building deployments, operations, security vulnerability mitigation, and automation. You will be instrumental in fostering a culture of Site Reliability Engineering (SRE) within the team, and your work will directly contribute to ensuring the stability, performance, and reliability of Oracle s global cloud service infrastructure. This role requires someone who is adaptable, highly motivated, and capable of managing large-scale cloud environments with a focus on continuous improvement. Key Responsibilities: Cloud Service Operations & Reliability: Deploy, operate, and maintain large-scale cloud service products in a highly available, fault-tolerant, and scalable environment. Collaborate with internal teams to identify and mitigate cross-team issues that pose operational risks to cloud services. Focus on systems reliability and ensure the continuous availability of cloud services by automating tasks and eliminating manual interventions. Automation & Improvements: Automate operational tasks and improve service deployments, focusing on scaling, performance, and uptime. Contribute to CI/CD systems, ensuring seamless integration and continuous delivery for cloud-based services. Leverage automation tools such as Terraform, Grafana, and Bitbucket to streamline operations. Security & Incident Response: Mitigate security vulnerabilities within cloud services and ensure compliance with Oracle's security standards. Participate in on-call rotations to provide immediate troubleshooting support and ensure rapid issue resolution. Perform deep analysis of service performance and collaborate with team members to diagnose and resolve issues that affect service availability or performance. Collaborative Problem-Solving: Work closely with cross-functional teams, including development, database, networking, and storage experts, to ensure the reliability and performance of services. Identify systemic issues and potential risks, develop solutions, and ensure proper documentation and communication with stakeholders. Documentation & Knowledge Sharing: Contribute to documentation such as runbooks, operational guides, and troubleshooting manuals. Mentor junior engineers and share knowledge on best practices for site reliability engineering and cloud service operations. Continuous Learning: Stay up to date with new cloud technologies, trends, and best practices, and actively implement them in your day-to-day work. Technical and Professional Requirements: Cloud Services & Infrastructure: 5+ years of experience in SRE, DevOps, or Automation roles with a focus on large-scale infrastructure and cloud services. Hands-on experience with cloud platforms (e.g., OCI, AWS, Azure) and expertise in compute, database, networking, and storage services within cloud environments. Automation & Tooling: Proficiency with automation tools such as Terraform, Grafana, LumberJack, and Shepherd. Solid experience in using CI/CD tools and processes for cloud service deployments and operations. Scripting & Systems: Strong knowledge of scripting languages, particularly Python and Java. Familiarity with Linux systems, docker containers, virtualized infrastructure, and orchestration (e.g., Kubernetes). Performance & Troubleshooting: Excellent troubleshooting skills with a focus on performance, availability, reliability, and scalability of distributed systems. Experience in operating fault-tolerant, highly available, high-throughput distributed systems. Security & Incident Management: Familiarity with security practices and mitigating security vulnerabilities in cloud services. Proven ability to handle incident response and provide efficient troubleshooting during on-call rotations. Collaboration & Communication: Strong verbal and written communication skills, capable of working effectively with diverse teams across multiple geographies. Ability to work in a highly collaborative environment, driving operational excellence and customer satisfaction. Preferred Qualifications: Experience in operating and maintaining multi-tenant, cloud-based infrastructure with a focus on scalability and high availability. Familiarity with tools and platforms like Grafana, Prometheus, and other observability and monitoring tools. Experience in networking and storage technologies in a cloud environment. Joining OCI s OLTP team as an SRE gives you the opportunity to work with cutting-edge technologies and contribute to the operational excellence of Oracle s global cloud infrastructure. This is a chance to grow your skills in a highly dynamic environment and to solve complex problems that directly impact mission-critical cloud services. With a focus on automation, scalability, and high performance, you will be an essential part of a team that powers Oracle s leading cloud services. If you are an experienced engineer passionate about cloud technologies, automation, and ensuring the reliability of large-scale systems, we encourage you to apply and join us in this exciting journey!

Site Reliability Site reliability Developer Site developer
JM

Technology Support II

J.p. Morgan

2+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Technology Support II Location: Bengaluru, India Department: Corporate Technology Team, JPMorgan Chase Job Description As a Technology Support II within the Corporate Technology team at JPMorgan Chase, you will leverage best practices in software engineering to solve complex business problems and drive excellence in technology solutions. You will be responsible for working on small to medium projects independently and collaborating with cross-functional teams to enhance your understanding of business needs and relevant technologies. This role involves championing site reliability practices, applying your experience in Agile SDLC, and proficiency with development toolsets. You will have a solid understanding of application, data, and infrastructure architecture, and effectively use ETL software such as Ab Initio. Staying updated on industry trends, leveraging your knowledge of financial instruments, and fostering an innovative culture will be key to your success. You will apply your software skills in business analysis, development, maintenance, and improvement, all while collaborating within large teams to achieve the organization s goals. Key Responsibilities Site Reliability: Champion site reliability culture, providing technical influence across the team. Agile Practices: Apply your experience with Agile SDLC and proficiency with development tools. Application & Infrastructure Architecture: Demonstrate solid knowledge in application, data, and infrastructure architecture disciplines. ETL Software Usage: Utilize Ab Initio ETL software effectively to process and integrate data. Industry Awareness: Stay informed about technology trends and best practices across the industry. Financial Instrument Knowledge: Leverage knowledge of various financial instruments in your work. Innovation Culture: Foster an innovative culture, bringing passion and creativity to problem-solving. Software Skills Application: Apply your software skills in business analysis, development, maintenance, and improvement. Collaboration: Collaborate effectively in large teams to meet organizational goals. Independent Work: Work independently and take the initiative on tasks and projects. Required Qualifications, Capabilities, and Skills Training & Certification: Formal training or certification in application support concepts, with 2+ years of applied experience. Programming & Scripting: Experience in Python or similar programming languages. Automation Tools: Experience with automation tools such as Ansible, Autosys, or Control-M. Site Reliability Knowledge: Emerging knowledge of reliability, scalability, performance, security, and site reliability best practices. Monitoring & Alerting: Familiar with service level objective alerting and monitoring tools (e.g., Splunk, Datadog, Dynatrace). CI/CD Tools: Familiar with continuous integration and delivery tools such as Jenkins, GitLab, or Terraform. Automation with Terraform & Python: Emerging knowledge of Terraform and automation in Python. Containers & Orchestration: Emerging knowledge of containers and container orchestration tools (e.g., ECS, Kubernetes, Docker). Collaboration Skills: Strong communication and collaboration skills, with the ability to thrive in a fast-paced, dynamic environment. Preferred Qualifications Cloud Experience: Experience with cloud platforms (preferably AWS) and setting up infrastructure using Terraform. Platform Experience: Advantageous to have experience supporting applications on platforms such as Databricks, Snowflake, or AWS EMR. Virtualization & Cloud Architecture: Knowledge of virtualization, cloud architecture, and services for automated deployments. About JPMorgan Chase JPMorgan Chase is one of the oldest and most prominent financial institutions in the world. With over 200 years of history, we provide innovative financial solutions to millions of consumers, small businesses, and some of the world's largest corporate, institutional, and government clients. Our services span across investment banking, consumer banking, small business banking, commercial banking, financial transaction processing, and asset management. Join us and be part of a global leader in the financial services industry.

Technology Support Technology support Ii Full-Time
LO

Principal Sdet

Locus

5-8 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Principal SDET Location: Bangalore (On-site; full-time) About Locus: At Locus, we are redefining logistics decision-making with deep-tech solutions that drive efficiency, consistency, and transparency across industries like retail and FMCG/CPG. Founded in 2015 by Nishith Rastogi and Geet Garg, Locus has evolved from a women s safety geo-tracking app into a globally recognized logistics optimization platform. Our technology has empowered enterprises such as Unilever and Nestl to execute over a billion deliveries across 30+ countries. Guided by our commitment to innovation and sustainable growth, we transform complex supply chains into strategic growth enablers. Join us at Locus and be part of a team shaping the future of global logistics. Job Overview: About the Role As a Principal SDET at Locus, you will play a critical role in driving the quality and reliability of our platform. This role goes beyond traditional testing; you will design, develop, and enhance automated test frameworks, ensure seamless integration of quality engineering practices, and mentor team members to establish a quality-first culture. Key Responsibilities: Automation Framework Design and Development: Architect, develop, and maintain robust test automation frameworks for backend, APIs, and frontend components. Ensure the frameworks are scalable, reusable, and aligned with the latest industry standards. Test Strategy and Planning: Collaborate with product managers, developers, and DevOps to define comprehensive test strategies for new features and system enhancements. Own the end-to-end testing lifecycle, from requirement analysis to test case creation, execution, and reporting. Drive better QA practices (In areas Like: defect creation, Capturing scope of feature, Sign offs , matrix of coverage in functional and automation etc) Quality Advocacy and Best Practices: Drive the adoption of best practices in testing, coding standards, and CI/CD processes across teams. Act as a champion of quality by fostering a quality-first mindset and instilling a culture of rigorous testing. Test Execution and Debugging: Conduct functional, performance, and security testing, ensuring the product meets the highest quality standards. Debug complex issues and work closely with developers to identify and resolve root causes. Continuous Improvement: Analyze test results and metrics to identify areas for improvement in testing processes and product quality. Contribute to the development and enhancement of monitoring and alerting systems to proactively address production issues. Mentorship and Collaboration: Mentor and guide junior SDETs and quality engineers, sharing knowledge and expertise to elevate the team s capabilities. Collaborate effectively with cross-functional teams to ensure quality is integrated into every stage of the development process. Develop a good understanding of velocity in teams and across the org and work towards removing roadblocks to improve release velocity Qualifications: 5-8 years of experience in software testing, with at least 3 years focused on test automation. Proficiency in programming languages such as Java, Python, or JavaScript. Hands-on experience with test automation tools and frameworks for Web and API automation like Selenium, Appium, TestNG, JUnit, or similar. Exp of working on any AI enabled testing tools or frameworks is a plus. Expertise in API testing and automation using tools like Postman, RestAssured, or equivalent. Familiarity with performance testing tools such as JMeter or Gatling. DevOps and CI/CD: Experience with CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or GitHub Actions. Knowledge of Docker, Kubernetes, and cloud platforms (AWS, GCP, or Azure) is a plus. Strong debugging skills and the ability to identify root causes of issues quickly. Excellent communication, collaboration, and leadership skills. Experience in testing large-scale, distributed systems. Knowledge of security testing and tools like OWASP ZAP or Burp Suite. Exposure to machine learning models and their testing challenges. Join Locus and become part of a visionary team that is redefining logistics through innovation and smart distribution. We provide competitive compensation, comprehensive benefits, and a collaborative environment where your expertise will drive both your growth and that of the organization. Locus is an equal opportunity employer dedicated to creating a diverse and inclusive workplace.

Principal Sdet Full-Time Principal SDET Software development engineer in test
IB

Application Developer: Cloud Fullstack

International Business Machines Corporation

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Application Developer - IBM Consulting Introduction: As an Application Developer in one of IBM's Consulting Client Innovation Centers (Delivery Centers), you'll be at the forefront of delivering deep technical expertise to both public and private sector clients worldwide. Our delivery centers offer locally based skills that drive innovation and facilitate the adoption of new technologies. At IBM, your role will involve transforming business requirements into code, contributing to the development of customized systems in an agile environment. By leveraging the latest tools, technologies, and education, you will help accelerate IBM's and its clients' digital transformations globally. Your work will integrate seamlessly with enterprise systems, creating solutions that drive innovation and business success. This is an exciting opportunity to make a global impact while advancing your career in one of the world's leading technology companies. Your Role and Responsibilities: Solution Design & Development: Address functional needs by designing and developing solutions using multiple technologies, converting high-level designs into functional and technical specifications. Application Support & Performance: Provide functional support services to ensure applications meet customer performance, availability, service level agreements (SLAs), and satisfaction targets. Project Management & Governance: Ensure adherence to project management practices and processes such as software application development, testing, service management, change management, and root cause analysis (RCA). Project Execution: Plan and manage medium to large-scale, complex, integrated application or platform projects, ensuring they are executed effectively within scope, timeline, budget, and quality parameters. Best Practices & Design Reviews: Define and promote coding best practices within your team. Perform design reviews to ensure the robustness and quality of the developed solutions. Required Technical and Professional Expertise: Languages & Frameworks: Proficiency in Java 8 and above, Spring Boot, and REST API Design & Development. Database & ORM: Experience with Spring Data/JPA/Hibernate and working knowledge of databases such as SQL Server/DB2. Containerization & Orchestration: Expertise in Docker, container orchestration platforms like OpenShift or Kubernetes. Messaging Systems: Familiarity with messaging platforms such as RabbitMQ and Kafka. CI/CD: Experience with continuous integration and continuous deployment tools such as Azure DevOps and Drone.io. Monitoring & Alerting: Knowledge of monitoring and alerting systems like AppDynamics and Prometheus. Preferred Technical and Professional Expertise: Log Management & Visualization: Experience with the ELK Stack (Elasticsearch, Logstash, Kibana) for log management and analysis. Data Visualization & Monitoring: Familiarity with Grafana for data visualization. Cloud Computing: Experience with AWS and cloud-based application deployments. Innovative Culture: At IBM, you ll work with cutting-edge technologies that are transforming industries globally. Global Impact: Your work will directly contribute to the success of clients worldwide by delivering impactful solutions. Career Growth: Gain access to professional development programs, training, and mentorship that will accelerate your career. Dynamic Work Environment: Join a diverse, collaborative team where creativity and new ideas are highly encouraged. If you're passionate about developing innovative solutions, transforming business needs into code, and working with the latest technologies, apply today to join IBM Consulting and make a significant global impact.

Application Developer Application Developer Cloud Cloud application
AL

Staff Automation Engineer

Arm Limited

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Staff Automation Engineer Company Arm Job Overview As a Staff Automation Engineer, you will play a pivotal role in designing and implementing cutting-edge IT automation solutions that enhance Arm s infrastructure and development processes. You will leverage your expertise in automation technologies to streamline provisioning, configuration management, image management, secret management, and CI/CD pipelines. This is a critical role to ensure our systems remain efficient, scalable, and secure. Responsibilities Develop and implement automated solutions for infrastructure provisioning and configuration management using tools like Terraform, CloudFormation, Ansible, Puppet, and Chef. Design, configure, and maintain scalable CI/CD pipelines using platforms such as GitLab CI, GitHub Actions, Cloudbees/Jenkins, AWS CodePipeline, or Azure DevOps. Collaborate with DevOps and Engineering teams to improve and optimize engineering workflows, enhancing both efficiency and time-to-market. Standardize and automate configuration processes to ensure consistency and compliance across all environments. Troubleshoot and resolve configuration-related issues, ensuring systems are always up-to-date, secure, and cost-efficient. Required Skills and Experience Proven success in automation engineering or a similar role. Solid understanding of DevOps practices and extensive experience in infrastructure-as-code (IaC). High proficiency with automation tools and technologies, including Terraform, Ansible, Docker, Kubernetes, Jenkins, and Vault. Strong understanding of both on-premise and cloud platforms (AWS, Azure, GCP), including cloud-native architectures. Excellent communication, collaboration, and problem-solving skills with the ability to thrive in a team-oriented environment. Nice-To-Have Skills and Experience Relevant certifications, such as AWS Certified DevOps Engineer or Certified Kubernetes Administrator, are a plus. What Arm Offers We offer exciting, meaningful work within a diverse team. Arm s continued growth ensures ample career progression opportunities, allowing you to make a real impact on our global success. #LI-KR2

Automation Engineer Staff Engineer Automation engineer Full-Time
BY

Senior Software Engineer Python

Blue Yonder

9-13 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior Software Engineer Python Location: Pune, India Company: Blue Yonder Experience: 9 to 13 years Education: Bachelor s Degree in Computer Science, Information Technology, or related fields About Blue Yonder Blue Yonder is a leading AI-driven global supply chain solutions provider, consistently recognized among Glassdoor s Best Places to Work. Our innovative products drive digital transformation for businesses worldwide across manufacturing, retail, and supply chain sectors. Role Overview We are seeking a Senior Software Engineer Python to join our growing Industry Solutions team, which is building a ground-up SaaS product focused on solving complex challenges in Manufacturing and Retail Industries. The team currently consists of 60+ global associates across the US and India and is set to expand rapidly. This role requires strong technical expertise combined with leadership qualities to mentor junior and mid-level engineers. Key Responsibilities Architect, design, and develop scalable, event-driven multi-tenant microservices using Python. Lead technical design discussions and mentor junior engineers on best practices, coding standards, and software quality. Drive code reviews, contribute to cleaner code, and advocate for effective testing strategies. Collaborate with cross-functional teams to deliver high-quality features aligned with product roadmaps. Troubleshoot complex issues by identifying root causes and implementing long-term fixes. Continuously contribute to improving engineering processes, code quality, and team productivity. Promote knowledge-sharing through training sessions, internal talks, and contributions to open-source projects. Ensure secure architecture by implementing strong identity management and secure configurations. Champion DevOps practices, Infrastructure as Code, and automation-first approaches for deployments. Focus on designing self-healing services with robust monitoring and alerting. Collaborate with stakeholders to optimize cloud infrastructure and cost efficiency. Foster a blameless culture during incident reviews and encourage continuous learning. Technical Skills & Experience Required 9 to 13 years of experience in Enterprise Python Development. Strong expertise in Python design patterns and libraries like Numpy. Deep understanding of cloud-native architectures and Azure services (e.g., AKS, Event Hub, Azure AD, HDInsight, ARM templates, etc.). Strong hands-on experience in REST API development, OAuth, and multi-tenant microservices. Experience with data platforms such as Snowflake and MS SQL. Solid understanding of Git, Gradle, Jenkins, and CI/CD pipelines. Background in infrastructure automation and DevOps practices. Experience designing and building secure systems with identity management and data protection. Knowledge of web services, IDOCs, and interface handling is a plus. Preferred Qualifications Exposure to manufacturing or retail domain. Familiarity with solver algorithms and optimization solutions. Experience working in agile teams with strong emphasis on collaboration and iterative delivery. Strong problem-solving mindset with a passion for mentoring and knowledge sharing. Work on cutting-edge AI-driven supply chain solutions with a global impact. Collaborate with a diverse and talented global team. Professional growth through mentorship, learning programs, and leadership opportunities. Experience a flexible and inclusive work environment that values creativity, innovation, and personal well-being. Diversity, Inclusion, Value & Equity (DIVE) At Blue Yonder, Diversity, Inclusion, Value & Equity are at the heart of our culture. We value diverse perspectives and foster an environment where everyone feels empowered to bring their authentic selves to work. All qualified applicants will receive equal consideration regardless of race, color, religion, gender identity, sexual orientation, disability, or veteran status. Qualification : Bachelors Degree in Computer Science, Information Technology, or related fields

Software Engineer Staff Engineer Software Engineer Staff software engineer
BY

Senior DevOps / Site Reliability Engineer

Blue Yonder

10-13 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior DevOps / Site Reliability Engineer Location: Pune, India Company: Blue Yonder Experience: 10 to 13 years Education: Bachelor s Degree in Computer Science, Engineering, or related STEM fields Company Overview Blue Yonder is a leading AI-driven Global Supply Chain Solutions provider and consistently recognized as one of Glassdoor s Best Places to Work. We are driving the next wave of digital transformation in manufacturing and retail, delivering innovative SaaS solutions that power intelligent supply chains across the globe. We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to lead the design, development, deployment, and operational management of our Azure SaaS solution. This role requires strong DevOps, cloud delivery, and infrastructure automation expertise, along with leadership capabilities to guide a growing global team. Role Overview In this role, you will be responsible for architecting, planning, and executing end-to-end delivery pipelines, supporting both product development and operational stability. Working closely with platform, product, and architecture teams, you will implement best-in-class DevOps and SRE practices, ensuring scalability, resilience, and cost optimization. Key Responsibilities Architect, design, and manage CI/CD pipelines and infrastructure for a cloud-native, multi-tenant SaaS solution on Azure. Lead sprint planning, backlog grooming, and architecture discussions. Develop quality automation scripts and tools to reduce manual efforts and enable self-healing, self-service capabilities. Identify and resolve operational bottlenecks and proactively improve observability (monitoring, alerting, logging). Participate in code reviews, ensure secure and scalable designs, and mentor junior and mid-level engineers. Collaborate with stakeholders to understand business and technical requirements and translate them into actionable user stories. Implement and enforce cloud cost optimization strategies. Conduct post-incident reviews with a blameless culture to identify root causes and drive continuous improvements. Automate service requests and standard operational procedures. Drive improvements to the team s continuous integration pipeline, ensuring rapid and reliable deployments. Stay updated with the latest DevOps, SRE, and cloud technologies and bring innovative ideas to the table. Participate in team hiring and actively contribute to onboarding new team members. Technical Environment Languages: Java, Python, PowerShell, Shell Scripting DevOps Tools: Azure DevOps, GitHub Actions, Jenkins Cloud: Microsoft Azure (ARM Templates, AKS, Event Hub, HDInsight, Azure AD, Application Gateway, Virtual Networks) Architecture: Microservices, Kubernetes, Docker, Event-driven architecture Frameworks: Spring Boot, Hibernate Monitoring & Logging: Elasticsearch, Spark, Kafka Databases: RDBMS, NoSQL Version Control: Git Required Skills & Experience Bachelor s Degree (STEM preferred) with 10 to 13 years of experience in DevOps, Cloud Delivery, or Site Reliability Engineering. Proven hands-on experience with Azure Cloud Services. Expertise in setting up and optimizing CI/CD pipelines. Strong scripting experience: Shell and PowerShell are mandatory; Python is a plus. Strong understanding of container technologies (Docker, Kubernetes) and microservices architecture. Experience integrating and managing third-party monitoring and logging tools. Strong problem-solving skills and ability to work with global, cross-functional teams. Excellent communication and stakeholder management skills. Nice to Have Development experience in Java or Python. Experience working in agile teams with a product-centric mindset. Experience working in manufacturing or retail domains. Exposure to AI/ML-driven monitoring and observability tools. Work with cutting-edge technologies on globally impactful solutions. Collaborate with diverse and talented teams across the US, India, and the UK. Foster your career growth through mentorship, continuous learning, and leadership opportunities. Experience an inclusive, flexible work culture where innovation and creativity thrive. Diversity, Inclusion, Value & Equality (DIVE) At Blue Yonder, we are committed to building an inclusive environment where everyone feels empowered to be themselves. All qualified applicants will receive consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. Qualification : Bachelors Degree in Computer Science, Engineering, or related STEM fields

Software Engineer Staff Engineer Software Engineer Staff software engineer
CO

Manager - Cloud Engineering

Couchbase

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Software Engineering Manager Couchbase Capella Location: Bangalore, India (Office-based role, 3 days a week in the office) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of the increasing demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in today s AI-driven world. By seamlessly uniting transactional, analytical, mobile, and AI workloads into a fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is driving innovation, accelerating AI transformation, and redefining customer experiences. Join us on our mission! Job Overview: We are seeking an experienced Engineering Manager to lead a team of software developers building the next generation of cutting-edge Database as a Service (DBaaS) on the public cloud. In this role, you will be responsible for leading the team to deliver high-quality projects on time while ensuring that the operations of our service in production meet targeted Service Level Objectives (SLOs). The ideal candidate will possess a strong technical background, experience managing engineering teams, and excellent leadership and communication skills. Responsibilities: Lead and manage a team of software developers, providing guidance, mentorship, and performance feedback. Ensure the timely delivery of projects that meet business objectives, quality standards, and supportability goals. Oversee production systems to ensure they meet availability, performance targets, and Service Level Agreements (SLAs). Work cross-functionally with product management, QA, and other teams to develop staffing plans, release roadmaps, and ensure successful project delivery. Foster a collaborative and high-performance team culture that emphasizes continuous improvement and innovation. Contribute to technical design, architecture, and code reviews while maintaining a hands-on approach as needed. Requirements: Education: Bachelor s or Master s degree in Computer Science or a related field, or equivalent practical experience. Experience: At least 3 years of experience as an Engineering Manager and 8+ years of experience as a Software Engineer. Proven track record of successfully managing software development projects on public cloud platforms. Strong experience with at least one of the major cloud platforms (AWS, GCP, or Azure). Experience with Golang is highly preferred, particularly in managing large projects written in Golang. Strong communication and leadership skills, with the ability to collaborate effectively with cross-functional teams. Couchbase is at the forefront of revolutionizing modern application development with Capella, our fast, flexible, and affordable cloud database platform. By offering best-in-class price performance, we enable organizations to build and scale applications that deliver exceptional customer experiences from cloud to edge. Over 30% of the Fortune 100 companies trust Couchbase to power their modern applications, and we are honored to be named among the Best Places to Work in the Bay Area and the UK. Some of the benefits of working at Couchbase include: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: World-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP program, retirement programs, and business travel insurance. Career Growth: Be valued, create value we support your career development. Fun Perks: Ergonomic office setup, food and snacks for in-office employees, and much more. Qualification : Bachelors or Masters degree in Computer Science or a related field, or equivalent practical experience.

Manager Cloud Cloud manager Engineering Manager engineering
CO

Senior Site Reliability Engineer

Couchbase

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!

Senior Site Reliability Site reliability Engineer
CI

Linux Automation Engineer

Capgemini Invent

4-8 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Linux Automation Engineer Experience: 4-8 Years Location: Bangalore Role Overview: As a Linux Automation Engineer, you will be responsible for implementing and automating infrastructure solutions that support IBM Cloud products and infrastructure. You will play a key role in building automation frameworks, managing CI/CD pipelines, ensuring security compliance, and providing technical escalation support. This role requires expertise in Linux, automation tools, cloud infrastructure, and DevOps best practices. Key Responsibilities: Infrastructure Automation & Implementation: Design, implement, and automate Linux-based infrastructure solutions for IBM Cloud. Develop automation scripts and tools to enhance system efficiency and reliability. Automate system provisioning, configuration management, and deployments. CI/CD & Test Automation: Build and manage test automation frameworks and CI/CD pipelines. Maintain and administer automated systems and DevOps tools for development and test teams. Security & Compliance: Ensure the security integrity and compliance of the infrastructure environment. Implement best practices for access control, monitoring, and auditing. Monitoring & Alerting: Develop and integrate alerting and monitoring solutions for mission-critical services. Partner with cross-functional teams to enhance system observability and reliability. Technical Support & Troubleshooting: Provide technical escalation support for Infrastructure Operations teams. Troubleshoot and resolve complex Linux, automation, and cloud infrastructure issues. Required Skills & Competencies: Strong expertise in Linux system administration and automation. Experience with infrastructure automation tools such as Ansible, Terraform, or Puppet. Hands-on experience with CI/CD tools like Jenkins, GitLab CI, or Azure DevOps. Proficiency in scripting languages (Bash, Python, or Shell scripting). Familiarity with cloud platforms (IBM Cloud, AWS, Azure, or GCP). Strong understanding of networking, security, and compliance best practices. Experience in monitoring and alerting tools like Prometheus, Grafana, or ELK Stack. Ability to work in a fast-paced, collaborative DevOps environment. Capgemini Engineering is a world leader in engineering services, empowering the most innovative companies globally. We offer: A dynamic and diverse work environment with cutting-edge technologies. Opportunities to work on industry-leading projects in cloud, automation, and DevOps. A culture of innovation, collaboration, and professional growth. Join us and be a part of a team that is shaping the future of digital transformation!

linux Automation Linux automation Full-Time Shell Scripting

1 - 20 of 0 jobs

* No exact matches found. Showing closest results instead
Sort by:

No results found

Modify search criteria or create an alert to get relevant jobs as soon as they’re posted

Create an alert

Continue to Save

Please login to your jobseeker account, or create a new one to save this job.

Feedback

Share Feedback