Prometheus Jobs in Bengaluru

57 Jobs Found

GR

Site Reliability Engineer

Groww

4-6 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Site Reliability Engineer Location: Bengaluru About Groww At Groww, we re on a mission to make financial services simple, accessible, and transparent for every Indian. As one of India s fastest-growing financial platforms, we help millions take control of their financial future through a wide range of products. We re a team driven by ownership, radical customer-centricity, and a deep passion for challenging the status quo. From intuitive design to robust engineering, everything we build is grounded in what our customers need. If you re excited about building systems that power the future of finance in India, we d love to hear from you. Our Vision To empower every Indian with the knowledge, tools, and confidence to make sound financial decisions. Our goal is to be the most trusted financial partner for millions across the country. Our Core Values Customer Obsession We put our users first, always. Extreme Ownership We own everything we do, end-to-end. Simplicity We keep things simple, effective, and intuitive. Long-term Thinking We focus on sustainable, impactful decisions. Transparency We believe in open communication and collaboration. Role Overview: As a Site Reliability Engineer (SRE) at Groww, you will be responsible for ensuring our systems are highly available, performant, and secure. You will work closely with engineering and infrastructure teams to improve reliability, automate deployments, and manage mission-critical services that power our platform. Key Responsibilities: Monitor and troubleshoot issues related to system performance, availability, and security. Define and maintain SLIs, SLOs, and Error Budgets to improve system reliability. Use tools like Grafana to analyze and report on metrics and trace data. Participate in the on-call rotation for 24/7 support of production systems. Collaborate with developers to ensure scalability and reliability are built into new services. Roll out security and infrastructure features proactively. Manage automated deployments, version control, and release rollouts. Perform Root Cause Analysis (RCA) for incidents and implement long-term fixes. Optimize system performance, conduct capacity planning, and create recovery strategies. Identify and automate repetitive tasks to reduce toil. Leverage CI/CD tools such as Git, Jira, Jenkins to streamline development workflows. Requirements: 4 6 years of relevant experience in SRE, DevOps, or infrastructure engineering. Bachelor's or Master's degree in Computer Science or a related field. Strong background in Linux/Unix system administration and networking. Hands-on experience with cloud platforms like GCP or AWS. Proficiency in programming languages such as Python, Java, or Go. Experience with monitoring and alerting tools: Grafana, Prometheus, New Relic, etc. Familiarity with configuration management tools. Experience with Kubernetes, Docker, and container orchestration tools is a strong plus. Excellent problem-solving, communication, and team collaboration skills. Be a part of one of India s fastest-growing fintech startups. Build and scale systems that impact millions of users daily. Work with passionate, driven teammates who are redefining financial services. A culture that encourages continuous learning, ownership, and transparency. If you're ready to help shape the future of fintech infrastructure in India, Groww is the place for you. Let s build something extraordinary together. Qualification : Bachelor's or Master's degree in Computer Science or a related field

Site Reliability Site reliability Engineer Site engineer
SL

Technical Lead Devops

Subex Limited

3-6 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Technical Lead - DevOps Location: Bangalore Rural, Karnataka, India Department: Data Platform and DevOps Employment Type: Subexian Experience Required: 3 to 6 years Job Overview: We are seeking an experienced Kubernetes Administrator with a strong background in managing containerized environments. The ideal candidate will have 4+ years of hands-on experience in deploying, configuring, and optimizing Kubernetes clusters to drive scalability, reliability, and performance. This is an excellent opportunity to leverage your expertise in Kubernetes orchestration while contributing to the overall success of our platform. Key Responsibilities: Cluster Management: Deploy, configure, and manage Kubernetes clusters both on-premises and across cloud platforms such as AWS, Azure, and GCP. Security & Compliance: Implement best practices for cluster security, including role-based access control (RBAC), network policies, and data encryption at rest and in transit. Automation: Automate cluster provisioning and ongoing management using tools like Terraform, Ansible, or Helm charts, streamlining operations and reducing manual tasks by 40%. Monitoring & Performance: Continuously monitor cluster health and performance metrics using tools like Prometheus, Grafana, ensuring high availability and optimal performance. CI/CD Pipelines: Design and implement CI/CD pipelines for containerized applications using tools such as Jenkins, GitLab CI/CD, and CircleCI to enable smooth continuous delivery. Collaboration: Work closely with development teams to troubleshoot issues, optimize application performance, and ensure compatibility with Kubernetes environments. Security Audits: Conduct regular security audits to identify vulnerabilities and ensure compliance with industry standards. Documentation: Maintain clear and comprehensive documentation for deployment procedures, configuration settings, and troubleshooting guides to enhance knowledge sharing within the team. Infrastructure Management: Administer and maintain Linux/Unix servers and virtualization platforms such as VMware or KVM, ensuring seamless operations across the infrastructure. Backup & Recovery: Implement and manage robust backup and disaster recovery solutions to ensure data integrity and minimize system downtime. Technical Support: Provide expert-level technical support for server and network infrastructure-related issues. Required Skills & Qualifications: Proven experience in Kubernetes deployment, configuration, and administration. Strong command of containerization technologies, including Docker and containerd. Hands-on experience with cloud platforms such as AWS, Azure, and GCP. Proficiency in Infrastructure as Code (IAC) tools like Terraform and Ansible. Familiarity with CI/CD pipelines and automation tools like Jenkins and GitLab CI/CD. Excellent troubleshooting and problem-solving skills. Strong communication and collaboration abilities, with the capability to work effectively across cross-functional teams. If you re passionate about DevOps, Kubernetes, and driving the success of containerized environments, we d love to hear from you!

Technical Lead Technical lead DevOps Lead devops
CO

Platform Engineer

Colortokens

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Platform Engineer Location: Bengaluru, Karnataka, India Full-time partially remote About ColorTokens At ColorTokens, we empower businesses to stay operational and resilient in an increasingly complex cybersecurity landscape. Breaches happen but with our cutting-edge ColorTokens Xshield platform, companies can minimize the impact of breaches by preventing the lateral spread of ransomware and advanced malware. We enable organizations to continue operating while breaches are contained, ensuring critical assets remain protected. Our innovative platform provides unparalleled visibility into traffic patterns between workloads, OT/IoT/IoMT devices, and users, allowing businesses to enforce granular micro-perimeters, swiftly isolate key assets, and respond to breaches with agility. Recognized as a Leader in the Forrester Wave : Microsegmentation Solutions (Q3 2024), ColorTokens safeguards global enterprises and delivers significant savings by preventing costly disruptions. Our culture We foster an environment that values customer focus, innovation, collaboration, mutual respect, and informed decision-making. We believe in alignment and empowerment so you can own and drive initiatives autonomously. Self-starters and high-motivated individuals will enjoy the rewarding experience of solving complex challenges that protect some of world s impactful organizations be it a children s hospital, or a city, or the defense department of an entire country. Position Overview: Colortokens is looking for a Junior Platform Administrator to assist in managing, maintaining, and optimizing our NextGen Security Information and Event Management (SIEM) platform. The ideal candidate will support the day-to-day operations, help onboard customer log sources, troubleshoot integration issues, and provide technical assistance to the security operations team. This role is ideal for a motivated professional with 3+ years of experience in SIEM administration, security operations, or log management. Key Responsibilities: SIEM Platform Administration Assist in deploying, configuring, and maintaining the NextGen SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, QRadar, Chronicle, Exabeam). Perform basic updates and patches to ensure platform security and functionality. Monitor SIEM health, performance, and uptime under the guidance of senior administrators. Log Source Management Onboard new log sources and validate data ingestion. Help troubleshoot log ingestion, parsing, and formatting issues. Maintain log retention policies for compliance. Rule and Use Case Management Support the development and deployment of detection rules, correlation use cases, and alerts. Tune existing use cases to minimize false positives. Work closely with security analysts to refine alerting strategies. Integration and Automation Assist in integrating SIEM with other security tools (e.g., EDR, microsegmentation, vulnerability scanners). Work on basic automation tasks using scripting (Python, PowerShell) to enhance SIEM efficiency. Platform Security and Compliance Support role-based access control (RBAC) and platform security policies. Help ensure SIEM adheres to compliance standards like SOC2, ISO 27001. Participate in periodic security audits. Network Debugging & Troubleshooting Have a basic understanding of TCP/IP, networking concepts, and protocols. Assist in debugging network connectivity issues related to SIEM log ingestion. Use basic network troubleshooting tools. Collaboration and Support Work alongside SOC analysts, threat hunters, and security engineers. Provide basic technical support for SIEM users. Assist in training and documentation for security teams. Performance Monitoring and Optimization Monitor storage and indexing performance to ensure optimal operations. Report any performance issues to senior administrators. Contribute to platform health reports and alerting metrics. Incident Support Assist SOC teams in log analysis, incident response, and forensic investigations. Ensure log data is readily available for security incidents. Education and Certifications: Bachelor s degree in Computer Science, Information Security, or a related field. Certifications (Preferred but not mandatory): Splunk Certified User/Admin Microsoft Certified: Security Operations Analyst Associate QRadar Certification Any SIEM-related certification Experience: 3+ years of experience in SIEM administration, security operations, or log management. Hands-on experience with at least one SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, Chronicle, Exabeam). Basic knowledge of log ingestion, rule creation, and data parsing. Exposure to scripting (Python, PowerShell) for automation. Basic understanding of TCP/IP networking concepts and network debugging. Technical Skills: Understanding of log formats, Syslog, JSON, XML, and data pipelines. Basic knowledge of querying languages (KQL, SPL, AQL). Familiarity with SIEM integration with security tools like EDR, SOAR, NDR. Awareness of MITRE ATT&CK, NIST, or CIS security frameworks. Basic experience with network troubleshooting tools (ping, traceroute, netcat (nc)). Soft Skills: Strong problem-solving and troubleshooting abilities. Good verbal and written communication skills. Ability to work collaboratively in a security operations environment. Preferred Skills: Basic understanding of cloud-based security solutions (AWS, Azure, Google Cloud). Exposure to SOAR tools (e.g., Cortex XSOAR, Splunk Phantom). Interest in machine learning-based anomaly detection for SIEM. Key Metrics for Success: Successful onboarding of log sources. Improvement in log ingestion and parsing accuracy. Contribution to fine-tuning detection rules. Timely resolution of SIEM-related support requests. Ability to identify and troubleshoot basic network connectivity issues.

Platform Engineer Platform engineer Full-Time Platform engineering
PS

Senior Associate Infrastructure L1 (AWS)

Publicis Sapient

4+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Senior Associate Infrastructure L1 (AWS) Location: Bengaluru, India Department: Infrastructure & Cloud Engineering Employment Type: Full-Time About the Role As a Senior Associate Infrastructure L1 (AWS), you will design, implement, and manage secure, scalable, and highly available cloud infrastructure for enterprise digital transformation initiatives. You ll collaborate with cross-functional teams to automate deployments, enable DevOps best practices, and ensure robust observability across systems. Your goal is to reduce time-to-market and optimize performance, cost, and compliance. Key Responsibilities Architect and build immutable infrastructure on AWS and/or other cloud platforms. Implement and maintain infrastructure as code using Terraform, CloudFormation, or similar. Manage containerized environments using Kubernetes (EKS/GKE), ECS, Docker, and Helm. Implement service mesh (e.g., Istio) for advanced traffic management, monitoring, and security. Develop and manage CI/CD pipelines using Jenkins, GitLab, CircleCI, or similar. Automate build/deployment processes using Groovy, Go, Python, Shell, or PowerShell. Integrate DevSecOps and security scanning into the software delivery lifecycle. Configure and maintain monitoring, logging, and observability using: Monitoring: Prometheus, Grafana, Datadog, New Relic Logging: ELK Stack, Fluentd, Splunk Observability: OpenTelemetry, Jaeger, Kiali, CloudTrail, Dynatrace Troubleshoot infrastructure, performance, and deployment issues. Collaborate with application teams and stakeholders to ensure high performance and availability of deployed services. Required Skills & Qualifications 4 to 12 years of experience in Cloud Infrastructure & DevOps roles. Bachelor's or Master s degree in Engineering, Computer Science, or related field. Hands-on experience with AWS (EC2, VPC, IAM, Lambda, RDS, CloudWatch, etc.) Solid experience in container orchestration using Kubernetes (EKS/GKE) and infrastructure management. Expert in IaC tools like Terraform (preferred), ARM templates, Pulumi, etc. Proficiency in CI/CD pipeline automation and scripting. Familiarity with cloud-native security practices and vulnerability scanning tools. Experience with DNS, Load Balancers, and high-volume application infrastructure setup. Hands-on experience with artifact repositories like Nexus or Artifactory. Preferred Certifications (Nice to Have) Associate-level certifications in AWS, Azure, or GCP HashiCorp Certified Terraform Associate Benefits Gender-neutral workplace policies 18 paid holidays per year Generous parental leave and new parent transition support Flexible work arrangements Comprehensive Employee Assistance Program (mental & physical wellness) About Publicis Sapient Publicis Sapient is a global digital transformation partner helping established organizations evolve into their future state through technology, data, consulting, and customer-first experiences. With over 20,000 employees across 53 offices, we combine deep domain knowledge with a start-up mindset and agile methods to solve complex business challenges.

Senior Associate Senior associate Infrastructure AWS
SI

It Automation Engineer

Samsara Inc

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: IT Automation Engineer Location: Bengaluru, India (Hybrid 3 days onsite) Company: Samsara Technologies India Pvt. Ltd. About Samsara Samsara (NYSE: IOT) is a global leader in the Connected Operations Cloud, empowering organizations in physical operations such as transportation, logistics, construction, and manufacturing to unlock actionable insights from IoT data. With products that improve safety, efficiency, and sustainability, Samsara is at the forefront of digital transformation for industries that power the world. Role Overview As an IT Automation Engineer within Samsara s Business Technology Core IT team, you'll play a key role in streamlining internal IT systems and processes through automation, infrastructure-as-code, and modern DevOps practices. This position emphasizes cloud infrastructure, scripting, CI/CD, and SaaS system integration to support high-growth scalability and efficiency across Samsara's enterprise environment. This hybrid role requires 3 days per week in the Bengaluru office and 2 days remote, operating in India Standard Time (IST). Key Responsibilities Automation & Development Design and build automation scripts and services using Python, Bash, or JavaScript (Node.js). Automate repetitive IT operations across internal platforms, SaaS tools, and cloud infrastructure. Develop and deploy Infrastructure-as-Code (IaC) using Terraform or CloudFormation for AWS environments. Cloud & DevOps Engineering Manage and provision AWS services such as Lambda, EC2, S3, RDS, ECS, API Gateway, etc. Build and maintain CI/CD pipelines and implement containerized solutions using Docker. Implement observability and monitoring solutions using tools like CloudWatch and Splunk. Collaboration & Strategy Partner cross-functionally with IT, security, and business systems teams. Lead strategic automation initiatives to improve IT efficiency at scale. Write and maintain clear documentation for automated workflows and tooling. Minimum Qualifications Bachelor's degree in Computer Science, IT, or a related field. 5+ years in IT automation, DevOps, or software development roles. Strong scripting skills in Python, JavaScript (Node.js), or Go. Hands-on experience with AWS services and IaC tools (Terraform preferred). Experience with SaaS ecosystems like Google Workspace, Okta, Slack, Zoom, GitHub, Zendesk. Proficient in version control using Git/GitHub and building CI/CD pipelines. Strong communication and cross-functional collaboration skills. Preferred Qualifications Familiarity with Atlassian tools (Jira, Confluence), OpsGenie, StatusPage. Experience with Splunk and monitoring large-scale cloud systems. Exposure to Google Cloud Platform (GCP). Experience leading end-to-end internal application development projects. Qualification : Bachelor's degree in Computer Science, IT, or a related field

IT Automation It automation Engineer It engineer
CO

Backend Engineer

Cognite

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Backend Engineer Location: Bengaluru (Hoodi, Rathi Legacy, Rohan Tech Park) Team: Product Engineering Employment: Full-Time | Hybrid About Cognite Cognite is a global SaaS leader leveraging AI and data to solve complex industrial challenges in Oil & Gas, Chemicals, Pharma, Manufacturing, and Energy. Our flagship products include Cognite Atlas AI and Cognite Data Fusion (CDF). We have been recognized as a 2022 Technology Innovation Leader and 2024 Microsoft Energy & Resources Partner of the Year, driving the future of industrial digital transformation. Our Values Impact: Deliver meaningful, measurable results. Ownership: Take responsibility beyond your comfort zone and foster inclusivity. Relentless: Innovate with determination and resilience. Role & Responsibilities Design and develop scalable, high-performance backend services and APIs using Java, Kotlin, or Python for Cognite Data Fusion. Work with advanced database technologies like PostgreSQL and Elasticsearch to enhance our industrial knowledge graph. Collaborate with application teams to create user-centric solutions addressing complex industrial problems. Build resilient, scalable infrastructure using modern open-source tools and Cognite s data platform. Influence critical product and technical decisions by partnering closely with stakeholders and domain experts. 5+ years of backend engineering experience, primarily using Java, Kotlin, or Python in SaaS/product companies at scale. Flexibility with tech stacks: Java/Kotlin experience is a plus for Python developers and vice versa. Experience with Spark is highly valued. Strong background in modern databases (PostgreSQL, Elasticsearch), graph processing, distributed systems, and performance tuning. Commitment to clean, maintainable code and best practices through continuous code review and improvement. DevOps experience: CI/CD, Infrastructure as Code, Kubernetes multi-cloud deployments (AWS, GCP, Azure). Skilled in observability and diagnostics with tools like Prometheus, Grafana, and expertise in troubleshooting complex system issues. Comfortable contributing to and learning from the open-source community. Excellent communication skills to collaborate effectively across diverse, global teams. Be part of a diverse global team representing 70+ nationalities, committed to DEI. Work in a modern, vibrant office environment at Rathi Legacy, Hoodi, Bengaluru with hybrid flexibility. Flat hierarchy with direct access to leadership and minimal bureaucracy. Collaborate with world-class talent on ambitious, high-impact projects across multiple industries. Engage in Cognite s HUB community for direct interaction with colleagues and partners. Make Your Impact Join Cognite and help revolutionize industrial digital transformation with strong DataOps, enabling better decisions and sustainability for global clients. We encourage applications from all backgrounds and identities. If you re passionate about shaping the future of industrial SaaS, apply today!

Backend Engineer Backend Engineer Full-Time Node.js
CO

Performance Engineer

Cognite

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Performance Engineer Location: Bengaluru (Whitefield) Team: Product Engineering Employment: Full-Time | Hybrid About Cognite Cognite is a global SaaS leader driving industrial digital transformation through AI and data. Our flagship products include Cognite Atlas AI and Cognite Data Fusion (CDF), empowering industries such as Oil & Gas, Chemicals, Pharma, and Manufacturing to harness data at scale. Recognized with multiple industry awards, including 2022 Technology Innovation Leader and 2024 Microsoft Energy & Resources Partner of the Year, we lead the way in innovative industrial solutions. Our Values Impact: Deliver meaningful outcomes with focus and purpose. Ownership: Take initiative, embrace responsibility, and collaborate inclusively. Relentless: Innovate persistently, learn from challenges, and improve continuously. Role & Responsibilities Design, develop, and execute performance and load tests to ensure system scalability, stability, and reliability of Cognite SaaS products. Identify performance bottlenecks and provide actionable insights for improvement. Build and maintain testing frameworks, scripts, and tools to support performance testing initiatives. Collaborate closely with engineering teams to align testing strategies with system architecture. Monitor production system performance and assist in root cause analysis of performance issues. Share performance optimization best practices via documentation, training, and team discussions. Qualifications Bachelor s or Master s degree in Computer Science, IT, or related fields. 3-5 years of experience in performance testing and engineering, preferably in SaaS environments. Proficiency with performance testing tools such as JMeter, Gatling, LoadRunner, BlazeMeter, or equivalents. Strong understanding of CI/CD pipelines and container technologies like Kubernetes and Docker. Solid programming skills in Java, Python, or similar languages. Experience with databases like PostgreSQL. Familiarity with performance monitoring and analysis tools such as Grafana and Prometheus. Preferred Skills Agile methodology experience and working in globally distributed teams. Expertise testing large-scale systems and handling high-volume data loads. Knowledge of React and JSON for test data creation and API performance testing. Diverse global community with 70+ nationalities and strong DEI focus. Modern, vibrant office in Whitefield, Bengaluru with hybrid work culture. Flat organizational structure with direct access to leadership and minimal bureaucracy. Collaborate with world-class talent on ambitious and impactful industrial tech projects. Engage with the wider Cognite community through HUB conversations and partnerships. Make an Impact Join Cognite to help build scalable, high-performing SaaS solutions that empower industrial enterprises globally. We welcome candidates from all backgrounds to apply. Qualification : Bachelors or Masters degree in Computer Science, IT, or related fields.

Performance Engineer Performance engineer Full-Time Performance testing
SA

Devops Engineer

Sarvam

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

DevOps Engineer Location: Bengaluru, Karnataka, India (On-Site) Department: Engineering Employment Type: Full-Time About Sarvam.ai Sarvam.ai is a cutting-edge generative AI startup headquartered in Bengaluru, India, with a mission to make generative AI accessible and impactful for Bharat. Founded by AI experts, we are dedicated to developing high-performance, cost-effective AI agents tailored for the Indian market. We enable enterprises to tap into new opportunities, build deeper customer connections, and reshape the future of AI for India and beyond. Role Overview We are looking for a DevOps Engineer to join our team and help build and manage scalable, secure, and high-performance infrastructure. In this role, you will be a key contributor to automating deployments, managing cloud infrastructure, optimizing CI/CD workflows, and ensuring system reliability. You will work with cutting-edge technologies, including cloud platforms, containerization, and infrastructure as code (IaC), to deliver impactful solutions for AI-driven products. Key Responsibilities CI/CD Pipelines: Design, implement, and manage CI/CD pipelines for seamless software deployment and integration. Cloud Infrastructure: Deploy and manage cloud infrastructure using Terraform, Kubernetes, and Docker for scalability and high performance. Automation & Scaling: Automate infrastructure provisioning, scaling, and security compliance to support high-availability environments. Monitoring & Optimization: Implement logging, monitoring, and alerting solutions using tools like Prometheus, Grafana, ELK Stack, or CloudWatch to monitor system performance and optimize resource utilization. Security & Compliance: Enhance security and compliance by managing IAM policies, encryption, and vulnerability scanning. Troubleshooting & Root Cause Analysis: Troubleshoot system failures, perform root cause analysis, and implement improvements to ensure reliability and uptime. Collaboration: Work closely with development teams to ensure smooth deployment and operation of AI models and applications. Must-Have Skills & Qualifications Educational Background: Bachelor s degree in Computer Science, Engineering, or related field (2024/2025 graduates). Cloud Expertise: Strong experience with AWS, Azure, or GCP for deploying and managing cloud-based applications. Containerization: Proficiency in Docker and Kubernetes for building and managing containerized applications. Infrastructure as Code (IaC): Experience with Terraform, Ansible, or CloudFormation to automate infrastructure management. CI/CD Pipelines: Experience in setting up automated workflows using tools like GitHub Actions, Jenkins, or GitLab CI/CD for smooth deployments. Monitoring & Logging: Experience with Prometheus, Grafana, ELK, or similar tools to implement effective monitoring and logging solutions. Networking & Security: Strong understanding of firewalls, VPNs, SSL, and cloud security best practices for secure infrastructure. Version Control: Proficiency with Git for managing code repositories and version control workflows. Problem Solving: Strong debugging, troubleshooting, and analytical skills to resolve complex system issues. Good to Have (Preferred Experience) Serverless Computing: Exposure to serverless computing models such as AWS Lambda or Azure Functions. Message Queues: Experience with message queues like Kafka, RabbitMQ, or SQS. Site Reliability Engineering (SRE): Familiarity with SRE practices to ensure the reliability and availability of large-scale systems. Open Source Contributions: Contributions to open-source projects or a strong GitHub portfolio showcasing DevOps expertise and best practices. Impactful Work: Work on AI-driven products that are reshaping the future of technology in India. Innovative Team: Collaborate with a team of AI experts and engineers pushing the boundaries of technology. Career Growth: Opportunity to grow in a fast-growing startup at the forefront of the generative AI revolution. Cutting-edge Technologies: Work with cloud technologies, automation, and AI infrastructure to create high-impact products. Qualification : Bachelors degree in Computer Science, Engineering, or related field

DevOps Engineer Devops engineer Full-Time Continuous integration
OR

Site Reliability Developer 2/3

Oracle

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Description: Site Reliability Engineer - OCI Cloud Engineering Team Role: Site Reliability Engineer (SRE) Team: OCI OLTP (Online Transaction Processing) Location: Kiev Career Level: IC2 Experience: 5+ years Overview: Oracle Cloud Infrastructure s (OCI) OLTP organization is seeking a Site Reliability Engineer (SRE) to join our dynamic and fast-paced Cloud engineering team. The team is responsible for mission-critical distributed systems and cloud services, and we are looking for an engineer who is deeply interested in databases, distributed systems, and cloud services. If you thrive in an environment where innovation, problem-solving, and operational excellence intersect, this is an exciting opportunity for you! As a member of the SRE services, you will focus on Cloud Services, building deployments, operations, security vulnerability mitigation, and automation. You will be instrumental in fostering a culture of Site Reliability Engineering (SRE) within the team, and your work will directly contribute to ensuring the stability, performance, and reliability of Oracle s global cloud service infrastructure. This role requires someone who is adaptable, highly motivated, and capable of managing large-scale cloud environments with a focus on continuous improvement. Key Responsibilities: Cloud Service Operations & Reliability: Deploy, operate, and maintain large-scale cloud service products in a highly available, fault-tolerant, and scalable environment. Collaborate with internal teams to identify and mitigate cross-team issues that pose operational risks to cloud services. Focus on systems reliability and ensure the continuous availability of cloud services by automating tasks and eliminating manual interventions. Automation & Improvements: Automate operational tasks and improve service deployments, focusing on scaling, performance, and uptime. Contribute to CI/CD systems, ensuring seamless integration and continuous delivery for cloud-based services. Leverage automation tools such as Terraform, Grafana, and Bitbucket to streamline operations. Security & Incident Response: Mitigate security vulnerabilities within cloud services and ensure compliance with Oracle's security standards. Participate in on-call rotations to provide immediate troubleshooting support and ensure rapid issue resolution. Perform deep analysis of service performance and collaborate with team members to diagnose and resolve issues that affect service availability or performance. Collaborative Problem-Solving: Work closely with cross-functional teams, including development, database, networking, and storage experts, to ensure the reliability and performance of services. Identify systemic issues and potential risks, develop solutions, and ensure proper documentation and communication with stakeholders. Documentation & Knowledge Sharing: Contribute to documentation such as runbooks, operational guides, and troubleshooting manuals. Mentor junior engineers and share knowledge on best practices for site reliability engineering and cloud service operations. Continuous Learning: Stay up to date with new cloud technologies, trends, and best practices, and actively implement them in your day-to-day work. Technical and Professional Requirements: Cloud Services & Infrastructure: 5+ years of experience in SRE, DevOps, or Automation roles with a focus on large-scale infrastructure and cloud services. Hands-on experience with cloud platforms (e.g., OCI, AWS, Azure) and expertise in compute, database, networking, and storage services within cloud environments. Automation & Tooling: Proficiency with automation tools such as Terraform, Grafana, LumberJack, and Shepherd. Solid experience in using CI/CD tools and processes for cloud service deployments and operations. Scripting & Systems: Strong knowledge of scripting languages, particularly Python and Java. Familiarity with Linux systems, docker containers, virtualized infrastructure, and orchestration (e.g., Kubernetes). Performance & Troubleshooting: Excellent troubleshooting skills with a focus on performance, availability, reliability, and scalability of distributed systems. Experience in operating fault-tolerant, highly available, high-throughput distributed systems. Security & Incident Management: Familiarity with security practices and mitigating security vulnerabilities in cloud services. Proven ability to handle incident response and provide efficient troubleshooting during on-call rotations. Collaboration & Communication: Strong verbal and written communication skills, capable of working effectively with diverse teams across multiple geographies. Ability to work in a highly collaborative environment, driving operational excellence and customer satisfaction. Preferred Qualifications: Experience in operating and maintaining multi-tenant, cloud-based infrastructure with a focus on scalability and high availability. Familiarity with tools and platforms like Grafana, Prometheus, and other observability and monitoring tools. Experience in networking and storage technologies in a cloud environment. Joining OCI s OLTP team as an SRE gives you the opportunity to work with cutting-edge technologies and contribute to the operational excellence of Oracle s global cloud infrastructure. This is a chance to grow your skills in a highly dynamic environment and to solve complex problems that directly impact mission-critical cloud services. With a focus on automation, scalability, and high performance, you will be an essential part of a team that powers Oracle s leading cloud services. If you are an experienced engineer passionate about cloud technologies, automation, and ensuring the reliability of large-scale systems, we encourage you to apply and join us in this exciting journey!

Site Reliability Site reliability Developer Site developer
CO

Senior Site Reliability Engineer

Couchbase

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!

Senior Site Reliability Site reliability Engineer
SO

Devops + Tester

Sourcefuse

4-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: DevOps + Tester Location: Bangalore, India Experience: 4 5 years Industry: IT Services Job Type: Full-time Role Overview This hybrid DevOps + QA role focuses on: Ensuring mobile application performance and reliability. Driving automation, CI/CD, and continuous improvement. Designing and executing automated test scripts and performing integration, regression, and performance testing. Supporting innovation and scalable software deployment in alignment with Rakuten s standards. You ll collaborate closely with development, QA, and operations teams, while improving infrastructure and testing frameworks. Key Skills & Tools CI/CD Tools: Jenkins, Bamboo, Docker Testing: Automation, Integration, Regression, Performance Testing Cloud Platforms: AWS, Azure, GCP Salesforce Ecosystem: 1 2 years hands-on experience preferred API Integration: Including legacy systems Test Scripting Tools: Open-source or commercial frameworks Solid grasp of software architecture, high availability, and transaction-intensive systems Responsibilities Monitor and optimize app performance Develop and maintain automated test scripts Execute integration and regression testing Conduct performance tests during pipeline integration Collaborate across DevOps, development, and QA teams Maintain detailed test documentation Conduct unit tests, code reviews, and QA validations Ensure service quality and customer satisfaction Education & Qualifications Bachelor s degree in CS, IT, Engineering, or related field (required) MBA or advanced degree (preferred) Salesforce Admin or PD certification (preferred) Ideal Candidate Traits Strong DevOps + Testing blend with cloud experience Effective communication with technical and non-technical teams Strategic thinker with planning skills Thrives in fast-paced environments, managing multiple priorities Interview Process 2 Technical Rounds Qualification : Bachelors degree in CS, IT, Engineering, or related field (required)

DevOps Full-Time Test automation Continuous integration Continuous deployment
IN

Enterprise Infra Automation Architect

Infosys

16+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Enterprise Infrastructure Automation Architect Location: Bengaluru, India Experience: 16 20 Years Service Line: Cloud & Infrastructure Services Educational Qualifications: B.E., B.Tech, M.Tech, BCA, MCA, MBA Role Overview: We are looking for a seasoned Enterprise Infrastructure Automation Architect to lead the design and implementation of automation strategies across our global IT infrastructure. This role is pivotal in driving enterprise-wide automation initiatives, streamlining operations, and enabling digital transformation through scalable and secure infrastructure automation solutions. Key Responsibilities: Infrastructure Automation Strategy & Roadmap Define and maintain the enterprise automation strategy aligned with organizational goals and IT objectives. Identify automation opportunities across compute, storage, network, virtualization, cloud, and data center domains. Establish automation goals, KPIs, and success metrics for continuous improvement. Evaluate and recommend emerging automation technologies and frameworks. Solution Design & Architecture Design scalable, secure, and maintainable automation architectures for enterprise infrastructure. Define enterprise-wide automation standards and best practices (e.g., IaC, scripting, orchestration). Select and standardize tools such as Ansible, Terraform, Python, PowerShell, and cloud-native automation services. Build reusable automation frameworks, templates, and modules to ensure consistency. Implementation & Governance Provide architectural oversight and support during implementation and deployment phases. Ensure compliance with automation standards and governance throughout the lifecycle. Participate in project reviews to ensure strategic alignment with enterprise automation goals. Establish governance processes for managing scripts, workflows, and infrastructure-as-code artifacts. Additional Responsibilities: Proven experience in IT Service Management and remote delivery automation environments. Ability to articulate the business value and operational impact of automation initiatives. Self-motivated, creative thinker with excellent problem-solving abilities. Excellent communication skills, both verbal and written. Technical & Professional Requirements: In-depth knowledge of enterprise architecture frameworks (e.g., TOGAF, Zachman). Expertise in infrastructure domains including compute, storage, middleware, backup (on-prem & cloud). Experience with public cloud platforms (AWS, Azure, GCP) and hybrid cloud architectures. Proficiency in security standards and best practices across global IT environments. Hands-on experience with monitoring and orchestration tools. Skilled in creating architecture diagrams and workflow visualizations using tools like MS Visio, Lucidchart, etc.

Enterprise Infra Automation Enterprise Automation Architect
CT

Devops Engineer

Camsdata Technologies India Pvt. Ltd.

2+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

DevOps Engineer Bangalore, India Location: Bangalore (Bengaluru) Experience: 2 to 8 Years Industry: IT Software / Cloud & DevOps Job Summary: We are seeking an experienced DevOps Engineer to design, implement, and manage CI/CD pipelines on AWS and support application deployments. The ideal candidate will have hands-on expertise with AWS services, automation tools, and security integration within DevOps workflows. Key Responsibilities: Design, configure, and maintain CI/CD pipelines using AWS native tools or traditional platforms such as Jenkins, GitHub Actions, etc. Deploy applications on AWS using services like AWS Fargate, EBS, S3, CodePipeline, CodeBuild, and others Onboard applications onto AWS DevOps platform following the required CI/CD workflow Collaborate with application and operations teams to provide remediation and support for CI/CD pipeline onboarding Integrate various test automation frameworks and tools into CI/CD pipelines for continuous testing Implement security scanning and frameworks within pipelines, including SAST, DAST, IAST, and RASP Monitor the DevOps platform, applications, and infrastructure; respond proactively to incidents and events Automate operational tasks using Ansible or scripting languages (e.g., Python, Bash) Develop reusable automation assets and scripts to streamline DevOps processes Required Skills: Proven experience setting up and managing CI/CD pipelines on AWS and other platforms Strong knowledge of AWS services relevant to DevOps: Fargate, EBS, S3, CodePipeline, CodeBuild Familiarity with automation tools like Ansible, scripting languages, and infrastructure-as-code Experience integrating security tools and frameworks within DevOps pipelines Good troubleshooting and monitoring skills with cloud-native tools and third-party platforms Excellent collaboration skills for working across development and operations teams Preferred Qualifications: Bachelor s degree in Computer Science, Engineering, or related field Certifications in AWS DevOps (AWS Certified DevOps Engineer) or similar credentials Experience with container orchestration (e.g., Kubernetes) and Docker Knowledge of Agile and DevSecOps methodologies Work on cutting-edge cloud-native DevOps solutions Collaborate with a dynamic team focused on automation and security Opportunity for professional growth and certification support Qualification : Bachelors degree in Computer Science, Engineering, or related field.

DevOps Engineer Devops engineer Full-Time Continuous Integration (CI)
TV

Devops Engineer

Team Vunet Systems

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

DevOps Engineer Location: Bengaluru, India Experience: 3 - 5 Years Job Type: Full-time About VuNet VuNet is a deep-tech leader in Business Journey Observability, leveraging Big Data and Machine Learning to deliver end-to-end digital experience monitoring for major financial institutions. The platform monitors over 28 billion transactions monthly, powering top banks and enterprises in India and MEA. Work on cutting-edge observability technology Join a Series B funded, award-winning startup recognized by Gartner, Forbes, and NASSCOM Collaborate in a fast-paced, innovative environment focused on learning and growth Access to mental wellness support, health insurance (covering family), and career development programs Role Overview: DevOps Engineer Design, develop, and maintain VuSmartMaps deployments across on-premises, cloud, and hybrid environments Automate deployments using Infrastructure-as-Code (IaC) and CI/CD pipelines Manage cybersecurity assessments and remediations for deployments Collaborate with development teams to improve deployment processes and infrastructure support Publish VuSmartMaps in cloud marketplaces (AWS, Azure, GCP) Stay current on DevOps, CI/CD, infrastructure orchestration, cybersecurity, AI workflows, and big data technologies Key Responsibilities Develop and maintain IaC frameworks enabling flexible VuSmartMaps deployment Build and manage CI/CD pipelines using GitHub Actions, Jenkins Monitor infrastructure, conduct cybersecurity testing, and manage patching Improve deployment efficiency and customer experience Collaborate cross-functionally for seamless integration and rollout Must-Have Skills 3+ years building/managing CI/CD pipelines (GitHub Actions, Jenkins) Certified/experienced in Kubernetes, Docker, Terraform, Helm, YAML Hands-on experience with GitOps workflows Knowledge of web servers (Nginx, Django), identity providers (Active Directory, LDAP), load balancers (Traefik) Experience with databases (PostgreSQL, Elasticsearch, Hadoop stack) and secrets management (Key Vault) Familiarity with cloud services (AWS, Azure, GCP) across IaaS, PaaS, SaaS layers Strong Linux and scripting skills (Bash, Python) Excellent communication skills for cross-team collaboration Good-to-Have Skills Exposure to Red Hat OpenShift, VMware, Ansible, Chef, Puppet Familiarity with container orchestration tools (Podman, Docker Swarm, Nomad) Experience optimizing dockerized microservices and container images Benefits Comprehensive health insurance covering you and your family Mental health and 1:1 counseling support Learning culture focused on innovation and career growth Inclusive, transparent workplace culture Access to new Gen AI tools and integrated tech workspace Career development and skill enhancement programs

DevOps Engineer Devops engineer Full-Time Continuous Integration (CI)
CO

Senior It Operations Engineer

Cognite

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Senior IT Operations Engineer Location: Bengaluru (Rathi Legacy, Rohan Tech Park, Hoodi) Team: Staff Finance Employment: Full-Time | Hybrid About Cognite Cognite is a global SaaS leader advancing industrial digital transformation through AI and data with flagship products like Cognite Atlas AI and Cognite Data Fusion. We are recognized as innovators and partners of choice in sectors including Oil & Gas, Chemicals, Pharma, and Manufacturing. Role Overview Join Cognite s Global IT Operations team in Bengaluru, where you will design, deploy, and maintain cloud infrastructure and IT systems critical to business operations. Your role will encompass managing Azure and SaaS platforms, automating processes, troubleshooting complex issues, and collaborating with teams to enable seamless IT services. Key Responsibilities Architect and manage cloud infrastructure solutions across Microsoft Azure, Google Workspace, and Atlassian tools. Handle IAM, user provisioning, and SaaS management with in-depth expertise in Microsoft Intune, Jamf Pro, and security tools. Develop automation using Terraform, PowerShell, Python, and manage CI/CD pipelines with GitHub. Provide hands-on support including access management, technical troubleshooting, and resolving issues across diverse SaaS platforms (Azure, GWS, Atlassian, Slack). Document IT processes and collaborate with internal teams to enhance IT operations. Your Profile 5 8 years experience in IT Cloud Infrastructure and IT Operations. Strong expertise in Microsoft EntraID, Intune, Azure, Google Workspace, and Atlassian products (Jira, Confluence). Proficient in scripting and automation with PowerShell, Python, Terraform, and GitHub version control. Experience with ITSM tools, preferably Jira Service Management. Bachelor s or Master s degree in IT or related field. Collaborative, service-minded, and patient with problem-solving and troubleshooting. Comfortable working with Windows and macOS environments, and security tools like Microsoft Defender and Jamf Protect. Diverse global team with 70+ nationalities and strong DEI focus. Modern, vibrant office environment in Bengaluru with hybrid work flexibility. Flat organization providing direct access to decision-makers. Work on innovative projects impacting major industries worldwide. Engage in an active community and partner ecosystem. Qualification : Bachelors or Masters degree in IT or related field.

Senior IT Operations Senior operations IT operations
CO

Senior Backend Engineer - Cognite Innovation Team

Cognite

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Senior Backend Engineer Innovation Team Location: Bengaluru (Rathi Legacy, Rohan Tech Park, Hoodi) Team: Global Strategic Services Industry Innovation and Solutions Employment: Full-Time | Hybrid About Cognite Cognite is a global SaaS leader using AI and data to transform industries such as Oil & Gas, Chemicals, Pharma, and Manufacturing. Our award-winning platforms include Cognite Data Fusion and Cognite Atlas AI, driving industrial digital transformation worldwide. The Team & Role You ll join Cognite s Innovation Team working on batch processing applications for a key American emulsions industry client. This role focuses on backend development and data engineering to deliver scalable, robust solutions handling complex data workflows with multidisciplinary teams and global stakeholders. Key Responsibilities Architect and develop backend systems for complex batch data processing workflows with a focus on reliability amid inconsistent/incomplete data. Build resilient workflows handling multi-source data and minimizing system impact from component failures. Develop and optimize data models for domain-specific industrial problems. Create data transformation functions to extract actionable insights from raw data. Manage scalable distributed systems and ensure operational reliability. Collaborate cross-functionally to refine requirements and deliver solutions. Work with large datasets using SQL and NoSQL technologies. Troubleshoot full application lifecycle issues, from dev to production. Stay current on tech trends to innovate team practices. Strong backend development experience, distributed systems, and complex domain modeling. Proficiency in Python, SQL, and NoSQL databases. Experience with cloud-based architectures, automated testing, version control, CI/CD pipelines. Problem-solving skills, especially when data or context is incomplete. Self-driven, taking ownership from concept to deployment. Collaborative and excellent communicator, comfortable engaging global stakeholders. Bonus: knowledge of Data Science and Machine Learning. Diverse global team of 70+ nationalities with strong DEI focus. Modern Bengaluru office with hybrid work flexibility. Flat structure with direct leader access, minimal bureaucracy. Work on ambitious, impactful projects with leading tech experts. Active community engagement with partners and customers. Excited to build next-gen industrial data solutions with Cognite? Apply today and make an impact in a fast-growing, innovative SaaS company!

Senior Backend Engineer Senior engineer Backend Engineer
OR

Oracle Cloud Operation Engineer

Oracle

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Description: SaaS Cloud Ops Specialist We are looking for SaaS Cloud Ops specialists involved in managing and supporting cloud-based applications, databases, and services. These roles can include: Designing, planning, implementing, onboarding, configuring, and managing cloud environments and applications; troubleshooting and resolving cloud services issues; maintaining, monitoring, planning, and documenting; and infrastructure-level automation experience. Career Level - IC3 Responsibilities As part of the Oracle Finance GIU - Banking-Application Management Support team, SaaSOps will be taking complete responsibility for supporting & maintaining OCI cloud-based applications, environments, and databases on OCI (Oracle Cloud). The new hire is expected to support 24x7 Production Operations for SaaS customers, associated banking cloud services, and products. Candidate should have expertise in the below (at least 3-4 from below): Kubernetes administration (Mandatory) Oracle Database administrator (Mandatory) OCI administration / or any other cloud administration (Mandatory) Linux (Mandatory) Excellent Communication Skills (Mandatory) 24*7 Production Operations (Mandatory) Expertise in Autonomous Database Automation experience CI/CD Pipelines Knowledge in GIT Repository Disaster Recovery (DR) SaaSOps is expected to possess strong troubleshooting skills and will need to work on a ticketing-based system to resolve issues and monitor various aspects of the cloud services as part of the day-to-day job. Also, he/she will work on critical and non-critical issues from the queues, escalation channels, and other modes of assignments. The candidate would be expected to update Service Requests with technical and non-technical solutions, meet SLA requirements, and interact with other functional teams, customers, customer management teams, and Product engineering teams as and when required.

Oracle Cloud Oracle Cloud Operation Cloud operation
BY

Senior DevOps / Site Reliability Engineer

Blue Yonder

10-13 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior DevOps / Site Reliability Engineer Location: Pune, India Company: Blue Yonder Experience: 10 to 13 years Education: Bachelor s Degree in Computer Science, Engineering, or related STEM fields Company Overview Blue Yonder is a leading AI-driven Global Supply Chain Solutions provider and consistently recognized as one of Glassdoor s Best Places to Work. We are driving the next wave of digital transformation in manufacturing and retail, delivering innovative SaaS solutions that power intelligent supply chains across the globe. We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to lead the design, development, deployment, and operational management of our Azure SaaS solution. This role requires strong DevOps, cloud delivery, and infrastructure automation expertise, along with leadership capabilities to guide a growing global team. Role Overview In this role, you will be responsible for architecting, planning, and executing end-to-end delivery pipelines, supporting both product development and operational stability. Working closely with platform, product, and architecture teams, you will implement best-in-class DevOps and SRE practices, ensuring scalability, resilience, and cost optimization. Key Responsibilities Architect, design, and manage CI/CD pipelines and infrastructure for a cloud-native, multi-tenant SaaS solution on Azure. Lead sprint planning, backlog grooming, and architecture discussions. Develop quality automation scripts and tools to reduce manual efforts and enable self-healing, self-service capabilities. Identify and resolve operational bottlenecks and proactively improve observability (monitoring, alerting, logging). Participate in code reviews, ensure secure and scalable designs, and mentor junior and mid-level engineers. Collaborate with stakeholders to understand business and technical requirements and translate them into actionable user stories. Implement and enforce cloud cost optimization strategies. Conduct post-incident reviews with a blameless culture to identify root causes and drive continuous improvements. Automate service requests and standard operational procedures. Drive improvements to the team s continuous integration pipeline, ensuring rapid and reliable deployments. Stay updated with the latest DevOps, SRE, and cloud technologies and bring innovative ideas to the table. Participate in team hiring and actively contribute to onboarding new team members. Technical Environment Languages: Java, Python, PowerShell, Shell Scripting DevOps Tools: Azure DevOps, GitHub Actions, Jenkins Cloud: Microsoft Azure (ARM Templates, AKS, Event Hub, HDInsight, Azure AD, Application Gateway, Virtual Networks) Architecture: Microservices, Kubernetes, Docker, Event-driven architecture Frameworks: Spring Boot, Hibernate Monitoring & Logging: Elasticsearch, Spark, Kafka Databases: RDBMS, NoSQL Version Control: Git Required Skills & Experience Bachelor s Degree (STEM preferred) with 10 to 13 years of experience in DevOps, Cloud Delivery, or Site Reliability Engineering. Proven hands-on experience with Azure Cloud Services. Expertise in setting up and optimizing CI/CD pipelines. Strong scripting experience: Shell and PowerShell are mandatory; Python is a plus. Strong understanding of container technologies (Docker, Kubernetes) and microservices architecture. Experience integrating and managing third-party monitoring and logging tools. Strong problem-solving skills and ability to work with global, cross-functional teams. Excellent communication and stakeholder management skills. Nice to Have Development experience in Java or Python. Experience working in agile teams with a product-centric mindset. Experience working in manufacturing or retail domains. Exposure to AI/ML-driven monitoring and observability tools. Work with cutting-edge technologies on globally impactful solutions. Collaborate with diverse and talented teams across the US, India, and the UK. Foster your career growth through mentorship, continuous learning, and leadership opportunities. Experience an inclusive, flexible work culture where innovation and creativity thrive. Diversity, Inclusion, Value & Equality (DIVE) At Blue Yonder, we are committed to building an inclusive environment where everyone feels empowered to be themselves. All qualified applicants will receive consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. Qualification : Bachelors Degree in Computer Science, Engineering, or related STEM fields

Software Engineer Staff Engineer Software Engineer Staff software engineer
MT

Devops

Mirafra Technologies

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

DevOps Engineer Location: Bangalore Experience: 5+ Years Education Qualification: B.E. in Computer Science / Electronics About Mirafra Founded in 2004, Mirafra is a fast-growing global product engineering services company specializing in Semiconductor Design, Embedded Systems, Digital Solutions, and Application Software. With over 1,500+ professionals worldwide, we provide cutting-edge solutions to Fortune 500 clients across industries such as Semiconductor, Internet, Aerospace, Networking, Telecom, Medical Devices, and Consumer Electronics. Recognitions: Best Company to Work For SiliconIndia (2016) Most Promising Design Services Provider SiliconIndia (2018) Top 10 Admired Companies for Software Services DigiTech Insight (2022) Key Responsibilities DevOps & Automation Develop automated CI/CD pipelines and manage build & deployment processes. Implement infrastructure automation using scripting (Shell, Batch, Python). Manage configuration, integration, and deployment using DevOps tools. Version Control & Build Management Work with Git, Gitlab, Bitbucket for version control. Maintain build systems like Make, CMake and manage dependencies using Pip, Conda, Poetry, Maven. Handle binary management tools like Artifactory, Nexus. Code Quality & Security Utilize Static Code Analysis tools (SonarQube, Pylint, Coverity) for code quality enforcement. Monitor and ensure security compliance in the DevOps lifecycle. Cloud & Containerization Manage cloud-based deployments and monitoring using ELK, Docker, Kubernetes. Implement scalable and resilient infrastructure solutions. Agile & Collaboration Work in an Agile/Scrum environment, collaborating with cross-functional teams. Utilize UML modeling and software development best practices. Skills & Qualifications Education: B.E. in Computer Science / Electronics Technical Expertise: Scripting & Automation: Shell, Batch, Python CI/CD & Build Tools: Jenkins, Gitlab, Make, CMake Version Control: Git, Bitbucket, Gitlab SCM Static Code Analysis: SonarQube, Pylint, Coverity Package Management: Pip, Conda, Poetry, Maven Binary Management: Artifactory, Nexus Cloud & Containerization: Docker, Kubernetes, ELK Stack Programming Languages: Python, C, C++ Operating Systems: Linux, Unix, Windows Soft Skills: Strong problem-solving and analytical skills. Excellent communication and team collaboration. Ability to work in fast-paced Agile environments. Cutting-edge projects in Semiconductor, Aerospace, Networking, and IoT. Global work environment with top-tier clients. Career growth opportunities and exposure to the latest technologies. Award-winning workplace culture and industry recognition. Excited to take on a challenging DevOps role? Apply now!

DevOps Full-Time CI/CD (Continuous Integration & Continuous Deployment) Infrastructure as Code (IaC) Automation
DA

Senior Technical Solutions Engineer (platform)

Databricks

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Overview: We are seeking a highly skilled Frontline Senior Technical Solutions Engineer with over 5 years of experience to join our Platform Support team. This role is pivotal in delivering exceptional support for our Databricks Data Intelligence platform, addressing complex technical challenges, and ensuring the seamless operation of our data solutions. As a frontline engineer, you will be the primary point of contact for critical issues, working closely with both internal teams and customers to resolve high-impact problems and drive platform improvements. Key Responsibilities: Frontline Support: Serve as the primary technical point of contact for escalated issues related to the Databricks Data Intelligence platform. Provide expert-level troubleshooting, diagnostics, and resolution for complex problems affecting system performance and reliability. Customer Interaction: Engage with customers directly to understand their technical issues and requirements. Provide timely, clear, and actionable solutions to ensure high levels of customer satisfaction. Incident Management: Lead the resolution of high-priority incidents, coordinating with various teams to address and mitigate issues swiftly. Conduct thorough root cause analyses and develop preventive measures to avoid recurrence. Collaboration: Work closely with engineering, product management, and DevOps teams to share insights, identify recurring issues, and drive improvements to the Databricks Data Intelligence platform. Documentation and Knowledge Sharing: Create and maintain detailed documentation on support procedures, known issues, and solutions. Contribute to internal knowledge bases and create training materials to assist other support engineers. Performance Monitoring: Monitor and analyze platform performance metrics to identify potential issues before they impact customers. Implement optimizations and enhancements to improve platform stability and efficiency. Platform Upgrades: Manage and oversee the deployment of Databricks Data Intelligence platform upgrades and patches, ensuring minimal disruption to services and maintaining system integrity. Innovation and Improvement: Stay abreast of industry trends and advancements in Databricks technology. Propose and drive initiatives to enhance platform capabilities and support processes. Customer Feedback: Collect and analyze customer feedback to drive continuous improvement in support processes and platform features. Qualifications: Experience: Minimum of 5 years of hands-on experience in a technical support or engineering role related to Databricks Data Intelligence platform, cloud data platforms, or big data technologies. Technical Skills: A deep understanding of Databricks architecture and Apache Spark, along with experience in cloud platforms like AWS, Azure, or GCP, is essential. Strong capabilities in designing and managing data pipelines, distributed computing are required. Proficiency in Unix/Linux administration, familiarity with DevOps practices, and skills in log analysis and monitoring tools are also crucial for effective troubleshooting and system optimization. Problem-Solving: Demonstrated ability to diagnose and resolve complex technical issues with a strong analytical and methodical approach. Communication: Exceptional verbal and written communication skills, with the ability to effectively convey technical information to both technical and non-technical stakeholders. Customer Focus: Proven experience in managing high-impact customer interactions and ensuring a positive customer experience. Collaboration: Ability to work effectively in a team environment, collaborating with engineering, product, and customer-facing teams. Education: Bachelor s degree in Computer Science, Engineering, or a related field. Advanced degree or relevant certifications are highly desirable. Preferred Skills: Experience with additional big data tools and technologies such as Hadoop, Kafka, or NoSQL databases. Familiarity with automation tools and CI/CD pipelines. Understanding of data governance and compliance requirements. Innovative Environment: Work with cutting-edge technology in a fast-paced, innovative company. Career Growth: Opportunities for professional development and career advancement. Team Culture: Collaborate with a talented and motivated team dedicated to excellence and continuous improvement. About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide including Comcast, Cond Nast, Grammarly, and over 50% of the Fortune 500 rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark , Delta Lake and MLflow. To learn more, follow Databricks on Twitter,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visithttps://www.mybenefitsnow.com/databricks. Our Commitment to Diversity and Inclusion At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics. Compliance If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to pr...

Senior Technical Senior technical Solutions Technical solutions

1 - 20 of 0 jobs

* No exact matches found. Showing closest results instead
Sort by:

No results found

Modify search criteria or create an alert to get relevant jobs as soon as they’re posted

Create an alert

Continue to Save

Please login to your jobseeker account, or create a new one to save this job.

Feedback

Share Feedback