Reliability Jobs in Bengaluru

205 Jobs Found

TA

Engineering Manager

Talview

6+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Engineering Manager Location: Bengaluru Hiring is still shaped by outdated processes manual screening, unconscious bias, and delayed feedback. Talview is transforming this with AI that actually works. We build GenAI-powered hiring and assessment platforms that make recruitment faster, fairer, and scalable. Our AI Products Alvy: The world s first AI Proctoring Agent for intelligent global exam monitoring. Ivy: A conversational AI Interviewer delivering unbiased first-round assessments. Impact: 10M+ assessments delivered across 120+ countries. The Role We re looking for an Engineering Manager to lead high-performing teams, drive architectural excellence, and deliver scalable products globally. You ll guide engineers across backend, frontend, QA, and DevOps, while partnering closely with Product and Design to drive meaningful outcomes. What You ll Do Leadership: Lead, mentor, and grow cross-functional engineering teams. Architecture: Own architecture and system design for cloud-native, distributed systems. Excellence: Champion code reviews, testing, automation, and security practices. Operations: Strengthen engineering processes including CI/CD, observability, and monitoring. Delivery: Own delivery outcomes, sprint planning, and team performance. People: Conduct 1:1s, performance reviews, and career development planning. You Might Be a Fit If You Have Required Qualifications: 6+ years of overall engineering experience; 5+ years in backend (Node.js, Go, or Python). 2+ years with Docker, Kubernetes, and public cloud platforms (AWS, GCP, or Azure). 2+ years in Agile delivery environments (Scrum, Squads, or Chapters). 1+ year experience managing a team of 4+ engineers. Deep understanding of cloud monitoring, deployments, and cost optimization. Bonus Points For: Building SaaS or high-scale distributed systems. Experience with AI-assisted coding tools (Cursor, Windsurf, Codex, etc.). Strong system design and architectural fundamentals. Our Culture: The 5Cs We are guided by Collaboration, Commitment, Credence (trust), Customer-centricity, and Candor. We work together, ship quality, and communicate openly. What You Get Competitive compensation and best-in-class hardware. 5-day work week with flexibility. Monthly team lunches and annual offsites. Accelerated growth in a fast-scaling product organization.

Engineering Manager Engineering manager Manager engineering Full-Time
PO

Engineering Manager, Collections

Postman

7+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Engineering Manager, Collections Location: Bengaluru Work Type: Full-Time About Postman Postman is the world s leading API platform, enabling over 40 million developers and 500,000 organizations, including 98% of the Fortune 500, to design, test, and collaborate on APIs efficiently. Founded in Bengaluru and headquartered in San Francisco, Postman simplifies the API lifecycle to help teams build better APIs, faster. The Opportunity The Collections team is at the heart of Postman s platform, enabling seamless API collaboration for millions of users. We manage tier-0/1 critical systems handling ~21M requests daily, supporting pillars like API development, testing, prototyping, discovery, distribution, and change management. We are seeking an experienced Engineering Manager to take Collections to the next level leading technical strategy, scaling systems, improving user experience, and growing a high-performing team. This role combines technical leadership, people management, and product vision, directly impacting Postman s growth and user engagement goals. Key Responsibilities Leadership & Team Development Grow and mentor engineers, aligning career growth with business goals. Participate in recruiting, hiring, and onboarding top engineering talent. Define and measure team performance with clear OKRs and real-time feedback. Technical & Strategic Ownership Drive engineering strategy and roadmap for the Collections team. Lead design and code reviews, ensuring high standards across frontend and backend systems. Ensure product reliability, performance, security, and 99.99% availability. Prioritize multi-quarter roadmaps while balancing technical constraints and business needs. Collaboration & Cross-functional Impact Partner with Product, Design, and Engineering teams to deliver a unified, high-quality API collaboration experience. Champion operational and customer excellence through incident management, performance monitoring, and UX issue resolution. About You Experience & Skills Bachelor s degree in Computer Science or equivalent practical experience. 7+ years of software development experience (C, C++, Java, JavaScript, NodeJS). 3+ years in technical leadership roles building impactful products. 2+ years in people management. Experience with microservices architecture and scalable systems. Exceptional written, verbal communication, and stakeholder management skills. Empathetic, collaborative, and committed to creating a positive team culture. Nice-to-Have Experience building customer-focused products at scale. Familiarity with standardizing engineering processes in a growing organization. Flexible hybrid work model with a collaborative and inclusive team. Full medical coverage, flexible PTO, wellness reimbursement, and monthly lunch stipend. Wellness programs, team-building events, and donation-matching initiatives. Opportunities for growth, ownership, and making a measurable impact on Postman s global platform. Our Values Curiosity: Explore boldly and innovate. Transparency: Communicate openly about successes and failures. Focus: Align work with Postman s larger vision. Inclusion: Every team member s voice matters. Excellence: Deliver high-quality products and experiences. Qualification : Bachelors degree in Computer Science or equivalent practical experience

Engineering Manager Engineering manager Manager engineering Collections
TV

Lead Platform Engineer

Team Vunet Systems

6-10 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Lead Platform Engineer Observability Solutions Location: Bengaluru Experience: 6 10 Years Function: Observability Engineering | Platform Architecture | SRE Enablement Join VuNet Redefining Digital Observability at Scale VuNet is transforming the future of digital experiences through Business Journey Observability, combining Big Data and AI/ML to empower real-time visibility across payments, banking, and financial services. Monitoring 28+ billion transactions/month, our platform is trusted by top financial institutions and powers over 300 million users. Backed by Series B funding and recognized by Gartner, NASSCOM, and Forbes, we are leading the charge in building a new category of observability, proudly Made in India for global impact. Your Role: Lead Platform Engineer As the Lead Platform Engineer, you will architect and drive the development of packaged observability solutions across 100+ infrastructure and application technologies. You will define **golden signals**, build **data collection strategies**, and lead the standardization of alerts, dashboards, and RCA workflows for platforms like **Kubernetes, Oracle DB, and Tomcat**. This is a cross-functional leadership role that sits at the intersection of product, platform, DevOps, and SRE. You will **lead a team** and influence how observability is delivered, scaled, and adopted across complex environments. Key Responsibilities Observability Solution Development Design and lead the delivery of observability packages for databases, middleware, cloud-native, and legacy platforms. Define and implement data collection pipelines, including agents, APIs, logs, metrics, traces, and service discovery. Establish **golden signals, SLIs/SLOs**, and health KPIs for performance, availability, and anomaly detection. Dashboards, Alerts & RCA Develop standardized, reusable dashboards, alerts, reports, and troubleshooting playbooks. Automate **RCA workflows** to improve MTTR and reduce alert fatigue. Platform Enablement & Integration Work with engineering to enhance agent capabilities and support new data sources/formats. Guide implementation of platform features for better observability at scale. Team Leadership & Governance Lead and mentor a team of observability engineers and specialists. Define design patterns, reusable modules, and version-controlled libraries. Stakeholder Collaboration Partner with product managers, DevOps, SREs, and customer teams to gather requirements, align priorities, and validate use cases. Ensure deliverables are scalable, well-documented, and production-ready. What You Bring Must-Have Skills 6 10 years of experience in observability, platform engineering, or SRE roles. Hands-on with tools like Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, Splunk. Strong understanding of logs, metrics, traces, profiling, and collection strategies. Experience developing solutions for platforms like Kubernetes, Oracle, PostgreSQL, Tomcat, etc. Proficient in Python, Shell scripting, APIs, and automation tools (**Terraform**, etc.). Familiar with alert fatigue mitigation, anomaly detection, and RCA frameworks. Excellent communication, technical leadership, and documentation skills. Nice to Have Experience managing an observability marketplace or solution catalog. Contributions to open-source observability projects. Certifications in Kubernetes, Observability platforms, or cloud providers (AWS/GCP/Azure). Background in ITSM tools, CMDBs, or incident workflow automation. At VuNet, you ll help build a category-defining observability platform that s already transforming critical infrastructure for leading financial institutions. You ll work with passionate engineers, push technical boundaries, and grow in a high-trust, high-impact environment. What You ll Experience: Ownership of key observability initiatives impacting 300M+ users. Collaboration with SRE, DevOps, and product teams across real-time financial systems. Opportunity to experiment with and shape Gen AI, ML, and emerging telemetry trends. Perks & Benefits Health insurance for you, your parents, and dependents. 1:1 mental wellness support. Training programs, certifications, and career growth opportunities. Transparent, inclusive, and high-trust work culture. Access to cutting-edge technology and Gen AI-powered workspaces.

Lead Platform Engineer Lead Engineer Engineer lead
MA

Devops Engineering Manager

Medi Assist

5-10 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: DevOps Engineering Manager Location: Bangalore Experience: 5 10 years Education: BE/BTech/MCA/MTech/MSc Role Overview: We re looking for an experienced DevOps Engineering Manager to lead our cloud infrastructure, automation, and DevOps initiatives. This is a hands-on leadership role focused on driving efficiency, security, and scalability across our IT operations and development pipelines. Key Responsibilities: Cloud & Infrastructure Management: Administer and manage Google Workspace, including user accounts, security policies, and compliance settings. Oversee and optimize AWS resources (EC2, IAM, S3, VPC), ensuring cost-effective and secure cloud operations. Configure and manage A10 vThunder for load balancing and network performance optimization. Serve as Active Directory Administrator, maintaining AD, DNS, and Group Policy Objects (GPOs). Deploy, maintain, and troubleshoot VMware environments to support virtual infrastructure. Security & Compliance: Manage domain and SSL certificates including installation, renewal, and issue resolution. Handle ADFS token certificate renewals to support uninterrupted authentication services. Enforce security best practices across cloud and on-prem environments. Automation & Scripting: Create and maintain automation scripts using Bash, PowerShell, or Python to streamline workflows. Reduce manual intervention and boost system efficiency through smart scripting and task automation. Monitoring & Troubleshooting: Proactively monitor system logs, performance metrics, and security alerts to prevent downtime. Investigate and resolve issues related to network, infrastructure, and cloud environments promptly. Required Skills & Experience: Proven experience with infrastructure automation tools such as Terraform or CloudFormation. Strong understanding of DevOps practices and implementing CI/CD pipelines for cloud deployments. Solid scripting skills in Bash, PowerShell, or Python. Expertise in managing both cloud-based and on-premise infrastructure. Strong troubleshooting capabilities and a proactive approach to system monitoring. Qualification : BE/BTech/MCA/MTech/MSc

DevOps Engineering Devops engineering Manager Devops manager
GR

Site Reliability Engineer

Groww

4-6 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Site Reliability Engineer Location: Bengaluru About Groww At Groww, we re on a mission to make financial services simple, accessible, and transparent for every Indian. As one of India s fastest-growing financial platforms, we help millions take control of their financial future through a wide range of products. We re a team driven by ownership, radical customer-centricity, and a deep passion for challenging the status quo. From intuitive design to robust engineering, everything we build is grounded in what our customers need. If you re excited about building systems that power the future of finance in India, we d love to hear from you. Our Vision To empower every Indian with the knowledge, tools, and confidence to make sound financial decisions. Our goal is to be the most trusted financial partner for millions across the country. Our Core Values Customer Obsession We put our users first, always. Extreme Ownership We own everything we do, end-to-end. Simplicity We keep things simple, effective, and intuitive. Long-term Thinking We focus on sustainable, impactful decisions. Transparency We believe in open communication and collaboration. Role Overview: As a Site Reliability Engineer (SRE) at Groww, you will be responsible for ensuring our systems are highly available, performant, and secure. You will work closely with engineering and infrastructure teams to improve reliability, automate deployments, and manage mission-critical services that power our platform. Key Responsibilities: Monitor and troubleshoot issues related to system performance, availability, and security. Define and maintain SLIs, SLOs, and Error Budgets to improve system reliability. Use tools like Grafana to analyze and report on metrics and trace data. Participate in the on-call rotation for 24/7 support of production systems. Collaborate with developers to ensure scalability and reliability are built into new services. Roll out security and infrastructure features proactively. Manage automated deployments, version control, and release rollouts. Perform Root Cause Analysis (RCA) for incidents and implement long-term fixes. Optimize system performance, conduct capacity planning, and create recovery strategies. Identify and automate repetitive tasks to reduce toil. Leverage CI/CD tools such as Git, Jira, Jenkins to streamline development workflows. Requirements: 4 6 years of relevant experience in SRE, DevOps, or infrastructure engineering. Bachelor's or Master's degree in Computer Science or a related field. Strong background in Linux/Unix system administration and networking. Hands-on experience with cloud platforms like GCP or AWS. Proficiency in programming languages such as Python, Java, or Go. Experience with monitoring and alerting tools: Grafana, Prometheus, New Relic, etc. Familiarity with configuration management tools. Experience with Kubernetes, Docker, and container orchestration tools is a strong plus. Excellent problem-solving, communication, and team collaboration skills. Be a part of one of India s fastest-growing fintech startups. Build and scale systems that impact millions of users daily. Work with passionate, driven teammates who are redefining financial services. A culture that encourages continuous learning, ownership, and transparency. If you're ready to help shape the future of fintech infrastructure in India, Groww is the place for you. Let s build something extraordinary together. Qualification : Bachelor's or Master's degree in Computer Science or a related field

Site Reliability Site reliability Engineer Site engineer
SL

Technical Lead Devops

Subex Limited

3-6 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Technical Lead - DevOps Location: Bangalore Rural, Karnataka, India Department: Data Platform and DevOps Employment Type: Subexian Experience Required: 3 to 6 years Job Overview: We are seeking an experienced Kubernetes Administrator with a strong background in managing containerized environments. The ideal candidate will have 4+ years of hands-on experience in deploying, configuring, and optimizing Kubernetes clusters to drive scalability, reliability, and performance. This is an excellent opportunity to leverage your expertise in Kubernetes orchestration while contributing to the overall success of our platform. Key Responsibilities: Cluster Management: Deploy, configure, and manage Kubernetes clusters both on-premises and across cloud platforms such as AWS, Azure, and GCP. Security & Compliance: Implement best practices for cluster security, including role-based access control (RBAC), network policies, and data encryption at rest and in transit. Automation: Automate cluster provisioning and ongoing management using tools like Terraform, Ansible, or Helm charts, streamlining operations and reducing manual tasks by 40%. Monitoring & Performance: Continuously monitor cluster health and performance metrics using tools like Prometheus, Grafana, ensuring high availability and optimal performance. CI/CD Pipelines: Design and implement CI/CD pipelines for containerized applications using tools such as Jenkins, GitLab CI/CD, and CircleCI to enable smooth continuous delivery. Collaboration: Work closely with development teams to troubleshoot issues, optimize application performance, and ensure compatibility with Kubernetes environments. Security Audits: Conduct regular security audits to identify vulnerabilities and ensure compliance with industry standards. Documentation: Maintain clear and comprehensive documentation for deployment procedures, configuration settings, and troubleshooting guides to enhance knowledge sharing within the team. Infrastructure Management: Administer and maintain Linux/Unix servers and virtualization platforms such as VMware or KVM, ensuring seamless operations across the infrastructure. Backup & Recovery: Implement and manage robust backup and disaster recovery solutions to ensure data integrity and minimize system downtime. Technical Support: Provide expert-level technical support for server and network infrastructure-related issues. Required Skills & Qualifications: Proven experience in Kubernetes deployment, configuration, and administration. Strong command of containerization technologies, including Docker and containerd. Hands-on experience with cloud platforms such as AWS, Azure, and GCP. Proficiency in Infrastructure as Code (IAC) tools like Terraform and Ansible. Familiarity with CI/CD pipelines and automation tools like Jenkins and GitLab CI/CD. Excellent troubleshooting and problem-solving skills. Strong communication and collaboration abilities, with the capability to work effectively across cross-functional teams. If you re passionate about DevOps, Kubernetes, and driving the success of containerized environments, we d love to hear from you!

Technical Lead Technical lead DevOps Lead devops
JA

Deputy Manager- Mechanical Maintenance

Jindal Aluminium

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Deputy Manager Mechanical Maintenance Department: Maintenance Location: Bengaluru Role Overview: We are seeking an experienced and proactive Deputy Manager Mechanical Maintenance to lead and manage the mechanical maintenance function at our Bengaluru facility. The ideal candidate will be responsible for ensuring the optimal performance, reliability, and safety of mechanical equipment through strategic planning and execution of maintenance activities. This role demands a hands-on leader who can drive operational efficiency, reduce downtime, and ensure compliance with industry standards. Key Responsibilities: Plan, schedule, and implement preventive and predictive maintenance programs to maximize equipment uptime and longevity. Manage the maintenance budget, ensuring efficient allocation of resources while maintaining quality and performance standards. Troubleshoot and resolve mechanical failures promptly to support uninterrupted production operations. Lead and supervise a team of maintenance technicians, ensuring adherence to safety procedures, SOPs, and company policies. Collaborate with production teams to coordinate planned shutdowns and maintenance activities with minimal disruption. Maintain accurate documentation of maintenance activities, equipment history, spare parts usage, and performance metrics. Develop and implement strategies to improve equipment reliability, reduce breakdowns, and enhance operational performance. Ensure all maintenance practices comply with relevant statutory and regulatory requirements. Lead mechanical maintenance projects, including new equipment installations, upgrades, and commissioning. Mentor and train team members to build technical capabilities and foster a culture of continuous improvement. Qualifications & Skills: Bachelor's degree (B.E/B.Tech) in Mechanical Engineering. Proven experience in a mechanical maintenance leadership role, preferably in a manufacturing or industrial environment. Strong knowledge of preventive and predictive maintenance techniques. Experience in managing budgets, projects, and cross-functional teams. Excellent problem-solving, communication, and leadership skills. Familiarity with regulatory requirements and industry safety standards. Qualification : Bachelor's degree (B.E/B.Tech) in Mechanical Engineering.

Manager Deputy manager Mechanical Manager mechanical Mechanical manager
SO

Devops + Tester

Sourcefuse

4-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: DevOps + Tester Location: Bangalore, India Experience: 4 5 years Industry: IT Services Job Type: Full-time Role Overview This hybrid DevOps + QA role focuses on: Ensuring mobile application performance and reliability. Driving automation, CI/CD, and continuous improvement. Designing and executing automated test scripts and performing integration, regression, and performance testing. Supporting innovation and scalable software deployment in alignment with Rakuten s standards. You ll collaborate closely with development, QA, and operations teams, while improving infrastructure and testing frameworks. Key Skills & Tools CI/CD Tools: Jenkins, Bamboo, Docker Testing: Automation, Integration, Regression, Performance Testing Cloud Platforms: AWS, Azure, GCP Salesforce Ecosystem: 1 2 years hands-on experience preferred API Integration: Including legacy systems Test Scripting Tools: Open-source or commercial frameworks Solid grasp of software architecture, high availability, and transaction-intensive systems Responsibilities Monitor and optimize app performance Develop and maintain automated test scripts Execute integration and regression testing Conduct performance tests during pipeline integration Collaborate across DevOps, development, and QA teams Maintain detailed test documentation Conduct unit tests, code reviews, and QA validations Ensure service quality and customer satisfaction Education & Qualifications Bachelor s degree in CS, IT, Engineering, or related field (required) MBA or advanced degree (preferred) Salesforce Admin or PD certification (preferred) Ideal Candidate Traits Strong DevOps + Testing blend with cloud experience Effective communication with technical and non-technical teams Strategic thinker with planning skills Thrives in fast-paced environments, managing multiple priorities Interview Process 2 Technical Rounds Qualification : Bachelors degree in CS, IT, Engineering, or related field (required)

DevOps Full-Time Test automation Continuous integration Continuous deployment
GC

Principal Engineer, Google Cloud VMware Engine

Google Careers

15+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Principal Engineer, Google Cloud VMware Engine Google - Bengaluru, Karnataka, India Join Google Cloud in Bengaluru, Karnataka, India and play a pivotal role in the Google Cloud VMware Engine (GCVE) team! Our mission is to empower customers to seamlessly move their VMware-based applications to Google Cloud without altering their existing apps, tools, or processes. As a Principal Engineer, you will spearhead the technical strategy for GCVE, driving the development of highly performant and scalable infrastructure with fully redundant and dedicated networking. You will ensure the availability necessary for demanding enterprise workloads and pioneer innovative solutions to dramatically reduce cloud migration timelines. In this critical role, you will lead the execution and delivery of the overall GCVE Systems Design. This includes influencing compute server design, platform architecture, network fabric, large cluster level Service Level Objectives (SLOs), reliability, observability, and alerting. You will also be instrumental in how these systems integrate with the broader Google Cloud Infrastructure and Services. Collaboration with external partners, particularly VMware, to co-create groundbreaking solutions will be a key aspect of this role. Leveraging your deep understanding of emerging cloud technologies, you will address the unique cloud requirements of VMware workloads. Your technical expertise will be crucial in bringing innovative software solutions to market, understanding enterprise workload requirements, and utilizing open source technologies. Google Cloud is dedicated to accelerating every organization's digital transformation. We provide enterprise-grade solutions leveraging Google s cutting-edge technology and developer-friendly tools. Join us as a trusted partner for customers in over 200 countries and territories, enabling their growth and solving their most critical business challenges. Responsibilities: Develop a forward-thinking technical roadmap for the Google Cloud VMware Engine organization, fostering continuous innovation and the implementation of novel systems solutions. Design, build, and deploy highly scalable solutions leveraging Google compute platforms, robust hardware, advanced networking, and sophisticated software infrastructure to deliver exceptional systems for VMware workloads at scale. Collaborate effectively across engineering teams involved in the build, design, and implementation of hardware and software spanning infrastructure domains such as platforms, server architecture, compute, storage, networking, and data analytics. Influence and establish best engineering practices for managing robust and scalable systems to proactively address exponential market demand. Provide technical leadership for cloud developer technology within Google and manage collaborations with cross-functional Engineering teams to optimize the adoption of Google Cloud Platform capabilities, both internally and for the broader cloud industry. Minimum Qualifications: Bachelor's degree in Computer Science, Electrical Engineering, or equivalent practical experience. 15 years of experience in hardware and software design, data structures, and algorithms. Preferred Qualifications: Master's degree. Proven experience in delivering successfully within stipulated timelines. Demonstrated experience in successfully building software and large-scale distributed systems. Comprehensive understanding of private and public cloud design considerations and limitations in virtualization, global infrastructure, hypervisor technologies, networking, data storage, and security. Exceptional ability to work cross-functionally, partnering effectively with groups such as Sales, Engineering, Product Management, Product Marketing, UX, and UI, skillfully brokering trade-offs with stakeholders and understanding their diverse needs. Excellent narrative and storytelling skills with a proven ability to drive usage, adoption, and market momentum. Qualification : Bachelor's degree in Computer Science, Electrical Engineering, or equivalent practical experience.

Principal Engineer Principal engineer Google Cloud
OR

Site Reliability Developer 2/3

Oracle

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Description: Site Reliability Engineer - OCI Cloud Engineering Team Role: Site Reliability Engineer (SRE) Team: OCI OLTP (Online Transaction Processing) Location: Kiev Career Level: IC2 Experience: 5+ years Overview: Oracle Cloud Infrastructure s (OCI) OLTP organization is seeking a Site Reliability Engineer (SRE) to join our dynamic and fast-paced Cloud engineering team. The team is responsible for mission-critical distributed systems and cloud services, and we are looking for an engineer who is deeply interested in databases, distributed systems, and cloud services. If you thrive in an environment where innovation, problem-solving, and operational excellence intersect, this is an exciting opportunity for you! As a member of the SRE services, you will focus on Cloud Services, building deployments, operations, security vulnerability mitigation, and automation. You will be instrumental in fostering a culture of Site Reliability Engineering (SRE) within the team, and your work will directly contribute to ensuring the stability, performance, and reliability of Oracle s global cloud service infrastructure. This role requires someone who is adaptable, highly motivated, and capable of managing large-scale cloud environments with a focus on continuous improvement. Key Responsibilities: Cloud Service Operations & Reliability: Deploy, operate, and maintain large-scale cloud service products in a highly available, fault-tolerant, and scalable environment. Collaborate with internal teams to identify and mitigate cross-team issues that pose operational risks to cloud services. Focus on systems reliability and ensure the continuous availability of cloud services by automating tasks and eliminating manual interventions. Automation & Improvements: Automate operational tasks and improve service deployments, focusing on scaling, performance, and uptime. Contribute to CI/CD systems, ensuring seamless integration and continuous delivery for cloud-based services. Leverage automation tools such as Terraform, Grafana, and Bitbucket to streamline operations. Security & Incident Response: Mitigate security vulnerabilities within cloud services and ensure compliance with Oracle's security standards. Participate in on-call rotations to provide immediate troubleshooting support and ensure rapid issue resolution. Perform deep analysis of service performance and collaborate with team members to diagnose and resolve issues that affect service availability or performance. Collaborative Problem-Solving: Work closely with cross-functional teams, including development, database, networking, and storage experts, to ensure the reliability and performance of services. Identify systemic issues and potential risks, develop solutions, and ensure proper documentation and communication with stakeholders. Documentation & Knowledge Sharing: Contribute to documentation such as runbooks, operational guides, and troubleshooting manuals. Mentor junior engineers and share knowledge on best practices for site reliability engineering and cloud service operations. Continuous Learning: Stay up to date with new cloud technologies, trends, and best practices, and actively implement them in your day-to-day work. Technical and Professional Requirements: Cloud Services & Infrastructure: 5+ years of experience in SRE, DevOps, or Automation roles with a focus on large-scale infrastructure and cloud services. Hands-on experience with cloud platforms (e.g., OCI, AWS, Azure) and expertise in compute, database, networking, and storage services within cloud environments. Automation & Tooling: Proficiency with automation tools such as Terraform, Grafana, LumberJack, and Shepherd. Solid experience in using CI/CD tools and processes for cloud service deployments and operations. Scripting & Systems: Strong knowledge of scripting languages, particularly Python and Java. Familiarity with Linux systems, docker containers, virtualized infrastructure, and orchestration (e.g., Kubernetes). Performance & Troubleshooting: Excellent troubleshooting skills with a focus on performance, availability, reliability, and scalability of distributed systems. Experience in operating fault-tolerant, highly available, high-throughput distributed systems. Security & Incident Management: Familiarity with security practices and mitigating security vulnerabilities in cloud services. Proven ability to handle incident response and provide efficient troubleshooting during on-call rotations. Collaboration & Communication: Strong verbal and written communication skills, capable of working effectively with diverse teams across multiple geographies. Ability to work in a highly collaborative environment, driving operational excellence and customer satisfaction. Preferred Qualifications: Experience in operating and maintaining multi-tenant, cloud-based infrastructure with a focus on scalability and high availability. Familiarity with tools and platforms like Grafana, Prometheus, and other observability and monitoring tools. Experience in networking and storage technologies in a cloud environment. Joining OCI s OLTP team as an SRE gives you the opportunity to work with cutting-edge technologies and contribute to the operational excellence of Oracle s global cloud infrastructure. This is a chance to grow your skills in a highly dynamic environment and to solve complex problems that directly impact mission-critical cloud services. With a focus on automation, scalability, and high performance, you will be an essential part of a team that powers Oracle s leading cloud services. If you are an experienced engineer passionate about cloud technologies, automation, and ensuring the reliability of large-scale systems, we encourage you to apply and join us in this exciting journey!

Site Reliability Site reliability Developer Site developer
BY

Senior DevOps / Site Reliability Engineer

Blue Yonder

10-13 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior DevOps / Site Reliability Engineer Location: Pune, India Company: Blue Yonder Experience: 10 to 13 years Education: Bachelor s Degree in Computer Science, Engineering, or related STEM fields Company Overview Blue Yonder is a leading AI-driven Global Supply Chain Solutions provider and consistently recognized as one of Glassdoor s Best Places to Work. We are driving the next wave of digital transformation in manufacturing and retail, delivering innovative SaaS solutions that power intelligent supply chains across the globe. We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to lead the design, development, deployment, and operational management of our Azure SaaS solution. This role requires strong DevOps, cloud delivery, and infrastructure automation expertise, along with leadership capabilities to guide a growing global team. Role Overview In this role, you will be responsible for architecting, planning, and executing end-to-end delivery pipelines, supporting both product development and operational stability. Working closely with platform, product, and architecture teams, you will implement best-in-class DevOps and SRE practices, ensuring scalability, resilience, and cost optimization. Key Responsibilities Architect, design, and manage CI/CD pipelines and infrastructure for a cloud-native, multi-tenant SaaS solution on Azure. Lead sprint planning, backlog grooming, and architecture discussions. Develop quality automation scripts and tools to reduce manual efforts and enable self-healing, self-service capabilities. Identify and resolve operational bottlenecks and proactively improve observability (monitoring, alerting, logging). Participate in code reviews, ensure secure and scalable designs, and mentor junior and mid-level engineers. Collaborate with stakeholders to understand business and technical requirements and translate them into actionable user stories. Implement and enforce cloud cost optimization strategies. Conduct post-incident reviews with a blameless culture to identify root causes and drive continuous improvements. Automate service requests and standard operational procedures. Drive improvements to the team s continuous integration pipeline, ensuring rapid and reliable deployments. Stay updated with the latest DevOps, SRE, and cloud technologies and bring innovative ideas to the table. Participate in team hiring and actively contribute to onboarding new team members. Technical Environment Languages: Java, Python, PowerShell, Shell Scripting DevOps Tools: Azure DevOps, GitHub Actions, Jenkins Cloud: Microsoft Azure (ARM Templates, AKS, Event Hub, HDInsight, Azure AD, Application Gateway, Virtual Networks) Architecture: Microservices, Kubernetes, Docker, Event-driven architecture Frameworks: Spring Boot, Hibernate Monitoring & Logging: Elasticsearch, Spark, Kafka Databases: RDBMS, NoSQL Version Control: Git Required Skills & Experience Bachelor s Degree (STEM preferred) with 10 to 13 years of experience in DevOps, Cloud Delivery, or Site Reliability Engineering. Proven hands-on experience with Azure Cloud Services. Expertise in setting up and optimizing CI/CD pipelines. Strong scripting experience: Shell and PowerShell are mandatory; Python is a plus. Strong understanding of container technologies (Docker, Kubernetes) and microservices architecture. Experience integrating and managing third-party monitoring and logging tools. Strong problem-solving skills and ability to work with global, cross-functional teams. Excellent communication and stakeholder management skills. Nice to Have Development experience in Java or Python. Experience working in agile teams with a product-centric mindset. Experience working in manufacturing or retail domains. Exposure to AI/ML-driven monitoring and observability tools. Work with cutting-edge technologies on globally impactful solutions. Collaborate with diverse and talented teams across the US, India, and the UK. Foster your career growth through mentorship, continuous learning, and leadership opportunities. Experience an inclusive, flexible work culture where innovation and creativity thrive. Diversity, Inclusion, Value & Equality (DIVE) At Blue Yonder, we are committed to building an inclusive environment where everyone feels empowered to be themselves. All qualified applicants will receive consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. Qualification : Bachelors Degree in Computer Science, Engineering, or related STEM fields

Software Engineer Staff Engineer Software Engineer Staff software engineer
CO

Senior Site Reliability Engineer

Couchbase

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!

Senior Site Reliability Site reliability Engineer
GC

Software Engineer Iii, Infrastructure, Core

Google Careers

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Software Engineer About the Role: At Google, our Software Engineers are at the forefront of innovation, designing and developing cutting-edge technologies that shape how billions of users connect, explore, and interact with information. Our products operate at an immense scale, extending far beyond web search, and require engineers who bring fresh perspectives from diverse technical domains, including information retrieval, distributed computing, large-scale system design, networking, security, artificial intelligence, natural language processing, UI design, and mobile development. As a Software Engineer, you will contribute to mission-critical projects, collaborating with teams across Google to develop, test, deploy, maintain, and enhance software solutions. Your versatility, leadership abilities, and enthusiasm for solving complex challenges will be crucial as you navigate projects across the full technology stack. The Core Team serves as the backbone of Google s technical infrastructure, building the foundational elements behind our flagship products. This team is responsible for developing essential developer platforms, product components, and infrastructure that drive innovation across Google s ecosystem. As a member of this team, you will play a pivotal role in breaking down technical barriers, optimizing existing systems, and making key architectural decisions that influence the entire organization. Key Responsibilities: Design, develop, and maintain high-quality software solutions that support Google's technical infrastructure and products. Participate in and lead design reviews with peers and stakeholders, evaluating available technologies to determine optimal solutions. Conduct thorough code reviews to ensure adherence to best practices, including code quality, efficiency, accuracy, testability, and compliance with style guidelines. Contribute to documentation and educational resources, updating content based on product enhancements and user feedback. Troubleshoot and debug complex system issues, analyzing their impact on hardware, networks, and service operations to maintain optimal performance and reliability. At Google, we foster a culture of continuous learning, innovation, and technical excellence. If you're passionate about solving challenging problems and building world-class technology, we invite you to be part of our journey. Qualification : Bachelors degree or equivalent practical experience.

Software Engineer Software Engineer Engineer software Engineer iii
BS

Senior Performance Engineer

Boomi Software

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Senior Performance Engineer Are you ready to work on world changing technologies? Today, organizations need to move with increased agility and insight to grow and thrive. Boomi is one of the hottest tech companies in the SaaS/Cloud industry, named a Leader for the eighth year in a row in the Gartner Enterprise iPaaS Magic Quadrant and recently recognized by Inc. Magazine as one of the best workplaces. Our award-winning, patented technology is transforming the world of integration by making enterprise-class integration technology accessible and affordable to companies of all sizes. Boomi provides the foundation on which your business can evolve and innovate. According to a recent survey by Vanson Bourne, connected businesses are far outpacing their competitors. We help organizations connect everything and engage everywhere across any channel, device or platform. More than 7,000 organizations are using Boomi to run better, faster and smarter. Working at Boomi means doing what you love. We hire trailblazers with an entrepreneurial spirit who can solve challenging problems, make a real impact in technology and want to build something big. If you are passionate about solving hard problems, enjoy working with world-class people and developing cutting edge technology, you should explore a career with Boomi. Learn more at http://www.boomi.com/ or visit Boomi Careers. Join us as a Performance Engineer on our Performance, Scalability and Resiliency(PSR) Engineering team in Bangalore/Hyderabad, India to do the best work of your career and make a profound social impact. What you ll achieve As a Performance Engineer, you will be responsible for validating and recommending performance optimizations in Boomi s computing infrastructure and software. You will work with our Product Development and Site Reliability Engineering teams on Performance monitoring, tuning and tooling. You will: Analyze Software Architecture (monolith and micro-service) and identify potential areas of performance, scalability and resiliency improvements Identify KPIs, perform trending and analysis, identify patterns and engineer remedial solutions for a high performant, fault tolerant and resilient platform and application stack. Design, automate and perform scalability and resiliency tests using various tools like JMeter, Chaos Monkey or similar Use observability stack to improve diagnosability and trending around Performance bottlenecks Identify performance tuning opportunities and recommend remedial solutions Take the first step towards your dream career Every Boomer brings something unique to the table. Here s what we are looking for with this role: Essential Requirements Expert in performance engineering fundamentals - arrival rate, workload models, responsiveness, computing resource utilization, time complexity, scalability, resiliency etc.. Expert in monitoring the performance using native Linux OS, Application Performance Management(APM) and Infrastructure monitoring tools Experience in analyzing crash dump, thread dump, SQL slow query log and identify performance bottlenecks Expert in recommending optimal resource configurations in Cloud, Virtual Machine, Container and Container Orchestration technologies Flexibility to work in a remote and geographically distributed team environment Desirable Requirements Experience in writing data extraction and custom monitoring tools using any programming language - Java, Python, R , Bash or similar Experience in capacity planning and modelling using AI/ML, queueing models or similar approaches Performance tuning experience in Java or similar application code

Senior Performance Engineer Senior engineer Performance engineer
BS

Software Principal Engineer - Sre

Boomi Software

7+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Senior Site Reliability Engineer Join us as a Senior Site Reliability Engineer on our Reliability Team and do the best work of your career while making a profound social impact. In this role, you will design and build sophisticated systems and software that align with our customers business goals and environments. You will collaborate with product management, engineering teams, customer success, and support to deliver innovative features and enhancements across Boomi s product offerings. Key Responsibilities Incident Management & SLAs: Participate in detecting, remediating, and reporting production incidents, ensuring that SLAs and SLOs are well-defined and consistently met. On-Call Rotation: Provide on-call support for planned and unplanned events. Collaboration: Partner with engineering teams to implement improvements, standardize processes, and drive consistent results. Disaster Recovery: Lead DR exercises, game days, and readiness training with SRE and engineering counterparts. Observability & Tooling: Collaborate with service engineering teams to build and automate tooling, implement best practices in observability, and ensure the scalability and reliability of Boomi s production services. Infrastructure Automation: Automate provisioning and maintenance of Boomi s infrastructure using tools like Terraform and Ansible. Technical Mentorship: Guide and mentor other engineers through design collaboration and code reviews. What You ll Bring Essential Requirements Expertise in defining, measuring, and improving reliability metrics (SLOs, SLIs, error budgets). Strong experience in observability practices (monitoring, logging, distributed tracing), preferably using Splunk and New Relic, including the ability to create custom dashboards from scratch. Proficiency in infrastructure automation using Terraform, CloudFormation, and Ansible playbooks, with scripting experience in Python. Hands-on experience conducting and automating disaster recovery (DR) exercises in AWS, validating RPOs and RTOs. Deep understanding of AWS components and the ability to design and implement APIs for internal use. Desirable Requirements 7+ years of experience in the software engineering industry, with exposure to large-scale production systems. Cloud certification (AWS, Azure, GCP, Oracle), with experience in services such as compute, containers, and databases. Experience in containerization best practices, cloud-native concepts, and security awareness in the cloud. Working at Boomi means doing what you love, surrounded by trailblazers with an entrepreneurial spirit. Our culture fosters innovation, encourages collaboration, and celebrates the unique contributions of every individual. Take the first step toward your dream career at Boomi where ideas shape the future of technology.

Software Principal Engineer Software Engineer Engineer software
II

Site Reliability Engineer - Z Platform

Ibm India

Fresher | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Introduction: The IBM CIO Technology Platform Transformation team plays a crucial role in modernizing IBM's technology infrastructure and platforms. By leveraging emerging technologies such as AI, machine learning, and cloud computing, the team aims to enhance security, streamline processes, and improve user experience. The team's mission is to optimize IT functions, reduce technical debt, and drive automation while fostering a culture of innovation and continuous improvement. In this role, you will be joining the CIO Hybrid Cloud Z Platform & Strategy Team, where you will maintain, support, and enhance multiple aspects of the Z environment, including z/OS storage management, performance, and networking tools. You will play a key role in automation, identifying improvements, and collaborating with various teams to improve operational efficiency. Responsibilities: Technical Support & Problem Resolution: Provide problem determination and source identification to resolve technical issues within the Z environment. Automation & Optimization: Recommend and implement optimization strategies and automation processes for technical support tools, procedures, and systems. Collaboration: Work closely with global teams to diagnose, prioritize, and resolve issues, providing guidance and fostering collaboration across various teams. Mentorship & Training: Offer technical training and mentorship to other members of the team, sharing knowledge and best practices. System Design & Implementation: Lead system design discussions for z/OS systems, plan and implement new solutions, and ensure they align with business goals and objectives. Continuous Improvement: Identify points of improvement in technical processes and propose innovative solutions through automation to enhance operational efficiency. Required Education & Experience: Bachelor's Degree in a relevant field (required). Master's Degree (preferred). Technical Expertise: Deep Knowledge of z/OS & z/OS Storage Support: Proficient in managing z/OS and storage in a mainframe environment, including installation, configuration, high availability, performance tuning, and security. Cloud Infrastructure & Network Knowledge: Experience in cloud infrastructure and network technologies, particularly in the context of z/OS environments. Problem Solving & Autonomy: Strong problem-solving skills, with the ability to work autonomously, meet goals, and apply innovative thinking to solve complex problems. Global Team Collaboration: Experience working with global teams across different locations, contributing to a collaborative and solution-driven environment. System Design & Leadership: Proven ability to lead system design discussions and plan and implement z/OS system configurations and changes. Fluent in English: Strong written and verbal communication skills in English. Preferred Technical Experience: REXX Programming: Experience with REXX programming for automation and scripting in a z/OS environment. Ansible on Mainframe: Familiarity with Ansible for automation on Mainframe systems. Zowe, ZOAU, R3S Knowledge: Knowledge of Zowe, ZOAU, and R3S for modernizing mainframe environments. Middleware & Systems Management Experience: Experience with z/OS middleware systems such as DB2, IMS, Base/Storage/TWS/Network/Netview. Automation & Testing: Experience in automating workloads, performing test automation, and optimizing operational workflows. Basic Container Technology Knowledge: Familiarity with container technologies and tools such as Docker, Kubernetes, and Red Hat OpenShift. About the Business Unit: The IBM Finance Organization is responsible for driving enterprise performance and transformation. As the financial stewards of IBM, we deliver IBM's financial strategy, develop new business models, and mitigate enterprise risk. The group is focused on creating value and improving the financial aspects of IBM s business across a variety of sectors, including accounting, financial planning, business controls, tax, treasury, and business development. Why Join IBM? Innovative Environment: Work on cutting-edge technologies and collaborate with experts in the field. Global Impact: Contribute to the transformation of IBM's hybrid cloud platform and technology infrastructure. Career Development: Gain exposure to advanced systems and automation techniques while working with global teams. Competitive Compensation & Benefits: Enjoy competitive pay, performance-based rewards, and a range of employee benefits. If you're passionate about technology, innovation, and driving automation in a hybrid cloud environment, join the IBM CIO Hybrid Cloud Z Platform & Strategy Team today!

Site Reliability Site reliability Engineer Site engineer
II

Site Reliability Engineering Professional - Windows

Ibm India

3+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Introduction System Engineers at IBM are integral to the company's strategic initiatives, ensuring the design, coding, testing, and delivery of cutting-edge solutions that power critical global systems. From ensuring transportation runs seamlessly to enabling secure financial transactions, the role of System Engineers is pivotal. At IBM, you ll leverage advanced development tools and work with industry-leading experts to create solutions you can take pride in. Your Role and Responsibilities We are seeking a skilled Windows Administrator to manage and maintain Windows operating systems and server networks within a critical cloud environment. In this role, you will: Install or upgrade Windows-based systems and servers. Manage user access to servers and maintain network stability and security. Troubleshoot and resolve complex IT issues. Enhance operational efficiency through automation and collaboration. This position demands a high level of technical expertise and a proactive approach to solving challenges. A successful Windows Administrator ensures seamless operations while upholding the highest security standards. Responsibilities Include: Managing critical IaaS infrastructure supporting customer workloads. Allocating and managing tools, frameworks, and assets to enhance engineering productivity and service delivery. Promoting consistent and efficient practices across teams. Who You Are You are a curious and passionate technologist eager to innovate and adopt emerging technologies. Your technical foundation is solid, and you thrive in dynamic environments, juggling multiple responsibilities to deliver integrated solutions. Who You'll Work With You will collaborate with a diverse and dynamic team, including architects, QA specialists, product managers, and delivery teams. This role promises variety, innovation, and the opportunity to make meaningful contributions daily. Required Education Bachelor s Degree Required Technical and Professional Expertise 3+ years of experience in Windows OS administration, including installation, upgrades, and troubleshooting. Expertise in setting up and configuring Windows Active Directory. 3+ years of hands-on experience with PowerShell scripting. Preferred Technical and Professional Expertise Proficiency in network configuration and troubleshooting. Experience with backup and recovery solutions. By joining IBM, you ll contribute to a collaborative environment where innovation meets practical application, driving solutions that make the world run better. Qualification : Bachelor's Degree

Site Reliability Site reliability Engineering Site Engineering
6S

Staff Software Engineer (backend)

6sense

8+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Role: Staff Software Engineer (Backend) We are seeking an experienced Staff Software Engineer (Backend) with a deep understanding of building highly scalable and reliable systems. This role involves working with large datasets, distributed systems, and cutting-edge technologies in a fast-paced, growth-driven environment. Required Skills & Experience 8+ years of industry experience, particularly in technology-focused organizations, with a preference for experience in startups. Proven experience working with large-scale datasets (10s of millions of documents). Expertise in building scalable and available system architectures. Hands-on experience with in-memory cache systems such as Redis, and distributed NoSQL stores (e.g., ElasticSearch, Cassandra, HBase, MongoDB). Strong proficiency in languages like Java, Python, or Scala. Ability to handle complex business processes and work with vast amounts of data. Experience in building microservices and distributed systems is highly preferred. Our Dual Missions For the world: Improve transparency and trust in the B2B ecosystem. For ourselves: Lead fulfilling, impactful lives. Our Core Values (How We Act) Have Empathy. Push boundaries. Make data-driven decisions. Take smart risks. Have fun at work. #LI-remote Our Benefits As a full-time employee, you ll enjoy a range of benefits designed to support your well-being and work-life balance: Comprehensive health coverage, paid parental leave, and generous paid time off (PTO) and holidays. Quarterly self-care days off to recharge. Stock options for long-term growth. Remote work flexibility with the tools and support needed to connect with your team, whether at home or in the office. A growth-focused culture, with access to LinkedIn Learning and other learning and development initiatives. Employee well-being programs, including quarterly wellness education sessions and activities celebrating diversity and inclusion. We prioritize your well-being and provide opportunities for personal and professional growth. Qualification : 8+ years of industry experience, primarily within technology-focused organizations, with a preference for start-ups.

Software Engineer Staff Engineer Software Engineer Engineer software
IB

Backend Developer

Ibm

4+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

At IBM, work is more than a job it s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world s most challenging problems? If so, lets talk. Your Role and Responsibilities We are looking for a developer highly interested in developing an innovative, future-oriented solution that automates and simplifies the installation, configuration and testing of Linux on Z systems. Your main duties will include design and implementation of new features, optimizing and maintaining existing code and ensuring the software meets high-quality standards through testing and debugging. You will also work closely with the customers to ensure the software meets their needs. Step in and be part of IBM System Development Lab community, outstanding for its innovation and team spirit, offering one of the broadest project portfolios of hardware and software technologies within the IBM Corporation. Engineers in our team work inside a highly agile development environment and are responsible for the full software development life cycle ranging from designing and implementing of the new product features, testing for industry-leading quality assurance over to continuous product delivery as well as supporting our global customers. You should be thrilled by emerging technologies with our software products for future Mainframe and Cloud-based markets. What you will do : The IBM Z Hyper Protect Virtual Servers team is seeking an experienced Backend Developer As a Backend developer, you will be part of a highly focused, self-managed team that designs, develops and tests secure solutions created for Z Systems workloads and applications. Responsible for all aspects of management, improvement, and support of the microservice platform s Linux based infrastructure. Provide feedback to architects regarding any issues that present themselves. Manage projects with various priority levels and timelines from start to finish. Act as escalation point for internal support departments in resolving a wide variety of customer facing issues regarding environment deployment, service issues, and technical questions. Consistently meet deadlines for complex issues and new projects involving multiple teams. Demonstrate best practices in all aspects of administration. Leverage various languages to build features based on an architectural design. Develop and maintain accurate documentation for internal procedures and services. Continuously stay abreast of new developments in supported operating systems to ensure consistent compatibility with established infrastructure. Must collaborate with other departments to resolve complex issues and be detail oriented. Ability to automate solutions to repetitive problems/tasks. Required Technical and Professional Expertise Up to 4 years of working experience with Linux distributions (Ubuntu/RHEL) in a production environment. Strong background in Software development with in-depth knowledge of Python, designing REST API and working knowledge of distributed services. Strong development skills in Openstack and its components. Knowledge on Core Linux Development skills Strong skills in github, ShellScripting(ksh/bash), containers and orchestration, system monitoring, Jenkins, CI/CD pipeline integration and end-to-end tests, playbooks and process automation. Understanding of container technologies like docker/podman, Orchestration tools Kubernetes openshift, Digital Certificate Knowledge. Good understanding of IBM Carbon Design Systems Knowledge of React.js , playwright Working experience of Security Scan Tools-VA Scan/App Scan, OWASP ZAP , Contrast A Self-starter Individual with excellent problem-solving skills, able to work independently and as a part of the team. Excellent presentation and soft skills and strong english communication skills both written and Verbal Preferred Technical and Professional Expertise Knowledge with deployment on OpenShift. Knowledge on Cloud Technology including Network, Storage and Compute. knowledge in zLinux operating systems and virtualization/hypervisor Qualification : Strong background in Software development with in-depth knowledge of Python, designing REST API and working knowledge of distributed services.

Backend Developer Backend Developer Full-Time Server-Side Development
I(

Site Reliability Engineer -- Logging And Monitoring

Ibm (international Business Machines)

2-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Introduction A career in IBM Software means you ll be part of a team that transforms our customer s challenges into solutions. Seeking new possibilities and always staying curious, we are a team dedicated to creating the world s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career. IBM s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive. Your role and responsibilities In this role, you will build and maintain an observability stack for IBM s Cloud Object Storage service using managed services as well as custom built services. This stack is used by Cloud Object Storage SREs and devs to understand the health of the service. Work duties and responsibilities include: Design, setup, configure and implement the COS Monitoring System using technologies such as Elasticsearch, Logstash, Kibana, Kafka, Kafka Mirrors, Filebeat, Grafana and Sysdig. Automate CICD tasks and infrastructure using Ansible, Terraform, Jenkins, and Travis. Experience with microservices and distributed application architecture, such as containers and Kubernetes. Experience with Linux administration and programming languages such as java, python and sql. Performance and configuration tuning to support the increasing load of data flowing into the COS Monitoring System. Provide design recommendations and thought leadership to provide best-in-class observability as part the COS Monitoring System. Provide 24x7 on-call customer support on a rotational basis. Design and develop dashboards for metrics analysis Design, Develop and Configure an alerting solution for an end-to-end incident management and recovery process by integrating Sysdig with Pagerduty, Email and Slack. Required education Bachelor's Degree Preferred education Bachelor's Degree Required technical and professional expertise Ability and tenacity to solve increasingly complex technical issues through analysis and a variety of problem-solving techniques. Working knowledge of Object-Oriented Python with demonstrable experience in applying these skills. Working knowledge of Linux environments. Experience working in an Agile-Scrum development environment. Experience using tools such as Jira, GitHub and Logging and monitoring tools BS in CS, CE or similar field, plus 2 to 5 years relevant work experience. Qualification : BS in CS, CE or similar field, plus 2 to 5 years relevant work experience.

Site Reliability Site reliability Engineer Site engineer

1 - 20 of 0 jobs

* No exact matches found. Showing closest results instead
Sort by:

No results found

Modify search criteria or create an alert to get relevant jobs as soon as they’re posted

Create an alert

Continue to Save

Please login to your jobseeker account, or create a new one to save this job.

Feedback

Share Feedback