Senior Site Reliability Engineer Job in Couchbase

Senior Site Reliability Engineer

Apply Now
Job Summary

Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives

Location: Bangalore, India (Office-based role)

About Couchbase:

As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission!

Job Overview:

As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations.

Responsibilities:

  • Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations.
  • Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.).
  • CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime.
  • Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business.
  • Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture.
  • Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance.
  • Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations.
  • Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery.

Requirements:

  • Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments.
  • Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby.
  • Linux Expertise: High proficiency with Linux operating systems.
  • Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS).
  • Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck.
  • Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc.
  • Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker).
  • Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts.

Preferred Skills:

  • Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts.
  • Incident Management: Experience with on-call rotations and incident management.
  • Database Experience: Familiarity with databases, particularly Couchbase.
  • Security Certifications: Relevant certifications in security or cloud technologies are a plus.

Why Couchbase?

Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform.

Benefits at Couchbase:

  • Generous Time Off Program: Flexibility to care for yourself and your family.
  • Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs.
  • Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance.
  • Career Growth: Focused on your career development and success.
  • Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!
Experience Required :

Minimum 5 Years

Vacancy :

2 - 4 Hires

Apply Now
Similar Jobs for you

See more recommended jobs

Your 4 Step Guide to Career Success

Apply for jobs
Create Profile
Schedule Interview
Get Hired