Platform Engineer GO Kubernetes Jobs in Bengaluru
963 Jobs Found
Mts - Software Development (cloud Ai Network Security Developer)
Aviatrix Systems
MTS - Software Developer (Cloud AI Network Security Developer) Location: Bengaluru Company: Aviatrix Experience: 1 3 years About Aviatrix: Aviatrix is a cloud network security leader trusted by over 500 enterprises. We specialize in securing multi-cloud environments, offering runtime protection and advanced control for modern cloud infrastructures. Role Strategy & Impact In this role, you will build next-generation intelligent cloud network security solutions. You will focus on developing Python/Go microservices that fuse network visibility with LLM-driven insights to redefine cloud firewall capabilities. Technical Requirements Core Competencies: Development: Professional experience in Go (Golang) or Python. Cloud Networking: Fundamentals of Routing, NAT, VPNs, and Subnets. Security: Understanding of Firewall concepts (ACLs) and Zero Trust architecture. AI Integration: Experience using AI/LLM APIs (OpenAI, Vertex AI, etc.). Data Infrastructure: Workflows involving Kafka, data ingestion, and stream processing. Cloud Ecosystem: Hands-on familiarity with AWS, Azure, or GCP. Preferred Qualifications: Network Observability: Experience with NetFlow, IPFIX, or VPC Flow Logs. Modern DevOps: Hands-on with Kubernetes, Container Networking, and Terraform. Generative AI: Knowledge of Prompt Engineering or RAG-based systems. Key Responsibilities Control Plane Development: Build services for firewall rules and policy orchestration. AI Workflows: Integrate LLM-based assistants for anomaly detection and alert summarization. Telemetry Pipelines: Maintain high-performance data pipelines for security event metrics. Security Logic: Design logic for threat pattern recognition and posture scoring. Benefits & Why Join Us Global Benefits: Private medical, pension, and life assurance. Work-Life Balance: Generous holiday allowance and annual wellbeing stipend. Growth Mindset: We value diverse paths if you are passionate about AI and Security, we want to hear from you.
Staff Engineer - Software Development
Aviatrix Systems
Staff Engineer - Software Development (Cloud AI & Network Security) Location: Bengaluru Company: Aviatrix Experience Required: 7+ Years About Aviatrix: Aviatrix is a global leader in cloud network security, trusted by over 500 enterprises. We provide a specialized platform for securing multi-cloud environments, giving organizations the control and visibility needed to modernize their cloud strategies. Architectural Focus & Impact As a Staff Engineer, you will architect and deliver advanced AI-driven network security solutions. This role bridges the gap between Distributed Systems (Python/Go), Real-time Telemetry, and LLM-integrated automation to build self-learning, adaptive security infrastructures. Technical Expertise Core Software Engineering: Languages: Deep proficiency in Python and Go (Golang). Distributed Systems: Mastery of Kubernetes, Microservices, and high-scale observability (Prometheus, ELK). Data Pipelines: Experience with real-time stream processing using Kafka, Flink, Kinesis, or Pub/Sub. Networking & Security Domain: Cloud Infrastructure: Expert knowledge of VPC/VNet design, Routing, Load Balancers, and Overlays. Firewall Technologies: Hands-on with Deep Packet Inspection (DPI), NGFW/IDS/IPS, and Cloud-native firewalls (AWS, Azure, GCP). Security Frameworks: Alignment with Zero Trust, NIST CSF, and CIS Benchmarks. AI & Machine Learning Integration: Model Serving: Experience serving ML models via REST or gRPC. Generative AI: Familiarity with LLM integration, RAG (Retrieval-Augmented Generation), LangChain, and vector databases. Key Responsibilities System Architecture: Lead the design of cloud-native microservices for security control planes. AI-Driven Features: Integrate LLMs for Natural Language-to-Firewall Rule translation and automated incident summarization. Technical Leadership: Mentor junior engineers and set high standards through rigorous Design and Code Reviews. Cross-Functional Collaboration: Partner with Data Scientists and Cloud Networking teams to deliver production-grade AI features. Benefits & Why Join Us Regional Package: Comprehensive pension, private medical coverage, and life assurance. Wellbeing: Annual wellbeing stipend and generous holiday allowance. Growth Culture: We value unique career paths and prioritize candidates who are passionate about the intersection of AI and Security.
Lead Software Engineer - Scale & Performance
Team Vunet Systems
Lead Software Engineer - Scale & Performance Location: Bengaluru Experience: 6 12 years About VuNet VuNet is a pioneer in Business Journey Observability, using Big Data and Machine Learning to revolutionize digital experiences in the financial services industry. Our platform delivers end-to-end visibility into customer journeys, helping organizations proactively resolve issues, ensure operational resilience, and deliver superior user satisfaction. With over 28 billion digital transactions monitored every month and serving more than 300 million users globally, VuNet is shaping the future of observability for some of the largest banks and financial institutions. We are Series B funded, part of NASSCOM s DeepTech Club, and recognized by global analysts such as Gartner and Omdia. Your Role: Lead Software Engineer - Scale & Performance As a Lead Software Engineer for Scale & Performance, you ll own the performance and scalability benchmarks for VuNet s observability platform. You will work with cutting-edge technologies, design robust test frameworks, and ensure that our platform scales seamlessly to meet the demands of millions of users. Roles & Responsibilities Own performance and scalability benchmarking for key platform components (ingestion pipelines, data storage, and query services). Design and execute load, stress, soak, and capacity tests across microservices, agents, and ingestion layers. Identify and resolve performance bottlenecks in both infrastructure (CPU/memory/IO) and application layers (API latency, throughput, GC behavior). Develop and maintain performance test frameworks, preferably using Kubernetes-based environments. Collaborate with DevOps and SRE teams to optimize system configurations (Kubernetes, Postgres/TimescaleDB, ClickHouse, Kafka) for scale. Implement OpenTelemetry for service instrumentation to monitor system health and latency (p50/p95/p99 metrics). Contribute to capacity planning, scaling strategies (horizontal/vertical), and resource optimization. Analyze production incidents related to scaling issues and drive permanent fixes. Work with engineering teams to design scalable architecture patterns and define SLIs/SLOs for system performance. Document performance baselines, tuning guides, and scalability best practices for internal use. What You Bring Mandatory Skills: Strong background in performance engineering for large-scale distributed systems or SaaS platforms. Expertise in Kubernetes, container runtimes (containerd/Docker), and resource profiling in containerized environments. Solid understanding of Linux internals, CPU/memory profiling, and network stack tuning. Hands-on experience with observability tools (Prometheus, Grafana, OpenTelemetry, Jaeger, Loki, Tempo, etc.). Familiarity with observability platform datastores like ClickHouse, PostgreSQL/TimescaleDB, Elasticsearch, or Cassandra. Experience with performance benchmarking tools such as k6, Locust, JMeter, or custom Golang/Python scripts. Ability to interpret system metrics (CPU usage, memory, GC, latency) and correlate across different layers. Nice-to-Have Skills: Experience with agent benchmarking (OpenTelemetry Collector, custom data shippers). Exposure to streaming systems like Kafka, NATS, or Pulsar. Familiarity with CI/CD pipelines for performance testing and regression tracking. Knowledge of cost optimization and capacity forecasting in cloud environments (AWS/GCP/Azure). Proficiency in Go, Python, or Bash scripting for automation and data analysis. Life at VuNet: At VuNet, we're building a world-class observability platform, and we re just getting started. You ll be part of a passionate, problem-solving team that embraces collaboration, fast learning, and staying ahead of emerging technologies like Gen AI. We foster a high-trust, inclusive culture where collaboration, ownership, and innovation are central to our success. If you're looking to work on cutting-edge tech, make a real impact, and grow with a supportive team you ll fit right in at VuNet. Benefits: Comprehensive health insurance coverage for you, your parents, and dependents. Mental wellness and 1:1 counseling support. A culture that promotes continuous learning, innovation, and career growth. Transparent, inclusive, and high-trust workplace. Opportunities for skill enhancement with training programs focused on new Gen AI technologies.
Senior Qa Engineer
Team Vunet Systems
Senior QA Engineer - AI-Powered Observability Platform Location: Bengaluru Experience: 6 10 years About VuNet VuNet is at the forefront of Business Journey Observability, revolutionizing the financial services industry with Big Data and Machine Learning. Our deep-tech platform provides comprehensive visibility into customer journeys, enabling proactive issue resolution, operational resilience, and superior user experiences. We monitor over 28 billion digital transactions monthly, serving 300 million users globally, and we re powering some of the largest banks and financial institutions in India and MEA. VuNet is Series B funded, part of NASSCOM s DeepTech Club, and recognized by analysts like Gartner and Omdia. Your Role: Senior QA Engineer - AI-Powered Observability Platform As a Senior QA Engineer at VuNet, you ll play a crucial role in ensuring the quality and reliability of our VuSmartMaps Observability Platform. You ll lead the design and implementation of cutting-edge test automation, performance validation, and reliability frameworks across distributed systems that handle billions of telemetry events. Working closely with development, operations, and QA teams, you will drive quality across the entire platform and play a key role in ensuring that our systems are scalable, resilient, and performant. Roles & Responsibilities Quality Strategy Ownership: Own the end-to-end quality strategy for observability platform components (metrics, logs, tracing, alerting, dashboards, MLOps). Automated Testing: Build and maintain automated test suites for data pipelines, APIs, and integration flows involving tools like Prometheus, Grafana, Loki, Elastic, and OpenTelemetry. Performance Validation: Design and execute tests to validate high-throughput, distributed systems under real-world load conditions, ensuring performance benchmarks are met. Test Frameworks Development: Develop and maintain test frameworks and tools using Python, Go, Bash, pytest, k6, Playwright, and others. System Reliability & Alerting: Define and implement test coverage for system reliability, alerting accuracy, and visualization correctness. Collaboration: Partner with developers, SREs, and DevOps teams to shift quality left in the development lifecycle, contributing to CI/CD pipelines and automation workflows using GitOps tools. Automation Integration: Integrate automated test suites into smoke, functional, and regression pipelines using Jenkins, Spinnaker, and other CI/CD tools. Mentorship: Mentor junior QA engineers, establish best practices, and ensure consistency in the QA discipline across the team. What You Bring Mandatory Skills: Experience: Minimum 6+ years in software quality engineering, with a focus on automated testing, performance, and reliability. Scripting/Programming: Proficiency in at least one scripting or programming language (JavaScript, Python, Go). CI/CD Systems: Experience with CI/CD systems such as GitHub Actions, Jenkins, or ArgoCD. Debugging Skills: Excellent debugging skills and the ability to analyze code quality and system performance. Distributed Systems Knowledge: Familiarity with Kafka, Kafka Streams, ClickHouse DB, and distributed systems. Kubernetes & Microservices: Strong experience testing Kubernetes-native systems, Helm deployments, and microservices. Observability Tools: Knowledge of observability tools like Prometheus, Grafana, Elastic Stack, OpenTelemetry, Loki, or Jaeger. Tooling & Deployment: Proficiency in Jenkins, Spinnaker, GitOps, Kubernetes, and Docker. Testing Experience: Hands-on experience in various types of testing (functional, performance, load, etc.) and knowledge of testing tools. Documentation Skills: Ability to create clear documentation (e.g., release notes, troubleshooting guides, and migration guides). Nice-to-Have Skills: Performance Testing: Experience designing and executing performance and load testing for high-traffic applications. Web Services & Systems Design: Understanding of web services and distributed systems architecture. Cross-Functional Communication: Excellent communication skills with the ability to coordinate across multiple teams. Life at VuNet: At VuNet, we re building a world-class observability platform proudly Made in India and we re just getting started. Join a passionate team of problem-solvers who love tackling complex challenges and stay ahead of the curve with technologies like Gen AI. We offer an environment where collaboration, innovation, and learning are at the core of everything we do. You ll have the opportunity to work on cutting-edge technologies and make a real impact on a product that powers leading banks and financial institutions globally. Benefits: Comprehensive health insurance coverage for you, your parents, and dependents. Mental wellness support and 1:1 counseling. A learning culture that promotes growth, innovation, and ownership. Transparent, inclusive, and high-trust workplace culture. Exposure to Gen AI and integrated technology workspaces. Support for career development with various training programs to enhance your skills and expertise.
Principal Architect
Alivecor India
Principal Architect Location: Bangalore Company: AliveCor About AliveCor AliveCor is on a mission to **revolutionize heart health** by making it accessible to everyone, everywhere. We have pioneered **over-the-counter medical ECG devices**, trusted by millions, and are leaders in empowering consumers to take control of their heart health. With our **FDA-cleared medical-grade hardware and software**, users have performed over 300 million heart health measurements. We are a team driven by a shared passion to make a real difference in people s lives. The Opportunity & Role As the **Principal Architect** at AliveCor, you will play a pivotal role in **shaping the future of heart health technology**. You will lead the architecture of our platform, ensuring it meets the evolving needs of our customers and the healthcare industry. This role requires **strong technical expertise in GoLang, Java, AWS, and modern software architecture**. Key Responsibilities Architect & Design Solutions: Collaborate closely with product and engineering leadership to design **scalable, secure, and efficient solutions** for both consumer and clinician-facing applications. Hands-On Engineering: Actively engage in software development (**up to 50% of your time**), working with technologies such as **Go, Java, Ruby on Rails, PostgreSQL, AWS, React, JavaScript**, and mobile (iOS & Android) apps. AI Integration: Leverage cutting-edge **AI technologies, including LLMs** (Large Language Models), to enhance the customer experience and drive product innovation. Ownership & Scalability: **Own the architecture** for your suite of products, ensuring it aligns with the organization s technical vision and is built to scale. Lead the review process and ensure timely implementation of changes. Mentorship & Best Practices: Drive **code reviews, design reviews, and architecture discussions** to establish and uphold engineering best practices across teams. Backend Infrastructure & Improvement: Maintain and continuously enhance backend systems to solve technical limitations before they affect production. Experiment with New Technologies: Explore emerging technologies and drive adoption where relevant, measuring impact and scalability. Qualifications & Skills Hands-on Engineering Leadership: **15+ years of hands-on software engineering experience**, including significant exposure to architecture and systems design. Core Proficiency: Strong knowledge of **GoLang and Java**, with experience in designing and building large-scale systems. Cloud & Platform Expertise: Expertise in **AWS and Kubernetes**, with the ability to architect and manage cloud-based systems. Architecture Design: Extensive experience with **multi-tier architectures, high-performance web-scale systems, and large databases**. System Design Expertise: Strong system design skills and experience designing clean interfaces and working at the right level of abstraction. Microservices & SOA: Experience designing and implementing **Service-Oriented Architectures (SOA) or Microservices**. CI/CD & Automation: Familiarity with build process automation and **CI/CD** (e.g., GitLab, Travis). Product Focused: Ability to work with product teams to identify customer pain points and iterate on solutions quickly. Willingness to Learn: Eagerness to learn new technologies and adapt to the ever-changing landscape of software engineering. Knowledge of Ruby on Rails is a plus, but not a requirement. Perks & Benefits Working Model: Hybrid Working Model (Flexibility to work both remotely and in the office). Leave: Generous Vacation Policy and comprehensive Family Leave. Medical Benefits: Above-market family floater medical insurance, covering both parents. Office Perks: Complimentary lunch provided at the office and convenient metro connectivity. Culture: Supportive, collaborative team culture.
Staff Software Engineer, Ai & Automation
Okta
Staff Software Engineer AI & Automation Location: Bengaluru Company: Okta, The World s Identity Company Experience: 8+ Years Type: Full-Time About Okta Okta is the world s leading identity platform. We empower people to securely access any technology, anywhere, on any device. With products like the Okta Platform and Auth0, we place identity at the core of business security, enabling growth through safe digital transformation. At Okta, we value diverse perspectives and experiences. We believe in learning, collaboration, and building an inclusive environment where everyone belongs. About the Team The Business Technology - Shared Services team is at the forefront of Okta s internal digital transformation. We focus on building intelligent, automated platforms that simplify operations and deliver smarter, faster experiences to both employees and customers. We collaborate across engineering, data science, security, and business units to deliver cutting-edge solutions powered by Generative AI (GenAI), virtual agents, workflow orchestration, and intelligent recommendations. The Opportunity As a Staff Software Engineer, you ll play a critical role in designing and developing AI-powered platforms that drive automation, scale, and intelligence across Okta s business. You ll help make LLM-powered solutions and intelligent automation a reality for the enterprise ensuring performance, security, and reliability at scale. This is a hands-on, individual contributor (IC) role, ideal for engineers who are passionate about solving complex problems, architecting scalable systems, and pushing the boundaries of AI integration. What You ll Do Design & Build: Develop scalable backend services that embed GenAI and automation into core business workflows (e.g., virtual agents, document intelligence, smart routing). Collaborate Across Teams: Work closely with product managers, data scientists, and other engineers from ideation to production. Architect for Scale: Make key architectural decisions around LLM integration, API design, data flow, and observability. Code with Excellence: Write clean, secure, and maintainable code in Python, Java, or similar languages. Build for Production: Use Docker, Kubernetes, and CI/CD pipelines to build and deploy high-availability services. Champion Best Practices: Promote high standards for testing, security, code reviews, and operational readiness. Mentor & Guide: Support a collaborative team culture through peer mentorship and design reviews. What You ll Bring 8+ years of experience in software engineering with a strong track record of building and maintaining production-grade, cloud-native services. Expertise in distributed systems, API development, and cloud infrastructure (AWS, GCP, or Azure). Proficiency in Python, Java, or Go. Experience with Docker, Kubernetes, and observability tools (e.g., Prometheus, Grafana, ELK). Exposure to AI/ML concepts and eagerness to work with LLMs, NLP, or automation platforms. A strong sense of ownership, collaborative mindset, and a bias toward action. Passion for learning and working with emerging technologies especially in the AI and automation space. Why Join Okta Make AI Real: Help move GenAI from experimentation to enterprise-wide impact. Build with Purpose: Work on challenges that simplify and secure Okta s internal operations. Grow in a Human-Centered Culture: Join a humble, technically driven team that values learning, excellence, and personal growth. Join Okta and shape how identity, AI, and automation come together to power the modern enterprise.
Distinguished Engineer - Machine Learning Engineering
Capital One
Distinguished Engineer Machine Learning Engineering Location: Bangalore Company: Capital One India About Us At Capital One India, we re redefining how technology powers financial services. Our teams work in a fast-paced, intellectually rigorous environment to tackle complex business challenges at scale. By harnessing the power of advanced analytics, data science, and machine learning, we create innovative, patentable solutions that transform customer experiences and drive the business forward. Team Overview: Machine Learning Experience (MLX) The MLX team leads Capital One s mission to build scalable, well-managed ML systems and platforms. We empower teams across the enterprise to develop, govern, and deploy machine learning models efficiently, securely, and at scale. From automated model governance to observability platforms, MLX enables end-to-end ML lifecycle management laying the foundation for AI-driven innovation across the organization. Role Overview We re looking for a Distinguished Engineer Machine Learning Engineering to join our MLX team. In this high-impact role, you'll architect and implement the platforms and tools that support model observability, automated governance, and ML model deployment at scale. This is an opportunity to drive enterprise-wide innovation and shape how ML is integrated into Capital One s core business systems. What You ll Do Design and build systems that capture and analyze large-scale model and feature metadata, including training metrics and runtime performance, to power model observability and governance automation. Partner with cross-functional teams including product managers, designers, and platform engineers to create scalable solutions that accelerate ML model lifecycle management. Lead efforts to enable automated governance decisions for ML models, ensuring compliance, auditability, and operational integrity. Architect and implement high-performance data pipelines that feed ML models with real-time and batch data. Contribute to the design and implementation of cloud-native ML systems using tools such as AWS, Kubernetes, and Terraform. Write clean, scalable, production-grade code in languages like Python, Go, or Java. Implement CI/CD pipelines, testing frameworks, and monitoring systems for ML applications. Drive the adoption of best practices in ML Ops, observability, and platform resilience. Basic Qualifications Master s Degree in Computer Science or related field. 15+ years of experience in software engineering or solution architecture. 10+ years building data-intensive, distributed computing systems. 10+ years programming in Python, Go, or Java. 8+ years of hands-on experience with industry-leading ML frameworks (e.g., Scikit-learn, TensorFlow, PyTorch, Dask, Spark). Preferred Qualifications PhD or Master's in Computer Science, Electrical Engineering, Mathematics, or related field. 5+ years of experience building, scaling, and optimizing production ML systems. Deep expertise in data preparation, feature engineering, and ML pipeline optimization. 10+ years writing performant, maintainable, and resilient production code. Strong experience deploying ML solutions on public cloud platforms (AWS, Azure, GCP). Expertise in distributed systems, file systems, or multi-node databases. Open-source contributor to ML tools or libraries. Published work in ML (papers, patents, blogs, etc.). 5+ years of experience in ML Ops (using MLflow, TFX, Kubeflow, etc.). Experience with LLMs and Generative AI applications (open-source or commercial models). Proven experience designing production-ready observability platforms for ML applications. Be at the forefront of building scalable, secure, and enterprise-grade ML platforms. Shape the future of AI and ML adoption in a top-tier financial institution. Collaborate with world-class engineers and data scientists. Solve real-world problems with high business impact. Thrive in a diverse, inclusive, and innovation-focused culture. Qualification : PhD or Master's in Computer Science, Electrical Engineering, Mathematics, or related field
Lead Fullstack Engineer
Capital One
Lead Fullstack Engineer Location: Bangalore Company: Capital One India About Capital One India At Capital One India, we thrive in a fast-paced, intellectually rigorous environment where we solve real-world business problems at scale. By combining advanced analytics, data science, and machine learning, we unlock powerful insights and build cutting-edge, patentable technology products that drive innovation and impact. Team Overview: Machine Learning Experiences (MLX) The MLX team is at the core of Capital One s AI/ML transformation. We re shaping the way the enterprise builds, deploys, and monitors ML models and features providing the tools and platforms that enable every business line to deliver AI-powered experiences. From platform onboarding to observability, MLX sets the foundation for responsible, scalable machine learning across Capital One. Role Overview We are seeking a Lead Fullstack Engineer to architect and develop scalable, cloud-native applications that power our ML platforms. You ll be responsible for leading technical initiatives, mentoring team members, and driving the development of full-stack systems with a primary focus on Python and Go, and a working proficiency in React. This is a hands-on leadership role requiring deep cloud expertise (AWS), backend scalability know-how, and familiarity with ML observability concepts. What You ll Do Lead cross-functional teams on full-stack development projects that support machine learning platforms and observability tools. Design and develop robust applications using Python, Go, and React, adhering to scalable and maintainable coding principles. Architect cloud-native solutions on AWS, leveraging services such as Lambda, ECS, S3, DynamoDB, SQS, and IAM. Implement and optimize CI/CD pipelines, containerization with Docker/Kubernetes, and other DevOps best practices. Work closely with ML engineers and data scientists to enhance observability and monitoring for ML models in production. Utilize open-source frameworks to accelerate development and maintain system performance. Collaborate with product managers to deliver user-centric, cloud-based solutions that drive real customer impact. Conduct code reviews, mentor engineers, and maintain high technical standards across the team. Contribute to backend optimization, API development, and integration with real-time data systems. Required Qualifications 7 10 years of experience in software development, with a strong background in backend and full-stack systems. Expertise in Python and Go; strong ability to build scalable, maintainable systems. Proficiency in React for frontend development (medium level). Hands-on experience with AWS services such as Lambda, ECS, S3, DynamoDB, IAM, and SQS. Experience with ML observability frameworks and tools. Knowledge of open-source development frameworks and cloud architecture patterns. Proficiency in CI/CD, containerization (Docker, Kubernetes), and modern DevOps practices. Strong problem-solving abilities and a track record of leading technical initiatives in agile environments. Preferred Qualifications Experience monitoring ML models in production and integrating observability tools. Knowledge of event-driven architecture and microservices. Familiarity with GraphQL, WebSockets, and real-time data communication. Understanding of data streaming architectures and processing pipelines. Why Join Capital One Be part of a team building enterprise-scale platforms that power ML for millions of customers. Work on impactful, cloud-native solutions in a collaborative and high-growth environment. Collaborate with world-class engineers, data scientists, and product leaders. Enjoy a culture that values experimentation, learning, and technical excellence.
Devops Engineer-2
Cashfree Payments India Private Limited
Position: DevOps Engineer-2 Location: Bengaluru Employment Type: Full-Time Department: Engineering Job Description: We are looking for a skilled DevOps Engineer-2 to design, implement, and maintain secure, scalable, and highly available infrastructure. You will play a key role in automating infrastructure provisioning, capacity planning, and building robust monitoring and CI/CD pipelines. Responsibilities: Design and implement secure, scalable infrastructure solutions. Automate infrastructure provisioning, demand forecasting, and capacity planning. Develop automation tools and frameworks to enhance system observability, availability, reliability, performance, and latency monitoring. Monitor system health, application performance, security controls, and cost optimization. Participate in sustainable incident response, peer reviews, and blameless postmortems. Lead the adoption and rollout of best DevOps tools and automation practices across services. Build and maintain continuous integration and continuous deployment (CI/CD) pipelines. Required Skills and Experience: Minimum 3 years of experience in DevOps and cloud technologies. Expertise in at least one major cloud platform: AWS, Azure, or GCP. Strong production experience with Kubernetes, including deployment, management, and troubleshooting. Proven ability to design scalable and resilient infrastructure architectures. Proficiency with infrastructure-as-code tools such as Terraform, Pulumi, or CloudFormation. Strong debugging and troubleshooting skills. Deep knowledge of Linux servers and networking fundamentals. Hands-on experience with scripting or programming languages like Python, Shell, Go, or Java. Familiarity with monitoring and observability tools such as DataDog, NewRelic, ELK stack, Prometheus, or Grafana. Understanding of modern cloud-native development practices including microservices architecture and RESTful APIs. Ability to thrive in a fast-paced, dynamic work environment.
Site Reliability Engineer
Groww
Position: Site Reliability Engineer Location: Bengaluru About Groww At Groww, we re on a mission to make financial services simple, accessible, and transparent for every Indian. As one of India s fastest-growing financial platforms, we help millions take control of their financial future through a wide range of products. We re a team driven by ownership, radical customer-centricity, and a deep passion for challenging the status quo. From intuitive design to robust engineering, everything we build is grounded in what our customers need. If you re excited about building systems that power the future of finance in India, we d love to hear from you. Our Vision To empower every Indian with the knowledge, tools, and confidence to make sound financial decisions. Our goal is to be the most trusted financial partner for millions across the country. Our Core Values Customer Obsession We put our users first, always. Extreme Ownership We own everything we do, end-to-end. Simplicity We keep things simple, effective, and intuitive. Long-term Thinking We focus on sustainable, impactful decisions. Transparency We believe in open communication and collaboration. Role Overview: As a Site Reliability Engineer (SRE) at Groww, you will be responsible for ensuring our systems are highly available, performant, and secure. You will work closely with engineering and infrastructure teams to improve reliability, automate deployments, and manage mission-critical services that power our platform. Key Responsibilities: Monitor and troubleshoot issues related to system performance, availability, and security. Define and maintain SLIs, SLOs, and Error Budgets to improve system reliability. Use tools like Grafana to analyze and report on metrics and trace data. Participate in the on-call rotation for 24/7 support of production systems. Collaborate with developers to ensure scalability and reliability are built into new services. Roll out security and infrastructure features proactively. Manage automated deployments, version control, and release rollouts. Perform Root Cause Analysis (RCA) for incidents and implement long-term fixes. Optimize system performance, conduct capacity planning, and create recovery strategies. Identify and automate repetitive tasks to reduce toil. Leverage CI/CD tools such as Git, Jira, Jenkins to streamline development workflows. Requirements: 4 6 years of relevant experience in SRE, DevOps, or infrastructure engineering. Bachelor's or Master's degree in Computer Science or a related field. Strong background in Linux/Unix system administration and networking. Hands-on experience with cloud platforms like GCP or AWS. Proficiency in programming languages such as Python, Java, or Go. Experience with monitoring and alerting tools: Grafana, Prometheus, New Relic, etc. Familiarity with configuration management tools. Experience with Kubernetes, Docker, and container orchestration tools is a strong plus. Excellent problem-solving, communication, and team collaboration skills. Be a part of one of India s fastest-growing fintech startups. Build and scale systems that impact millions of users daily. Work with passionate, driven teammates who are redefining financial services. A culture that encourages continuous learning, ownership, and transparency. If you're ready to help shape the future of fintech infrastructure in India, Groww is the place for you. Let s build something extraordinary together. Qualification : Bachelor's or Master's degree in Computer Science or a related field
Platform Engineer
Colortokens
Platform Engineer Location: Bengaluru, Karnataka, India Full-time partially remote About ColorTokens At ColorTokens, we empower businesses to stay operational and resilient in an increasingly complex cybersecurity landscape. Breaches happen but with our cutting-edge ColorTokens Xshield platform, companies can minimize the impact of breaches by preventing the lateral spread of ransomware and advanced malware. We enable organizations to continue operating while breaches are contained, ensuring critical assets remain protected. Our innovative platform provides unparalleled visibility into traffic patterns between workloads, OT/IoT/IoMT devices, and users, allowing businesses to enforce granular micro-perimeters, swiftly isolate key assets, and respond to breaches with agility. Recognized as a Leader in the Forrester Wave : Microsegmentation Solutions (Q3 2024), ColorTokens safeguards global enterprises and delivers significant savings by preventing costly disruptions. Our culture We foster an environment that values customer focus, innovation, collaboration, mutual respect, and informed decision-making. We believe in alignment and empowerment so you can own and drive initiatives autonomously. Self-starters and high-motivated individuals will enjoy the rewarding experience of solving complex challenges that protect some of world s impactful organizations be it a children s hospital, or a city, or the defense department of an entire country. Position Overview: Colortokens is looking for a Junior Platform Administrator to assist in managing, maintaining, and optimizing our NextGen Security Information and Event Management (SIEM) platform. The ideal candidate will support the day-to-day operations, help onboard customer log sources, troubleshoot integration issues, and provide technical assistance to the security operations team. This role is ideal for a motivated professional with 3+ years of experience in SIEM administration, security operations, or log management. Key Responsibilities: SIEM Platform Administration Assist in deploying, configuring, and maintaining the NextGen SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, QRadar, Chronicle, Exabeam). Perform basic updates and patches to ensure platform security and functionality. Monitor SIEM health, performance, and uptime under the guidance of senior administrators. Log Source Management Onboard new log sources and validate data ingestion. Help troubleshoot log ingestion, parsing, and formatting issues. Maintain log retention policies for compliance. Rule and Use Case Management Support the development and deployment of detection rules, correlation use cases, and alerts. Tune existing use cases to minimize false positives. Work closely with security analysts to refine alerting strategies. Integration and Automation Assist in integrating SIEM with other security tools (e.g., EDR, microsegmentation, vulnerability scanners). Work on basic automation tasks using scripting (Python, PowerShell) to enhance SIEM efficiency. Platform Security and Compliance Support role-based access control (RBAC) and platform security policies. Help ensure SIEM adheres to compliance standards like SOC2, ISO 27001. Participate in periodic security audits. Network Debugging & Troubleshooting Have a basic understanding of TCP/IP, networking concepts, and protocols. Assist in debugging network connectivity issues related to SIEM log ingestion. Use basic network troubleshooting tools. Collaboration and Support Work alongside SOC analysts, threat hunters, and security engineers. Provide basic technical support for SIEM users. Assist in training and documentation for security teams. Performance Monitoring and Optimization Monitor storage and indexing performance to ensure optimal operations. Report any performance issues to senior administrators. Contribute to platform health reports and alerting metrics. Incident Support Assist SOC teams in log analysis, incident response, and forensic investigations. Ensure log data is readily available for security incidents. Education and Certifications: Bachelor s degree in Computer Science, Information Security, or a related field. Certifications (Preferred but not mandatory): Splunk Certified User/Admin Microsoft Certified: Security Operations Analyst Associate QRadar Certification Any SIEM-related certification Experience: 3+ years of experience in SIEM administration, security operations, or log management. Hands-on experience with at least one SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, Chronicle, Exabeam). Basic knowledge of log ingestion, rule creation, and data parsing. Exposure to scripting (Python, PowerShell) for automation. Basic understanding of TCP/IP networking concepts and network debugging. Technical Skills: Understanding of log formats, Syslog, JSON, XML, and data pipelines. Basic knowledge of querying languages (KQL, SPL, AQL). Familiarity with SIEM integration with security tools like EDR, SOAR, NDR. Awareness of MITRE ATT&CK, NIST, or CIS security frameworks. Basic experience with network troubleshooting tools (ping, traceroute, netcat (nc)). Soft Skills: Strong problem-solving and troubleshooting abilities. Good verbal and written communication skills. Ability to work collaboratively in a security operations environment. Preferred Skills: Basic understanding of cloud-based security solutions (AWS, Azure, Google Cloud). Exposure to SOAR tools (e.g., Cortex XSOAR, Splunk Phantom). Interest in machine learning-based anomaly detection for SIEM. Key Metrics for Success: Successful onboarding of log sources. Improvement in log ingestion and parsing accuracy. Contribution to fine-tuning detection rules. Timely resolution of SIEM-related support requests. Ability to identify and troubleshoot basic network connectivity issues.
Lead Product Engineer
Themathcompany
Job Title: Lead Product Engineer Location: Bengaluru, Karnataka, India Department: Product Engineering Experience: 6 10 years About TheMathCompany TheMathCompany (MathCo ) is a global Enterprise AI and analytics firm helping Fortune 500 and Global 2000 organizations drive better decision-making through custom-built AI solutions. With our proprietary platform NucliOS, we power scalable, reusable, and intelligent products that accelerate digital transformation across industries. At MathCo, we empower our engineers to innovate, take ownership, and work on cutting-edge enterprise-grade products from scratch. If you re passionate about full-stack product development and thrive in a dynamic, high-impact environment, we d love to talk to you. About the Role As a Lead Product Engineer, you will be responsible for the architecture, development, and deployment of full-stack applications that are critical to our enterprise clients success. You ll lead a cross-functional team and work closely with product managers, designers, and data scientists to deliver high-performance solutions that scale and transform. This role combines hands-on coding, system architecture, and team leadership, and is ideal for engineers looking to make a direct impact on product direction and execution. Key Responsibilities 1. Full Stack Development Build and maintain RESTful APIs and microservices using languages like Python or Node.js. Design and develop scalable and maintainable UI components using modern front-end frameworks (e.g., React, Vue, Angular). Architect and manage application databases (SQL & NoSQL). Integrate with authentication/authorization protocols (e.g., OAuth2, OIDC, SAML). Participate in the full product lifecycle from ideation and development to deployment and support. 2. Product Development & Collaboration Translate business requirements into technical solutions in collaboration with product managers and designers. Provide technical leadership and guidance to the development team. Optimize systems for performance, scalability, and reliability. 3. Code Quality & Testing Write clean, efficient, well-documented code following best practices and industry standards. Conduct and participate in code reviews. Develop and maintain automated test frameworks; ensure unit, integration, and end-to-end testing coverage. Oversee CI/CD pipelines and manage cloud-based deployments. Must-Have Technical Skills 6 10 years of experience in full-stack development and system design. Proficiency in backend languages like Python, Node.js (bonus: Rust, Go). Strong front-end experience with HTML, CSS, JavaScript, and frameworks like React, Angular, or Vue.js. Expertise in REST APIs, microservices, and database design (ER modeling, data flow, API integration). Hands-on with PostgreSQL, MySQL, MongoDB, or similar. Experience deploying applications on cloud platforms (AWS, Azure, GCP). Familiarity with Docker, Kubernetes, and infrastructure tools like Terraform, CloudWatch. Version control and DevOps experience with GitHub, Azure DevOps, and CI/CD pipelines. Non-Technical & Soft Skills Strong knowledge of Agile methodologies (SCRUM). Excellent problem-solving and debugging capabilities. Strong communication and collaboration skills across teams. Ability to manage multiple projects in a fast-paced environment. Work with some of the best minds in data science, engineering, and AI. Drive high-impact product development used by Fortune 500 companies. Be part of a flat, inclusive, and innovation-driven workplace culture. Access to cutting-edge tools, custom-built platforms, and rapid experimentation. Apply now and help build world-class AI-driven solutions with MathCo.
Senior Software Engineer - Cloud Native Protection
Rubrik
Senior Software Engineer Cloud Native Protection Location: Bangalore, India About the Team Rubrik s Cloud Native Protection team safeguards customer data on public cloud platforms. With cloud data growing rapidly and cyber threats increasing, the team builds scalable, secure solutions to protect, search, and analyze cloud data efficiently. Operating like a startup within a startup, the team tackles complex engineering challenges in a culture driven by strong engineering values and collaboration. About the Role As a Senior Software Engineer, you will be a key technical leader responsible for driving complex projects, designing scalable cloud-native software, and mentoring team members. You ll work closely with other engineers and cross-functional partners, bringing technical expertise, initiative, and leadership to deliver impactful solutions. What You ll Do Design, develop, test, deploy, maintain, and improve cloud native protection software. Tackle open-ended, complex problems, leading investigation and scoping efforts. Own project execution and ensure successful delivery of assigned work. Mentor and guide junior engineers, fostering their growth. Collaborate with product management, QA, UI/UX, documentation, and support teams. Experience & Qualifications Education & Experience: Bachelor s or Master s degree in Computer Science or related field. 4+ years of professional experience in software development. Technical Skills: Proficient in one or more programming languages: Go, Java, C/C++, Scala, Python. Experience with public cloud platforms (AWS, Azure, GCP) is a plus. Familiarity with Docker, containers, Kubernetes, and microservices architectures is a bonus. Strong understanding of SDLC, design patterns, and software engineering best practices. Leadership & Collaboration: Proven problem-solving skills and attention to detail. Experience reviewing and designing software artifacts with high quality. Strong leadership and communication skills with a track record of mentoring others. Ability to work independently and deliver impactful results on complex projects. Rubrik is committed to securing the world s data through Zero Trust Data Security . Our platform combines machine learning and cloud-native technology to protect enterprises against cyberattacks, insider threats, and operational disruptions, ensuring data availability and integrity even under adverse conditions. Qualification : Bachelors or Masters degree in Computer Science or related field.
Backend Developer Intern (sde)
Cloudsek
Job Title: Backend Developer Intern (SDE) Cybersecurity | CloudSEK Location: Bengaluru, Karnataka, India About CloudSEK CloudSEK is a fast-growing AI-powered cybersecurity company that specializes in digital risk monitoring. Founded in 2015 and headquartered in Singapore, we are building the world s fastest and most reliable AI platform to identify and mitigate digital threats in real-time. We ve received multiple accolades, including: NASSCOM-DSCI Excellence Award for Security Product Company of the Year NetApp Excellerator s Best Growth Strategy Award Raised $7M Series A funding led by MassMutual Ventures At CloudSEK, we foster a culture that values curiosity, creativity, and collaboration. If you're passionate about cybersecurity and backend development, we want you on our team! What You'll Do (Internship Responsibilities) As a Backend Development Intern (SDE), you will: Work with experienced developers to design, build, and maintain scalable backend systems for our cybersecurity products Write clean and efficient code using modern programming languages like Node.js or Go Collaborate with cross-functional teams including front-end developers, data scientists, and product managers to deliver innovative features Participate in code reviews and contribute to maintaining high-quality coding standards Troubleshoot and fix bugs, performance bottlenecks, and security issues Document technical specifications and support materials Stay updated with emerging technologies in backend development and cybersecurity What We're Looking For (Intern Requirements) Pursuing or recently completed a degree in Computer Science, Software Engineering, or related field Solid understanding of backend development with skills in Node.js or Go Basic knowledge of APIs, databases (SQL), and web technologies Strong analytical and problem-solving abilities Excellent communication and teamwork skills Eagerness to learn, adapt, and grow in a dynamic environment Passion for cybersecurity and innovation Preferred Qualifications (Nice to Have) Exposure to cloud platforms like AWS, GCP, or Azure Experience with Docker, Kubernetes, or similar containerization tools Familiarity with CI/CD pipelines, Git, Jenkins, and DevOps workflows Understanding of NoSQL databases, caching, and distributed systems Hands-on experience working on real-world cybersecurity products Mentorship from a team of industry experts Flexible working hours and an open, vibrant office culture Free food, unlimited snacks, and drinks at our Bengaluru office Fun team events, games, and music we work hard and celebrate harder! Note: This is a paid internship opportunity. Duration and stipend will be discussed during the interview process. Join CloudSEK as a Backend Intern and work on cutting-edge technology that protects global digital assets. Apply now to be part of a team that s shaping the future of digital risk management. Qualification : Pursuing or recently completed a degree in Computer Science, Software Engineering, or related field
Sr. Software Engineer, .net Fullstack Reactjs
Apttus
Senior Software Engineer .NET Core | Cloud | Microservices | Bangalore, India Location: Bangalore, India Department: Software Engineering Reports To: Sr. Manager, Software Engineering Experience: 5+ Years Tech Stack: .NET Core 6, C#, JavaScript, Angular/React, AWS/Azure, Kubernetes, ElasticSearch, NoSQL About Conga: At Conga, we thrive on innovation, career growth, and team collaboration. We are a leading provider of Revenue Lifecycle Management solutions, helping businesses streamline configuration, fulfillment, and renewal with AI-driven insights and a unified data model. Our culture is rooted in the Conga Way, which emphasizes employee empowerment, customer transformation, and product excellence. About the Role: We are looking for a Senior Software Engineer to join our Platform Services team in Bangalore. You will play a pivotal role in building cloud-agnostic, scalable platforms that power the Conga product suite. You will collaborate with cross-functional teams, design high-performance backend systems, and contribute to all phases of the Agile development lifecycle. This is a career-defining opportunity to work on enterprise-scale microservices architecture and cutting-edge cloud platforms. As a Senior Software Engineer, you will not only write clean, scalable code but also mentor junior engineers and solve complex technical challenges in a fast-paced SaaS environment. Why This Role Matters: This isn't just a coding job it's an opportunity to shape the future of enterprise software. You will have the chance to influence technical direction, ensure high-quality engineering practices, and contribute to scalable solutions that are deployed globally. As part of a forward-thinking team, you ll drive innovation and build solutions that make a real impact. Key Responsibilities: Backend Development: Design, develop, and maintain scalable services using .NET Core 6 and C#, ensuring code quality and adherence to best practices. Frontend Collaboration: Work with React or AngularJS, JavaScript, HTML5, and CSS to support full-stack development. Cloud Platforms: Implement solutions on AWS or Azure, utilizing modern cloud architecture and design patterns. Microservices Architecture: Develop cloud-agnostic microservices, using containers and Kubernetes, optimized for performance and reliability. Agile Participation: Actively participate in Agile ceremonies (planning, backlog grooming, reviews) and contribute to end-to-end project delivery. Problem Solving: Act as a technical subject matter expert (SME) for issue resolution, architecture discussions, and performance optimization. Collaboration: Work closely with global teams across time zones, fostering a highly agile and transparent development culture. Mentorship: Provide guidance to junior developers, conduct code reviews, and promote a culture of technical excellence. Qualifications & Experience: 5+ years of software development experience with a strong background in .NET Core, C#, and JavaScript frameworks (React or AngularJS). Hands-on experience with React or AngularJS for front-end development. Proficiency in cloud platforms such as AWS or Azure, as well as ElasticSearch and NoSQL databases. Experience in building and deploying containerized applications using Kubernetes. Deep understanding of cloud-native architecture, CI/CD pipelines, and scalable microservices. Bachelor's degree in Computer Science, Engineering, or a related field. What Sets You Apart: Strong Communication Skills: Ability to clearly explain complex technical concepts and collaborate across globally distributed teams. Leadership Potential: Ability to take ownership of deliverables, motivate others, and contribute meaningfully to team goals. Problem Solver: A passion for diagnosing and resolving issues across stacks with elegant solutions. Global Mindset: Comfort working outside traditional hours to coordinate with international teams and ensure timely delivery. At Conga, we re passionate about building a collaborative culture where innovation and growth go hand-in-hand. As a Senior Software Engineer, you ll be at the forefront of developing cutting-edge solutions that directly impact the way businesses manage their revenue lifecycle. If you're excited about scalable solutions, microservices architecture, and the cloud-first future, this is the role for you. Qualification : Bachelors degree in Computer Science, Engineering, or a related field.
Principal Cloud Development Engineer
Cloud Software Group
Job Title: Principal Cloud Development Engineer Location: Bengaluru, India About Cloud Software Group: Cloud Software Group (CSG), home to Citrix and TIBCO, is one of the largest global providers of cloud-based technologies, empowering over 100 million users worldwide. As a Principal Cloud Development Engineer, you will play a pivotal role in shaping the future of Desktop-as-a-Service (DaaS) solutions helping deliver secure, scalable, and intelligent platforms that drive modern work experiences from anywhere. We re entering an era of accelerated innovation and transformation now is the perfect time to bring your technical leadership, cloud expertise, and mentorship mindset to the forefront. About This Team: The DaaS team at CSG is responsible for designing and building scalable and resilient cloud-native microservices that power Citrix s core virtualization offerings. This team collaborates across product, architecture, operations, and customer success groups to build next-gen capabilities on Azure, AWS, and other hybrid environments. Your Role and Responsibilities: As a Principal Cloud Development Engineer, you will be expected to: Lead design and architecture discussions for cloud-native solutions within the Citrix DaaS product line. Drive the development of scalable and secure backend features, with emphasis on business logic, cloud security, and performance. Mentor junior and senior engineers, guiding them in coding best practices, design decisions, and technical growth. Collaborate with Product Managers, UX Designers, Support, and Site Reliability Engineers to build customer-centric features and maintain high service uptime. Contribute to strategic technical initiatives, including the adoption of Gen AI tools, DevSecOps automation, and performance tuning of production systems. Participate in on-call escalation support, helping debug complex issues and lead incident resolution. Promote a culture of continuous learning and improvement through code reviews, technical sessions, and post-incident analysis. Required Experience and Skills: 14+ years of experience in cloud software development using .NET (C#), Java, or equivalent Object-Oriented Programming languages. Strong computer science fundamentals (algorithms, data structures, systems design). Proven track record in building and leading cloud-native microservices with modern deployment practices (CI/CD, IaC, Kubernetes, Docker). Strong cloud platform expertise, especially in Microsoft Azure or Amazon EC2. Deep understanding of cloud security, including identity/access management, encryption, compliance, and incident response. Advanced knowledge in automation scripting (Python, PowerShell). Familiarity with troubleshooting tools like Sumo Logic, Splunk, or equivalent observability platforms. Experience with Terraform, CI/CD pipelines, and managing Kubernetes-based deployments. Strong communication, collaboration, and mentoring abilities. Preferred Qualifications: Prior experience building secure services in the DaaS, VDI, or enterprise SaaS domain. Hands-on experience with Azure Active Directory, Microsoft AD, or other identity solutions. Moderate understanding of cryptographic protocols and encryption standards. Familiarity with Agile/SAFe development methodologies. Contributions to open-source or technical publications are a plus. Impact: Influence the architecture and direction of mission-critical cloud platforms used globally. Mentorship: Be a technical leader shaping the next generation of engineers. Innovation: Work with a company at the edge of a "Cambrian leap" in cloud evolution. Culture: Inclusive, forward-thinking, and driven by curiosity and collaboration. Flexibility & Benefits: Competitive salary, performance bonus, flexible work model, health insurance, wellness programs, and more. Equal Opportunity Statement: Cloud Software Group is committed to Equal Employment Opportunity and prohibits unlawful discrimination of any kind. All qualified applicants will receive consideration without regard to race, color, religion, gender, gender identity or expression, national origin, age, disability, veteran status, or any other characteristic protected by law.
Senior Site Reliability Engineer
Couchbase
Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!
Autoit Solutioning Engineer, Lead
Qualcomm
Job Title: Site Reliability Engineer (SRE) General Summary: We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. This role is critical in ensuring the stability, scalability, and security of our infrastructure and services. As an SRE, you will work collaboratively with software engineers, data scientists, and product managers to optimize system reliability while driving automation and continuous improvement. You will be responsible for modernizing traditional services, implementing cutting-edge technology, and proactively managing infrastructure to maintain operational excellence. If you are passionate about automation, DevSecOps, system performance, and infrastructure resilience, this role offers an exciting opportunity to make a meaningful impact. Key Responsibilities: System Monitoring & Incident Response: Continuously monitor system health, detect anomalies, and respond to incidents promptly. Investigate and troubleshoot service-related issues, ensuring minimal disruption. Implement proactive measures to prevent downtime and optimize system stability. Infrastructure Automation & DevOps Implementation: Develop and maintain Infrastructure-as-Code (IaC) scripts to automate deployments and scaling. Automate routine operational tasks to improve efficiency and reduce manual intervention. Leverage DevSecOps practices to ensure secure and resilient deployments. Performance Optimization & Capacity Planning: Collaborate with development teams to enhance software performance and system responsiveness. Identify and resolve system bottlenecks to improve speed, efficiency, and reliability. Forecast resource requirements based on traffic patterns and business growth. Security, Compliance & Risk Management: Implement security best practices and compliance measures across all infrastructure layers. Conduct security audits and ensure systems meet industry-standard security guidelines. Proactively assess and mitigate risks associated with infrastructure and deployments. Required Qualifications & Skills: Technical Expertise: Extensive experience with Linux-based environments (Ubuntu, RedHat), including system administration and troubleshooting. Strong proficiency in scripting and automation using Python, Bash, or Go. Experience with containerization and orchestration technologies such as Docker and Kubernetes. Familiarity with CI/CD pipelines and tools like Jenkins, Puppet, Vault, and Splunk. Hands-on experience with cloud platforms (AWS, Azure, or GCP). Problem-Solving & Leadership: Strong analytical skills with the ability to diagnose and resolve complex system issues. Self-driven, highly motivated, and able to work independently in a fast-paced environment. Ability to collaborate cross-functionally and communicate technical solutions effectively. Security & Reliability Focus: Solid understanding of DevSecOps principles and secure system design. Ability to implement monitoring, logging, and alerting solutions to maintain system resilience. Passion for continuous learning and leveraging data-driven approaches for system improvement. Work in a high-impact role that directly contributes to the reliability and scalability of mission-critical systems. Be part of an innovative, forward-thinking team that values automation, collaboration, and continuous improvement. Competitive salary, professional development opportunities, and an environment that fosters growth and innovation. If you are a passionate, results-driven SRE, we invite you to join us and play a pivotal role in shaping the future of our infrastructure.
Senior Technical Solutions Engineer (platform)
Databricks
Job Overview: We are seeking a highly skilled Frontline Senior Technical Solutions Engineer with over 5 years of experience to join our Platform Support team. This role is pivotal in delivering exceptional support for our Databricks Data Intelligence platform, addressing complex technical challenges, and ensuring the seamless operation of our data solutions. As a frontline engineer, you will be the primary point of contact for critical issues, working closely with both internal teams and customers to resolve high-impact problems and drive platform improvements. Key Responsibilities: Frontline Support: Serve as the primary technical point of contact for escalated issues related to the Databricks Data Intelligence platform. Provide expert-level troubleshooting, diagnostics, and resolution for complex problems affecting system performance and reliability. Customer Interaction: Engage with customers directly to understand their technical issues and requirements. Provide timely, clear, and actionable solutions to ensure high levels of customer satisfaction. Incident Management: Lead the resolution of high-priority incidents, coordinating with various teams to address and mitigate issues swiftly. Conduct thorough root cause analyses and develop preventive measures to avoid recurrence. Collaboration: Work closely with engineering, product management, and DevOps teams to share insights, identify recurring issues, and drive improvements to the Databricks Data Intelligence platform. Documentation and Knowledge Sharing: Create and maintain detailed documentation on support procedures, known issues, and solutions. Contribute to internal knowledge bases and create training materials to assist other support engineers. Performance Monitoring: Monitor and analyze platform performance metrics to identify potential issues before they impact customers. Implement optimizations and enhancements to improve platform stability and efficiency. Platform Upgrades: Manage and oversee the deployment of Databricks Data Intelligence platform upgrades and patches, ensuring minimal disruption to services and maintaining system integrity. Innovation and Improvement: Stay abreast of industry trends and advancements in Databricks technology. Propose and drive initiatives to enhance platform capabilities and support processes. Customer Feedback: Collect and analyze customer feedback to drive continuous improvement in support processes and platform features. Qualifications: Experience: Minimum of 5 years of hands-on experience in a technical support or engineering role related to Databricks Data Intelligence platform, cloud data platforms, or big data technologies. Technical Skills: A deep understanding of Databricks architecture and Apache Spark, along with experience in cloud platforms like AWS, Azure, or GCP, is essential. Strong capabilities in designing and managing data pipelines, distributed computing are required. Proficiency in Unix/Linux administration, familiarity with DevOps practices, and skills in log analysis and monitoring tools are also crucial for effective troubleshooting and system optimization. Problem-Solving: Demonstrated ability to diagnose and resolve complex technical issues with a strong analytical and methodical approach. Communication: Exceptional verbal and written communication skills, with the ability to effectively convey technical information to both technical and non-technical stakeholders. Customer Focus: Proven experience in managing high-impact customer interactions and ensuring a positive customer experience. Collaboration: Ability to work effectively in a team environment, collaborating with engineering, product, and customer-facing teams. Education: Bachelor s degree in Computer Science, Engineering, or a related field. Advanced degree or relevant certifications are highly desirable. Preferred Skills: Experience with additional big data tools and technologies such as Hadoop, Kafka, or NoSQL databases. Familiarity with automation tools and CI/CD pipelines. Understanding of data governance and compliance requirements. Innovative Environment: Work with cutting-edge technology in a fast-paced, innovative company. Career Growth: Opportunities for professional development and career advancement. Team Culture: Collaborate with a talented and motivated team dedicated to excellence and continuous improvement. About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide including Comcast, Cond Nast, Grammarly, and over 50% of the Fortune 500 rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark , Delta Lake and MLflow. To learn more, follow Databricks on Twitter,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visithttps://www.mybenefitsnow.com/databricks. Our Commitment to Diversity and Inclusion At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics. Compliance If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to pr...
Senior Software Engineer - Backend
Nvidia
NVIDIA is searching for a highly motivated senior software engineer for the team that is building capabilities for a next generation Network management and Telemetry system in cloud using modern design principles at internet scale.The person will be will be responsible for building distributed cloud applications. It will be a highly scalable, modern network operations toolset that provides visibility, troubleshooting, validation and telemetry for Ethernet networks. What you'll be doing: Development of distributed cloud applications, micro services and SAAS platform with high throughput and reliability. Contribute to applications like data ingestion, distributed computing, near real time analytic engines, RESTful APIs and user interfaces. Drive requirement discussions, design and product improvements. Drive improvements in areas like performance, team productivity, automation, quality, monitoring and reliability of applications. Working closely with the system architects, UI/UX and test engineers What we need to see: Bachelors/Masters Degree in Computer Science/Engineering 5+ years of experience in complex microservices based architectures. Extensive programming experience in Scala, Go, Python Fluent in coding and rapid prototyping. Strong experience in developing, maintaining, and testing of scalable distributed applications. Experience with stream processing frameworks, such as Kafka,Flink , Spark Streaming, Samza etc. Background with NoSQL databases such as Cassandra, MongoDB. Experience with orchestration/scheduling technologies like Kubernetes, SLURM, Nomad etc Ways to stand out from the crowd: Experience with public clouds like AWS. Worked in Reactive application designs (https://www.reactivemanifesto.org/). Experience in network stacks, protocols, SDN. NVIDIA is widely considered to be one of the technology world s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative, passionate and self-motivated, we want to hear from you! NVIDIA is leading the way inground-breakingdevelopments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Qualification : Bachelors/Masters Degree in Computer Science/Engineering
1 - 20 of 0 jobs
* No exact matches found. Showing closest results insteadNo results found
Modify search criteria or create an alert to get relevant jobs as soon as they’re posted