Kubernetes Clusters Jobs in Bengaluru
323 Jobs Found
Senior Software Engineer (backend)
Talview
Senior Software Engineer (Backend) Location: Bengaluru Hiring is still rooted in outdated processes manual screening, unconscious bias, long feedback cycles, and excessive administrative overhead. Organizations spend more time managing hiring than discovering real talent. At Talview, we re changing that with AI that actually works. We build GenAI-powered hiring and assessment platforms that make recruitment faster, fairer, and scalable. Our AI Products Alvy The world s first AI Proctoring Agent. Ivy A conversational AI Interviewer transforming first-round screenings. Our impact: 10M+ assessments delivered across 120+ countries. The Role We are looking for a Senior Software Engineer (Backend) to build high-performance systems. You will work across Talview s backend ecosystem including PostgreSQL, Redis, AWS Lambda, GraphQL, Hasura, Temporal, and microservices to deliver secure, reliable, and globally scalable solutions. What You ll Do Design and operate scalable microservices supporting millions of assessments. Monitor and optimize system performance and efficiency. Own backend services end-to-end from development to maintenance. Collaborate with Frontend, QA, Product, and Customer Success teams. Solve engineering challenges directly impacting platform stability. You Might Be a Fit If You Have Required Qualifications: 4 6 years of backend engineering experience, primarily with Node.js. Deep understanding of OOP, design patterns, and SOLID principles. Experience with unit, integration, and functional testing. Strong knowledge of distributed systems and SQL (PostgreSQL). Working knowledge of Docker, Kubernetes, and microservices. Familiarity with: Time-series & Messaging: InfluxDB, Prometheus, Kafka, RabbitMQ, SQS. Caching & Frameworks: Redis, Hasura, Apollo. Cloud: AWS S3, Azure Blob. Bonus Points: Experience in SaaS or high-scale product environments. Hands-on experience with AI-assisted coding tools (Cursor, Windsurf, etc.). Understanding of Agile/Scrum methodologies. Our Culture: The 5Cs We re driven by Collaboration, Commitment, Credence, Customer-centricity, and Candor. We work together, ship quality, and communicate openly. What You Get Competitive compensation and best-in-class hardware. Flexibility: 5-day work week and flexible hours. Growth: Accelerated career paths in a fast-scaling organization. Perks: Stocked pantry, monthly lunches, and annual offsites.
Lead Software Engineer - Scale & Performance
Team Vunet Systems
Lead Software Engineer - Scale & Performance Location: Bengaluru Experience: 6 12 years About VuNet VuNet is a pioneer in Business Journey Observability, using Big Data and Machine Learning to revolutionize digital experiences in the financial services industry. Our platform delivers end-to-end visibility into customer journeys, helping organizations proactively resolve issues, ensure operational resilience, and deliver superior user satisfaction. With over 28 billion digital transactions monitored every month and serving more than 300 million users globally, VuNet is shaping the future of observability for some of the largest banks and financial institutions. We are Series B funded, part of NASSCOM s DeepTech Club, and recognized by global analysts such as Gartner and Omdia. Your Role: Lead Software Engineer - Scale & Performance As a Lead Software Engineer for Scale & Performance, you ll own the performance and scalability benchmarks for VuNet s observability platform. You will work with cutting-edge technologies, design robust test frameworks, and ensure that our platform scales seamlessly to meet the demands of millions of users. Roles & Responsibilities Own performance and scalability benchmarking for key platform components (ingestion pipelines, data storage, and query services). Design and execute load, stress, soak, and capacity tests across microservices, agents, and ingestion layers. Identify and resolve performance bottlenecks in both infrastructure (CPU/memory/IO) and application layers (API latency, throughput, GC behavior). Develop and maintain performance test frameworks, preferably using Kubernetes-based environments. Collaborate with DevOps and SRE teams to optimize system configurations (Kubernetes, Postgres/TimescaleDB, ClickHouse, Kafka) for scale. Implement OpenTelemetry for service instrumentation to monitor system health and latency (p50/p95/p99 metrics). Contribute to capacity planning, scaling strategies (horizontal/vertical), and resource optimization. Analyze production incidents related to scaling issues and drive permanent fixes. Work with engineering teams to design scalable architecture patterns and define SLIs/SLOs for system performance. Document performance baselines, tuning guides, and scalability best practices for internal use. What You Bring Mandatory Skills: Strong background in performance engineering for large-scale distributed systems or SaaS platforms. Expertise in Kubernetes, container runtimes (containerd/Docker), and resource profiling in containerized environments. Solid understanding of Linux internals, CPU/memory profiling, and network stack tuning. Hands-on experience with observability tools (Prometheus, Grafana, OpenTelemetry, Jaeger, Loki, Tempo, etc.). Familiarity with observability platform datastores like ClickHouse, PostgreSQL/TimescaleDB, Elasticsearch, or Cassandra. Experience with performance benchmarking tools such as k6, Locust, JMeter, or custom Golang/Python scripts. Ability to interpret system metrics (CPU usage, memory, GC, latency) and correlate across different layers. Nice-to-Have Skills: Experience with agent benchmarking (OpenTelemetry Collector, custom data shippers). Exposure to streaming systems like Kafka, NATS, or Pulsar. Familiarity with CI/CD pipelines for performance testing and regression tracking. Knowledge of cost optimization and capacity forecasting in cloud environments (AWS/GCP/Azure). Proficiency in Go, Python, or Bash scripting for automation and data analysis. Life at VuNet: At VuNet, we're building a world-class observability platform, and we re just getting started. You ll be part of a passionate, problem-solving team that embraces collaboration, fast learning, and staying ahead of emerging technologies like Gen AI. We foster a high-trust, inclusive culture where collaboration, ownership, and innovation are central to our success. If you're looking to work on cutting-edge tech, make a real impact, and grow with a supportive team you ll fit right in at VuNet. Benefits: Comprehensive health insurance coverage for you, your parents, and dependents. Mental wellness and 1:1 counseling support. A culture that promotes continuous learning, innovation, and career growth. Transparent, inclusive, and high-trust workplace. Opportunities for skill enhancement with training programs focused on new Gen AI technologies.
Gen AI Support Engineer-2
Exotel
Gen AI Support Engineer-2 Location: Bengaluru Experience: 4 7+ years Employment Type: Full-time About Us Exotel is the leading full-stack customer engagement platform and virtual telecom operator for emerging markets. Since its inception in 2011, Exotel has been powering 50 million daily engagements across voice, video, and messaging channels. We provide our unified customer engagement solutions to over 6000 companies globally, including industry leaders like Ola, Swiggy, Flipkart, GoJek, Byjus, Urban Company, HDFC Bank, Zomato, and Oyo. With $100 million in Series D funding and an ARR of $60 million, Exotel is a growth-stage company poised for massive impact. Overview We're seeking a Gen AI Support Engineer-2 to join our team. As an L2 Support Engineer, you will be the highest level of technical escalation within the support organization. Your role will encompass system reliability, platform integrity, troubleshooting mission-critical production issues, and collaborating with engineering teams for architecture feedback. Additionally, you'll help mentor junior engineers and improve operational processes and tools for large-scale environments. If you're passionate about writing clean code with Python and Django and want to contribute to a fast-paced, mission-driven company, this role is for you! Responsibilities Mission-Critical Issue Resolution: Own the resolution of high-priority, time-sensitive production issues. Root Cause Analysis (RCA): Lead RCA reviews and push for systemic improvements in system architecture and processes. Performance Optimization: Identify bottlenecks and propose architectural changes to improve system performance and scalability. Patch Management: Assist in configuring, deploying, and testing patches, releases, and application updates to production environments. SME for Production Systems: Serve as the Subject Matter Expert (SME) for Exotel's production systems and integrations. Cross-Team Collaboration: Work with Delivery, Product, and Engineering teams to influence system design, rollout strategies, and improvement plans. Mentorship: Lead and mentor L1/L2 engineers on troubleshooting best practices and continuous learning. Code Writing & Automation: Write clean, maintainable code for internal tools, scripts, and automation using Python and Django. Support Tooling: Automate recovery workflows and design support tools for proactive monitoring. Operational Excellence: Establish and improve SLAs, monitoring dashboards, alerting systems, and operational runbooks to ensure system reliability. Must Have Skills Backend Development Support: 3+ years of experience in backend development support, production support, or DevOps/SRE roles. Core Technologies: Proficiency in Python, Django, SQL, and troubleshooting in Linux. Web Technologies: Strong understanding of HTML, CSS, JavaScript, and other web technologies. Distributed Systems & Cloud: Experience working with distributed systems, cloud architecture (AWS), Docker, and Kubernetes. Automation: Strong scripting skills with Bash/Python for automation and operational support. CI/CD & Observability: Good understanding of CI/CD, observability tools, and release management workflows. Communication Skills: Excellent communication, leadership, and incident command skills for managing production issues and cross-functional collaboration. Nice to Have Experience with AI-powered systems and machine learning technologies. Familiarity with monitoring systems like Prometheus, Grafana, or Elasticsearch. Knowledge of microservices architectures and scaling distributed systems. Innovative Work: Be at the forefront of cloud-based communications technology and AI-driven customer engagement platforms. Impact: Play a key role in maintaining and optimizing systems that power millions of customer interactions daily. Growth Opportunities: Be part of a fast-growing company with ample learning opportunities and career development. Collaborative Environment: Work in a supportive, inclusive environment where your input and ideas matter. Competitive Benefits: Comprehensive benefits package including health insurance, mental wellness support, and more.
Senior Backend Engineer, Cloud Platform
Postman
Senior Backend Engineer, Cloud Platform Location: Bengaluru Work Type: Full-Time About Postman Postman is the world s leading API platform, empowering over 40 million developers and 500,000 organizations, including 98% of the Fortune 500, to build and manage APIs efficiently. We simplify each step of the API lifecycle, enabling teams to create better APIs, faster. Founded in Bengaluru, Postman is headquartered in San Francisco, with offices in Boston, New York, and Bengaluru. We are privately held, backed by Battery Ventures, BOND, Coatue, CRV, Insight Partners, and Nexus Venture Partners. The Opportunity We are looking for a Senior Backend Engineer to join our Cloud Platform Team. You will design and implement robust backend systems and APIs that support internal tooling and enhance the productivity of Postman s engineering organization. This role requires a combination of strong backend expertise, problem-solving ability, and collaboration skills. Key Responsibilities Design & Development Build scalable and maintainable backend solutions. Develop high-quality APIs using modern backend technologies. Optimize applications for speed, performance, and scalability. Collaboration & Communication Work closely with frontend engineers, designers, and product managers to deliver innovative solutions. Participate in code reviews, sharing knowledge and maintaining code quality. Tooling & Integration Develop tools and components to support other engineering teams workflows. Integrate with backend services and APIs to deliver comprehensive solutions. Maintenance & Improvement Continuously refactor and optimize backend systems. Stay updated on industry trends and emerging technologies to keep Postman s platform cutting-edge. About You Education & Experience Bachelor s degree in Computer Science, Engineering, or equivalent experience. 6+ years of backend development experience, preferably in SaaS or cloud platforms. Technical Skills Strong proficiency in Java, Spring, and Kubernetes. Experience with Helm and Argo is a plus. Knowledge of RESTful APIs and asynchronous request handling. Familiarity with Git and version control best practices. Soft Skills Excellent problem-solving skills and attention to detail. Strong collaboration and communication abilities. Comfortable in a fast-paced, agile development environment. Flexible hybrid work model with a collaborative team environment. Full medical coverage, flexible PTO, wellness reimbursement, and monthly lunch stipend. Wellness programs, team-building events, and donation-matching initiatives. Inclusive, growth-oriented culture where every team member can thrive. Our Values Curiosity: Explore and innovate boldly. Transparency: Communicate openly about successes and failures. Focus: Deliver results aligned with Postman s vision. Inclusion: Every voice matters. Excellence: Strive for the best products and experiences. Qualification : Bachelors degree in Computer Science, Engineering, or equivalent experience
Staff Software Engineer, Ai & Automation
Okta
Staff Software Engineer AI & Automation Location: Bengaluru Company: Okta, The World s Identity Company Experience: 8+ Years Type: Full-Time About Okta Okta is the world s leading identity platform. We empower people to securely access any technology, anywhere, on any device. With products like the Okta Platform and Auth0, we place identity at the core of business security, enabling growth through safe digital transformation. At Okta, we value diverse perspectives and experiences. We believe in learning, collaboration, and building an inclusive environment where everyone belongs. About the Team The Business Technology - Shared Services team is at the forefront of Okta s internal digital transformation. We focus on building intelligent, automated platforms that simplify operations and deliver smarter, faster experiences to both employees and customers. We collaborate across engineering, data science, security, and business units to deliver cutting-edge solutions powered by Generative AI (GenAI), virtual agents, workflow orchestration, and intelligent recommendations. The Opportunity As a Staff Software Engineer, you ll play a critical role in designing and developing AI-powered platforms that drive automation, scale, and intelligence across Okta s business. You ll help make LLM-powered solutions and intelligent automation a reality for the enterprise ensuring performance, security, and reliability at scale. This is a hands-on, individual contributor (IC) role, ideal for engineers who are passionate about solving complex problems, architecting scalable systems, and pushing the boundaries of AI integration. What You ll Do Design & Build: Develop scalable backend services that embed GenAI and automation into core business workflows (e.g., virtual agents, document intelligence, smart routing). Collaborate Across Teams: Work closely with product managers, data scientists, and other engineers from ideation to production. Architect for Scale: Make key architectural decisions around LLM integration, API design, data flow, and observability. Code with Excellence: Write clean, secure, and maintainable code in Python, Java, or similar languages. Build for Production: Use Docker, Kubernetes, and CI/CD pipelines to build and deploy high-availability services. Champion Best Practices: Promote high standards for testing, security, code reviews, and operational readiness. Mentor & Guide: Support a collaborative team culture through peer mentorship and design reviews. What You ll Bring 8+ years of experience in software engineering with a strong track record of building and maintaining production-grade, cloud-native services. Expertise in distributed systems, API development, and cloud infrastructure (AWS, GCP, or Azure). Proficiency in Python, Java, or Go. Experience with Docker, Kubernetes, and observability tools (e.g., Prometheus, Grafana, ELK). Exposure to AI/ML concepts and eagerness to work with LLMs, NLP, or automation platforms. A strong sense of ownership, collaborative mindset, and a bias toward action. Passion for learning and working with emerging technologies especially in the AI and automation space. Why Join Okta Make AI Real: Help move GenAI from experimentation to enterprise-wide impact. Build with Purpose: Work on challenges that simplify and secure Okta s internal operations. Grow in a Human-Centered Culture: Join a humble, technically driven team that values learning, excellence, and personal growth. Join Okta and shape how identity, AI, and automation come together to power the modern enterprise.
Lead Platform Engineer
Team Vunet Systems
Lead Platform Engineer Observability Solutions Location: Bengaluru Experience: 6 10 Years Function: Observability Engineering | Platform Architecture | SRE Enablement Join VuNet Redefining Digital Observability at Scale VuNet is transforming the future of digital experiences through Business Journey Observability, combining Big Data and AI/ML to empower real-time visibility across payments, banking, and financial services. Monitoring 28+ billion transactions/month, our platform is trusted by top financial institutions and powers over 300 million users. Backed by Series B funding and recognized by Gartner, NASSCOM, and Forbes, we are leading the charge in building a new category of observability, proudly Made in India for global impact. Your Role: Lead Platform Engineer As the Lead Platform Engineer, you will architect and drive the development of packaged observability solutions across 100+ infrastructure and application technologies. You will define **golden signals**, build **data collection strategies**, and lead the standardization of alerts, dashboards, and RCA workflows for platforms like **Kubernetes, Oracle DB, and Tomcat**. This is a cross-functional leadership role that sits at the intersection of product, platform, DevOps, and SRE. You will **lead a team** and influence how observability is delivered, scaled, and adopted across complex environments. Key Responsibilities Observability Solution Development Design and lead the delivery of observability packages for databases, middleware, cloud-native, and legacy platforms. Define and implement data collection pipelines, including agents, APIs, logs, metrics, traces, and service discovery. Establish **golden signals, SLIs/SLOs**, and health KPIs for performance, availability, and anomaly detection. Dashboards, Alerts & RCA Develop standardized, reusable dashboards, alerts, reports, and troubleshooting playbooks. Automate **RCA workflows** to improve MTTR and reduce alert fatigue. Platform Enablement & Integration Work with engineering to enhance agent capabilities and support new data sources/formats. Guide implementation of platform features for better observability at scale. Team Leadership & Governance Lead and mentor a team of observability engineers and specialists. Define design patterns, reusable modules, and version-controlled libraries. Stakeholder Collaboration Partner with product managers, DevOps, SREs, and customer teams to gather requirements, align priorities, and validate use cases. Ensure deliverables are scalable, well-documented, and production-ready. What You Bring Must-Have Skills 6 10 years of experience in observability, platform engineering, or SRE roles. Hands-on with tools like Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, Splunk. Strong understanding of logs, metrics, traces, profiling, and collection strategies. Experience developing solutions for platforms like Kubernetes, Oracle, PostgreSQL, Tomcat, etc. Proficient in Python, Shell scripting, APIs, and automation tools (**Terraform**, etc.). Familiar with alert fatigue mitigation, anomaly detection, and RCA frameworks. Excellent communication, technical leadership, and documentation skills. Nice to Have Experience managing an observability marketplace or solution catalog. Contributions to open-source observability projects. Certifications in Kubernetes, Observability platforms, or cloud providers (AWS/GCP/Azure). Background in ITSM tools, CMDBs, or incident workflow automation. At VuNet, you ll help build a category-defining observability platform that s already transforming critical infrastructure for leading financial institutions. You ll work with passionate engineers, push technical boundaries, and grow in a high-trust, high-impact environment. What You ll Experience: Ownership of key observability initiatives impacting 300M+ users. Collaboration with SRE, DevOps, and product teams across real-time financial systems. Opportunity to experiment with and shape Gen AI, ML, and emerging telemetry trends. Perks & Benefits Health insurance for you, your parents, and dependents. 1:1 mental wellness support. Training programs, certifications, and career growth opportunities. Transparent, inclusive, and high-trust work culture. Access to cutting-edge technology and Gen AI-powered workspaces.
Lead / Senior Software Engineer
Ultraviolette Automotive
Job Title: Lead / Senior Software Engineer (Microservices, IoT, Kafka, AWS) Location: Bengaluru Experience Required: 5 8 years Industry: Automotive (Electric Mobility) Employment Type: Full-time About Ultraviolette Join the Charge. Create the Future. At Ultraviolette, we re not just building electric vehicles we re redefining what mobility looks and feels like. From creating India s fastest electric motorcycle to designing the world s most advanced electric scooter, we're on a mission to engineer machines that are not only sustainable, but also exhilarating. Our team is a diverse mix of engineers, designers, dreamers, and doers, united by a bold vision to accelerate the global shift to electric mobility. If you're driven by innovation and looking to work on cutting-edge products that challenge the status quo, this is your platform to make a real impact. About the Role We are looking for a Senior/Lead Software Engineer who thrives in designing scalable, high-performance, and cloud-native microservices. This role is ideal for someone with hands-on experience in Java, Spring Boot, Kafka, and AWS, and a keen interest in IoT and real-time data architectures. You will work at the intersection of hardware and software helping us bring our vision of connected, intelligent, and high-performance electric vehicles to life. Key Responsibilities Design, develop, and deploy Java-based microservices using Spring Boot and related technologies. Architect and implement event-driven systems using Apache Kafka for real-time IoT data streaming. Build and manage RESTful APIs with AWS API Gateway for seamless service integration. Leverage AWS services such as Lambda, DynamoDB, and MemoryDB to build scalable, serverless solutions. Collaborate with cross-functional teams front-end developers, DevOps, and product managers to convert ideas into working features. Optimize application performance, troubleshoot issues, and ensure high reliability and availability. Mentor junior engineers, conduct code reviews, and enforce clean coding practices. Contribute to architecture decisions, technical roadmaps, and innovation initiatives. Stay on top of emerging technologies in cloud, microservices, and IoT ecosystems. Participate actively in Agile processes including sprint planning, stand-ups, and retrospectives. Required Qualifications Bachelor s or Master s degree in Computer Science, Engineering, or a related field. 5+ years of hands-on experience in Java development, with a strong foundation in Spring Boot. Proven experience designing and deploying microservices architectures. Strong knowledge of Kafka and real-time streaming/data pipeline architectures. Hands-on experience with Docker and Kubernetes for containerization and orchestration. Understanding of IoT protocols such as MQTT or CoAP and device connectivity. Proficiency with version control (Git) and Agile methodologies. Excellent problem-solving, communication, and collaboration skills. Nice to Have AWS Certifications (e.g., AWS Certified Developer or Solutions Architect). Experience with additional languages like Python or Rust. Familiarity with NoSQL databases such as MongoDB or Cassandra. Knowledge of DevOps practices, CI/CD pipelines, and Infrastructure as Code (e.g., Terraform). Exposure to stream processing frameworks (e.g., Apache Flink, Spark Streaming). Experience in edge computing and distributed systems. Strong understanding of cloud security best practices. Work on India s most futuristic electric vehicles at the intersection of technology, performance, and sustainability. Be part of a mission-driven company shaping the future of mobility. Collaborate with some of the most talented engineers and designers in the industry. A fast-paced environment that celebrates innovation, learning, and growth. Qualification : Bachelors or Masters degree in Computer Science, Engineering, or a related field
Devops Engineer-2
Cashfree Payments India Private Limited
Position: DevOps Engineer-2 Location: Bengaluru Employment Type: Full-Time Department: Engineering Job Description: We are looking for a skilled DevOps Engineer-2 to design, implement, and maintain secure, scalable, and highly available infrastructure. You will play a key role in automating infrastructure provisioning, capacity planning, and building robust monitoring and CI/CD pipelines. Responsibilities: Design and implement secure, scalable infrastructure solutions. Automate infrastructure provisioning, demand forecasting, and capacity planning. Develop automation tools and frameworks to enhance system observability, availability, reliability, performance, and latency monitoring. Monitor system health, application performance, security controls, and cost optimization. Participate in sustainable incident response, peer reviews, and blameless postmortems. Lead the adoption and rollout of best DevOps tools and automation practices across services. Build and maintain continuous integration and continuous deployment (CI/CD) pipelines. Required Skills and Experience: Minimum 3 years of experience in DevOps and cloud technologies. Expertise in at least one major cloud platform: AWS, Azure, or GCP. Strong production experience with Kubernetes, including deployment, management, and troubleshooting. Proven ability to design scalable and resilient infrastructure architectures. Proficiency with infrastructure-as-code tools such as Terraform, Pulumi, or CloudFormation. Strong debugging and troubleshooting skills. Deep knowledge of Linux servers and networking fundamentals. Hands-on experience with scripting or programming languages like Python, Shell, Go, or Java. Familiarity with monitoring and observability tools such as DataDog, NewRelic, ELK stack, Prometheus, or Grafana. Understanding of modern cloud-native development practices including microservices architecture and RESTful APIs. Ability to thrive in a fast-paced, dynamic work environment.
Technical Lead Devops
Subex Limited
Position: Technical Lead - DevOps Location: Bangalore Rural, Karnataka, India Department: Data Platform and DevOps Employment Type: Subexian Experience Required: 3 to 6 years Job Overview: We are seeking an experienced Kubernetes Administrator with a strong background in managing containerized environments. The ideal candidate will have 4+ years of hands-on experience in deploying, configuring, and optimizing Kubernetes clusters to drive scalability, reliability, and performance. This is an excellent opportunity to leverage your expertise in Kubernetes orchestration while contributing to the overall success of our platform. Key Responsibilities: Cluster Management: Deploy, configure, and manage Kubernetes clusters both on-premises and across cloud platforms such as AWS, Azure, and GCP. Security & Compliance: Implement best practices for cluster security, including role-based access control (RBAC), network policies, and data encryption at rest and in transit. Automation: Automate cluster provisioning and ongoing management using tools like Terraform, Ansible, or Helm charts, streamlining operations and reducing manual tasks by 40%. Monitoring & Performance: Continuously monitor cluster health and performance metrics using tools like Prometheus, Grafana, ensuring high availability and optimal performance. CI/CD Pipelines: Design and implement CI/CD pipelines for containerized applications using tools such as Jenkins, GitLab CI/CD, and CircleCI to enable smooth continuous delivery. Collaboration: Work closely with development teams to troubleshoot issues, optimize application performance, and ensure compatibility with Kubernetes environments. Security Audits: Conduct regular security audits to identify vulnerabilities and ensure compliance with industry standards. Documentation: Maintain clear and comprehensive documentation for deployment procedures, configuration settings, and troubleshooting guides to enhance knowledge sharing within the team. Infrastructure Management: Administer and maintain Linux/Unix servers and virtualization platforms such as VMware or KVM, ensuring seamless operations across the infrastructure. Backup & Recovery: Implement and manage robust backup and disaster recovery solutions to ensure data integrity and minimize system downtime. Technical Support: Provide expert-level technical support for server and network infrastructure-related issues. Required Skills & Qualifications: Proven experience in Kubernetes deployment, configuration, and administration. Strong command of containerization technologies, including Docker and containerd. Hands-on experience with cloud platforms such as AWS, Azure, and GCP. Proficiency in Infrastructure as Code (IAC) tools like Terraform and Ansible. Familiarity with CI/CD pipelines and automation tools like Jenkins and GitLab CI/CD. Excellent troubleshooting and problem-solving skills. Strong communication and collaboration abilities, with the capability to work effectively across cross-functional teams. If you re passionate about DevOps, Kubernetes, and driving the success of containerized environments, we d love to hear from you!
Senior Ai Engineer
Themathcompany
Job Title: Senior AI Engineer Location: Bengaluru, Karnataka, India Department: GenAI Experience: 4.5 to 7 years Open Positions: 5 About the Role As a Senior AI Engineer, you will design, build, and maintain scalable AI solutions with a strong focus on Generative AI technologies such as large language models (LLMs), embeddings, and retrieval techniques. You will lead a team of AI engineers and collaborate with stakeholders to deliver impactful AI-driven products aligned with business goals. Your role includes mentoring, project planning, ensuring data quality, and driving continuous process improvements. Key Responsibilities Design, develop, and deploy scalable AI/ML solutions, specializing in advanced Generative AI (LLMs, embeddings, retrieval-augmented generation, prompt engineering). Lead, mentor, and develop a team of AI engineers in a collaborative, inclusive environment. Coordinate with stakeholders to gather requirements, prioritize tasks, and define project timelines. Ensure projects align with overall business objectives and data strategies. Oversee data quality, integrity, and security in AI engineering projects. Build reusable frameworks to enhance the efficiency and scalability of AI systems. Manage client communications to translate requirements into technical outcomes. Identify skill gaps and create opportunities for professional development. Drive initiatives for improving data operations and AI delivery efficiency. Required Technical Skills 4.5 to 7 years of experience developing and deploying scalable AI/ML solutions. Strong expertise in data modeling, relational and NoSQL databases, software development lifecycle, unit testing, and functional programming. Proficient in designing and implementing advanced Generative AI solutions including LLMs, embeddings, retrieval techniques, and prompt engineering. Experience designing and optimizing Retrieval-Augmented Generation (RAG) systems. Proficiency with Databricks workflows, including job and cluster management, and API usage. Solid understanding of data structures, algorithms, multiprocessing, and optimization techniques. Skilled in Python libraries such as Pandas, NumPy, FastAPI for data processing and API development. Expertise in SQL optimization and database schema design. Experience deploying AI models using Docker and Kubernetes. Familiarity with version control using GitHub. Hands-on experience with cloud platforms like Azure, AWS, or GCP for AI deployments. Optional experience with PySpark for data processing. Basic understanding of CI/CD pipelines and deployment best practices. Required Non-Technical Skills Strong problem-solving ability with financial impact awareness in both team management and solution delivery. Excellent verbal and written communication skills, comfortable interacting with mid-level client management. Ability to balance pragmatic solutions versus perfect outcomes and rally teams accordingly. Strong interpersonal skills including conflict resolution, empathy, negotiation, and active listening. Demonstrated leadership and mentorship capabilities. Self-motivated with a strong sense of ownership. Good to Have Familiarity with data visualization tools and techniques. Understanding of data security, privacy, governance, and compliance frameworks. Experience with graph databases and graph processing frameworks. Knowledge of data virtualization and federation methods. Skills in data profiling and data quality management. Education Bachelor s degree in Engineering, Computer Science, or a related field. Qualification : Bachelors degree in Engineering, Computer Science, or a related field.
Senior Data Engineer
Synechron
Position Title: Senior Data Engineer Databricks, PySpark, Cloud Platforms Location: Bengaluru Bellandur (GTP) Employment Type: Full-time Job Summary Synechron is looking for a Senior Data Engineer to join our advanced analytics team in Bengaluru. In this role, you will architect and build scalable, high-performance data pipelines that power data science, analytics, and business intelligence initiatives. You ll work with modern tools including Databricks, PySpark, and cloud data platforms, while collaborating across teams to ensure high-quality, secure, and efficient data solutions. Key Responsibilities Design, develop, and maintain large-scale, secure, and efficient data pipelines using Databricks, PySpark, and cloud-native tools. Partner with data scientists, analysts, and business stakeholders to translate requirements into robust data solutions. Integrate data from various structured, semi-structured, and streaming sources. Ensure high standards for data quality, performance optimization, security, and cost efficiency. Drive data pipeline automation, orchestration, and monitoring using tools like Airflow. Lead troubleshooting efforts, performance tuning, and enhancements of existing pipelines. Stay informed about emerging data technologies and recommend adoption where relevant. Technical Skills Core Expertise Programming: Python (expert), SQL (advanced), PySpark. Platforms: Databricks (clusters, notebooks, workflows), AWS/Azure/GCP. Data Orchestration: Apache Airflow (or similar). Data Warehousing: Snowflake (preferred), data modeling, ETL/ELT pipelines. Streaming: Kafka or other stream processing tools. DevOps: CI/CD (GitLab CI, Jenkins), version control (Git), containerization (Docker/Kubernetes preferred). Security: Familiarity with encryption, access controls, and compliance best practices. Experience 8+ years of experience in data engineering or related roles. Proven expertise in developing and deploying scalable data pipelines using Databricks, PySpark, and SQL. Hands-on experience with cloud platforms (AWS, Azure, or GCP). Strong background in data warehousing, especially with Snowflake. Exposure to real-time data processing and orchestration tools. Experience implementing CI/CD pipelines for data workflows is a plus. Daily Responsibilities Build and optimize data ingestion, transformation, and storage workflows. Collaborate with cross-functional teams to align data solutions with business objectives. Monitor, troubleshoot, and continuously improve pipeline performance. Conduct data quality checks, ensure governance and compliance standards. Contribute to technical documentation, code reviews, and team knowledge sharing. Qualifications Bachelor s or Master s degree in Computer Science, IT, or related field. Relevant certifications (e.g., Databricks Certified Data Engineer, AWS Certified Data Analytics) are preferred. Professional Competencies Strong problem-solving and analytical mindset. Effective communicator with ability to collaborate across technical and non-technical teams. Time management and prioritization skills under tight deadlines. Proactive leadership and a passion for innovation. Commitment to ethical data use and data security. Diversity & Inclusion at Synechron Synechron is committed to building an inclusive, diverse, and equitable workplace. Through our global Same Difference DEI initiative, we celebrate and support people from all backgrounds, including race, gender, sexual orientation, religion, age, disability, and more. We offer flexible work arrangements, continuous learning, internal mobility, and mentoring programs to support every employee s growth. Qualification : Bachelors or Masters degree in Computer Science, IT, or related field
Devops Engineer
Sarvam
DevOps Engineer Location: Bengaluru, Karnataka, India (On-Site) Department: Engineering Employment Type: Full-Time About Sarvam.ai Sarvam.ai is a cutting-edge generative AI startup headquartered in Bengaluru, India, with a mission to make generative AI accessible and impactful for Bharat. Founded by AI experts, we are dedicated to developing high-performance, cost-effective AI agents tailored for the Indian market. We enable enterprises to tap into new opportunities, build deeper customer connections, and reshape the future of AI for India and beyond. Role Overview We are looking for a DevOps Engineer to join our team and help build and manage scalable, secure, and high-performance infrastructure. In this role, you will be a key contributor to automating deployments, managing cloud infrastructure, optimizing CI/CD workflows, and ensuring system reliability. You will work with cutting-edge technologies, including cloud platforms, containerization, and infrastructure as code (IaC), to deliver impactful solutions for AI-driven products. Key Responsibilities CI/CD Pipelines: Design, implement, and manage CI/CD pipelines for seamless software deployment and integration. Cloud Infrastructure: Deploy and manage cloud infrastructure using Terraform, Kubernetes, and Docker for scalability and high performance. Automation & Scaling: Automate infrastructure provisioning, scaling, and security compliance to support high-availability environments. Monitoring & Optimization: Implement logging, monitoring, and alerting solutions using tools like Prometheus, Grafana, ELK Stack, or CloudWatch to monitor system performance and optimize resource utilization. Security & Compliance: Enhance security and compliance by managing IAM policies, encryption, and vulnerability scanning. Troubleshooting & Root Cause Analysis: Troubleshoot system failures, perform root cause analysis, and implement improvements to ensure reliability and uptime. Collaboration: Work closely with development teams to ensure smooth deployment and operation of AI models and applications. Must-Have Skills & Qualifications Educational Background: Bachelor s degree in Computer Science, Engineering, or related field (2024/2025 graduates). Cloud Expertise: Strong experience with AWS, Azure, or GCP for deploying and managing cloud-based applications. Containerization: Proficiency in Docker and Kubernetes for building and managing containerized applications. Infrastructure as Code (IaC): Experience with Terraform, Ansible, or CloudFormation to automate infrastructure management. CI/CD Pipelines: Experience in setting up automated workflows using tools like GitHub Actions, Jenkins, or GitLab CI/CD for smooth deployments. Monitoring & Logging: Experience with Prometheus, Grafana, ELK, or similar tools to implement effective monitoring and logging solutions. Networking & Security: Strong understanding of firewalls, VPNs, SSL, and cloud security best practices for secure infrastructure. Version Control: Proficiency with Git for managing code repositories and version control workflows. Problem Solving: Strong debugging, troubleshooting, and analytical skills to resolve complex system issues. Good to Have (Preferred Experience) Serverless Computing: Exposure to serverless computing models such as AWS Lambda or Azure Functions. Message Queues: Experience with message queues like Kafka, RabbitMQ, or SQS. Site Reliability Engineering (SRE): Familiarity with SRE practices to ensure the reliability and availability of large-scale systems. Open Source Contributions: Contributions to open-source projects or a strong GitHub portfolio showcasing DevOps expertise and best practices. Impactful Work: Work on AI-driven products that are reshaping the future of technology in India. Innovative Team: Collaborate with a team of AI experts and engineers pushing the boundaries of technology. Career Growth: Opportunity to grow in a fast-growing startup at the forefront of the generative AI revolution. Cutting-edge Technologies: Work with cloud technologies, automation, and AI infrastructure to create high-impact products. Qualification : Bachelors degree in Computer Science, Engineering, or related field
Machine Learning Engineer
Test Company
Machine Learning Engineer Full-Time - Bengaluru, India - Data Science / Artificial Intelligence / Engineering Join our dynamic Data Science / Artificial Intelligence / Engineering team in Bengaluru, India as a Full-Time Machine Learning Engineer and play a key role in driving data-driven innovation! We are seeking a skilled and results-oriented Machine Learning Engineer to design, build, and deploy scalable machine learning models that address real-world business challenges. You will collaborate closely with data scientists, engineers, and product managers to transform raw data into actionable insights and integrate intelligent features into our products. As a Machine Learning Engineer, you will be responsible for the complete lifecycle of machine learning models and pipelines, from design and development to seamless deployment for a variety of applications. This includes classification, regression, clustering, recommendation systems, and time-series forecasting. You will leverage your expertise to preprocess and analyze large and complex datasets, extracting meaningful features and valuable insights. Collaboration with cross-functional teams will be crucial as you identify strategic ML opportunities and define clear success metrics. A key aspect of this role involves optimizing machine learning models for peak performance, scalability, and accuracy within production environments. You will build robust APIs or efficient microservices to integrate these models seamlessly into our applications, utilizing tools such as Flask or FastAPI. Continuous improvement is paramount, and you will be responsible for the ongoing monitoring and retraining of models based on their performance and any signs of data drift. Staying at the forefront of the field is essential, and you will be expected to stay updated with the latest ML research and emerging technologies, applying them to continuously enhance our product capabilities. Key Responsibilities: Design, develop, and deploy machine learning models and pipelines for diverse applications including classification, regression, clustering, recommendation, and time-series forecasting. Preprocess and analyze large datasets to extract meaningful features and actionable insights. Collaborate effectively with cross-functional teams to identify strategic ML opportunities and define clear success metrics. Optimize models for maximum performance, scalability, and accuracy in production environments. Build robust APIs or efficient microservices to integrate ML models into applications using tools like Flask or FastAPI. Continuously monitor and retrain models based on performance metrics and potential data drift. Stay updated with the latest ML research and technologies and apply them to enhance product capabilities. Minimum Qualifications: Bachelor s or Master s degree in Computer Science, Data Science, Statistics, or a related field. 2+ years of proven experience as a Machine Learning Engineer or in a similar role. Strong proficiency in Python and key ML libraries such as Scikit-learn, XGBoost, TensorFlow, or PyTorch. Practical experience working with both SQL and NoSQL databases. Solid knowledge of essential data preprocessing, effective feature engineering, and robust model evaluation techniques. Familiarity with standard software engineering practices, including version control (Git), thorough code reviews, and efficient CI/CD pipelines. Preferred Qualifications: Prior experience with deep learning, natural language processing (NLP), or computer vision. Familiarity with major cloud services like AWS, GCP, or Azure (especially SageMaker, Vertex AI, etc.). Understanding of modern MLOps tools and practices (e.g., MLflow, Kubeflow, DVC). Practical experience with containerization and orchestration tools (Docker, Kubernetes). Knowledge of big data tools (e.g., Spark, Hadoop) is considered a significant plus. What We Offer: Competitive salary and performance-based incentives to reward your contributions. Comprehensive health insurance and valuable wellness benefits to support your well-being. Dedicated learning and development programs for continuous professional growth. Exciting opportunities to work on impactful, real-world AI/ML projects with significant scale. A collaborative, inclusive, and innovative work culture that fosters teamwork and creativity. Flexible working hours and a hybrid work model to promote a healthy work-life balance.
Lead Cloud Engineer - Hpc
Chevron Corporation
Lead Cloud Engineer HPC (High Performance Computing) Location: Bengaluru, India Company: Chevron Experience Level: 5 10 Years Department: IT Cloud Engineering Work Mode: Hybrid (Global Operations Support) About the Role Chevron is seeking a Lead Cloud Engineer HPC to deliver next-generation High Performance Computing (HPC) infrastructure and application solutions. This position plays a key role in supporting compute-intensive workloads such as geophysics, reservoir simulations, AI/ML models, and parallel file systems, with a focus on cloud-native solutions and low-latency architectures. This role is ideal for someone experienced in cloud engineering, Linux system administration, and HPC architecture. Key Responsibilities Design, deploy, and support HPC environments for compute-intensive workloads. Configure and manage HPC job scheduling systems (e.g., Slurm, PBS). Implement and maintain parallel file systems such as Lustre. Manage Azure-based cloud infrastructure using VM Scale Sets. Collaborate with data scientists and developers to optimize infrastructure. Leverage Ansible, Satellite, and Python for automation. Optimize low-latency networks for distributed computing. Modernize HPC architecture with an eye on cost control and performance. Required Qualifications Bachelor s degree in Computer Science, Information Systems, or related field. 5 10 years of experience in: Linux system administration in a large-scale environment Microsoft Azure, including VM Scale Sets HPC job schedulers Slurm, PBS Parallel file systems Lustre Automation tools Ansible, Satellite, Python Supporting compute-heavy scientific/engineering applications Understanding of storage systems, networking, and performance tuning in HPC. Willingness to support global operations and participate in after-hours support. Preferred Skills Experience deploying HPC workloads in hybrid cloud environments. Familiarity with reservoir simulation or geophysical applications. Knowledge of security compliance and cost optimization in HPC/cloud. Working Hours Chevron supports global operations. Work hours may be aligned across international teams: Standard Work Days: Monday to Friday Shift Options: 8:00 AM 5:00 PM or 1:30 PM 10:30 PM IST Benefits & Perks Competitive compensation package Health, life, and accident insurance Flexible work schedule and hybrid options Professional development and certification support Work on cutting-edge HPC and AI infrastructure Equal Opportunity Employer Chevron is committed to a diverse and inclusive workforce. All qualified applicants will receive consideration without regard to race, gender, religion, sexual orientation, nationality, age, or disability. Chevron participates in E-Verify in applicable regions. Apply Now If you re passionate about building scalable HPC infrastructure in the cloud, apply now to join Chevron s Cloud Engineering team in Bengaluru. Qualification : Bachelors degree in Computer Science, Information Systems, or related field.
Software Development Manager For Cephfs
International Business Machines
Software Development Manager CephFS Location: Bangalore, Karnataka, India Job Type: Full-Time Experience Level: Senior / Leadership Company: IBM Ceph Engineering Team Education: Bachelor s Degree (Master s preferred) Introduction: At IBM, we re not just redefining business we re redefining what s possible through technology, collaboration, and innovation. As one of the world's leading technology companies, IBM is transforming industries through the power of AI, Cloud, Analytics, Security, and IoT. With a presence in over 170 countries, we bring together diverse minds to solve complex challenges and build a smarter future. Join IBM s Ceph Engineering Organization and be a part of shaping the future of software-defined distributed storage. We're looking for an experienced and visionary Software Development Manager to lead the CephFS team the group responsible for the file system layer of the Ceph ecosystem. About the Role: As a Software Development Manager for CephFS, you ll play a key leadership role in designing, developing, and delivering new capabilities in CephFS, the scalable and highly available POSIX-compliant distributed file system built atop Ceph. You ll lead a global team of engineers, working in an open-source community, to develop enterprise-grade storage solutions for modern workloads. You will focus on building next-generation distributed file system features like instant cloning, file overlays, coherent snapshots, and advanced client-side caching. This role is a mix of hands-on technical leadership and people management, with strong collaboration across open-source communities, IBM product teams, and clients. Key Responsibilities: Lead and mentor a team of talented engineers working on CephFS. Drive the design and implementation of new distributed features and algorithms for file system scalability, performance, and resiliency. Collaborate with the global Ceph open-source community to contribute and review code, resolve issues, and plan new features. Guide the team in debugging complex production issues, both live and offline. Collaborate with client-facing and support teams to perform root cause analysis for customer-reported issues. Contribute to the architecture and roadmap of CephFS in alignment with product and client needs. Engage in code reviews, triaging, and architectural discussions. Promote engineering best practices and a culture of continuous improvement. Required Skills & Experience: Bachelor s degree in Computer Science or related field. Strong experience working with C++ or other systems programming languages. Excellent debugging skills (live system and core file analysis). Hands-on experience in open-source development (preferably with contributions to GitHub). Good understanding of large-scale codebases and the ability to design and implement major features or changes. Comfortable working with Python for automation and testing. Proficiency with Git and GitHub workflows. Excellent verbal and written English communication skills to coordinate with a distributed, global team. Proven ability to mentor and support engineers while driving technical excellence. Preferred Qualifications: Master s degree in Computer Science or related field. Experience in building or maintaining file systems or distributed storage platforms. Prior work in distributed systems, high-performance computing, or cloud-native storage. Experience working in remote/distributed teams. Familiarity with systems like OpenStack, Kubernetes, or NFS-Ganesha. Work with world-class engineering teams on products that power global enterprise infrastructure. Be part of a vibrant open-source community, contributing to widely adopted storage technologies. Enjoy a culture of continuous learning, innovation, and impact. Competitive salary, benefits, and flexible work arrangements. Be essential. Be a leader in redefining how the world stores and accesses data. Apply today to join IBM s CephFS team and help build the future of enterprise file storage. Qualification : Bachelors degree in Computer Science or related field.
Site Reliability Developer 2/3
Oracle
Job Description: Site Reliability Engineer - OCI Cloud Engineering Team Role: Site Reliability Engineer (SRE) Team: OCI OLTP (Online Transaction Processing) Location: Kiev Career Level: IC2 Experience: 5+ years Overview: Oracle Cloud Infrastructure s (OCI) OLTP organization is seeking a Site Reliability Engineer (SRE) to join our dynamic and fast-paced Cloud engineering team. The team is responsible for mission-critical distributed systems and cloud services, and we are looking for an engineer who is deeply interested in databases, distributed systems, and cloud services. If you thrive in an environment where innovation, problem-solving, and operational excellence intersect, this is an exciting opportunity for you! As a member of the SRE services, you will focus on Cloud Services, building deployments, operations, security vulnerability mitigation, and automation. You will be instrumental in fostering a culture of Site Reliability Engineering (SRE) within the team, and your work will directly contribute to ensuring the stability, performance, and reliability of Oracle s global cloud service infrastructure. This role requires someone who is adaptable, highly motivated, and capable of managing large-scale cloud environments with a focus on continuous improvement. Key Responsibilities: Cloud Service Operations & Reliability: Deploy, operate, and maintain large-scale cloud service products in a highly available, fault-tolerant, and scalable environment. Collaborate with internal teams to identify and mitigate cross-team issues that pose operational risks to cloud services. Focus on systems reliability and ensure the continuous availability of cloud services by automating tasks and eliminating manual interventions. Automation & Improvements: Automate operational tasks and improve service deployments, focusing on scaling, performance, and uptime. Contribute to CI/CD systems, ensuring seamless integration and continuous delivery for cloud-based services. Leverage automation tools such as Terraform, Grafana, and Bitbucket to streamline operations. Security & Incident Response: Mitigate security vulnerabilities within cloud services and ensure compliance with Oracle's security standards. Participate in on-call rotations to provide immediate troubleshooting support and ensure rapid issue resolution. Perform deep analysis of service performance and collaborate with team members to diagnose and resolve issues that affect service availability or performance. Collaborative Problem-Solving: Work closely with cross-functional teams, including development, database, networking, and storage experts, to ensure the reliability and performance of services. Identify systemic issues and potential risks, develop solutions, and ensure proper documentation and communication with stakeholders. Documentation & Knowledge Sharing: Contribute to documentation such as runbooks, operational guides, and troubleshooting manuals. Mentor junior engineers and share knowledge on best practices for site reliability engineering and cloud service operations. Continuous Learning: Stay up to date with new cloud technologies, trends, and best practices, and actively implement them in your day-to-day work. Technical and Professional Requirements: Cloud Services & Infrastructure: 5+ years of experience in SRE, DevOps, or Automation roles with a focus on large-scale infrastructure and cloud services. Hands-on experience with cloud platforms (e.g., OCI, AWS, Azure) and expertise in compute, database, networking, and storage services within cloud environments. Automation & Tooling: Proficiency with automation tools such as Terraform, Grafana, LumberJack, and Shepherd. Solid experience in using CI/CD tools and processes for cloud service deployments and operations. Scripting & Systems: Strong knowledge of scripting languages, particularly Python and Java. Familiarity with Linux systems, docker containers, virtualized infrastructure, and orchestration (e.g., Kubernetes). Performance & Troubleshooting: Excellent troubleshooting skills with a focus on performance, availability, reliability, and scalability of distributed systems. Experience in operating fault-tolerant, highly available, high-throughput distributed systems. Security & Incident Management: Familiarity with security practices and mitigating security vulnerabilities in cloud services. Proven ability to handle incident response and provide efficient troubleshooting during on-call rotations. Collaboration & Communication: Strong verbal and written communication skills, capable of working effectively with diverse teams across multiple geographies. Ability to work in a highly collaborative environment, driving operational excellence and customer satisfaction. Preferred Qualifications: Experience in operating and maintaining multi-tenant, cloud-based infrastructure with a focus on scalability and high availability. Familiarity with tools and platforms like Grafana, Prometheus, and other observability and monitoring tools. Experience in networking and storage technologies in a cloud environment. Joining OCI s OLTP team as an SRE gives you the opportunity to work with cutting-edge technologies and contribute to the operational excellence of Oracle s global cloud infrastructure. This is a chance to grow your skills in a highly dynamic environment and to solve complex problems that directly impact mission-critical cloud services. With a focus on automation, scalability, and high performance, you will be an essential part of a team that powers Oracle s leading cloud services. If you are an experienced engineer passionate about cloud technologies, automation, and ensuring the reliability of large-scale systems, we encourage you to apply and join us in this exciting journey!
Devops Engineer
Pure Storage
Join Us in Revolutionizing the Data Storage Industry We're leading an exciting transformation in the data storage industry. By joining us, you will have the opportunity to make a significant impact, grow alongside a brilliant team, and be at the forefront of cutting-edge technology. If you re passionate about reshaping the future of tech, this is your chance to leave your mark. The Challenge: As a DevOps Engineer, you will be responsible for managing, automating, and optimizing our VMware, cloud, and on-prem Kubernetes infrastructure. Your role will be critical in ensuring the high availability, performance, and scalability of our services. You will work closely with cross-functional teams to architect solutions that improve our engineering operations and align with business needs. Key Responsibilities: Infrastructure Design & Management: Design, deploy, and maintain scalable VMware, Cloud (AWS, GCP, Azure, IBM), and Kubernetes environments. Custom Solutions Development: Develop and implement tailored OpenStack/VMware/Kubernetes solutions based on organizational needs. Automation & CI/CD: Build automation tools and frameworks (CI/CD pipelines) to streamline infrastructure operations and manage workloads (VMs/Pods) lifecycles. Observability & Best Practices: Implement observability best practices for OpenStack/VMware/Kubernetes environments, ensuring reliability and performance. Continuous Improvement: Provide expertise in DevOps methodologies to support continuous improvement and optimize infrastructure efficiency. Troubleshooting & Support: Assist with deployment, configuration, troubleshooting, and provide infrastructure support for AWS/Azure environments. Documentation & Knowledge Sharing: Create and maintain documentation for OpenStack/VMware/Kubernetes architecture and stay current with emerging technologies and industry best practices. On-Call Support: Participate in the follow-the-sun on-call rotation for infrastructure support. What You ll Need to Bring: Education: Bachelor s degree in Computer Science, Information Technology, or a related field. Experience: 5+ years of experience managing and automating large-scale VMware, OpenStack, or Kubernetes environments. Programming Skills: Proficiency in Python, Java, or Go for scripting and automation. DevOps Knowledge: Strong understanding of DevOps principles with hands-on experience using tools like Ansible, Terraform, or Puppet. Observability Expertise: Experience implementing observability solutions with tools like Prometheus, Grafana, Logstash, Elastic, Fluent-bit, or Fluentd. Linux & Networking: Familiarity with Linux, TCP/IP, DNS, DHCP, and related networking concepts. OpenStack/Vmware/Kubernetes: Proven experience working with OpenStack (e.g., OpenStack Yoga), VMware, or Kubernetes (particularly enterprise Kubernetes clusters on bare metal). Virtualization & Containerization: Expertise in virtualization technologies (KVM, VMware, KubeVirt) and container technologies (Docker, Kubernetes). Cloud Experience: Experience with infrastructure support and automation for AWS, Azure, or GCP. Problem-Solving Skills: Excellent troubleshooting abilities with great attention to detail. Communication & Collaboration: Strong interpersonal skills and the ability to collaborate across teams. Preferred Qualifications: Experience with PureStorage products such as FlashArray, FlashBlade, or Portworx. Certifications in Kubernetes (CKA, CKAD), VMware (VCP), or OpenStack (COA). Experience running production VMs in KubeVirt. Familiarity with agile environments and project management tools. What You Can Expect from Us: Pure Innovation: We celebrate critical thinkers, challenges, and trailblazers who strive to innovate. Pure Growth: We support your growth and provide the space for you to contribute meaningfully. We're proud to be named in Fortune's Best Workplaces and certified as a Great Place to Work. Pure Team: We focus on collaboration and teamwork, setting aside ego for the greater good. Qualification : Bachelors degree in Computer Science, Information Technology, or related field.
Senior Site Reliability Engineer
Couchbase
Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!
Senior Software Engineer - Data Platform
Databricks
About Databricks At Databricks, we are passionate about enabling data teams to solve the world s toughest challenges from creating the next mode of transportation to accelerating the development of medical breakthroughs. We build and run the world s best data and AI infrastructure platform, empowering our customers to use deep data insights to transform their businesses. Databricks Mosaic AI offers a data-centric approach to building enterprise-quality Machine Learning (ML) and Generative AI solutions, enabling organizations to securely and cost-effectively own and host ML and Generative AI models, trained and augmented with their enterprise data. We re only getting started in Bengaluru, India, where we are currently setting up 10 new engineering teams from scratch! The Opportunity Senior Software Engineer As a Senior Software Engineer at Databricks India, you ll have the opportunity to work on a variety of challenging projects across multiple domains: Backend Engineering Distributed Data Systems (DDS) Full-Stack Development The Impact You ll Have 1. Backend Engineering Join our Backend teams and tackle challenges that range from product to infrastructure: Solve complex problems in distributed systems, large-scale service architecture, monitoring, workflow orchestration, and developer experience. Build reliable, high-performance services and client libraries for managing massive amounts of data on cloud storage backends like AWS S3 and Azure Blob Store. Work on scalable services (e.g., Scala, Kubernetes) and data pipelines (e.g., Apache Spark, Databricks) that support our pricing infrastructure, processing millions of cluster-hours per day. 2. Distributed Data Systems (DDS) Work across a range of exciting DDS projects: Apache Spark Data Plane Storage Delta Lake Delta Pipelines Performance Engineering 3. Full-Stack Engineering As a Full-Stack Software Engineer, collaborate closely with your team and product managers to create intuitive user experiences that delight our customers. What We Look For BS or higher in Computer Science or a related field. 7+ years of production-level experience in one or more of the following languages: Python, Java, Scala, C++, or similar. Proven experience developing large-scale distributed systems from scratch. Experience working on a SaaS platform or with Service-Oriented Architectures (SOA). About Databricks Databricks is the data and AI company trusted by over 10,000 organizations worldwide, including Comcast, Cond Nast, Grammarly, and over 50% of the Fortune 500. We help unify and democratize data, analytics, and AI through the Databricks Data Intelligence Platform. Headquartered in San Francisco, Databricks was founded by the original creators of Apache Spark, Delta Lake, MLflow, and the Lakehouse architecture, with offices around the globe. Qualification : BS (or higher) in Computer Science, or a related field.
Staff Software Engineer (go, Microservices, Kubernetes)
Netapp
About NetApp NetApp is the intelligent data infrastructure company, turning disruption into opportunity for every customer. We help customers unlock new business possibilities, no matter the data type, workload, or environment. At NetApp, it all starts with our people. We embrace diversity and openness because it's part of our DNA. Collaboration is at the core of what we do asking for help, partnering across teams, and driving innovation together. "At NetApp, we fully embrace and advance a diverse, inclusive global workforce that fosters belonging and high performance." George Kurian, CEO Job Summary As a Senior Software Engineer on the AI Data Platform team, you will be involved in the design and development of the AI Data Platform, built on NetApp s flagship ONTAP storage operating system the #1 Storage Operating System in the world, trusted by over 30,000 customers and managing hundreds of exabytes of data. Join us in transforming how data shapes the world. Your work will support cutting-edge technologies that enable life-saving medical analytics, improve autonomous vehicle navigation, monitor environmental hazards, and unlock new possibilities for businesses globally. An ideal candidate is results-driven, curious, creative, and collaborative, with broad experience in Big Data processing, AI/ML workflows, MLOps, Kubernetes, and distributed systems. Job Responsibilities Design, develop, and support AI Data Platform components built on NetApp ONTAP. Build and maintain microservices and REST APIs for scalable, reliable solutions. Work closely with cross-functional teams to solve complex, data-intensive problems and deliver innovative solutions. Participate in technical discussions and contribute to system design, architecture, and best practices. Support and collaborate with other engineers to ensure seamless development, testing, and deployment processes. Stay current with emerging technologies, continuously improving your skill set and applying new concepts to ongoing projects. Required Skills Programming Languages: Proficiency in Go and Python. AI/ML Experience: Familiarity with PyTorch, TensorFlow, Keras, OpenAI frameworks, LLMs (Open Source), LangChain. Cloud & Kubernetes: Hands-on experience with Linux, Kubernetes control plane, auto-scaling, orchestration, and containerization in AWS/Azure/GCP environments. Big Data Technologies: Experience with platforms like Spark, Hadoop, and distributed storage systems for large-scale data processing. NoSQL Databases: Proficiency in MongoDB, Cassandra, Cosmos DB, and DocumentDB. Microservices Architecture: Proven experience building microservices and developing REST APIs and related frameworks. Preferred Skills Experience in the storage domain or with distributed file systems, networking, or file/cloud protocols. Familiarity with MLOps practices and workflows. Proven experience leading mid- to large-sized projects and collaborating across teams. Strong understanding of computer architecture, data structures, and programming best practices. Education and Experience Bachelor s degree with 12+ years of experience, Master s degree with 12 years, or PhD with 10 years of experience. Equivalent experience is also considered. Work Environment NetApp offers a hybrid work environment to enhance connection, collaboration, and culture. In-office expectations will be discussed during the recruitment process. Equal Opportunity Employer NetApp is an Equal Employment Opportunity (EEO) employer, committed to providing a workplace free of discrimination. We do not discriminate based on age, race, color, gender, sexual orientation, gender identity, national origin, religion, disability, genetic information, pregnancy, or any protected classification. A Note to Applicants Research shows that women often apply only if they meet 100% of the qualifications but no one is ever 100% qualified. If this role excites you, we encourage you to apply anyway! Qualification : Bachelors degree with 12+ years of experience, Masters degree with 12 years, or PhD with 10 years of experience. Equivalent experience is also considered.
1 - 20 of 0 jobs
* No exact matches found. Showing closest results insteadNo results found
Modify search criteria or create an alert to get relevant jobs as soon as they’re posted