Reliability Engineering Jobs in Bengaluru
941 Jobs Found
Senior Technical Architect
Locus
Senior Technical Architect Location: Bengaluru Work Type: Full-Time About Locus Locus is a battle-tested, agentic Transportation Management System powering all-mile, all-channel logistics across 30+ countries. In 2025, Locus joined the Ingka Group (IKEA Retail), gaining the scale of a global leader while continuing to operate independently. Headquartered in Bangalore with a global footprint, we are a team of 170+ problem-solvers united by a mission to reinvent how the world moves goods. What We Value Global Mindset: Curiosity about diverse markets. Driven: Energized by complex challenges. Thoughtful: Analytical and deliberate. Adaptive: Decisive in fast-moving environments. Exacting: Commitment to excellence and detail. Role Overview You will define the architectural backbone of Locus s enterprise SaaS platform. This role bridges the gap between product vision and technical execution, ensuring our systems remain scalable, extensible, and resilient as we grow globally. Key Responsibilities Platform Vision: Define high-level system and module architectures aligned with business goals. Domain-Driven Design: Build domain-centric, loosely coupled systems for logistics and enterprise SaaS. Scale & Performance: Architect resilient, high-throughput systems for real-time data processing. Technical Mentorship: Lead design reviews and mentor engineers on architectural patterns and best practices. Modernization: Identify technical debt and define phased roadmaps for platform evolution. Ideal Candidate Profile 10+ years in software architecture, including 3+ years in enterprise SaaS platform architecture. Domain Expertise: Deep knowledge of TMS, WMS, routing, or optimization systems. Technical Stack: Mastery of Microservices, Event-Driven Architectures, REST APIs, and Cloud-Agnostic design. Leadership: Proven ability to communicate complex blueprints to both executive stakeholders and engineering teams. Shape the future of logistics automation, establish a foundation for global innovation, and solve complex, real-world problems with the backing of the IKEA ecosystem.
Site Reliability Engineer
Groww
Position: Site Reliability Engineer Location: Bengaluru About Groww At Groww, we re on a mission to make financial services simple, accessible, and transparent for every Indian. As one of India s fastest-growing financial platforms, we help millions take control of their financial future through a wide range of products. We re a team driven by ownership, radical customer-centricity, and a deep passion for challenging the status quo. From intuitive design to robust engineering, everything we build is grounded in what our customers need. If you re excited about building systems that power the future of finance in India, we d love to hear from you. Our Vision To empower every Indian with the knowledge, tools, and confidence to make sound financial decisions. Our goal is to be the most trusted financial partner for millions across the country. Our Core Values Customer Obsession We put our users first, always. Extreme Ownership We own everything we do, end-to-end. Simplicity We keep things simple, effective, and intuitive. Long-term Thinking We focus on sustainable, impactful decisions. Transparency We believe in open communication and collaboration. Role Overview: As a Site Reliability Engineer (SRE) at Groww, you will be responsible for ensuring our systems are highly available, performant, and secure. You will work closely with engineering and infrastructure teams to improve reliability, automate deployments, and manage mission-critical services that power our platform. Key Responsibilities: Monitor and troubleshoot issues related to system performance, availability, and security. Define and maintain SLIs, SLOs, and Error Budgets to improve system reliability. Use tools like Grafana to analyze and report on metrics and trace data. Participate in the on-call rotation for 24/7 support of production systems. Collaborate with developers to ensure scalability and reliability are built into new services. Roll out security and infrastructure features proactively. Manage automated deployments, version control, and release rollouts. Perform Root Cause Analysis (RCA) for incidents and implement long-term fixes. Optimize system performance, conduct capacity planning, and create recovery strategies. Identify and automate repetitive tasks to reduce toil. Leverage CI/CD tools such as Git, Jira, Jenkins to streamline development workflows. Requirements: 4 6 years of relevant experience in SRE, DevOps, or infrastructure engineering. Bachelor's or Master's degree in Computer Science or a related field. Strong background in Linux/Unix system administration and networking. Hands-on experience with cloud platforms like GCP or AWS. Proficiency in programming languages such as Python, Java, or Go. Experience with monitoring and alerting tools: Grafana, Prometheus, New Relic, etc. Familiarity with configuration management tools. Experience with Kubernetes, Docker, and container orchestration tools is a strong plus. Excellent problem-solving, communication, and team collaboration skills. Be a part of one of India s fastest-growing fintech startups. Build and scale systems that impact millions of users daily. Work with passionate, driven teammates who are redefining financial services. A culture that encourages continuous learning, ownership, and transparency. If you're ready to help shape the future of fintech infrastructure in India, Groww is the place for you. Let s build something extraordinary together. Qualification : Bachelor's or Master's degree in Computer Science or a related field
Technical Lead Devops
Subex Limited
Position: Technical Lead - DevOps Location: Bangalore Rural, Karnataka, India Department: Data Platform and DevOps Employment Type: Subexian Experience Required: 3 to 6 years Job Overview: We are seeking an experienced Kubernetes Administrator with a strong background in managing containerized environments. The ideal candidate will have 4+ years of hands-on experience in deploying, configuring, and optimizing Kubernetes clusters to drive scalability, reliability, and performance. This is an excellent opportunity to leverage your expertise in Kubernetes orchestration while contributing to the overall success of our platform. Key Responsibilities: Cluster Management: Deploy, configure, and manage Kubernetes clusters both on-premises and across cloud platforms such as AWS, Azure, and GCP. Security & Compliance: Implement best practices for cluster security, including role-based access control (RBAC), network policies, and data encryption at rest and in transit. Automation: Automate cluster provisioning and ongoing management using tools like Terraform, Ansible, or Helm charts, streamlining operations and reducing manual tasks by 40%. Monitoring & Performance: Continuously monitor cluster health and performance metrics using tools like Prometheus, Grafana, ensuring high availability and optimal performance. CI/CD Pipelines: Design and implement CI/CD pipelines for containerized applications using tools such as Jenkins, GitLab CI/CD, and CircleCI to enable smooth continuous delivery. Collaboration: Work closely with development teams to troubleshoot issues, optimize application performance, and ensure compatibility with Kubernetes environments. Security Audits: Conduct regular security audits to identify vulnerabilities and ensure compliance with industry standards. Documentation: Maintain clear and comprehensive documentation for deployment procedures, configuration settings, and troubleshooting guides to enhance knowledge sharing within the team. Infrastructure Management: Administer and maintain Linux/Unix servers and virtualization platforms such as VMware or KVM, ensuring seamless operations across the infrastructure. Backup & Recovery: Implement and manage robust backup and disaster recovery solutions to ensure data integrity and minimize system downtime. Technical Support: Provide expert-level technical support for server and network infrastructure-related issues. Required Skills & Qualifications: Proven experience in Kubernetes deployment, configuration, and administration. Strong command of containerization technologies, including Docker and containerd. Hands-on experience with cloud platforms such as AWS, Azure, and GCP. Proficiency in Infrastructure as Code (IAC) tools like Terraform and Ansible. Familiarity with CI/CD pipelines and automation tools like Jenkins and GitLab CI/CD. Excellent troubleshooting and problem-solving skills. Strong communication and collaboration abilities, with the capability to work effectively across cross-functional teams. If you re passionate about DevOps, Kubernetes, and driving the success of containerized environments, we d love to hear from you!
Platform Engineer
Colortokens
Platform Engineer Location: Bengaluru, Karnataka, India Full-time partially remote About ColorTokens At ColorTokens, we empower businesses to stay operational and resilient in an increasingly complex cybersecurity landscape. Breaches happen but with our cutting-edge ColorTokens Xshield platform, companies can minimize the impact of breaches by preventing the lateral spread of ransomware and advanced malware. We enable organizations to continue operating while breaches are contained, ensuring critical assets remain protected. Our innovative platform provides unparalleled visibility into traffic patterns between workloads, OT/IoT/IoMT devices, and users, allowing businesses to enforce granular micro-perimeters, swiftly isolate key assets, and respond to breaches with agility. Recognized as a Leader in the Forrester Wave : Microsegmentation Solutions (Q3 2024), ColorTokens safeguards global enterprises and delivers significant savings by preventing costly disruptions. Our culture We foster an environment that values customer focus, innovation, collaboration, mutual respect, and informed decision-making. We believe in alignment and empowerment so you can own and drive initiatives autonomously. Self-starters and high-motivated individuals will enjoy the rewarding experience of solving complex challenges that protect some of world s impactful organizations be it a children s hospital, or a city, or the defense department of an entire country. Position Overview: Colortokens is looking for a Junior Platform Administrator to assist in managing, maintaining, and optimizing our NextGen Security Information and Event Management (SIEM) platform. The ideal candidate will support the day-to-day operations, help onboard customer log sources, troubleshoot integration issues, and provide technical assistance to the security operations team. This role is ideal for a motivated professional with 3+ years of experience in SIEM administration, security operations, or log management. Key Responsibilities: SIEM Platform Administration Assist in deploying, configuring, and maintaining the NextGen SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, QRadar, Chronicle, Exabeam). Perform basic updates and patches to ensure platform security and functionality. Monitor SIEM health, performance, and uptime under the guidance of senior administrators. Log Source Management Onboard new log sources and validate data ingestion. Help troubleshoot log ingestion, parsing, and formatting issues. Maintain log retention policies for compliance. Rule and Use Case Management Support the development and deployment of detection rules, correlation use cases, and alerts. Tune existing use cases to minimize false positives. Work closely with security analysts to refine alerting strategies. Integration and Automation Assist in integrating SIEM with other security tools (e.g., EDR, microsegmentation, vulnerability scanners). Work on basic automation tasks using scripting (Python, PowerShell) to enhance SIEM efficiency. Platform Security and Compliance Support role-based access control (RBAC) and platform security policies. Help ensure SIEM adheres to compliance standards like SOC2, ISO 27001. Participate in periodic security audits. Network Debugging & Troubleshooting Have a basic understanding of TCP/IP, networking concepts, and protocols. Assist in debugging network connectivity issues related to SIEM log ingestion. Use basic network troubleshooting tools. Collaboration and Support Work alongside SOC analysts, threat hunters, and security engineers. Provide basic technical support for SIEM users. Assist in training and documentation for security teams. Performance Monitoring and Optimization Monitor storage and indexing performance to ensure optimal operations. Report any performance issues to senior administrators. Contribute to platform health reports and alerting metrics. Incident Support Assist SOC teams in log analysis, incident response, and forensic investigations. Ensure log data is readily available for security incidents. Education and Certifications: Bachelor s degree in Computer Science, Information Security, or a related field. Certifications (Preferred but not mandatory): Splunk Certified User/Admin Microsoft Certified: Security Operations Analyst Associate QRadar Certification Any SIEM-related certification Experience: 3+ years of experience in SIEM administration, security operations, or log management. Hands-on experience with at least one SIEM platform (e.g., Stellar Cyber, Splunk, Sentinel, Chronicle, Exabeam). Basic knowledge of log ingestion, rule creation, and data parsing. Exposure to scripting (Python, PowerShell) for automation. Basic understanding of TCP/IP networking concepts and network debugging. Technical Skills: Understanding of log formats, Syslog, JSON, XML, and data pipelines. Basic knowledge of querying languages (KQL, SPL, AQL). Familiarity with SIEM integration with security tools like EDR, SOAR, NDR. Awareness of MITRE ATT&CK, NIST, or CIS security frameworks. Basic experience with network troubleshooting tools (ping, traceroute, netcat (nc)). Soft Skills: Strong problem-solving and troubleshooting abilities. Good verbal and written communication skills. Ability to work collaboratively in a security operations environment. Preferred Skills: Basic understanding of cloud-based security solutions (AWS, Azure, Google Cloud). Exposure to SOAR tools (e.g., Cortex XSOAR, Splunk Phantom). Interest in machine learning-based anomaly detection for SIEM. Key Metrics for Success: Successful onboarding of log sources. Improvement in log ingestion and parsing accuracy. Contribution to fine-tuning detection rules. Timely resolution of SIEM-related support requests. Ability to identify and troubleshoot basic network connectivity issues.
Deputy Manager- Mechanical Maintenance
Jindal Aluminium
Position: Deputy Manager Mechanical Maintenance Department: Maintenance Location: Bengaluru Role Overview: We are seeking an experienced and proactive Deputy Manager Mechanical Maintenance to lead and manage the mechanical maintenance function at our Bengaluru facility. The ideal candidate will be responsible for ensuring the optimal performance, reliability, and safety of mechanical equipment through strategic planning and execution of maintenance activities. This role demands a hands-on leader who can drive operational efficiency, reduce downtime, and ensure compliance with industry standards. Key Responsibilities: Plan, schedule, and implement preventive and predictive maintenance programs to maximize equipment uptime and longevity. Manage the maintenance budget, ensuring efficient allocation of resources while maintaining quality and performance standards. Troubleshoot and resolve mechanical failures promptly to support uninterrupted production operations. Lead and supervise a team of maintenance technicians, ensuring adherence to safety procedures, SOPs, and company policies. Collaborate with production teams to coordinate planned shutdowns and maintenance activities with minimal disruption. Maintain accurate documentation of maintenance activities, equipment history, spare parts usage, and performance metrics. Develop and implement strategies to improve equipment reliability, reduce breakdowns, and enhance operational performance. Ensure all maintenance practices comply with relevant statutory and regulatory requirements. Lead mechanical maintenance projects, including new equipment installations, upgrades, and commissioning. Mentor and train team members to build technical capabilities and foster a culture of continuous improvement. Qualifications & Skills: Bachelor's degree (B.E/B.Tech) in Mechanical Engineering. Proven experience in a mechanical maintenance leadership role, preferably in a manufacturing or industrial environment. Strong knowledge of preventive and predictive maintenance techniques. Experience in managing budgets, projects, and cross-functional teams. Excellent problem-solving, communication, and leadership skills. Familiarity with regulatory requirements and industry safety standards. Qualification : Bachelor's degree (B.E/B.Tech) in Mechanical Engineering.
Systems Development Engineer, Google Cloud
Google Careers
Systems Development Engineer Google Cloud Location: Bengaluru, Karnataka, India Company: Google Minimum Qualifications Bachelor s degree in Computer Science, Information Technology, or a related field; or equivalent practical experience. 2+ years of experience with systems automation. 2+ years of experience in technical infrastructure (e.g., deployment, maintenance, troubleshooting). Preferred Qualifications 3+ years of experience in systems design and implementation. About the Role As a Systems Development Engineer (SDE) at Google Cloud, you will be part of a team responsible for managing and scaling critical services and infrastructure. This role emphasizes automation, reliability, and observability, using engineering practices to eliminate manual toil and improve system efficiency. Google SDEs design and build the tools and systems that power the infrastructure for Google s services, transforming telemetry into actionable insights and proactively solving operational challenges. You ll have the opportunity to work on impactful, large-scale projects in an environment that fosters learning, collaboration, and growth. Key Responsibilities Participate in on-call rotations and incident response, managing services within your domain. Troubleshoot infrastructure and system issues, evaluate diagnostic data, and recommend solutions. Resolve tickets and bugs within defined service-level objectives (SLOs). Collaborate with primary responders to maintain high availability and reliability of systems. Contribute to the design and implementation of systems and services in related domains. Work directly with customers to gather requirements, define distributed system needs, and propose solutions. Develop automation tools and systems to improve efficiency and reduce operational overhead. About Google Cloud Google Cloud helps organizations transform their business with advanced technologies and enterprise-grade solutions. With a focus on sustainability, innovation, and scalability, Google Cloud serves customers in over 200 countries and territories, providing the tools and infrastructure necessary to solve the world s most complex business challenges. Qualification : Bachelor's degree in Computer Science or IT-related field, or equivalent practical experience.
Electrical Principal Engineer
Dell Technologies
Electrical Principal Engineer FPGA Team Location: Bengaluru, India Team: Electrical Engineering Company: Dell Technologies Role Overview As a Principal Electrical Engineer, you will contribute to the architecture, design, and validation of FPGA-based hardware systems for Dell s next-generation enterprise servers. This role involves working across global teams to deliver robust, scalable, and efficient PCBA (Printed Circuit Board Assembly) and logic solutions that align with industry standards and internal requirements. Key Responsibilities Architect and design next-gen hardware features in collaboration with front-end teams and partners. Analyze and recommend trade-offs in design features and costs. Guide global teams with best practices in electronic hardware design. Own and deliver system interfaces and support cross-functional development efforts. Create comprehensive documentation for testing and validation. Essential Requirements 8 12 years of experience in FPGA hardware verification using Verilog, SystemVerilog, VHDL. Expertise in UVM, ABV (Assertion-Based Verification), code coverage, and unit-level simulation. Knowledge in digital design methodologies: CDC (Clock Domain Crossing) RDC (Reset Domain Crossing) Static timing analysis Experience with x86 or ARM architectures. Familiarity with peripheral protocols: I2C, I3C, SMBus, IPMI, IPMB. Strong background in both analog and digital design. Understanding of hardware/software co-design and debugging complex systems. Desirable Qualifications Experience with Intel/AMD x86 and ARM-based systems. Hands-on with FPGA tools: Xilinx, Lattice, Altera Quartus, ModelSim/QuestaSim. Passion for mentoring and knowledge sharing. Dell Technologies offers a collaborative and innovative environment where hardware engineers work at the forefront of industry advancements. You'll be empowered to lead cutting-edge hardware projects, influence product design, and make a lasting impact on the future of enterprise technology.
Principal Systems Development Engineer
Dell Technologies
Principal Systems Development Engineer Thermal Controls Location: Bengaluru, India Team: Systems Development Engineering Company: Dell Technologies Role Overview As a Principal Systems Development Engineer specializing in Thermal Controls, you ll lead the design and validation of advanced cooling algorithms for Dell s server products, especially those geared towards AI, HPC, and enterprise applications. Your role bridges hardware, firmware, and software to deliver efficient, reliable thermal management systems and uncover innovation opportunities. Key Responsibilities Design, document, implement, and validate thermal control algorithms in collaboration with Dell s global systems management software and thermal teams. Debug and validate thermal control features on live server hardware. Set up and operate servers with varying configurations to test thermal control functionality. Diagnose and route field issues, contributing to resolution strategies. Develop or enhance automation tools to support thermal design and validation workflows. Essential Requirements 5+ years of industry experience in: Mechatronics Thermal controls Electronics cooling Hardware/firmware debugging Strong programming skills in Python, C, or similar languages. Familiarity with server systems: BIOS, CPLD, BMC, device drivers. Proficiency in Linux OS and strong understanding of general OS principles. Excellent debugging, problem-solving, and communication skills. Desirable Qualifications Bachelor s or Master s degree in: Computer Engineering Mechanical Engineering with a focus on controls or mechatronics Experience designing and implementing electronic cooling solutions (hardware to firmware). Proficiency with CAD tools (especially CREO) for mechanical and thermal design. You ll work on thermal control systems that enable the latest in AI and high-performance computing, shaping the future of Dell s data center and server technologies. Dell s engineering culture offers hands-on collaboration, career development, and the opportunity to lead innovations at the intersection of hardware, firmware, and software.
Staff Engineer - Core Infrastructure
Eightfold
Staff Engineer - Core Infrastructure Location: Bangalore, Karnataka, India Employment Type: Full-Time | Hybrid Work Model About Eightfold.ai At Eightfold.ai, we re transforming the future of work by leveraging artificial intelligence to connect individuals with career opportunities based on their skills and potential, not just their network. Our Talent Intelligence Platform powers a more diverse, inclusive workforce by helping organizations plan, hire, develop, and retain top talent. With $410M+ in funding and a $2B+ valuation, we are revolutionizing how the world thinks about skills, potential, and careers. If you re passionate about cutting-edge technology, infrastructure, and creating scalable solutions that impact the world, we want you to join us. The Opportunity We re looking for a Staff Engineer to join our Core Infrastructure Team and help scale the backbone of Eightfold s platform. This high-impact role will involve designing, building, and optimizing foundational systems that power everything from search and machine learning infrastructure to developer platforms and observability tools. You will drive system design across our stack and mentor engineering teams to build scalable, resilient systems that enable Eightfold to grow and deliver AI-powered solutions for our customers. What You ll Own & Drive Architect & Scale Core Systems: Design and build scalable infrastructure systems that support Eightfold s AI-driven products, including search, compute, storage, and machine learning infrastructure. Cross-Functional Leadership: Lead cross-team technical initiatives, collaborating with Product, Security, Data, and Platform teams to align with company-wide goals. Hands-On Development: Contribute directly to system design, code reviews, and incident response, ensuring best practices are followed. Mentorship & Leadership: Guide and mentor engineers to help them grow into future leaders, fostering a culture of technical excellence across teams. Advocate for Engineering Excellence: Champion best practices across areas such as cloud architecture, CI/CD, security, and observability. Solve Complex Infrastructure Challenges: Tackle problems around reliability, scalability, and infrastructure performance, ensuring the systems are robust and perform well at scale. Bring Emerging Tech to Life: Stay on top of the latest trends and technologies, incorporating new scalable design patterns into our architecture. What You Bring 10+ years of experience in backend or infrastructure engineering, with a strong background in building distributed, cloud-native systems. Proven track record in designing and delivering reliable, high-scale services (ideally in AWS, GCP, or Azure environments). Expertise in Infrastructure Technologies: Deep knowledge of containerization, orchestration (Kubernetes), and infrastructure-as-code. Experience with one or more of the following: search infrastructure, ML/AI infrastructure, databases/data warehouses, developer tooling, or platform security. Leadership Experience: A passion for mentoring and guiding engineers, influencing teams and peers, and driving excellence across projects. Strong communication skills, able to translate complex technical challenges into strategic business impact. (Bonus) Experience with SRE principles, cloud security, and compliance for enterprise/government environments. Our Engineering Culture At Eightfold, we believe in ownership over tasks. You won t just be given directions; you ll be trusted to take responsibility and make a measurable impact. We have a growth mindset and continuously improve in all aspects of our work. Collaboration, transparency, and speed are core to everything we do. You ll work in a dynamic, supportive environment where your work directly influences the success of the company and its mission. Meaningful Work: Help shape the future of work by building products that impact careers and businesses globally. Growth Opportunities: Be part of a rapidly scaling company where your contributions are highly valued. Competitive Compensation: Attractive salary, equity, and comprehensive benefits package (including medical, vision, and dental coverage). Hybrid Work Model: Work from our Bangalore office twice a week, with flexibility for remote work. Inclusive Culture: We are committed to fostering a diverse and inclusive work environment where everyone feels valued. Equal Opportunity Employer Eightfold.ai is an Equal Opportunity Employer. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, or disability. If you re a hands-on, innovative engineer with a passion for building scalable systems and tackling infrastructure challenges, we want to hear from you.
Engineering Manager- Platform Engineering
Meesho
Engineering Manager Platform Engineering Location: Bangalore, Karnataka | Department: Tech About the Team At Meesho, we support 5% of Indian households with high-scale e-commerce solutions and we do it with zero downtime. We value speed over perfection, embrace failures as learning opportunities, and empower teams with a Founder s Mindset. As part of the Platform Engineering team, you ll be building resilient, low-latency, high-throughput systems that serve millions of users daily. We invest in the growth of every engineer through continuous feedback, open communication, and a supportive culture. And yes we know how to party as hard as we code. About the Role We are looking for a skilled Engineering Manager Platform Engineering to lead a team responsible for designing, scaling, and optimizing our core infrastructure. This role involves managing large-scale distributed systems, fostering engineering excellence, and collaborating across teams to drive innovation. You ll ensure technical quality, delivery speed, and scalable architecture for all projects under your ownership. What You Will Do Design and allocate technical tasks while maintaining Meesho s engineering standards. Own execution of platform projects from inception to deployment, ensuring scalability and reliability. Conduct regular 1:1s, drive feedback cycles, and support career growth of engineers. Partner closely with Product and Design teams to develop new platform capabilities. Coach engineers on best practices for architecture, performance, and scalability. Monitor project health, sprint progress, and engineering KPIs. Foster a high-performing team culture with strong engineering ownership. What You Will Need Bachelor s or Master s degree in Computer Science or a related technical field. 8+ years of professional software development experience, including 1+ year in team management. Proven experience building large-scale distributed systems. Strong coding skills in Java, Python, or Go, and multithreading expertise. Deep understanding of messaging systems (Kafka, etc.), transactional and NoSQL databases. Experience working on cloud platforms like GCP or AWS. Exceptional communication, leadership, and stakeholder management skills. Good to have: Exposure to Elasticsearch, data pipelines, or stream processing systems. About Us Meesho is India s leading e-commerce platform built for the next billion users. With 1.75M+ sellers and a customer base spread across every serviceable pin code, we are democratizing internet commerce by enabling small businesses to sell online at zero commission and with the lowest logistics costs in the industry. From affordable products that reflect local demand to a robust pan-India tech infrastructure, Meesho is transforming how India shops and sells online. Our Culture & Total Rewards At Meesho, we believe in creating a culture of impact, inclusion, and innovation. Our values reflected in 11 guiding principles or "Mantras" shape how we work, collaborate, and grow together. Why You ll Love Working Here: Compensation: Competitive salary with equity-based rewards tailored to your experience and impact. Wellness: Extensive health insurance for you and your family through our MeeCare Program, mental wellness support, gym discounts, and more. Flexibility & Leave: Generous time off, parental benefits, and relocation support. Growth & Learning: Continuous learning through workshops, internal mobility, and performance coaching. Culture of Recognition: Personalized gifts, fun rituals, and regular engagement programs celebrating wins big and small. Join us to build the platform powering the future of digital commerce in India. Apply now and be part of a tech-first, people-driven journey at Meesho. Qualification : Bachelors or Masters degree in Computer Science or a related technical field.
Site Reliability Developer 2/3
Oracle
Job Description: Site Reliability Engineer - OCI Cloud Engineering Team Role: Site Reliability Engineer (SRE) Team: OCI OLTP (Online Transaction Processing) Location: Kiev Career Level: IC2 Experience: 5+ years Overview: Oracle Cloud Infrastructure s (OCI) OLTP organization is seeking a Site Reliability Engineer (SRE) to join our dynamic and fast-paced Cloud engineering team. The team is responsible for mission-critical distributed systems and cloud services, and we are looking for an engineer who is deeply interested in databases, distributed systems, and cloud services. If you thrive in an environment where innovation, problem-solving, and operational excellence intersect, this is an exciting opportunity for you! As a member of the SRE services, you will focus on Cloud Services, building deployments, operations, security vulnerability mitigation, and automation. You will be instrumental in fostering a culture of Site Reliability Engineering (SRE) within the team, and your work will directly contribute to ensuring the stability, performance, and reliability of Oracle s global cloud service infrastructure. This role requires someone who is adaptable, highly motivated, and capable of managing large-scale cloud environments with a focus on continuous improvement. Key Responsibilities: Cloud Service Operations & Reliability: Deploy, operate, and maintain large-scale cloud service products in a highly available, fault-tolerant, and scalable environment. Collaborate with internal teams to identify and mitigate cross-team issues that pose operational risks to cloud services. Focus on systems reliability and ensure the continuous availability of cloud services by automating tasks and eliminating manual interventions. Automation & Improvements: Automate operational tasks and improve service deployments, focusing on scaling, performance, and uptime. Contribute to CI/CD systems, ensuring seamless integration and continuous delivery for cloud-based services. Leverage automation tools such as Terraform, Grafana, and Bitbucket to streamline operations. Security & Incident Response: Mitigate security vulnerabilities within cloud services and ensure compliance with Oracle's security standards. Participate in on-call rotations to provide immediate troubleshooting support and ensure rapid issue resolution. Perform deep analysis of service performance and collaborate with team members to diagnose and resolve issues that affect service availability or performance. Collaborative Problem-Solving: Work closely with cross-functional teams, including development, database, networking, and storage experts, to ensure the reliability and performance of services. Identify systemic issues and potential risks, develop solutions, and ensure proper documentation and communication with stakeholders. Documentation & Knowledge Sharing: Contribute to documentation such as runbooks, operational guides, and troubleshooting manuals. Mentor junior engineers and share knowledge on best practices for site reliability engineering and cloud service operations. Continuous Learning: Stay up to date with new cloud technologies, trends, and best practices, and actively implement them in your day-to-day work. Technical and Professional Requirements: Cloud Services & Infrastructure: 5+ years of experience in SRE, DevOps, or Automation roles with a focus on large-scale infrastructure and cloud services. Hands-on experience with cloud platforms (e.g., OCI, AWS, Azure) and expertise in compute, database, networking, and storage services within cloud environments. Automation & Tooling: Proficiency with automation tools such as Terraform, Grafana, LumberJack, and Shepherd. Solid experience in using CI/CD tools and processes for cloud service deployments and operations. Scripting & Systems: Strong knowledge of scripting languages, particularly Python and Java. Familiarity with Linux systems, docker containers, virtualized infrastructure, and orchestration (e.g., Kubernetes). Performance & Troubleshooting: Excellent troubleshooting skills with a focus on performance, availability, reliability, and scalability of distributed systems. Experience in operating fault-tolerant, highly available, high-throughput distributed systems. Security & Incident Management: Familiarity with security practices and mitigating security vulnerabilities in cloud services. Proven ability to handle incident response and provide efficient troubleshooting during on-call rotations. Collaboration & Communication: Strong verbal and written communication skills, capable of working effectively with diverse teams across multiple geographies. Ability to work in a highly collaborative environment, driving operational excellence and customer satisfaction. Preferred Qualifications: Experience in operating and maintaining multi-tenant, cloud-based infrastructure with a focus on scalability and high availability. Familiarity with tools and platforms like Grafana, Prometheus, and other observability and monitoring tools. Experience in networking and storage technologies in a cloud environment. Joining OCI s OLTP team as an SRE gives you the opportunity to work with cutting-edge technologies and contribute to the operational excellence of Oracle s global cloud infrastructure. This is a chance to grow your skills in a highly dynamic environment and to solve complex problems that directly impact mission-critical cloud services. With a focus on automation, scalability, and high performance, you will be an essential part of a team that powers Oracle s leading cloud services. If you are an experienced engineer passionate about cloud technologies, automation, and ensuring the reliability of large-scale systems, we encourage you to apply and join us in this exciting journey!
Senior Engineer - IT Software Development & Operations
Sasken Technologies
Job Title: Senior Engineer - IT Software Development & Operations Location: Bengaluru Job Summary The Senior Engineer will be responsible for applying their technical expertise in various aspects of software development and operations, including design, coding, testing, documentation, and technical support. This role requires the ability to handle complex issues, adapt existing methods to solve problems, and deliver results with minimal supervision. The ideal candidate will have strong collaboration skills, consistently seek to improve their technical capabilities, and actively participate in technical initiatives to enhance organizational success. Roles & Responsibilities Design & Development: Responsible for the design, coding, testing, bug fixing, documentation, and technical support within the assigned area. Ensure timely delivery of solutions while meeting quality and productivity goals. Collaboration & Customer Interaction: Regularly collaborate with customer teams to clarify technical issues, resolve queries, and ensure smooth project execution. Participate in key project and work-related activities, providing input on identifying important issues and risks. Process Improvement: Actively seek opportunities to enhance existing skills and acquire new complex technical skills. Participate in technical initiatives related to the project and organization, delivering training and contributing to process improvements. Project Execution: Adhere to organizational guidelines and checklists during deliverable reviews. Provide regular status reports to the Team Lead and ensure that relevant organizational processes are followed. Skill Development: Enhance technical capabilities by attending training sessions, engaging in self-study, and undergoing periodic technical assessments. Education and Experience Education: Engineering Graduate, MCA, or equivalent. Experience: 2-5 years of relevant experience. Competencies Description Digital Automation Engineer: Experienced in designing and implementing engineering processes and automation across phases of the DevOps-based SDLC, including Configuration Management, Build & Release, Test Automation, Deployment, Infrastructure Automation, and Continuous Operations. Configuration Management Specialist: Design, configure, and implement version control, branching, and configuration strategies using source code and version control systems like GIT, GitLab, BitBucket, SVN, CVS, Clearcase. Build Automation Specialist: Experience in Continuous Integration (CI) and Build Automation tools like Jenkins, Bamboo, ANT, Maven, Gradle. Test Automation Specialist: Experience in designing and authoring Test Automation scripts for Mobile, Web, Cross-platform, Web Services, Microservices, and infrastructure testing. Proficient in Black Box, White Box, Functional, Performance, UI, Security, and Regression testing, along with experience in BDD frameworks and device test clouds like Sauce Labs and Xamarin Test Cloud. Deployment Specialist: Expertise in release management strategies, managing package repositories, AMIs, and deploying applications and service packages across cloud and container-based infrastructure. Infrastructure Automation Specialist: Expertise in designing and implementing programmable infrastructure on virtualized and cloud-based environments. Ability to manage IaaS, Configuration Management, Container Management, and Environment Management across cloud platforms (AWS, Azure, etc.). Continuous Operations Specialist: Design, implement, and operate elastic infrastructure, manage application and service monitoring, failover scenarios, scalability, SLAs, and operational dashboards across cloud and virtualized environments. Platforms Linux, Windows, Android, iOS, VMware, OpenStack, Hyper-V Technology Standards AWS, Azure, RESTful APIs, SOAP, Test-Driven Development (TDD), Microservices patterns, Service Mesh, CloudFormation templates. Tools Configuration Management: GIT, GitLab, BitBucket, SVN, Clearcase, Perforce. Build Tools: GNU Make, NMake, ANT, Maven, Gradle, Ivy. CI Tools: Jenkins, Bamboo, CircleCI, AWS DevOps tools, Azure DevOps. Requirement Management: Bugzilla, Jira. Code Review: Gerrit, GitLab, ReviewBoard. Containers: Docker, Docker Swarm, Kubernetes, ECS (Amazon), AKS (Azure). Automation & Configuration Management: Ansible, Chef, Puppet. Cloud-Native DevOps Services (AWS, Azure): Cloud-Native DevOps Services. Testing Tools: Appium, Visual Studio App Center, SauceLabs, Selenium, Black Duck, SOAP UI, Protractor, JUnit, NUnit, LoadRunner, JMeter. Monitoring & Dashboarding: Prometheus, ELK Stack, Grafana. Languages Scripting Languages: Perl, Python, Groovy, Shell Script, PowerShell, YAML, Ansible. Other Programming Languages: Java, C#, XML. Test Automation Languages: Java, Python (for Appium and Sauce Labs). Specialization Key Areas: Configuration Management, Test Automation, Build and Release Automation, Infrastructure Automation, Continuous Operations, Deployment, RPA (Robotic Process Automation). Desired Skills Strong collaboration and communication skills. Ability to manage multiple projects and tasks while ensuring quality delivery. Experience working in an agile development environment. Proactive in identifying and resolving technical challenges. Strong analytical and problem-solving abilities. This is an exciting opportunity for a skilled Senior Engineer to advance their career in the IT Software Development and Operations domain, work on innovative projects, and gain experience across cutting-edge technologies. Qualification : Engineering Graduate, MCA, or equivalent.
Senior DevOps / Site Reliability Engineer
Blue Yonder
Job Title: Senior DevOps / Site Reliability Engineer Location: Pune, India Company: Blue Yonder Experience: 10 to 13 years Education: Bachelor s Degree in Computer Science, Engineering, or related STEM fields Company Overview Blue Yonder is a leading AI-driven Global Supply Chain Solutions provider and consistently recognized as one of Glassdoor s Best Places to Work. We are driving the next wave of digital transformation in manufacturing and retail, delivering innovative SaaS solutions that power intelligent supply chains across the globe. We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to lead the design, development, deployment, and operational management of our Azure SaaS solution. This role requires strong DevOps, cloud delivery, and infrastructure automation expertise, along with leadership capabilities to guide a growing global team. Role Overview In this role, you will be responsible for architecting, planning, and executing end-to-end delivery pipelines, supporting both product development and operational stability. Working closely with platform, product, and architecture teams, you will implement best-in-class DevOps and SRE practices, ensuring scalability, resilience, and cost optimization. Key Responsibilities Architect, design, and manage CI/CD pipelines and infrastructure for a cloud-native, multi-tenant SaaS solution on Azure. Lead sprint planning, backlog grooming, and architecture discussions. Develop quality automation scripts and tools to reduce manual efforts and enable self-healing, self-service capabilities. Identify and resolve operational bottlenecks and proactively improve observability (monitoring, alerting, logging). Participate in code reviews, ensure secure and scalable designs, and mentor junior and mid-level engineers. Collaborate with stakeholders to understand business and technical requirements and translate them into actionable user stories. Implement and enforce cloud cost optimization strategies. Conduct post-incident reviews with a blameless culture to identify root causes and drive continuous improvements. Automate service requests and standard operational procedures. Drive improvements to the team s continuous integration pipeline, ensuring rapid and reliable deployments. Stay updated with the latest DevOps, SRE, and cloud technologies and bring innovative ideas to the table. Participate in team hiring and actively contribute to onboarding new team members. Technical Environment Languages: Java, Python, PowerShell, Shell Scripting DevOps Tools: Azure DevOps, GitHub Actions, Jenkins Cloud: Microsoft Azure (ARM Templates, AKS, Event Hub, HDInsight, Azure AD, Application Gateway, Virtual Networks) Architecture: Microservices, Kubernetes, Docker, Event-driven architecture Frameworks: Spring Boot, Hibernate Monitoring & Logging: Elasticsearch, Spark, Kafka Databases: RDBMS, NoSQL Version Control: Git Required Skills & Experience Bachelor s Degree (STEM preferred) with 10 to 13 years of experience in DevOps, Cloud Delivery, or Site Reliability Engineering. Proven hands-on experience with Azure Cloud Services. Expertise in setting up and optimizing CI/CD pipelines. Strong scripting experience: Shell and PowerShell are mandatory; Python is a plus. Strong understanding of container technologies (Docker, Kubernetes) and microservices architecture. Experience integrating and managing third-party monitoring and logging tools. Strong problem-solving skills and ability to work with global, cross-functional teams. Excellent communication and stakeholder management skills. Nice to Have Development experience in Java or Python. Experience working in agile teams with a product-centric mindset. Experience working in manufacturing or retail domains. Exposure to AI/ML-driven monitoring and observability tools. Work with cutting-edge technologies on globally impactful solutions. Collaborate with diverse and talented teams across the US, India, and the UK. Foster your career growth through mentorship, continuous learning, and leadership opportunities. Experience an inclusive, flexible work culture where innovation and creativity thrive. Diversity, Inclusion, Value & Equality (DIVE) At Blue Yonder, we are committed to building an inclusive environment where everyone feels empowered to be themselves. All qualified applicants will receive consideration for employment regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. Qualification : Bachelors Degree in Computer Science, Engineering, or related STEM fields
Senior Site Reliability Engineer
Couchbase
Job Title: Site Reliability Engineer (SRE) Cloud Platform & Production Pipeline Initiatives Location: Bangalore, India (Office-based role) About Couchbase: As industries race to embrace AI, traditional database solutions fall short of rising demands for versatility, performance, and affordability. Couchbase is leading the way with Capella, the developer data platform for critical applications in our AI-driven world. By uniting transactional, analytical, mobile, and AI workloads into a seamless, fully managed solution, Couchbase empowers developers and enterprises to build and scale applications with unmatched flexibility, performance, and cost-efficiency from cloud to edge. Trusted by over 30% of the Fortune 100, Couchbase is unlocking innovation, accelerating AI transformation, and redefining customer experiences. Come join our mission! Job Overview: As a Site Reliability Engineer (SRE), you will play a pivotal role in managing, optimizing, and maintaining Couchbase s cloud infrastructure for Capella, our Database as a Service (DBaaS) platform. You will be responsible for ensuring the reliability and performance of our cloud service while collaborating closely with engineering teams to improve deployment pipelines, security practices, and overall system health. You will work across cloud platforms and multiple tools to provide guidance, mentorship, and contribute to the strategic direction of cloud operations. Responsibilities: Infrastructure Management: Manage, monitor, and maintain the infrastructure for Capella to ensure reliable operations. Security & Compliance: Implement and manage cloud environments in accordance with company security guidelines, including vulnerability management, penetration testing, and compliance requirements (SOC 2, PCI-DSS, GDPR, HIPAA, etc.). CI/CD & Release Pipeline: Collaborate with engineering teams to optimize CI/CD processes, aiming for a highly resilient deployment strategy, ideally with zero downtime. Cloud Optimization: Stay up-to-date with new technologies and industry trends to continuously improve cloud platform architecture and meet the evolving needs of the business. Security Integration: Work with development teams to integrate security scanners within the DevOps lifecycle, enhancing security posture. Leadership & Mentorship: Provide guidance on architecture, code reviews, and technical feedback to improve service reliability, security, cost, and performance. Incident Management: Demonstrate exceptional problem-solving skills, proactively identifying and addressing potential issues before they affect business operations. Collaboration: Partner with development teams, application owners, and stakeholders to integrate best practices and ensure seamless service delivery. Requirements: Experience: 5+ years in Site Reliability Engineering (SRE), DevSecOps, or similar roles, with significant experience working in public cloud environments. Programming & Scripting: Proficiency in languages such as Go, Python, Java, or Ruby. Linux Expertise: High proficiency with Linux operating systems. Kubernetes Management: Experience in managing and maintaining Kubernetes clusters (both self-managed and managed platforms like AWS EKS). Security & Vulnerability Management: In-depth knowledge of security tools and practices (vulnerability management, pen testing, SCA, DAST, SAST), with hands-on experience using tools like Sysdig, Synk, and Blackduck. Cloud Platforms & Tools: Strong experience with cloud platforms (AWS, GCP, Azure) and open-source tools like Artifactory, Jira, Jenkins, Grafana, Prometheus, Datadog, Thanos, etc. Configuration Management: Proficiency with Terraform, Git, and CI/CD platforms (e.g., CircleCI, GitHub, Spinnaker). Networking Security: Solid understanding of TCP/IP, DNS, HTTP, Firewalls, VPNs, and other networking security concepts. Preferred Skills: Availability & Reliability: Knowledge of SLO/SLA, availability, reliability, and performance concepts. Incident Management: Experience with on-call rotations and incident management. Database Experience: Familiarity with databases, particularly Couchbase. Security Certifications: Relevant certifications in security or cloud technologies are a plus. Couchbase reimagines database technology to deliver a fast, flexible, and affordable cloud database platform, empowering developers to build applications with exceptional customer experiences. Trusted by over 30% of the Fortune 100, Couchbase drives innovation and customer success through its Capella platform. Benefits at Couchbase: Generous Time Off Program: Flexibility to care for yourself and your family. Wellness Benefits: Access to world-class medical plans, dental, vision, life insurance, and employee assistance programs. Financial Planning: RSU equity program, ESPP, retirement planning, and business travel insurance. Career Growth: Focused on your career development and success. Fun Perks: Ergonomic and comfortable office setup, food & snacks for in-office employees, and more!
Devops
Mirafra Technologies
DevOps Engineer Location: Bangalore Experience: 5+ Years Education Qualification: B.E. in Computer Science / Electronics About Mirafra Founded in 2004, Mirafra is a fast-growing global product engineering services company specializing in Semiconductor Design, Embedded Systems, Digital Solutions, and Application Software. With over 1,500+ professionals worldwide, we provide cutting-edge solutions to Fortune 500 clients across industries such as Semiconductor, Internet, Aerospace, Networking, Telecom, Medical Devices, and Consumer Electronics. Recognitions: Best Company to Work For SiliconIndia (2016) Most Promising Design Services Provider SiliconIndia (2018) Top 10 Admired Companies for Software Services DigiTech Insight (2022) Key Responsibilities DevOps & Automation Develop automated CI/CD pipelines and manage build & deployment processes. Implement infrastructure automation using scripting (Shell, Batch, Python). Manage configuration, integration, and deployment using DevOps tools. Version Control & Build Management Work with Git, Gitlab, Bitbucket for version control. Maintain build systems like Make, CMake and manage dependencies using Pip, Conda, Poetry, Maven. Handle binary management tools like Artifactory, Nexus. Code Quality & Security Utilize Static Code Analysis tools (SonarQube, Pylint, Coverity) for code quality enforcement. Monitor and ensure security compliance in the DevOps lifecycle. Cloud & Containerization Manage cloud-based deployments and monitoring using ELK, Docker, Kubernetes. Implement scalable and resilient infrastructure solutions. Agile & Collaboration Work in an Agile/Scrum environment, collaborating with cross-functional teams. Utilize UML modeling and software development best practices. Skills & Qualifications Education: B.E. in Computer Science / Electronics Technical Expertise: Scripting & Automation: Shell, Batch, Python CI/CD & Build Tools: Jenkins, Gitlab, Make, CMake Version Control: Git, Bitbucket, Gitlab SCM Static Code Analysis: SonarQube, Pylint, Coverity Package Management: Pip, Conda, Poetry, Maven Binary Management: Artifactory, Nexus Cloud & Containerization: Docker, Kubernetes, ELK Stack Programming Languages: Python, C, C++ Operating Systems: Linux, Unix, Windows Soft Skills: Strong problem-solving and analytical skills. Excellent communication and team collaboration. Ability to work in fast-paced Agile environments. Cutting-edge projects in Semiconductor, Aerospace, Networking, and IoT. Global work environment with top-tier clients. Career growth opportunities and exposure to the latest technologies. Award-winning workplace culture and industry recognition. Excited to take on a challenging DevOps role? Apply now!
Software Engineer Iii, Infrastructure, Core
Google Careers
Job Title: Software Engineer About the Role: At Google, our Software Engineers are at the forefront of innovation, designing and developing cutting-edge technologies that shape how billions of users connect, explore, and interact with information. Our products operate at an immense scale, extending far beyond web search, and require engineers who bring fresh perspectives from diverse technical domains, including information retrieval, distributed computing, large-scale system design, networking, security, artificial intelligence, natural language processing, UI design, and mobile development. As a Software Engineer, you will contribute to mission-critical projects, collaborating with teams across Google to develop, test, deploy, maintain, and enhance software solutions. Your versatility, leadership abilities, and enthusiasm for solving complex challenges will be crucial as you navigate projects across the full technology stack. The Core Team serves as the backbone of Google s technical infrastructure, building the foundational elements behind our flagship products. This team is responsible for developing essential developer platforms, product components, and infrastructure that drive innovation across Google s ecosystem. As a member of this team, you will play a pivotal role in breaking down technical barriers, optimizing existing systems, and making key architectural decisions that influence the entire organization. Key Responsibilities: Design, develop, and maintain high-quality software solutions that support Google's technical infrastructure and products. Participate in and lead design reviews with peers and stakeholders, evaluating available technologies to determine optimal solutions. Conduct thorough code reviews to ensure adherence to best practices, including code quality, efficiency, accuracy, testability, and compliance with style guidelines. Contribute to documentation and educational resources, updating content based on product enhancements and user feedback. Troubleshoot and debug complex system issues, analyzing their impact on hardware, networks, and service operations to maintain optimal performance and reliability. At Google, we foster a culture of continuous learning, innovation, and technical excellence. If you're passionate about solving challenging problems and building world-class technology, we invite you to be part of our journey. Qualification : Bachelors degree or equivalent practical experience.
Systems Development Engineer Iii, Silicon Infrastructure
Google Careers
Minimum Qualifications: Bachelor's degree in Computer Science or IT-related field, or equivalent practical experience. 3 years of experience with systems automation and tooling using Python, Go, or any other programming language. 3 years of experience in managing technical infrastructure (e.g., deployment, maintenance, troubleshooting). 3 years of experience in Linux Internals, Networking, and Systems administration. Preferred Qualifications: 6 years of experience in implementing, troubleshooting, and supporting computing systems. Experience with DevOps tools such as Terraform, Puppet, Ansible, Jenkins. Experience with cloud provider ecosystems of cloud service providers. Experience in supporting large-scale complex cloud-based infrastructure. Experience in one of the High Performance Computing (HPC) Schedulers such as LSF, Symphony, NC, or SLURM. Excellent investigative, problem-solving, and communication skills. About the Job: Systems Development Engineering (SDE) at Google is a role where you manage services and systems at scale. SDEs creatively put their engineering discipline to use automating the mundane and reducing toil. We don t just write code to fix bugs but emphasize the development of tools and solutions that fix classes of problems. We know it s hard to control what you can t measure so we focus on observability: instrumenting first, then turning data into knowledge, and finally knowledge into action. We know that the operational efficiency of Google systems, services, virtual compute environments, and the operating systems that power them impact the environment, not just the bottom line. We know that working together we can do more, and that community matters. Google brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a blame-free environment. We promote self-direction to work on meaningful projects while striving to create an environment that provides the support and mentorship needed to learn and grow. Together we engineer and build the infrastructure, tools, access, and telemetry for systems that enable orchestration of Google-scale services. Come build things that matter. In this role, you will be responsible for building and operating the underlying foundation infrastructure used for designing custom silicon. This includes compute platforms, CI/CD, Cloud, groups and Access Control Lists (ACLs), source control, dashboards, resource economy, and bench labs used by silicon engineers in their daily work. Google's mission is to organize the world's information and make it universally accessible and useful. Our team combines the best of Google AI, Software, and Hardware to create radically helpful experiences. We research, design, and develop new technologies and hardware to make computing faster, seamless, and more powerful. We aim to make people's lives better through technology. Responsibilities: Ensure the timely delivery of silicon designs using Google Cloud Platform (GCP) within a fluid environment. Monitor and uphold high standards for internal services on Google Cloud. Provide support, maintain, and deploy team-supported infrastructure and documentation. Collaborate closely with engineering teams to implement and develop on Google Cloud infrastructure. Automate tasks wherever needed. Qualification : Bachelor's degree in Computer Science or IT-related field, or equivalent practical experience.
Engineering Manager - Money (gatekeeper)
Databricks
At Databricks, we are obsessed with Data + AI to solve the world's toughest problems, from security threat detection to cancer drug development. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high-value challenges that are central to their missions. Founded in 2013 by the original creators of Apache Spark , Databricks has grown from a tiny corner office in Berkeley, CA to a global organization with over 6500 employees. Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest-growing SaaS companies in the world. The Money team's mission at Databricks is to maximize the value that our customers derive from their investments in data projects. We accomplish this through innovative commercialization strategies, timely & accurate billing, cost optimization tools, intelligent resource usage controls, and cutting-edge engineering. We provide a seamless and consistent set of platforms to enable all Databricks products to reach customers quickly, and sustainably. As the first Engineering Manager for Money at Databricks India, you will be key to building a base for one of Databricks most central engineering orgs. You will own critical components that form the backbone of our business, starting with Databricks resource admission control and usage governance infrastructure. Your role is crucial in helping bring diverse business needs together, including abuse prevention, product commercialization motions, and reliable product availability at scale. You will work closely with infrastructure as well as product teams in bringing critical governance functionality to Databricks customers. The impact you will have: Hire great engineers to build an outstanding team and support their career development by providing clear and timely feedback Ensure high technical standards by instituting processes (architecture reviews, testing) and culture (engineering excellence) Work with engineering and product leadership to build a long-term roadmap Coordinate execution and collaborate across teams to successfully deliver cross-cutting strategic projects What we look for: 10+ years of extensive experience with large-scale distributed systems alongside processes around testing, monitoring, SLAs etc 5+ years of engineering management experience, building, managing and mentoring high-performing software engineering teams Demonstrated success in collaborating with multiple cross-functional stakeholders to align system features and architecture, to deliver impactful platform projects Experience in developing and implementing proactive mechanisms for failure detection and incident prevention Excellent leadership, communication, and project management skills BS (or higher) in Computer Science, or a related field About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide including Comcast, Cond Nast, Grammarly, and over 50% of the Fortune 500 rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark , Delta Lake and MLflow. To learn more, follow Databricks on Twitter,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visithttps://www.mybenefitsnow.com/databricks. Our Commitment to Diversity and Inclusion At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics. Compliance If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone. Qualification : BS (or higher) in Computer Science, or a related field.
Senior Engineering Manager - Backend
Databricks
About Databricks At Databricks, we are passionate about empowering data teams to solve the world s toughest challenges from creating the next mode of transportation to accelerating medical breakthroughs. We build and run the world s best data and AI infrastructure platform, enabling our customers to leverage deep data insights to transform their businesses. Founded by engineers and customer-obsessed, we thrive on solving technical challenges whether it's designing next-gen UI/UX for interacting with data or scaling our services and infrastructure across millions of virtual machines. And we re only getting started. The Opportunity As one of the first Engineering Managers in the Software Engineering team at Databricks India, you will lead a team of talented engineers to build infrastructure and products at scale for the Databricks platform. Our teams work across diverse domains, including: Resource Management Infrastructure: Power big data and machine learning workloads on a scalable, secure, and cloud-agnostic platform. Reliable, Scalable Services and Client Libraries: Handle massive amounts of data across multiple regions and cloud providers. Developer Tools: Help Databricks engineers operate services across different clouds and environments. Services and Infrastructure at the Intersection of Machine Learning and Distributed Systems. The Impact You Will Have Hire and grow a world-class engineering team, fostering a supportive and collaborative environment. Support engineers career development, providing regular feedback and cultivating future engineering leaders. Ensure high technical standards by implementing effective processes (e.g., architecture reviews, testing) and promoting a culture of engineering excellence. Collaborate with engineering and product leadership to define and execute long-term roadmaps. Coordinate execution and unblock cross-team projects, ensuring timely delivery and alignment with business objectives. What We Look For 12+ years of experience with large-scale distributed systems, including expertise in testing, monitoring, and defining SLAs. Proven track record as a Software Engineering Leader, with experience building and scaling engineering teams from the ground up. Extensive experience managing high-performing software engineering teams. Strong collaboration skills to partner with Product Management, Sales, and Customers in developing innovative features and products. BS (or higher) in Computer Science or a related field. About Databricks Databricks is the data and AI company, trusted by over 10,000 organizations worldwide, including Comcast, Cond Nast, Grammarly, and more than 50% of the Fortune 500. We help unify and democratize data, analytics, and AI through the Databricks Data Intelligence Platform. Headquartered in San Francisco, Databricks was founded by the original creators of Apache Spark, Delta Lake, MLflow, and the Lakehouse architecture, with offices across the globe. Qualification : BS (or higher) in Computer Science, or a related field.
Senior Performance Engineer
Boomi Software
Senior Performance Engineer Are you ready to work on world changing technologies? Today, organizations need to move with increased agility and insight to grow and thrive. Boomi is one of the hottest tech companies in the SaaS/Cloud industry, named a Leader for the eighth year in a row in the Gartner Enterprise iPaaS Magic Quadrant and recently recognized by Inc. Magazine as one of the best workplaces. Our award-winning, patented technology is transforming the world of integration by making enterprise-class integration technology accessible and affordable to companies of all sizes. Boomi provides the foundation on which your business can evolve and innovate. According to a recent survey by Vanson Bourne, connected businesses are far outpacing their competitors. We help organizations connect everything and engage everywhere across any channel, device or platform. More than 7,000 organizations are using Boomi to run better, faster and smarter. Working at Boomi means doing what you love. We hire trailblazers with an entrepreneurial spirit who can solve challenging problems, make a real impact in technology and want to build something big. If you are passionate about solving hard problems, enjoy working with world-class people and developing cutting edge technology, you should explore a career with Boomi. Learn more at http://www.boomi.com/ or visit Boomi Careers. Join us as a Performance Engineer on our Performance, Scalability and Resiliency(PSR) Engineering team in Bangalore/Hyderabad, India to do the best work of your career and make a profound social impact. What you ll achieve As a Performance Engineer, you will be responsible for validating and recommending performance optimizations in Boomi s computing infrastructure and software. You will work with our Product Development and Site Reliability Engineering teams on Performance monitoring, tuning and tooling. You will: Analyze Software Architecture (monolith and micro-service) and identify potential areas of performance, scalability and resiliency improvements Identify KPIs, perform trending and analysis, identify patterns and engineer remedial solutions for a high performant, fault tolerant and resilient platform and application stack. Design, automate and perform scalability and resiliency tests using various tools like JMeter, Chaos Monkey or similar Use observability stack to improve diagnosability and trending around Performance bottlenecks Identify performance tuning opportunities and recommend remedial solutions Take the first step towards your dream career Every Boomer brings something unique to the table. Here s what we are looking for with this role: Essential Requirements Expert in performance engineering fundamentals - arrival rate, workload models, responsiveness, computing resource utilization, time complexity, scalability, resiliency etc.. Expert in monitoring the performance using native Linux OS, Application Performance Management(APM) and Infrastructure monitoring tools Experience in analyzing crash dump, thread dump, SQL slow query log and identify performance bottlenecks Expert in recommending optimal resource configurations in Cloud, Virtual Machine, Container and Container Orchestration technologies Flexibility to work in a remote and geographically distributed team environment Desirable Requirements Experience in writing data extraction and custom monitoring tools using any programming language - Java, Python, R , Bash or similar Experience in capacity planning and modelling using AI/ML, queueing models or similar approaches Performance tuning experience in Java or similar application code
1 - 20 of 0 jobs
* No exact matches found. Showing closest results insteadNo results found
Modify search criteria or create an alert to get relevant jobs as soon as they’re posted