Gitops Jobs in Pune
5 Jobs Found
Devops Sre Manager
Talentica Software (i) Pvt. Ltd.
About Talentica Software: Talentica Software is a boutique software development company founded by industry veterans and alumni from IIT Bombay. We specialize in helping startups build innovative products by leveraging the latest tools and technologies to solve real-world challenges. With over 21 years of experience, we've partnered with 180+ startups, primarily in the US, and contributed to numerous successful exits. In 2022, Talentica Software was recognized by Great Place to Work as one of India s Great Mid-Size Workplaces. What We re Looking For: We are seeking a DevOps SRE Manager to lead our cloud operations, with a primary focus on Google Cloud Platform (GCP) and secondary support for AWS. In this role, you will manage two critical teams: one DevOps team responsible for GCP infrastructure, and a CloudOps/SRE team ensuring 24/7 uptime for our mission-critical services. This position requires a blend of technical expertise, leadership skills, and customer relationship management. You ll be responsible for ensuring the reliability, scalability, and security of our infrastructure while overseeing smooth cloud operations. What You ll Be Doing: As a DevOps SRE Manager, your responsibilities will include: Managing GCP Operations: Oversee DevOps operations within Google Cloud Platform using tools like Terraform, Kubernetes (GKE), Prometheus, and Grafana. Infrastructure Automation: Ensure timely execution of tasks and optimize infrastructure automation to improve operational efficiency. CI/CD Enhancement: Drive improvements to CI/CD pipelines, enforce cloud security best practices, and enhance software delivery processes. System Reliability: Improve system reliability through advanced monitoring, logging, and alerting solutions. Cloud Optimization: Optimize cloud infrastructure for cost-effectiveness, scalability, and security, ensuring long-term operational efficiency. Leading CloudOps/SRE Teams: Manage a 24x7 CloudOps/SRE team focused on maintaining service uptime and providing prompt incident response. Incident Management: Lead incident management processes, including conducting Root Cause Analysis (RCA) and ensuring adherence to SLAs. Implement Observability Best Practices: Utilize Grafana, Prometheus, and Opsgenie to implement observability best practices. Promote Automation: Foster self-healing, automated infrastructure to reduce manual interventions and improve operational efficiency. Customer Relationship Management: Build and maintain strong customer relationships through transparent and clear communication. Mentorship and Leadership: Lead and mentor cross-functional teams of DevOps and CloudOps/SRE engineers, ensuring high productivity, continuous professional growth, and performance reviews. AWS Support: Provide basic-to-intermediate support for AWS services (IAM, EC2, S3, Lambda, CloudFormation) and assist in hybrid cloud integration when required. To Be Successful in This Role, You Should Have: Qualifications: BE/BTech from a reputable engineering institute. Experience: 8-12 years of experience in DevOps, CloudOps, or SRE roles. Technical Expertise: Primary Cloud Platform: Expertise in Google Cloud Platform (GCP). Secondary Cloud Platform: Experience with AWS. Infrastructure as Code (IaC): Strong experience with Terraform. Containerization & Orchestration: Hands-on experience with Kubernetes (GKE). CI/CD & Automation: Expertise in tools such as Jenkins, GitOps, and Ansible. Monitoring & Observability: Proficient in Prometheus, Grafana. Incident & Alerting: Familiarity with Opsgenie. Big Data & Streaming: Experience with Kafka, Airflow, Druid. AWS Services: Experience with IAM, EC2, S3, Lambda, CloudFormation, and CloudWatch. Additional Skills: Proven experience managing 24x7 operations and multi-cloud environments. Hands-on expertise with GCP infrastructure, Terraform, Kubernetes, and CI/CD pipelines. Experience with incident management, RCA, monitoring, and alerting. Strong understanding of reliability engineering, automation, and cloud security best practices. Bonus Points If You Have: Experience working with Kafka, Airflow, and Druid in large-scale environments. Certifications such as GCP Professional DevOps Engineer, AWS Solutions Architect, or Kubernetes. Working knowledge of AWS cloud services, especially in hybrid-cloud environments. What You ll Find Here: A Culture of Innovation: We focus exclusively on cutting-edge development. Our clients seek our expertise for innovative solutions, not maintenance work. Endless Learning Opportunities: Constantly expand your skills and stay on top of the latest trends and advancements in cloud technologies. Talented Peers: Work alongside top-tier engineers from India s best institutes (IITs, NITs, and others), fostering a collaborative and growth-oriented environment. Work-Life Balance: We value your well-being and offer flexible schedules and remote work options to help you maintain a healthy work-life balance. A Great Culture: 82% of our employees recommend Talentica to their peers (according to Glassdoor), which speaks to the positive work environment we ve built. Recognition & Rewards: We celebrate success and ensure that your contributions are recognized and appreciated. At Talentica, we invite you to take ownership of large-scale, impactful projects and work with cutting-edge technologies. If you re ready to make a real difference in shaping the future of our industry, we d love to have you join us. Qualification : BE/BTech from a reputable engineering institute.
Senior Cloud Engineer
Allianz Technology
Job Description: Senior Cloud Engineer We are seeking a highly skilled Senior Cloud Engineer with extensive experience in software development and DevOps to lead our efforts in automating Cloud infrastructure. The ideal candidate will focus on building automation for Cloud Landing Zones and collaborate with cross-functional teams to ensure the successful implementation and maintenance of cloud solutions. Responsibilities: Design, implement, and maintain scalable and efficient cloud-based solutions on AWS and Azure. Lead initiatives to automate cloud infrastructure. Collaborate with teams to integrate best practices in development, code quality, and automation. Guide and mentor development teams, providing expertise in DevOps and automation practices. Contribute to the design and implementation of cloud applications using serverless architectures, Kubernetes, and event-driven patterns. Develop and maintain CI/CD pipelines to streamline deployments, utilizing GitOps methodologies. Apply security best practices to design and implement secure authentication and authorization mechanisms. Monitor and optimize the performance, scalability, and reliability of cloud applications. Stay updated with the latest cloud technologies and development trends, applying new tools and frameworks as needed. Ensure software systems meet functional and non-functional requirements while adhering to best practices in software design, testing, and security. Foster continuous improvement by sharing knowledge, conducting team reviews, and mentoring junior developers. Requirements: Proven experience as a Cloud engineer or similar role, with a strong focus on AWS (Azure is a plus). Solid experience in software development and DevOps practices. Expertise in AWS/Azure infrastructure automation. Proficiency in programming languages such as Python, Golang, or JavaScript. Experience with serverless architectures, Kubernetes, and event-driven patterns. Knowledge of CI/CD pipelines and GitOps methodologies. Strong understanding of cloud security best practices. Excellent problem-solving skills and ability to work collaboratively in a team environment. Strong communication skills and the ability to convey complex technical concepts to non-technical stakeholders. Preferred Qualifications: Experience in designing and working with No-SQL databases such as DynamoDB. Experience in leading and mentoring development teams. Expertise in software architecture, development, and systems testing with a strong focus on cloud technologies. Strong technical guidance and decision-making abilities to shape solutions and enforce development best practices. Proficient in applying quality gates, including code reviews, pair programming, and team review meetings. Experience in code management and release processes, with familiarity in Monorepo and Multirepo strategies. Solid understanding of functional programming principles, including list/map/reduce/compose techniques and familiarity with monads. Knowledge of SDLC, and adherence to DRY, KISS, and SOLID design principles. Proficient in managing security protocols such as ABAC, RBAC, JWT, SAML, AAD, and OIDC for authentication and authorization. Expertise in event-driven architecture, including queues, streams, batches, and pub/sub systems. Strong understanding of scalability, concurrency, and distributed systems. Experience with cloud networking and proxies. Expertise in CI/CD pipelines, GitFlow, and GitOps frameworks like Flux and ArgoCD. Polyglot programmer with expert-level proficiency in at least two languages (e.g., Python, TypeScript, GoLang). Experience in operating Kubernetes clusters from a developer s perspective, including custom CRDs, operators, and controllers. Experience in building serverless cloud applications. Strong team player with the ability to communicate and collaborate well in a fast-paced, collaborative environment. Proficient in using GitHub for version control, code reviews, and collaborative development. Experience working in agile teams, participating in sprints, and collaborating effectively in cross-functional teams. Fluency in UI development using React, Hooks, and TypeScript is a plus. Deep knowledge of AWS cloud services, with a basic understanding of Azure as a plus. Experience in developing and managing cloud infrastructures using Crossplane.io is a plus. Knowledge equivalent to AWS Certified DevOps Engineer Professional is a plus.
Senior Cloud Engineer
Allianz
Job Title: Senior Cloud Engineer Job Description: We are seeking a highly skilled Senior Cloud Engineer with extensive experience in software development and DevOps to lead efforts in automating Cloud infrastructure. The ideal candidate will focus on building automation for Cloud Landing Zones and collaborate with cross-functional teams to ensure the successful implementation and maintenance of cloud solutions. You will leverage your expertise in cloud technologies, coding practices, and automation to help scale and optimize the organization s cloud infrastructure. Responsibilities: Cloud Infrastructure Design & Implementation: Design, implement, and maintain scalable and efficient cloud-based solutions primarily on AWS, and Azure. Lead initiatives to automate cloud infrastructure, improving efficiency and reducing manual interventions. Collaboration & Best Practices: Collaborate with cross-functional teams to integrate best practices in development, code quality, and automation. Mentor and guide development teams, offering expertise in DevOps and automation practices. Cloud Application Development: Contribute to the design and implementation of cloud applications, leveraging serverless architectures, Kubernetes, and event-driven patterns. Ensure high-quality design, security, and testing practices in cloud-based applications. CI/CD & Automation: Develop and maintain CI/CD pipelines for seamless, automated deployments using GitOps methodologies. Security: Apply security best practices in cloud application architecture, including designing secure authentication and authorization mechanisms. Performance Optimization: Continuously monitor and optimize the performance, scalability, and reliability of cloud applications and infrastructure. Innovation: Stay updated with the latest cloud technologies, trends, and tools to drive innovation and improve the organization s cloud architecture. Continuous Improvement: Promote continuous improvement by sharing knowledge, conducting team reviews, and mentoring junior developers. Requirements: Experience: Proven experience as a Cloud Engineer or in a similar role with a strong focus on AWS (experience with Azure is a plus). Solid experience in software development and DevOps practices. Expertise in AWS/Azure infrastructure automation. Technical Skills: Proficiency in programming languages such as Python, Golang, or JavaScript. Experience with serverless architectures, Kubernetes, and event-driven patterns. Strong understanding of CI/CD pipelines, GitOps methodologies, and cloud security best practices. Collaboration & Communication: Excellent problem-solving skills and ability to work collaboratively within a team. Strong communication skills to convey complex technical concepts to non-technical stakeholders. Preferred Qualifications: Cloud Technologies & Architecture: Experience in designing and working with No-SQL databases, such as DynamoDB. Expertise in software architecture, development, and systems testing, with a strong focus on cloud technologies. Leadership & Mentoring: Experience in leading and mentoring development teams. Ability to guide and influence decision-making in the development and design of cloud solutions. Quality & Code Management: Proficient in applying quality gates such as code reviews, pair programming, and team review meetings. Experience in code management and release processes, with familiarity in Monorepo and Multirepo strategies. Technical Competence: Solid understanding of functional programming principles. Knowledge of SDLC, and adherence to DRY, KISS, and SOLID design principles. Expertise in managing security protocols such as ABAC, RBAC, JWT, SAML, AAD, and OIDC for authentication and authorization. Event-Driven Architecture & Cloud Networking: Expertise in event-driven architecture including queues, streams, batches, and pub/sub systems. Strong understanding of scalability, concurrency, and distributed systems. Experience with cloud networking and proxies. Additional Skills: Polyglot programmer with expert-level proficiency in at least two languages (e.g., Python, TypeScript, GoLang). Experience operating Kubernetes clusters from a developer s perspective, including custom CRDs, operators, and controllers. Strong experience in GitHub for version control, code reviews, and collaborative development. Agile & Development Environment: Experience working in agile teams, participating in sprints, and collaborating effectively in cross-functional teams. UI Development (Optional): Fluency in UI development using React, Hooks, and TypeScript is a plus. Cloud Certifications (Optional): AWS or Azure certifications. AWS Certified DevOps Engineer Professional or similar certification is a plus. What We Offer: Professional Development: Opportunities for continued professional growth through internal programs, mentorship, and certifications. Collaborative Work Environment: A culture that fosters innovation, collaboration, and knowledge sharing among teams. Competitive Benefits: Comprehensive benefits package including health, wellness, and work-life balance programs. Career Progression: Clear paths for career advancement and personal growth in a rapidly expanding cloud-focused company. Join our dynamic team and help shape the future of cloud infrastructure! Let's build the cloud solutions of tomorrow, together.
Senior Site Reliability Engineer
Nvidia
NVIDIA s Infrastructure, Planning and Processes (IPP) organization is seeking a hard-working and experienced Site Reliability/DevOps Engineer, with strong background in Infrastructure Management, Monitoring, Automation, & System Administration, to join our Sanity Operations Team in Pune. The IPP Org provides Infrastructure, Products & Services for multiple software teams including GPU, Mobile, and Automotive divisions working on Nvidia's extraordinary products & services. The team is responsible for hosting, enabling & running the large scale private cloud systems & services, for our in-house Testing CI framework. The cloud hosts a heterogeneous mix of machines and devices with various operating systems (Windows/Linux/Android, etc.), running with NVIDIA GPUs and Tegra Processors. What you ll be doing: Create resilient, scalable, and efficient test and deployment pipelines. Design and implement complex automation platforms to identify & resolve operational inefficiencies. Triaging software, hardware and infrastructure issues and maintaining high availability for our infrastructure & services. Deploying & Monitoring critical high performance, large scale services running on Geo-distributed systems. Continuously Strive for efficient utilization & management of the infrastructure. Automate processes for enabling developers to adopt self-service practices, while ensuring compliance with security standards. Work with architects and engineers across the teams to review the designs & solutions during development and deployment phases. Collaborate with our other engineering teams to deliver reliable, robust, and high-performance capability of the underlying infra. Mine & analyze data from multiple sources for identifying scaling & optimization opportunities. What we need to see: Bachelor s or Master s degree in computer science, Software Engineering, or equivalent experience with 7+ years of experience in a DevOps environment. Strong hands-on experience in Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc) Working Experience in monitoring & maintaining large-scale infrastructure applications running in a microservice-based architecture. Proficient with Virtualization architecture with strong experience in Kubernetes, VMs, Dockers. Experience with continuous integration and continuous delivery systems such as GitLab, GitOps, Jenkins, Packer, and Terraform. Strong Python scripting skills, with proven background of using/writing JSON/REST APIs. Fluency in using MySQL or equivalent NoSQL databases queries Solid understanding of configuration management tools like, Chef, Puppet, Ansible, etc. Working Experience with Perforce, GIT or any other version control system is necessary. Experience with telemetry and alerting systems such as Kibana, Elastic Search, Grafana, and Prometheus to create rich visualizations of system health over time. Ability to self-manage, show leadership, mentor others and communicate well. Ways to stand out from the crowd: Understanding of networking concepts like TCP/IP and firewall management. Exposure to web apps/dashboards on frameworks like Django, AngularJS, VueJS, etc. High level understanding of Build and Test systems. Experience in Building regression detection systems by analyzing real-time production data, emphasizing important metrics. Innovating with industry-standard tools and collaborating with the open source community Qualification : Bachelors or Masters degree in computer science, Software Engineering, or equivalent experience with 7+ years of experience in a DevOps environment.
Lead Site Reliability Engineer (azure)
Epam Systems
We are seeking a highly skilled and experienced Lead Site Reliability Engineer with a focus on Azure environments to join our team. In this crucial role, you will leverage your expertise to enhance the reliability and scalability of our cloud-based platforms, ensuring efficient operation and optimal performance. This position involves collaborating closely with cross-functional teams to migrate existing services to the OpenShift platform and make our infrastructure Cloud agnostic. As a leader, you ll guide your team in creating resilient systems and processes that support both internal and external customers relying on our desktop applications and services. Responsibilities Oversee migration of services to OpenShift and work towards making our infrastructure Cloud agnostic Run pipelines using Azure DevOps for environment configuration and application deployment Leverage Python, bash, and PowerShell to automate routine and complex tasks Implement and manage Kubernetes and container-based environments Monitor cloud resources efficiently and improve system performance in line with SLI metrics Debug and resolve operational issues swiftly and effectively Collaborate with development and operations teams to ensure system reliability and security Mentor team members and lead by example in maintaining best practices for site reliability Continuously assess, improve and optimize existing system architecture and applications Stay up-to-date with technological advancements and integrate innovative tools and techniques Requirements 5+ years of experience as a Systems Engineer with a development background 1+ years of relevant leadership experience Proficiency in Linux and Docker with hands-on experience in Kubernetes Capability to use at least one of the following scripting languages: Python, Bash, PowerShell Background in infrastructure management including networking and operating systems Familiarity with monitoring tools in cloud environments and understanding of SLI concepts Familiarity with Azure and/or GCP as cloud service providers Nice to have Experience working with Windows Knowledge of CI/CD pipelines, particularly Azure DevOps Understanding of Istio and GitOps tools like ArgoCD We offer Opportunity to work on technical challenges that may impact across geographies Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications Opportunity to share your ideas on international platforms Sponsored Tech Talks & Hackathons Unlimited access to LinkedIn learning solutions Possibility to relocate to any EPAM office for short and long-term projects Focused individual development Benefit package: Health benefits Retirement benefits Paid time off Flexible benefits Forums to explore beyond work passion (CSR, photography, painting, sports, etc.) Qualification : 5+ years of experience as a Systems Engineer with a development background
1 - 20 of 0 jobs
* No exact matches found. Showing closest results insteadNo results found
Modify search criteria or create an alert to get relevant jobs as soon as they’re posted