15 Pyspark Jobs Vacancies in Bengaluru - June 2026

IN

Sr. Data Engineer- Aws- Big Data

Infocepts

7-10 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Sr. Data Engineer - AWS - Big Data Location:Bangalore Type of Employment: Full-Time Experience Required: 7 to 10 years Job Overview: We are seeking a highly skilled Sr. Data Engineer with expertise in AWS cloud technologies and Big Data to join our Cloud Data Architect Team at Infocepts. In this critical role, you will design and implement robust data solutions using technologies like EMR, Athena, PySpark, AWS Lambda, S3, and other AWS services. The ideal candidate will have a strong foundation in database concepts and SQL and will be responsible for building scalable data pipelines to support high-performance data processing. Key Responsibilities: Technology Assessment and Design: Study the existing technology landscape and evaluate current data integration frameworks. Assist in designing complex Big Data use cases leveraging AWS services. Documentation and Stakeholder Communication: Prepare and maintain comprehensive project documentation, adhering to quality guidelines and schedules. Work closely with Architects and Project Managers to provide accurate estimations, scoping, and scheduling assistance. Clearly communicate design decisions and conduct Proof-of-Concepts to validate new solutions before implementation. Process Improvement and Automation: Identify areas for process automation to improve efficiency and team productivity. Provide expert guidance and troubleshooting support to junior Data Engineers. Training and Knowledge Sharing: Develop and deliver technology-focused training sessions for the team, ensuring continuous knowledge sharing. Share expertise through Expert Knowledge Sharing sessions with Client Stakeholders. Essential Skills: AWS Services Expertise: In-depth knowledge of S3, EC2, EMR, Athena, AWS Glue, and Lambda. Big Data Technologies: Proficiency with Apache Spark, Databricks, and Big Data table formats such as Delta Lake (open-source). Data Warehousing: Strong understanding of data warehousing concepts and architectures. Programming Skills: Advanced programming skills in Python for building data pipelines. SQL Expertise: Strong SQL skills for data transformation, aggregation, and querying large datasets. ETL Workflow Development: Expertise in creating ETL workflows with complex transformations (e.g., SCD, deduplication, aggregation). Orchestration Tools: Familiarity with orchestration tools like Apache Airflow. MPP Databases: Experience with at least one MPP database (e.g., AWS Redshift, Snowflake, SingleStore). Cloud Databases: Exposure to cloud databases like Snowflake or AWS Aurora. Desirable Skills: Cloud Databases: Familiarity with Snowflake, AWS Aurora. Big Data Technologies: Experience with Hadoop and Hive. AWS Certification: Associate or Professional Level AWS Certification. Advanced Knowledge of Big Data Solutions: Exposure to big data tools and frameworks on cloud platforms. Qualifications: Experience: 7+ years of overall IT experience, with 5+ years specifically focused on AWS-related projects. Educational Background: Bachelor's degree in Computer Science, Engineering, or a related field (Master's degree is a plus). Technical Certifications: Demonstrated commitment to continuous learning through certifications or relevant training. Qualities: Strong analytical and problem-solving skills to deep dive into complex technical challenges.

Sr. Data Engineer Sr. engineer Data Engineer

IN

Cloud Data Engineer - AWS Big Data

Infocepts

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Cloud Data Engineer AWS Big Data Location: Bangalore, India Employment Type: Full-time Experience Required: 5 to 8 years Purpose of the Position: Join the Infocepts Cloud Data Architect Team as a Cloud Data Engineer and help design and implement cutting-edge big data solutions on AWS. You will leverage your expertise in EMR, Athena, PySpark, S3, AWS Lambda, and SQL to develop robust and scalable data platforms. Key Responsibilities: Technology Assessment and Design: Assess existing technology landscape and data integration frameworks. Design complex Big Data use cases using AWS services under guidance of the Architect. Support architectural decision-making by evaluating trade-offs in cost, performance, and durability. Recommend optimizations to existing data infrastructure. Documentation and Stakeholder Communication: Create project documentation adhering to quality and delivery standards. Collaborate closely with Architects and Project Managers for scoping, estimation, and planning. Present design decisions to technical and business stakeholders clearly. Conduct PoCs and design review sessions. Process Improvement and Automation: Identify and suggest opportunities for automation and process enhancements. Mentor junior engineers and support technical problem solving. Training and Knowledge Sharing: Prepare and deliver internal training on AWS and Big Data topics. Lead client knowledge sharing sessions and contribute to case studies. Essential Skills: In-depth experience with AWS services: S3, EC2, EMR, Athena, Glue, Lambda Familiarity with MPP databases like Redshift, Snowflake, or SingleStore Proficiency in Apache Spark and Databricks Strong programming skills in Python Experience building data pipelines using AWS and Databricks Knowledge of Big Data file formats such as Delta Lake Advanced SQL skills for large-scale data manipulation Hands-on experience with Apache Airflow or similar orchestration tools Strong understanding of ETL workflows and data warehousing concepts Desirable Skills: Cloud databases: AWS Aurora, Snowflake Experience with Hadoop and Hive AWS Certifications (Associate or Professional level) are a plus Qualifications: Bachelor s degree in Computer Science, Engineering, or related field (Master s preferred) Overall 5+ years of IT experience with at least 3 years in AWS Big Data projects Ongoing learning and technical certifications are strongly encouraged Key Qualities: Strong problem-solving and analytical thinking Self-driven with a passion for emerging data technologies Excellent communication and client presentation skills Ability to work in cross-functional, agile teams Apply now to be part of a high-impact data transformation team working on large-scale cloud data projects! Qualification : Bachelors degree in Computer Science, Engineering, or related field (Masters preferred)

Cloud Data Cloud data Engineer Cloud engineer

IB

Data Engineer

International Business Machines

7+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Azure Data Engineer IBM Consulting Client Innovation Center Location: Bengaluru, India Experience: 7+ Years (Minimum 4+ Years in Azure Technologies) Job Type: Full-Time Education Required: Bachelor s Degree (Master s preferred) Introduction: Join IBM Consulting at our Client Innovation Center in Bengaluru, where we deliver deep technical and industry expertise to a wide range of public and private sector clients. Our delivery centers focus on innovation, agility, and adoption of next-gen technologies to transform businesses locally and globally. Your Role & Responsibilities: Design and develop scalable data engineering solutions on Microsoft Azure. Build, optimize, and manage data pipelines and ETL workflows using tools like Azure Data Factory, ADLS, and Databricks. Write clean, efficient code using PySpark, PL/SQL, and Spark SQL. Integrate and manage data from various sources, both structured and unstructured. Work with SQL, Postgres, Cassandra, and Cosmos DB. Utilize Azure services such as Stream Analytics, SQL DW, Azure Functions, ARM Templates, and Analysis Services. Implement and maintain serverless architectures for modern data solutions. Ensure high-quality data delivery, security, and compliance across systems. Collaborate across teams and contribute to solution design, code reviews, and documentation. Required Skills & Experience: Bachelor s degree in Computer Science, Information Systems, or related technical field. 7+ years total experience in Data Engineering, with 4+ years in Azure-based data projects. Strong hands-on experience with: Azure Data Factory, Data Lake (ADLS), Databricks Programming: Python, PySpark, PL/SQL, Spark SQL Databases: SQL, Postgres, Cassandra, Cosmos DB Solid grasp of data warehousing, relational databases, and cloud-based architectures. Familiarity with version control tools like Git and CI/CD pipelines. Preferred Skills & Qualifications: Master s Degree in a relevant field. Experience with: ARM Templates, Azure Functions, Serverless Architectures Object-oriented scripting languages (Python, Scala, etc.) Excellent problem-solving and communication skills. Ability to work in a fast-paced environment and collaborate effectively with teams and stakeholders. What You ll Get: Work on high-impact, global projects with cutting-edge Azure technologies. Be part of a collaborative and forward-thinking IBM team. Opportunities for professional growth, upskilling, and certifications. A dynamic work culture focused on innovation, agility, and client success. Join us in Bengaluru and be part of IBM s journey in shaping the future of data engineering. Qualification : Bachelors degree in Computer Science, Information Systems, or related technical field.

Data Engineer Data Engineer Platforms Data Platforms

PL

Lead Data Scientist

Playsimple

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Lead Data Scientist Location: Bangalore North, Karnataka, India Job Type: Full-Time Industry: Entertainment / Mobile Gaming About Us We are one of India s most exciting and fast-growing mobile gaming companies. Founded in 2014 and partnered with Modern Times Group (MTG), our vision is to create simple, impactful casual game experiences at massive scale. Our portfolio includes evergreen hits such as Daily Themed Crossword, WordTrip, WordJam, WordWars, WordTrek, TileMatch, and Jigsaw. We have built a global network of chart-topping games supported by powerful tech and analytics infrastructure that fuels rapid growth. Position Summary As a Lead Data Scientist in our Central Analytics team, you will play a critical role in shaping data-driven strategies that enhance player experience and business performance. This fast-paced role offers abundant opportunities to work alongside product leaders and game teams, transforming complex data into actionable insights that drive user acquisition, engagement, and monetization. Key Responsibilities Collaborate closely with product leaders to provide data-driven advisory on strategic decisions. Partner with game development teams to analyze gameplay data and generate actionable insights that improve user acquisition, engagement, and monetization. Perform advanced exploratory data analyses and ad-hoc reporting to identify trends, issues, and opportunities across our game portfolio. Design, execute, and lead data research projects, delivering practical recommendations based on rigorous statistical analyses. Drive continuous improvement in game performance through innovative machine learning models and analytics techniques. Requirements Bachelor s/Master s/PhD degree in Computer Science, Statistics, or a related field. Proven experience with machine learning, statistical modeling, and data science projects. Hands-on proficiency in Python and/or Spark for data manipulation, visualization, and building ML models. Strong SQL skills with experience querying large, complex datasets from data lakes or warehouses. Demonstrated ability to lead research projects and translate findings into actionable business recommendations. Excellent interpersonal skills and a collaborative approach to working with cross-functional teams. Knowledge of Deep Learning frameworks and techniques is highly desirable. Work with a top-tier gaming company known for its innovative and data-driven culture. Influence millions of users worldwide through impactful analytics. Collaborate with talented teams in a high-growth, dynamic environment. Access to cutting-edge tools and technologies for data science and machine learning. Competitive compensation and career growth opportunities. Qualification : Bachelors/Masters/PhD degree in Computer Science, Statistics, or a related field.

Lead Data Data lead Scientist Data scientist

IB

Chief Analytics Office (cao) - Data Scientist

International Business Machines

2-4 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Chief Analytics Office (CAO) - Data Scientist Location: Bangalore, Karnataka, India Job Type: Full-Time Experience Level: 2-4 years Company: IBM Chief Analytics Office (CAO) Introduction: At IBM, we believe in the power of AI and data science to transform enterprises. In the Chief Analytics Office (CAO), we work on the front lines of AI-driven enterprise transformations. By joining our team, you will collaborate with innovative leaders and contribute to solving complex problems using the latest AI and machine learning technologies. Here, your skills will not only be valued but nurtured in a dynamic, collaborative environment designed for growth and impact. Role Overview: As a Data Scientist within IBM's Chief Analytics Office, you will play a critical role in driving AI and data science initiatives across the enterprise. You will use your technical expertise to develop and deploy machine learning models and data analytics solutions that align with business objectives. This role is focused on translating data insights into actionable recommendations and leading data science projects from concept to execution. Key Responsibilities: Technical Execution and Leadership: Develop and deploy AI models and data analytics solutions. Assist in the implementation and optimization of AI-driven strategies aligned with business stakeholder requirements. Refine data-driven methodologies for enterprise transformation projects. Data Science and AI: Design, implement, and optimize machine learning solutions and statistical models. Perform end-to-end data analysis, from problem formulation through to deployment. Ensure the scalability of AI solutions by leveraging cloud platforms. Apply IBM s data science standards and reusable assets to ensure quality and consistency. Project Support: Lead and contribute across various stages of AI and data science projects, from data exploration to model development and deployment. Monitor project timelines and assist in resolving any technical challenges. Design and implement measurement frameworks to assess the business impact of AI solutions using KPIs. Collaboration: Work closely with stakeholders to ensure alignment with strategic goals. Collaborate with data engineers, software developers, and other team members to integrate AI solutions into existing systems. Provide technical expertise to cross-functional teams, fostering a collaborative environment. Required Education and Experience: Education: Bachelor s or Master s degree in Computer Science, Data Science, Statistics, or a related field. An advanced degree is strongly preferred. Experience: 2-4 years of experience in data science, AI, or analytics, with a focus on implementing data-driven solutions. Experience with data cleaning, data analysis, A/B testing, and data visualization. Hands-on experience with AI technologies through coursework or projects. Required Technical Skills: Programming and Data Analysis: Proficiency in SQL and Python for data analysis and developing machine learning models. Familiarity with machine learning algorithms such as linear regression, decision trees, random forests, gradient boosting (e.g., XGBoost, LightGBM), neural networks, and deep learning frameworks (e.g., TensorFlow, PyTorch). Cloud Platforms & Data Processing: Experience with cloud-based platforms (e.g., AWS, Azure, IBM Cloud) and data processing frameworks. Specialized Knowledge: Knowledge of large language models (LLMs) and familiarity with IBM s WatsonX product suite. Familiarity with object-oriented programming (OOP) principles. Analytical and Soft Skills: Problem-Solving: Strong analytical skills with the ability to solve complex problems and derive actionable insights from datasets. Communication: Excellent communication skills to explain technical concepts clearly to stakeholders at various levels. Project Management: Ability to handle multiple initiatives simultaneously, prioritize effectively, and meet deadlines in a fast-paced environment. Preferred Experience: Advanced Degree: A Master s degree in a related field is strongly preferred. Impact: Work on innovative projects that directly impact enterprise transformation using the latest AI and data science techniques. Collaboration: Be part of a collaborative environment with a global team of AI experts. Growth: Develop professionally with continuous learning opportunities in the ever-evolving AI field. Work-Life Balance: IBM offers a dynamic work environment with flexibility and career growth opportunities. Apply now and join IBM's Chief Analytics Office as a Data Scientist and be part of the next generation of data-driven innovations. Qualification : A Masters degree in a related field is strongly preferred.

Chief Analytics Office Data Data Analytics

KP

Software Development Manager

Kaleidofin Private Limited

8+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Software Development Manager Experience: 8+ Years Location: Bangalore Company: Kaleidofin About Us Kaleidofin is a mission-driven fintech platform committed to delivering financial solutions for underserved communities. Our proprietary ki credit platform combines advanced credit health analytics, middleware integration, risk dashboards, and debt capital structuring empowering financial institutions to serve low-income customers, informal sector businesses, and MSMEs more effectively. Operating across India, Bangladesh, and Kenya, we ve already impacted over 6 million customers, enabling over $3 billion in debt capital for underserved communities. Backed by global investors including the Gates Foundation, Omidyar Network, and Blume Ventures, Kaleidofin is recognized as a leader in inclusive finance. Featured on Forbes Asia's "100 to Watch" Winner of RBI s Swanari TechSprint 2023 Winner of G20 TechSprint 2024 by BIS Certified as a Great Place to Work Role Overview As a Software Development Manager, you will play a key leadership role in shaping the technical roadmap and executing high-impact fintech products. You ll lead a team of engineers to build scalable, secure, and robust systems. This is a hands-on leadership role requiring active involvement in design, architecture, and code-level decisions. What You ll Do Own the product s engineering roadmap, architecture, scalability, and code quality. Guide system design decisions and lead architecture/code reviews. Collaborate closely with Product and Business teams to align on product strategy and feature development. Provide hands-on technical leadership especially in solving complex problems in fintech and lending domains. Mentor and develop engineering talent; foster a culture of ownership, innovation, and quality. Champion engineering best practices including test automation, CI/CD, and DevOps. Lead the adoption of data-driven practices including Big Data, PySpark, and data pipeline architectures. Balance short-term deliverables with long-term technical sustainability and scalability. Support team growth through career development, learning plans, and succession planning. 8+ years of software development experience, with 1 2+ years in people/team management roles. Strong fundamentals in computer science, system design, and problem-solving. Bachelor's or Master's degree in CS or related field from a top engineering institution (IITs, NITs, etc.). Proven experience building distributed, scalable, and secure systems. Strong hands-on experience with Python, Java, Spring Boot, and microservice architectures. Proficient in SQL/NoSQL, Kafka, CI/CD pipelines, and DevOps tools (Docker, Kubernetes, Jenkins, etc.). Familiarity with AI/ML model integration and credit scoring systems is a strong plus. Experience with data engineering: Big Data, Spark, Airflow, AWS stack (S3, Glue, Athena), etc. Knowledge of front-end and mobile (Android) development is a plus. Experience working in fintech, lending tech, or high-growth startup environments preferred. Excellent leadership, communication, and cross-functional collaboration skills. Be part of a high-impact team driving financial inclusion at scale. Work on cutting-edge technology in fintech and data infrastructure. Grow with a company that s winning global recognition and delivering real-world impact. Contribute to a people-first culture that values learning, innovation, and work-life balance. Qualification : Bachelor's or Master's degree in CS or related field from a top engineering institution

Software Development Software Development Manager Software manager

MF

Associate ML Ops

Mpokket Financial Services Private Limited

1-2 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Associate ML Ops Location: Bangalore Department: Data Science Employment Type: Full-time Experience: 1 2 years Job Overview We are seeking a motivated and detail-oriented Associate ML Ops to join our Data Science team. In this role, you will be responsible for supporting the deployment, monitoring, and scaling of machine learning models in production environments. You ll collaborate closely with data scientists and engineers to build robust MLOps pipelines and ensure model reliability, scalability, and performance. If you are passionate about bringing machine learning models to life and have hands-on experience in productionizing ML systems, we d love to hear from you. Key Responsibilities Deploy and maintain machine learning models in production environments using best-in-class tools like Databricks and MLflow. Collaborate with data scientists to translate experimental models into scalable, production-ready systems. Monitor model performance, accuracy, and overall health through automated tools and custom strategies. Build and maintain RESTful APIs using Python frameworks such as Flask or Django to serve ML models. Write efficient and optimized SQL and NoSQL queries for data extraction and transformation. Apply software engineering best practices, including version control, testing, and documentation, to MLOps workflows. Work with Python libraries like Pandas, PySpark, scikit-learn, SQLAlchemy, and Requests. Troubleshoot issues related to model deployment, API performance, or data integration pipelines. Minimum Qualifications Bachelor s or Master s degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field. 1 2 years of hands-on experience in solving analytical or machine learning problems in production settings. Must-Have Technical Skills Hands-on experience with Databricks and MLflow Proven expertise in deploying ML models in real-world applications Strong understanding of data structures, algorithms, OOP, and software engineering principles Experience building and maintaining REST APIs using Python Proficiency in SQL and NoSQL Excellent Python programming and debugging skills Familiarity with core Python libraries used in ML and data processing: Pandas, scikit-learn, PySpark, SQLAlchemy, etc. Nice-to-Have Skills Exposure to Kafka for streaming and batch data processing Familiarity with Git and CI/CD pipelines Experience with Python multiprocessing or worker/queue systems Understanding of event-driven or asynchronous programming models This is an exciting opportunity to work at the intersection of data science and engineering. You ll play a key role in productionizing cutting-edge models and ensuring they deliver real business impact. Qualification : Bachelors or Masters degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field

Associate Ops ML Ops Full-Time MLOps

MF

ML Ops Engineer

Mpokket Financial Services Private Limited

3-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: ML Ops Engineer Location: Bangalore Department: Data Science Employee Type: Full-time Experience Required: 3 5 years Position Overview We are seeking an experienced and motivated ML Ops Engineer to join our Data Science team. In this role, you will be responsible for deploying, monitoring, and maintaining machine learning models in production environments. You will work closely with data scientists, engineers, and product teams to ensure models are scalable, reliable, and aligned with business objectives. This role is ideal for professionals who are passionate about building robust ML pipelines and bringing machine learning solutions into real-world applications at scale. Key Responsibilities Deploy and manage machine learning models in production environments, ensuring scalability, reliability, and performance. Build and maintain MLOps pipelines using platforms like Databricks and MLflow. Monitor model performance, accuracy, and health; implement alerting and diagnostics as needed. Develop and maintain RESTful APIs using Python frameworks such as Flask or Django to serve ML models. Optimize data workflows and collaborate with engineering teams to improve model integration and performance. Design strategies for automated model retraining, deployment, and version control. Write clean, maintainable, and efficient code using Python, adhering to OOP principles and best practices. Write complex queries using SQL and work with NoSQL databases to support data pipelines and feature stores. Leverage Python libraries such as PySpark, Pandas, scikit-learn, SQLAlchemy, and Requests. Minimum Qualifications Bachelor s or Master s degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field. 3 5 years of experience in building, deploying, and monitoring machine learning solutions in production. Must-Have Skills Experience with Databricks and MLflow for model training and deployment. Proven expertise in machine learning model deployment and monitoring in live environments. Strong programming skills in Python, with solid understanding of data structures, algorithms, and OOP concepts. Experience developing RESTful APIs using Flask or Django. Proficient in SQL and NoSQL database operations. Hands-on knowledge of libraries such as Pandas, PySpark, scikit-learn, SQLAlchemy, and Requests. Strong analytical, problem-solving, and debugging skills. Good-to-Have Skills Experience with Kafka streaming and batch processing. Familiarity with CI/CD pipelines and version control systems like Git. Understanding of Python multiprocessing, worker/queue systems, and asynchronous/event-driven programming. This is a unique opportunity to work at the intersection of machine learning and DevOps. You'll play a critical role in operationalizing AI models and making them a core part of our product offerings. If you enjoy building scalable systems and solving real-world ML engineering challenges, we d love to meet you. Qualification : Bachelors or Masters degree in Computer Science, Statistics, Econometrics, Operations Research, or a related technical field

Ops ML Ops Engineer Ml engineer ML Ops Engineer

SL

Data Scientist

Subex Limited

1-3 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position: Data Scientist (AI/ML Expert) Location: Pritech Park SEZ, Block 09, 4th Floor B Wing, Survey No. 51 to 64/4, Outer Ring Road, Bellandur V, Bangalore, Karnataka, India Department: Advanced Analytics Employment Type: Subexian Experience Required: 1 to 3 years Job Overview: We are looking for a talented Data Scientist with expertise in AI/ML to join our Advanced Analytics team. As a key contributor, you ll design, develop, and validate predictive models, recommendation systems, and forecasting solutions, while also collaborating with cross-functional teams to deliver cutting-edge solutions using the latest technologies. Key Responsibilities: Model Development: Design, develop, and validate predictive models, recommendation systems, and forecasting solutions using a mix of statistical, machine learning, and deep learning techniques. You will work both independently and as part of a collaborative team. Data Visualization & Reporting: Communicate actionable insights effectively through compelling dashboards, reports, and visualizations using tools such as Superset, Power BI, and Python libraries (Matplotlib, Seaborn, Plotly). AI & Tech Solutions: Collaborate with teams to design and deliver flexible, scalable solutions using advanced technologies such as AI and large language models (LLMs). API Development: Develop and integrate REST APIs and frameworks such as Flask or FastAPI for seamless deployment of machine learning models. Documentation: Maintain clear, comprehensive documentation for data workflows, model development, and analytical methodologies to ensure knowledge sharing and transparency across teams. Continuous Learning: Stay up-to-date with the latest trends and advancements in data science, algorithms, and technologies, ensuring your skills and knowledge remain cutting-edge. Required Technical Skills: Python Proficiency: Strong experience with Python and libraries like Scikit-learn, TensorFlow/PyTorch, and data visualization libraries (Matplotlib, Seaborn, Plotly). SQL: Solid hands-on experience in SQL for efficient data querying. ML Ops & Pipelines: Understanding of machine learning operations (ML Ops) and ML pipelines for streamlined model deployment. Cloud & Distributed Computing: Exposure to cloud platforms such as AWS, Azure, or GCP and distributed computing tools like Hadoop, Spark, or Pyspark is a plus. Soft Skills: Effective Communication: Strong ability to communicate complex analytical findings in a clear and engaging manner, tailoring insights for both technical and non-technical audiences. Problem-Solving: A proactive problem-solver with the ability to adapt and thrive in a fast-paced, dynamic environment. Continuous Growth: Self-motivated, curious, and always seeking opportunities for professional growth and learning. At Subex, we encourage a collaborative, innovative, and growth-driven work environment. If you're passionate about applying data science techniques to real-world challenges and want to work with cutting-edge AI/ML technologies, we d love to hear from you!

Data Scientist Data scientist Full-Time Machine Learning

TH

Senior Ai Engineer

Themathcompany

4-7 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Job Title: Senior AI Engineer Location: Bengaluru, Karnataka, India Department: GenAI Experience: 4.5 to 7 years Open Positions: 5 About the Role As a Senior AI Engineer, you will design, build, and maintain scalable AI solutions with a strong focus on Generative AI technologies such as large language models (LLMs), embeddings, and retrieval techniques. You will lead a team of AI engineers and collaborate with stakeholders to deliver impactful AI-driven products aligned with business goals. Your role includes mentoring, project planning, ensuring data quality, and driving continuous process improvements. Key Responsibilities Design, develop, and deploy scalable AI/ML solutions, specializing in advanced Generative AI (LLMs, embeddings, retrieval-augmented generation, prompt engineering). Lead, mentor, and develop a team of AI engineers in a collaborative, inclusive environment. Coordinate with stakeholders to gather requirements, prioritize tasks, and define project timelines. Ensure projects align with overall business objectives and data strategies. Oversee data quality, integrity, and security in AI engineering projects. Build reusable frameworks to enhance the efficiency and scalability of AI systems. Manage client communications to translate requirements into technical outcomes. Identify skill gaps and create opportunities for professional development. Drive initiatives for improving data operations and AI delivery efficiency. Required Technical Skills 4.5 to 7 years of experience developing and deploying scalable AI/ML solutions. Strong expertise in data modeling, relational and NoSQL databases, software development lifecycle, unit testing, and functional programming. Proficient in designing and implementing advanced Generative AI solutions including LLMs, embeddings, retrieval techniques, and prompt engineering. Experience designing and optimizing Retrieval-Augmented Generation (RAG) systems. Proficiency with Databricks workflows, including job and cluster management, and API usage. Solid understanding of data structures, algorithms, multiprocessing, and optimization techniques. Skilled in Python libraries such as Pandas, NumPy, FastAPI for data processing and API development. Expertise in SQL optimization and database schema design. Experience deploying AI models using Docker and Kubernetes. Familiarity with version control using GitHub. Hands-on experience with cloud platforms like Azure, AWS, or GCP for AI deployments. Optional experience with PySpark for data processing. Basic understanding of CI/CD pipelines and deployment best practices. Required Non-Technical Skills Strong problem-solving ability with financial impact awareness in both team management and solution delivery. Excellent verbal and written communication skills, comfortable interacting with mid-level client management. Ability to balance pragmatic solutions versus perfect outcomes and rally teams accordingly. Strong interpersonal skills including conflict resolution, empathy, negotiation, and active listening. Demonstrated leadership and mentorship capabilities. Self-motivated with a strong sense of ownership. Good to Have Familiarity with data visualization tools and techniques. Understanding of data security, privacy, governance, and compliance frameworks. Experience with graph databases and graph processing frameworks. Knowledge of data virtualization and federation methods. Skills in data profiling and data quality management. Education Bachelor s degree in Engineering, Computer Science, or a related field. Qualification : Bachelors degree in Engineering, Computer Science, or a related field.

Senior Ai Engineer Senior engineer Ai engineer

SY

Senior Data Engineer

Synechron

8+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Position Title: Senior Data Engineer Databricks, PySpark, Cloud Platforms Location: Bengaluru Bellandur (GTP) Employment Type: Full-time Job Summary Synechron is looking for a Senior Data Engineer to join our advanced analytics team in Bengaluru. In this role, you will architect and build scalable, high-performance data pipelines that power data science, analytics, and business intelligence initiatives. You ll work with modern tools including Databricks, PySpark, and cloud data platforms, while collaborating across teams to ensure high-quality, secure, and efficient data solutions. Key Responsibilities Design, develop, and maintain large-scale, secure, and efficient data pipelines using Databricks, PySpark, and cloud-native tools. Partner with data scientists, analysts, and business stakeholders to translate requirements into robust data solutions. Integrate data from various structured, semi-structured, and streaming sources. Ensure high standards for data quality, performance optimization, security, and cost efficiency. Drive data pipeline automation, orchestration, and monitoring using tools like Airflow. Lead troubleshooting efforts, performance tuning, and enhancements of existing pipelines. Stay informed about emerging data technologies and recommend adoption where relevant. Technical Skills Core Expertise Programming: Python (expert), SQL (advanced), PySpark. Platforms: Databricks (clusters, notebooks, workflows), AWS/Azure/GCP. Data Orchestration: Apache Airflow (or similar). Data Warehousing: Snowflake (preferred), data modeling, ETL/ELT pipelines. Streaming: Kafka or other stream processing tools. DevOps: CI/CD (GitLab CI, Jenkins), version control (Git), containerization (Docker/Kubernetes preferred). Security: Familiarity with encryption, access controls, and compliance best practices. Experience 8+ years of experience in data engineering or related roles. Proven expertise in developing and deploying scalable data pipelines using Databricks, PySpark, and SQL. Hands-on experience with cloud platforms (AWS, Azure, or GCP). Strong background in data warehousing, especially with Snowflake. Exposure to real-time data processing and orchestration tools. Experience implementing CI/CD pipelines for data workflows is a plus. Daily Responsibilities Build and optimize data ingestion, transformation, and storage workflows. Collaborate with cross-functional teams to align data solutions with business objectives. Monitor, troubleshoot, and continuously improve pipeline performance. Conduct data quality checks, ensure governance and compliance standards. Contribute to technical documentation, code reviews, and team knowledge sharing. Qualifications Bachelor s or Master s degree in Computer Science, IT, or related field. Relevant certifications (e.g., Databricks Certified Data Engineer, AWS Certified Data Analytics) are preferred. Professional Competencies Strong problem-solving and analytical mindset. Effective communicator with ability to collaborate across technical and non-technical teams. Time management and prioritization skills under tight deadlines. Proactive leadership and a passion for innovation. Commitment to ethical data use and data security. Diversity & Inclusion at Synechron Synechron is committed to building an inclusive, diverse, and equitable workplace. Through our global Same Difference DEI initiative, we celebrate and support people from all backgrounds, including race, gender, sexual orientation, religion, age, disability, and more. We offer flexible work arrangements, continuous learning, internal mobility, and mentoring programs to support every employee s growth. Qualification : Bachelors or Masters degree in Computer Science, IT, or related field

Senior Data Engineer Senior engineer Data Engineer

CI

Data Scientist

Colan Infotech

5+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Data Scientist Experience: 5+ Years Location: Bangalore, Karnataka, India Job Type: Full Time Job Summary We are seeking an experienced Data Scientist with over 5 years of expertise to join our innovative team in Bangalore. The ideal candidate will have hands-on experience in advanced statistical methods, machine learning, and deep learning frameworks, combined with strong skills in deploying models at scale. Key Responsibilities Apply advanced Statistics and Operations Research methods to solve complex business problems. Develop and deploy predictive and machine learning models using frameworks such as PyTorch, TensorFlow, Keras, and XGBoost. Utilize data engineering and big data tools including PySpark, Databricks, and Flask for building scalable data solutions. Implement and fine-tune NLP models like RNN, LSTM, and attention-based architectures, leveraging pre-trained models from Stanford, IBM, Azure, and OpenAI. Design and optimize efficient SQL queries to extract data from large databases. Work with visualization tools like D3.js, Dash Plotly, and deploy interactive dashboards using Streamlit. Utilize graph databases such as Neo4j for complex data relationships. Manage version control with tools like GitHub or Bitbucket. Deploy machine learning models into production environments using MLOps practices on cloud platforms such as AWS or Azure. Required Skills 5+ years of professional experience as a Data Scientist or similar role. Strong foundation in statistics, machine learning, and deep learning techniques. Proficient in Python-based data science libraries and frameworks. Hands-on experience with cloud-based MLOps and deployment. Excellent SQL skills for data extraction and manipulation. Experience with version control systems and collaborative development. Familiarity with NLP techniques and modern pre-trained language models. Strong problem-solving and communication skills. Qualifications Any graduate degree in Computer Science, Statistics, Mathematics, Engineering, or related field. Join a forward-thinking company in Bangalore, where you will work with cutting-edge technologies to create impactful data-driven solutions. Grow your career in an environment that values innovation, diversity, and continuous learning.

Data Scientist Data scientist Full-Time Machine Learning

NE

Lead Software Engineer - Backend

Neuron7.ai

10+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Lead Software Engineer - Backend Location: Bengaluru, India Employment Type: Full-time, Hybrid About Neuron7.ai Neuron7.ai is a rapidly growing AI-first SaaS company that is redefining service intelligence. Backed by top-tier venture capitalists in Silicon Valley and a distinguished group of angel investors, we are recognized as a startup to watch. Our AI-driven platform helps enterprises make accurate service decisions at scale by delivering service predictions in seconds through the analysis of structured and unstructured data. We work across industries, including high-tech devices, manufacturing, and medical devices, enabling service leaders to excel in critical metrics like first-call resolution, turnaround time, and service margins. At Neuron7.ai, you ll be part of an innovative and dynamic team that is shaping the future of service intelligence. We value creativity, collaboration, and a commitment to pushing boundaries. This is your chance to make an impact in a fast-growing startup and work with cutting-edge technology. About the Team As a Lead Software Engineer - Backend, you will play a crucial role in architecting, developing, and optimizing backend systems for our groundbreaking products. You will lead a talented team, collaborating across functions to deliver scalable and high-performance systems. As a leader, you ll also mentor and guide junior engineers to help them grow in their careers. What You ll Do: Backend Systems Design: Lead the design and implementation of scalable, reliable, and high-performance backend systems. Distributed Systems: Develop and optimize distributed systems to handle large-scale traffic and data. System Architecture: Lead the architecture and design of critical components, ensuring robust and efficient solutions. Code Development: Write clean, maintainable, and efficient code that powers our AI-driven platform. Cross-Functional Collaboration: Work closely with product managers, frontend engineers, and other stakeholders to deliver end-to-end solutions. Troubleshooting & Debugging: Solve complex production issues and implement reliable, performance-optimized solutions. Mentorship: Provide guidance and mentorship to junior and mid-level engineers, fostering a collaborative environment. Continuous Learning: Stay updated on the latest trends and advancements in backend development, AI, and distributed systems. What We re Looking For: Experience: Minimum 10 years of professional experience in backend development, with at least 4 years of experience in AI, NLP, or ML. Programming Skills: Expertise in one of the following programming languages: Java, Python, or Golang. Distributed Systems: Proven experience in building and maintaining distributed systems at scale. Cloud Platforms: Familiarity with at least one major cloud platform (Azure, AWS, or GCP). System Design: Strong understanding of system design, architecture patterns, and data structures. Database Knowledge: Experience with relational and NoSQL databases. API Development: Proficiency in developing RESTful APIs and microservices. Security & Performance: Solid understanding of security best practices and performance optimization in backend systems. Problem-Solving Skills: Strong analytical skills with a proven track record of solving complex engineering challenges. Communication: Excellent communication skills with the ability to collaborate across teams and interact with non-technical stakeholders. Preferred Skills: Containerization: Experience with Kubernetes, Docker, or other containerization tools for scalable deployment. Messaging Systems: Familiarity with Kafka, RabbitMQ, or similar messaging systems for real-time data processing. CI/CD: Knowledge of CI/CD pipelines and automation tools to ensure efficient development and deployment. Data Pipelines: Familiarity with data pipelining frameworks such as PySpark, Flink, or similar for real-time processing. What We Do and Value: At Neuron7.ai, we prioritize integrity, innovation, and a customer-centric approach. Our mission is to enhance service decision-making through advanced AI technology, and we are dedicated to delivering excellence in every aspect of our work. Company Perks & Benefits: Competitive salary, equity, and spot bonuses Paid sick leave Latest MacBook Pro for your work Comprehensive health insurance Paid parental leave Work from home or from our vibrant Bengaluru office with hybrid work arrangements Our Commitment to Diversity and Inclusion: Neuron7.ai is committed to fostering a diverse and inclusive workplace. We ensure equal employment opportunities without discrimination based on race, color, religion, sex, sexual orientation, gender identity, age, disability, national origin, marital status, or any other characteristic protected by law. If you re excited about building innovative backend solutions and want to work with cutting-edge technologies in a forward-thinking team, we d love to hear from you!

Lead Software Software lead Engineer Lead Engineer

M&

Data Engineer Ii

Mckinsey & Company

2-5 Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Your Impact As a Data Engineer at QuantumBlack, you will collaborate with stakeholders, data scientists, and internal teams to develop and implement impactful data products and solutions. Your key responsibilities will include building and maintaining technical platforms for advanced analytics, designing scalable and reproducible data pipelines for machine learning, and ensuring information security within data environments. You will assess clients' data quality, map data fields to hypotheses, and prepare data for use in analytics models. Additionally, you will contribute to R&D projects, internal asset development, and participate in cross-functional problem-solving sessions with a variety of stakeholders, including C-level executives, to create innovative analytics solutions. You will be based in Gurugram, joining a global data engineering community, and work within cross-functional and Agile project teams alongside project managers, data scientists, machine learning engineers, other data engineers, and industry experts. You will collaborate directly with our clients, ranging from data owners and users to C-level executives. You will be aligned with one of our industry-focused practices: Pharma & Medical Products (PMP) or Global Energy & Materials (GEM). In these practices, you ll work on solving the most critical challenges for our clients in these sectors. PMP focuses on advancing the development and delivery of life-saving medicines and medical treatments, while GEM supports industries like chemicals, steel, mining, and energy to achieve operational excellence. GEMx and PMPx, the assetization arm of these practices, focus on creating reusable digital and analytics assets to support client work. As part of this team, you will help shape impactful solutions for large organizations, developing capabilities for sustained impact. Your Growth In this role, you will contribute to the frameworks and libraries that our teams of Data Scientists and Engineers use to progress from data to meaningful impact. You will have the opportunity to guide global companies through data science solutions, helping them transform and enhance performance across industries including healthcare, automotive, energy, and elite sports. Real-World Impact: You ll gain unique learning and development opportunities globally. Fusing Tech & Leadership: Work with the latest technologies and methodologies, with access to top-tier learning programs. Multidisciplinary Teamwork: Collaborate with data scientists, engineers, project managers, UX designers, and more to enhance performance. Innovative Work Culture: Creativity, passion, and wellness are central to our modern work environment, which includes insightful talks, training sessions, and a focus on work-life balance. Striving for Diversity: We celebrate diversity, with colleagues from over 40 nationalities, appreciating the value that diverse perspectives bring to the workplace. You are a highly collaborative individual who prioritizes impact over agenda. You enjoy learning from colleagues, challenging ideas thoughtfully, and working together to improve processes and solve problems. You believe in iterative change, experimenting with new approaches, and advancing quickly through constant learning and improvement. While we value using the right tech for the right task, our team often leverages technologies such as Python, PySpark, SQL, Airflow, Databricks, Kedro (our open-source data pipelining framework), Dask/RAPIDS, Docker, Kubernetes, and cloud solutions like AWS, GCP, and Azure. Your Role as a Data Engineer Collaboration: Work with business stakeholders, data scientists, and internal teams to create extraordinary, domain-focused data products (reusable assets) and deliver them to clients. Domain Expertise: Develop deep understanding of client industries and use creative techniques to deliver meaningful impact. Technical Platforms: Build and maintain technical platforms for advanced analytics engagements, spanning both data science and data engineering work. Data Pipelines: Design and implement robust, modular, scalable, deployable, and reproducible data pipelines for machine learning. Data Management: Ensure the security of data environments and compliance with information security standards. Data Wrangling: Assess data quality, map data fields to hypotheses, and prepare data for analytics models. Contribute to R&D: Participate in internal asset development and contribute to R&D projects to drive innovation. Cross-functional Problem Solving: Collaborate with internal teams and clients, including data owners and C-level executives, to create impactful analytics solutions. Your Qualifications and Skills Bachelor s degree in computer science or related field; Master's degree is a plus. 2-5 years of relevant work experience. Proficiency in at least one programming language such as Python, Scala, or Java. Strong experience with distributed processing frameworks (e.g., Spark, Hadoop, EMR) and SQL. Experience in commercial client-facing projects, particularly in close-knit teams. Ability to work with structured, semi-structured, and unstructured data, and identify linkages across disparate data sets. Clear communication skills to explain complex solutions effectively. Understanding of information security principles to ensure compliant handling of client data. Experience with cloud platforms (AWS, Azure, Google Cloud, Databricks) is highly desirable. Experience with CI/CD processes using GitHub Actions, CircleCI, or similar, and end-to-end pipeline development including application deployment is a plus. Qualification : Bachelors degree in computer science or related field; Master's degree is a plus.

Data Engineer Data Engineer Ii Engineer ii

OM

Sr. Data Engineer

Orange Mantra

6+ Years | Not Disclosed | Bengaluru, Karnataka, India | Full-time

Skills Required: Should have a minimum 6+ years in Data Engineering, Data Analytics platform. Should have strong hands-on design and engineering background in AWS, across a wide range of AWS services with the ability to demonstrate working on large engagements. Should be involved in Requirements Gathering and transforming them to into Functionally and technical design. Maintain and optimize the data infrastructure required for accurate extraction, transformation, and loading of data from a wide variety of data sources. Design, build and maintain batch or real-time data pipelines in production. Develop ETL/ELT Data pipeline (extract, transform, load) processes to help extract and manipulate data from multiple sources. Automate data workflows such as data ingestion, aggregation, and ETL processing and should have good experience with different types of data ingestion techniques: File-based, API-based, streaming data sources (OLTP, OLAP, ODS etc) and heterogeneous databases. Prepare raw data in Data Warehouses into a consumable dataset for both technical and non-technical stakeholders. Strong experience and implementation of Data lakes, Data warehousing, Data Lakehousing architectures. Ensure data accuracy, integrity, privacy, security, and compliance through quality control procedures. Monitor data systems performance and implement optimization strategies. Leverage data controls to maintain data privacy, security, compliance, and quality for allocated areas of ownership. Experience of AWS tools (AWS S3, EC2, Athena, Redshift, Glue, EMR, Lambda, RDS, Kinesis, DynamoDB, QuickSight etc.). Strong experience with Python, SQL, pySpark, Scala, Shell Scripting etc. Strong experience with workflow management & Orchestration tools (Airflow, Should hold decent experience and understanding of data manipulation/wrangling techniques. Demonstrable knowledge of applying Data Engineering best practices (coding practices to DS, unit testing, version control, code review). Big Data Eco-Systems, Cloudera/Hortonworks, AWS EMR etc. Snowflake Data Warehouse/Platform. Streaming technologies and processing engines, Kinesis, Kafka, Pub/Sub and Spark Streaming. Experience of working with CI/CD technologies, Git, Jenkins, Spinnaker, Ansible etc Experience building and deploying solutions to AWS Cloud. Good experience on NoSQL databases like Dynamo DB, Redis, Cassandra, MongoDB, or Neo4j etc. Experience with working on large data sets and distributed computing (e.g., Hive/Hadoop/Spark/Presto/MapReduce). Good to have working knowledge on Data Visualization tools like Tableau, Amazon QuickSight, Power BI, QlikView etc. Experience in Insurance domain preferred.

Sr. Data Engineer Sr. engineer Data Engineer

Pyspark Jobs in Bengaluru

15 Jobs Found

Sr. Data Engineer- Aws- Big Data

Cloud Data Engineer - AWS Big Data

Data Engineer

Lead Data Scientist

Chief Analytics Office (cao) - Data Scientist

Software Development Manager

Associate ML Ops

ML Ops Engineer

Data Scientist

Senior Ai Engineer

Senior Data Engineer

Data Scientist

Lead Software Engineer - Backend

Data Engineer Ii

Sr. Data Engineer

1 - 20 of 0 jobs

No results found

Continue to Save

Share Feedback