Data Science Engineer Job in Rakuten India Development Center
Data Science Engineer
- Bengaluru, Bangalore Urban, Karnataka
- Not Disclosed
- Full-time
Position: Data Science Engineer, Customer DNA
Rakuten is the one of the largest internet services company globally and provides more than 70 services spanning eCommerce, finance, telecommunication, sports and much more to approx. 1.4 billion customers worldwide. Following the strategic vision Rakuten as a data-driven membership company , we are expanding our data activities across our multiple Rakuten group companies.
Customer Program, one of our core science programs, aims to build universal customer models and applications. Customer DNA (CDNA) is a platform to manage Rakuten customer profiles by utilizing data from different Rakuten businesses. CDNA-IAs (Inferred Attributes), a part of CDNA, are predicted customer profiles using machine learning algorithms. The team builds and operates a large number of models for CDNA-IAs. The team does apply recent research progress to develop an efficient data pipeline and for model creation.
We are looking for a Data Science Engineer for the Customer DNA (CDNA) group.
Responsibility:
- Profiling tens of millions of customers with machine learning
- Design and tuning machine learning algorithms
- Design and implement frameworks to serve hundreds of models
- Around 50% time in research and 50% in development
- Create internal tools for monitoring, visualization and other automation process.
Requirements:
- Computer science or related background
- 3+ years experience in software development, especially using Python as a programming language
- Experience with common Linux commands and Linux scripting languages
- Experience with Hadoop, MapReduce, HDFS and Big Data querying tools, such as Tez, Hive, and Impala
- Experience with machine learning projects
- Experience with data analysis and visualization.
Preferred:
- Experiences in web service development
- Solid knowledge of large volumes data processing
- Experience with machine learning libraries, such as TensorFlow, PyTorch, and Scikit-Learn
- Experience with NoSQL databases, such as HBase, Redis, CouchBase
- Familiar with data mining concepts and machine learning algorithms
- Experience with Spark and stream-processing systems
- Knowledge of various ETL techniques and frameworks, such as Flume

