About the business: A global leader in information and analytics, we help researchers and healthcare professionals advance science and improve health outcomes for the benefit of society. Building on our publishing heritage, we combine quality information and vast data sets with analytics to support visionary science and research, health education and interactive learning, as well as exceptional healthcare and clinical practice. At Elsevier, your work contributes to the world's grand challenges and a more sustainable future. We harness innovative technologies to support science and healthcare to partner for a better world.
About the team: You will be part of a team who combine quality information and vast data sets with analytics to support visionary science and research that contributes to the world's grand challenges and a more sustainable future
About the Role: As a Senior Data Scientist, you will lead and drive the strategic development and implementation of our AI solutions, overseeing the entire lifecycle of data science projects. Your role will involve not just the development and refinement of models, but also the mentorship of junior data scientists and collaboration with cross-functional teams to innovate and scale our data science initiatives.
Responsibilities
- Strategic Data Insights and Model Development: Spearhead data collection, analysis, and advanced model development, focusing on classification, deep learning, and innovative techniques. Define and assess quality metrics, presenting high-level insights to stakeholders while guiding junior team members.
- Advanced Production Solutions: Design and oversee the creation of sophisticated, production-ready Python packages for data science pipelines. Collaborate extensively with technology teams to ensure seamless deployment and scalability.
- End-to-End Integration and Quality Assurance Leadership: Take charge of integrating data science components and conducting rigorous quality assessments, leveraging expertise in large language models. Establish resilience against model drift and develop comprehensive maintenance strategies, including automated model re-training protocols.
- Performance Evaluation and Strategic Development: Develop comprehensive reporting mechanisms for pipeline performance and lead in the implementation of automatic re-training strategies for existing pipelines, ensuring continuous optimization.
Requirements
- Education and Experience: Minimum of 3 years of relevant applied experience and a Master's degree or higher in computer science, data science, artificial intelligence, mathematics, statistics, or related quantitative fields. Alternatively, at least 4 years of relevant experience. Considerable experience leading complex data science projects is highly valued.
- Advanced Programming Proficiency: Demonstrated expertise in Python, with a proven track record of delivering high-quality, production-ready code following best practices, while mentoring junior team members in this aspect.
- Advanced Machine Learning Expertise: Extensive hands-on experience in advanced classification, regression, clustering, and deep learning techniques. Mastery in neural networks, large language models, and cutting-edge ML algorithms.
- Deep Knowledge of Large Language Models: Mastery in utilizing and integrating large language models for sophisticated natural language processing tasks, guiding the team in leveraging these models effectively.
- Expert Data Manipulation Skills: Mastery in data processing, cleaning, and analysis, with advanced expertise in tools like Pandas, NumPy, Matplotlib, and SciPy.
- Advanced Communication Skills: Exceptional communication and presentation skills, especially in conveying complex data science concepts to both technical and non-technical stakeholders.
- Strategic Analytical Thinking: Demonstrated ability to strategically solve complex problems and translate intricate requirements into effective solutions.
- Expert Technical Competence: Advanced proficiency in Git, DevOps, CI/CD, and extensive experience in cloud computing platforms like AWS and Azure.
- Continuous Learning and Mentorship: Demonstrated commitment to continuous learning and a keen interest in mentoring junior team members, driving innovation, and staying updated in MLOps and data science productionization.
Nice to Have
- Extensive experience in optimizing production processes through parallelization, multi-threading, and automated model re-training.
- Proficiency in MLOps frameworks (e.g., SageMaker, Kubeflow, MLFlow) and big data processing frameworks (e.g., Spark, Hadoop, Databricks).
- Advanced software engineering skills, including proficiency in additional programming languages like Java and SQL, along with comprehensive knowledge of relational databases, semi-structured and unstructured document formats (e.g., JSON and XML), REST interfaces, micro-services, and UML.
Work in a way that works for you:
We promote a healthy work/life balance across the organisation. We offer an appealing working prospect for our people. With numerous wellbeing initiatives, shared parental leave, study assistance and sabbaticals, we will help you meet your immediate responsibilities and your long-term goals.