We are looking for a Senior Data Scientist to lead the design, development, and deployment of data science solutions geared toward large-scale information analysis. The role requires proven experience bringing machine learning and deep learning models to production with massive data, applying A/B testing, supervised learning, anomaly detection, and pattern recognition practices.
The ideal candidate should be hands-on, with a solid background in statistics, algorithms, and programming, and capable of translating business problems (especially in the accounting, financial, and tax domains) into scalable, secure, and high-impact solutions.
Responsibilities
- Design, train, validate, and deploy machine learning and deep learning models in production environments with big data
- Implement advanced anomaly detection and pattern recognition techniques to identify irregularities, fraud, operational risks, or atypical behavior in the data
- Execute A/B testing and statistical experimentation to validate hypotheses, measure impact, and optimize information analysis products
- Collaborate with cross-functional teams (product, engineering, business, tax/accounting) to translate needs into data science use cases
- Ensure data quality through pipeline cleaning, validation, orchestration, and monitoring processes
- Develop and maintain technical documentation, metrics dashboards, and model performance reports
- Propose new solutions based on predictive models, advanced analytics, and generative AI techniques that add strategic value
Profile Requirements
Academic
- Bachelor's degree in Systems Engineering, Mathematics, Statistics, Computer Science, or related field (Master's/Doctorate desirable)
Experience
- 6-12 years of experience in data science, with at least 3 years leading projects in production
- Solid experience in supervised learning, A/B testing, anomaly detection, and pattern recognition
- Experience putting ML/DL models with millions of records or transactions into production
Technical
- Languages: Python (required), R, and SQL (advanced)
- Experience with ML pipelines, MLOps, and cloud deployment (AWS, GCP, or Azure)
- Knowledge of ML/DL frameworks (scikit-learn, TensorFlow, PyTorch)
- Experience with anomaly detection (Isolation Forest, LOF, autoencoders, Prophet, ARIMA, robust statistics)
- Experience in pattern recognition and predictive modeling (clustering, time series, sequences, recurrent neural networks)
- SQL and NoSQL databases; experience with vector databases (Pinecone, pgvector, Milvus).
- Strong data visualization skills (Matplotlib, Seaborn, Plotly, Power BI, Tableau).
- Experience with model testing and cross-validation
Plus / Desirable (Nice to Have)
- Knowledge of tax, accounting, ERPs, or the financial sector (banks, fintechs, insurance companies)
- Experience in NLP and LLMs for information extraction and document classification
- Experience in transaction fraud detection, credit risk monitoring, or tax irregularities
- Familiarity with big data environments (Spark, Databricks, Hadoop)
- Knowledge of programming languages such as Java, Scala, C++
- Publications, presentations, or participation in data science communities