Personal details

Guichong L. - Remote data scientist

Guichong L.

Based in: 🇨🇦 Canada
Timezone: Bogota (UTC-5)

About

EXPERIENCE SUMMARY • 8-year work experience as Senior Data scientist for Biotechnology, market research, telecom business, automobile, network security, and 4-years work experience as AI engineer for cloud spark applications in AWS, Azure.. • Recent research and development for applications in anomaly traffic detection in CAN workflows; high level feature extraction using Bayesian variables; object detection and action classification; • Designed and implemented machine learning algorithms for churn prediction and Add a Line analysis and machine learning pipelines for telecom business analysis. • Developed advanced machine learning algorithms for text, POS outlet items, category hierarchy classification; multilabel and multitask classification algorithms/text classification/NLP; • Developed advanced machine learning regression/classification algorithms for food component analysis using chemometrics and spectroscopy; • Postdoctoral research on uniformly and unbiased sampling/crawling online social networks using advanced Markov Chain Monte Carlo techniques; developed an innovative sampling algorithm, a new coupling technique, implemented by Ruby and Rails and Twitter API, DataMapper; Unix/Linux, Amazon EC2; social media analysis using Python, NLTK, SkLearn.

Work Experience

AI developer/Python programmer,
IRCC, Altis | Aug 2023 - Dec 2023
Python
Google BigQuery
Google Cloud Platform
Dataflow
AI
AWS

Project I: IBM SPSS python conversion. Implemented end-to-end python pipeline with spark on AWS cloud for the original SPSS stream models. The main tasks include data collection, spss stream python conversion for spss type, filter, select, filler, merge nodes, aggregate nodes, flag nodes supernodes, cache, statistical outputs, with pyspark with spark on AWS cloud platform; unit tests and spark storage plan for optimization. Further, including techniques to transfer to Google Cloud Platform (GCP) services with BigQuery, DataFlow, Pub/Sub, BigTable, Data Fusion, DataProc, Cloud Composer, Cloud SQL, Compute Engine, Cloud Functions, and App Engine; BigQueryML, AutoML, Vertex AI

Data Scientist/Machine learning engineer
SpruceInfotech | Jan 2022 - Jul 2022
Python
Pandas
Machine learning
DevOps

Developed and implemented Python scripts for data parsing, data imputation, and data encoding using sklearn, pandas. Developed Python scripts to train and build models and to run tests to evaluate system performance of AI solutions, using sklearn, pandas; developed and implemented python scripts to AI solutions for forecast modeling and regression modeling and classification modeling; developed and implemented python scripts for log time series analysis; Developed MLflow for model tracking, training, logging, registration, inference, hyperopt/parameter sweep. Univariate/multivariate forecasting and regression Analyzed and validated business requirements and review of solutions with relevant stakeholders; the technical report for MLFlow project development and production solution for azure cloud AI solution, and anomaly detection and sentinel; the research report for improvement of forecast models for rail transportation with Azure Devops and Databricks Technology: Pyspark, pandas, sklearn, MLflow, Azure databricks, Devops

Education

University of Ottawa
Doctor's degree・Computer Science
Jan 2006 - Jun 2010
Regina university
Master's degree・computer science
Feb 2001 - Sep 2004