Personal details

K - Remote data scientist

K

Timezone: Kolkata (UTC+5.5)

Summary

I have been mostly solving problems around Natural Language Processing and Computer Vision.
From ideation, and experimentation to monitoring and deployment.
Be it quantizing your huge transformer model or training it in a distributed setting with limited resources. I have got your back

Personal Projects

FINANCIAL DOCUMENT UNDERSTANDING PLATFORM
2021
Django
OpenCV
NumPy
BERT
TensorFlow
Hugging face
A platform to automate identification and information extraction from various financial and legal documents • Developed a large-scale data pipeline for classification of 2M+ documents into 1400+ categories • Worked on layout analyzer, table detection, table data extraction, forms and free text clustering • Information extraction using BERT for Named Entity recognition in unstructured documents and ROI based template matching on structured documents • Set up highly available, fault-tolerant, model pluggable infrastructure using AWS lambda, DynamoDB, API gateway • Built an event-driven feedback loop and user interaction platform • Reduced the SLA of single document processing by 70% • Helped Intellect cut extraneous escrow staff and services cost by 34%
E-SIGNING & NOTARIZATION
2019
Django
BERT
TensorFlow
Cnn
Signature, Notary Detection and Verification toolkit • Designed and implemented a modular, extensible Signature and Notary Detection API • Implemented a false positive removal api on top, increasing the model precision from 64% to 87% • Developed an algorithm to validate if a notary seal follows US notarization state laws • Setup data extraction pipeline from Notary Seals • Integrated the api with DocuSign and First American document processing services platform • Reduced the SLA of signature and notary verification by 70% • Helped company cut escrow services cost, Increase in Document recording services revenue