Personal details

João R. - Remote data engineer

João R.

Timezone: Bucharest (UTC+3)

Summary

I’m a self-motivated engineer passionate about data warehousing and data engineering, capable of easily adapting to new environments. I enjoy discussions about technology and creative ways to approach complex problems.

I have extensive experience developing data applications across different industry sectors such as finance, telco, retail and marketing. During my career, I have worked with different data architectures and made use of multiple relational databases, the Hadoop ecosystem and cloud infrastructure.

I hold myself to a high level of attention to detail and thoroughness, I believe in a pragmatic view of benefits for each project I participate in.

Work Experience

Senior Analyst, Data Engineer
KNEIP | Dec 2018 - Mar 2020
Java
HBase
Apache Spark
Apache Kafka
Kubernetes
Apache Hadoop
CI/CD
Kafka streams
Apache NiFi
- Developed Kafka Streams applications supporting an event-driven architecture with micro services. - Developed Spark Streaming applications to consume data from Kafka and load a Fund Data Management data model in HBase. - Participated in creating CI/CD pipeline for Kafka Streams applications, migrating to containers and Kubernetes orchestration. - Implemented NiFi processor groups to integrate data sourced from files via FTP. - Participated in data modelling for Fund Data Management.
Senior Developer
Sagacity Solutions | Feb 2017 - Nov 2018
Python
SQL
MySQL
Teradata
Apache Spark
Apache Hadoop
Apache Hive
- Developed a bespoke Value Based Management analytics solution for telecommunications company Telstra. The solution, within data warehouse supported by Teradata, included modules for tenure and cashflow forecasts and also investment data integration. - Designed and developed configuration-driven product for Value Based Management using Apache Spark, standardization of core algorithms. - Supported the implementation of Value Based Management product for telecommunications group Tele2 in three different countries, Estonia, Latvia and Lithuania. - Oversaw Value Based Management product operating in a Software-as-a-Service model using AWS. - Developed ETL to enable a Revenue Assurance process related to call-center operations for telecommunications company TalkTalk. Used data warehouse supported by Netezza.

Personal Projects

KNEIP Digital PlatformIconOpenNewWindows
2020
Java
HBase
Apache Spark
Apache Kafka
Kubernetes
Spark streaming
Apache Hadoop
CI/CD
Kafka streams
Apache NiFi
A complete digital platform for Fund Data Management, capable of handling the entire life cycle of Fund Data, integrating multiple sources and capable of supporting multiple targets for reporting and publishing in different media. I was a senior data engineer within a cross-functional team responsible for supporting real-time data integration from different sources into a data model capable of supporting multiple products. The platform implemented an event-driven architecture with micro services. I was heavily involved in the development of the data ingest pipeline making use of Apache NiFi, Kafka Streams, Apache Spark and HBase.
2018
Python
SQL
Apache Spark
Apache Hadoop
VBM stands for ‘Value Based Management’ and it is a solution which allows businesses to improve their profitability by providing detailed customer level insight on which customer delivers the most value. It also looks to create an appropriate and sustainable approach to governance and a culture to focus on long term value creation. I was the lead developer to create a configuration-driven product containing VBM’s core modules, namely tenure and cashflow forecasts and also investment data integration. I also participated in different implementations of this product, delivering client-specific customization and supporting technical deployment in different environments, cloud (AWS) and on-premises Hadoop cluster. The product is written in Python and supported by Apache Spark.