Personal details

Josiah B. - Remote

Josiah B.

Timezone: Central Time (US & Canada) (UTC-5)

Summary

I have specialized in Big Data technologies, especially Hadoop technologies like Apache Spark, Flume, HBase, HDFS, Hive LLAP, Impala, etc. This career has lead me into developing applications that implement Machine Learning models, predictive algorithms, NLP algorithms, and ingest large datasets. I'm very well versed in concurrent and parallel programming and am really good with both Object Oriented as well as Functional programming approaches.

I really love teaching people and sharing my knowledge. I promise that in the time that I spend mentoring you, I will pour into you as much of my knowledge as I can to give you the best chance possible in the industry.

Work Experience

Senior Data Engineer
Pinsight Media | Apr 2018 - Present
Scala
Linux
Shell
Pandas
Apache Spark
Apache Kafka
Apache Hadoop
Apache Airflow
Mostly writing Spark processing pipelines on very large (1/2 petabyte or more) datasets.
Hadoop Architect
Triple-I Corporation | AMC Theatres | May 2017 - Apr 2018
Java
Scala
Pandas
Machine Learning
NLP (Natural Language Processing)
Apache Spark
Apache Hadoop
Apache flume
Python 2
I'm playing a lead role in getting AMC Theatres Big Data initiative off the ground. Responsibilities and Accomplishments: - Extended a Spark Sentiment Analyzer written in Scala using Stanford CoreNLP to analyze complex customer feedback. - wrote a custom Flume Source plugin in Java and Scala + Cats for ingesting a vendor's realtime HTTPS event stream - used Scala, Akka, Scalatra, and Cats to develop an HTTP-based Custom Flume Client - Co-Administrator of a CDH5 (Cloudera) cluster - Training for Hadoop software development and Scala programming to peers/engineers - Development process and workflow advisor - Exploratory research and project idea generation - Develop new solutions/Apps leveraging Hadoop technologies including Flume, Spark, Impala, Hive, and HBase - Deploy new Hadoop apps and plugins to a Kerberized CDH 5 cluster - Rig applications to execute through Sysvinit, Upstart, or Systemd - Rig system-initiated applications to auto-authenticate to Kerberos using keytabs - Automation Engineer and advisor - Haskell-style functional programming in Scala using Cats - Imported deeply nested JSON files into Hive and Impala and flattened it out into a traditional SQL table structure. - wrote real-time data ingestion to HDFS apps using Linux Shell scripting, Python, Java, and Scala - Created a Docker CDH 5 development sandbox for prototyping.

Personal Projects

Overnight Website ChallengeIconOpenNewWindows
2014
HTML/CSS
Ruby on Rails
PostgreSQL
Heroku
JavaScript
Built a new website for KVC Health Systems in 24 hours.
Overnight Website ChallengeIconOpenNewWindows
2017
HTML/CSS
Ruby on Rails
PostgreSQL
Heroku
Continuous Integration
Docker
React
JavaScript
Continuous Deployment
Redux
Built a website in 24 hours for PrincipalsConnect