Actively recruiting / 17 applicants
We’re here to help you
Wilson Bittencourt is in direct contact with the company and can answer any questions you may have. Email
Wilson Bittencourt, RecruiterRole Overview
We are seeking a skilled Spark Optimization Expert with a strong focus on enhancing the performance of Spark jobs that execute Python UDF (User Defined Function) code. We use Spark to parallelize custom optimization problems that run on medium-sized data that is cached to tables regularly. These jobs seem to use a lot of RAM, much more than we expect given the size and partitioning scheme of the data.
Responsibilities
- Analyze and optimize Spark jobs to improve performance and efficiency.
- Work with Python UDF code to enhance execution within Spark environments.
- Identify bottlenecks and implement solutions to improve processing speed and resource utilization.
- Collaborate with data engineering teams to ensure seamless integration of optimized Spark jobs into broader data workflows.
Required Skills
- Extensive experience with Apache Spark job configuration and optimization techniques.
- Proficiency in writing and optimizing Python UDF code for Spark.
- Strong analytical skills to identify and resolve performance issues in Spark jobs.
- Familiarity with distributed computing concepts and best practices.
- Relevant experience working with Kubernetes is required
Nice to Have
- Experience with data engineering and integration in complex data environments.
- Knowledge of additional data processing frameworks and tools.
- Experience of knowledge of Fugue API