About the job – Data Scientist II at Green Cabbage
At Green Cabbage we empower our customers in the procurement and negotiation process with industry data to save money, time and risk. The company is growing rapidly, and we want to set the foundation for accelerating our development team. We are currently designing, building, and optimizing our core data science offerings and looking for a senior member to join our team to steward experimentation and innovation.
Examples of the skills, knowledge, and experiences you need to lead and deliver value at this level include but are not limited to:
- Analyze and identify the linkages and interactions between the component parts of an entire system.
- Take ownership of projects, ensuring their successful planning and technical execution.
- Partner with team leadership to ensure collective ownership of quality, timelines, and deliverables.
- Develop skills outside your comfort zone and encourage others to do the same.
- Effectively mentor others.
- Use the review of work as an opportunity to deepen the expertise of team members.
- Address conflicts or issues, engaging in difficult conversations with clients, team members and other stakeholders, escalating where appropriate.
Minimum Degree Required
Bachelor's Degree
Minimum Year(s) of Experience
7 year(s)
Demonstrates extensive experience:
- Heavy contributor in building of AI and GenAI solutions, including but not limited to analytical modeling, prompt engineering, general all-purpose programming (e.g., Python), testing, communication of results, front end and back-end integration, and iterative development with clients,
- Documenting and analyzing business processes for AI and Generative AI opportunities, including gathering of requirements, creation of initial hypotheses, and development of AI/GenAI solution approach
- Managing teams to process unstructured and structured data to be consumed as context for LLMs, including but not limited to embedding of large text corpus, generative development of SQL queries, building connectors to structured databases; and
- Directing data engineers and other data scientists to deliver efficient solutions
Demonstrates extensive abilities and/or a proven record of success learning and performing in functional and technical capacities, including the following areas:
- Managing GenAI application including back-end and front-end integrations
- Using Python (e.g., Pandas, NLTK, Scikit-learn, Keras, etc.), common LLM development frameworks (e.g., Langchain, Semantic Kernel), Relational storage (SQL), Non-relational storage (NoSQL);
- Experience in analytical techniques such as Machine Learning, Deep Learning and Optimization
- Vectorization and embedding, prompt engineering, RAG (retrieval augmented generation) workflow development
- Understanding or hands on experience with Azure, AWS, and / or Google Cloud platforms
- Experience with Git Version Control, Unit/Integration/End-to-End Testing, CI/CD, release management, etc.