hero

Job opportunities in the Octopus Ventures portfolio

Octopus Ventures
companies
Jobs

Senior Data Engineer (with Data Science Skills)

PeakData

PeakData

Data Science
Wrocław, Poland
Posted on Jul 1, 2025

About PeakData

PeakData provides AI-powered market intelligence to optimize drug launch execution and resource allocation for pharmaceutical companies. Our platform delivers actionable insights on healthcare professionals (HCPs) and healthcare organizations (HCOs), empowering commercial leaders with real-time, data-driven decision-making.

Role Overview

We're looking for a Senior Engineer with strong data science capabilities to join our Data Platform team. In this role, you’ll design and build cloud-native data solutions that support large-scale processing, analytics, and AI-powered automation across our platform.

This is a hands-on, senior-level role. You will be expected to work independently, own end-to-end pipelines and infrastructure, and drive initiatives forward both individually and within the team. You should have a strong foundation in Python, SQL, AWS, and/or GCP, with experience using or integrating LLMs into data workflows.

Tech Environment

You’ll work with and expand upon:

  • Python for data pipelines and automation
  • SQL (PostgreSQL) for transformation and analytics
  • AWS (S3, Glue, Lambda, ECS, Bedrock) as primary cloud environment
  • GCP (Vertex AI) for select workloads and integrations
  • Medallion architecture with RAW/CLEANED/CURATED layers
  • LLM integrations for automation, enrichment, and insight generation
  • Data quality frameworks and orchestration tools (e.g., Argo)

Key Responsibilities

Engineering Ownership

  • Design, implement, and maintain scalable and efficient data pipelines across AWS and GCP
  • Build data products and services supporting both internal analytics and client-facing insights
  • Own ETL/ELT workflows from ingestion to curation
  • Implement observability and alerting for pipeline health and data integrity
  • Integrate LLMs into workflows to support enrichment, automation, or intelligent data handling

Team Leadership & Initiative

  • Act as a technical lead for data engineering projects, driving execution independently
  • Collaborate cross-functionally with Data Science, Product, and Engineering teams
  • Contribute to architectural decisions and long-term data platform evolution
  • Champion best practices for performance, security, and scalability

Data Science & LLM Integration

  • Apply data science techniques where appropriate (e.g., clustering, statistical inference)
  • Prototype and validate LLM-powered solutions using tools like AWS Bedrock or Vertex AI
  • Use prompt engineering and evaluation frameworks to fine-tune LLM interactions
  • Help bridge engineering and AI innovation across the platform

Qualifications

Required Skills & Experience

  • 6+ years of experience in data engineering or back-end systems with data-heavy workloads
  • Strong hands-on skills with Python and SQL
  • Deep understanding of AWS cloud data tooling (S3, Lambda, Glue, Step Functions, etc.)
  • Working experience with GCP services, especially BigQuery and Vertex AI
  • Exposure to LLMs and how they integrate into data workflows
  • Experience building data pipelines at scale with monitoring and alerting
  • Ability to work independently and take ownership of technical topics

Bonus Skills

  • Experience with Argo, Airflow or similar orchestration frameworks
  • Familiarity with IaC tools (Terraform) for deploying infrastructure
  • Experience with data quality monitoring, validation frameworks, or anomaly detection
  • Previous work in healthcare, life sciences, or regulated data environments

Personal Attributes

  • Proactive: You take initiative and don’t wait for tasks to be assigned
  • Autonomous: You can own projects from design to production with minimal oversight
  • Curious: You explore new approaches (especially LLMs/AI) and bring them to the table
  • Collaborative: You work well with cross-functional teams
  • Customer-aware: You understand the real-world impact of your pipelines and models

What We Offer

  • Purpose-driven work: support pharmaceutical innovation and better patient outcomes
  • Ownership: real autonomy in shaping our data systems and how they scale
  • Innovation: work on LLM integration and next-gen data workflows
  • A collaborative, fast-moving environment
  • Competitive compensation
  • Access to both AWS and GCP ecosystems in production

If you're a hands-on data engineer who enjoys owning end-to-end systems, loves solving real business problems, and thrives in a hybrid cloud + AI environment — we want to talk to you.