Lead Product Data Engineer
Inovia Bio
Job Title: Lead Product Data Engineer
Location: London, UK (Hybrid)
Salary: £95,000 - £105,000/year + options + private medical insurance + unlimited holiday
Company: Inovia Bio
This role is for candidates only, not recruiters.
About Us:
At Inovia Bio we believe that drug development is in dire need of disruptive thinking.
The current ecosystem supporting biopharmaceutical companies is typified by slow, archaic incumbents that extract as much value as possible from drug developers and their companies while often delivering utter garbage. We’re changing that!
From the way data is used to make early decisions through to how development plans can be accelerated and de-risked, there are significant opportunities to get drugs to patients faster. Your role will be central to our mission of revolutionising drug development.
Who are we?
Inovia Bio is a TechBio Startup based in London. We partner with innovative biopharmaceutical companies by providing both our proprietary technology platforms and expertise to deliver unprecedented impact to their programs. We are the company drug developers dream of working with and the company CROs dread.
An Atypical Role:
The Lead Product Data Engineer at Inovia operates differently – We believe that drug development can only be solved by the merging of medicine and technology…and so will you!
We are seeking a seasoned Lead Product Data Engineer with a strong product-oriented mindset, that fully embraces collaboration with medical colleagues. You will feel comfortable going into a new industry and learning is a key part of your nature.
You will take ownership of end-to-end data processes and embrace collaboration. You will work closely with cross-functional teams to design, build, and maintain robust data infrastructure on the Google Cloud Platform (GCP), enabling data-driven decision-making and facilitating our mission to accelerate drug development.
Key Responsibilities:
- Collaboration: Work with data epidemiologists, product managers, and stakeholders to define requirements, iterate on solutions, and ensure alignment with business objectives.
- Design & Development: Develop, test, and deploy scalable data pipelines using Python, Flask, Apache Spark/Beam, and GCP Dataflow to meet the needs of our customers while collaborating with the medical, biostatistics and epidemiology teams.
- Data Orchestration & Automation: Utilize Airflow for workflow orchestration and Cloud Functions for seamless integration with GCP services, automating critical processes.
- Real-Time Data Processing: Implement data streaming solutions with Pub/Sub or Kafka, ensuring low-latency processing and real-time analytics capabilities.
- Data Warehouse Optimisation & Modelling: Set up and optimize BigQuery, SQL, DBT models, and VertexAI workflows for advanced data transformations and machine learning model serving.
- Data Visualization: Collaborate with stakeholders to define requirements and build interactive dashboards on Looker, translating complex datasets into actionable insights.
- Infrastructure as Code: Use Terraform to define and manage infrastructure in a reproducible manner, ensuring a scalable, secure, and consistent data platform.
- CI/CD Pipelines: Implement and monitor CI/CD pipelines to support continuous integration and automated deployment of data workflows.
- Implement and Optimise ML Models: Having a good mathematical understanding of LLMs and general machine learning models.
Key Requirements:
· Collaboration: Work with the medical team, product managers, and stakeholders to define requirements, iterate on solutions, and ensure alignment with business objectives.
· Product-Oriented Mindset: Ability to prioritize customer needs and translate them into actionable data solutions.
· You believe in “strong opinions loosely held”: No idea or approach is sacred – you’re comfortable speaking up and challenging ideas and comfortable receiving feedback.
· Technical Proficiency: Extensive experience with Python, Flask, Airflow, GCP Dataflow, Apache Spark/Beam, Cloud Functions, Pub/Sub(or Kafka), Vertex AI, CI/CD, Linux, BiqQuery, SQL, DBT, Looker, Git and GitFlow.
· Google Cloud Platform Knowledge: Deep understanding of GCP services and architecture best practices and use of Terraform.
· You have worked in an Agile environment
· 3+ years of experience
· Eligibility to work in the UK
· All successful applicants will need to pass a DBS check.
Good-to-Haves:
- ML-Ops Experience: Familiarity with ML model deployment and monitoring, especially within a GCP/VertexAI environment.
- Full-Stack Development Skills: Experience with web development frameworks and integrating front-end and back-end services.
- Web Crawling: Hands-on experience with web scraping and data extraction techniques.
- Looker Dashboard Expertise: Proven experience in building and optimizing Looker dashboards for data storytelling.
- 3+ years of experience
What we offer:
- Competitive salary in the range of £95,000 to £105,000/year.
- Equity options to share in our success.
- Private medical insurance.
- Unlimited holiday policy.
- An opportunity to work on impactful data projects in a dynamic and innovative environment.
- Hybrid work environment.
- Rapid progression opportunities
- Zero political work culture.
- Learning budget