Senior / Staff ML Researcher (London)

Recurse ML

Recurse ML

Software Engineering, Data Science
London, UK
Posted on Friday, July 5, 2024

Senior / Staff ML Researcher (London)

Our Mission ​

Recurse’s mission is to make software engineering great again! We want to enable software engineers to spend more time solving hard problems and less time performing mundane codebase maintenance tasks. That’s why we are building ML tools that automate mundane software engineering tasks, starting with dependency upgrades, in large codebases. Think, turning a story into a submitted PR.
Our product is an on-prem K8s service that interacts with client’s Git repo. It consists of a mix of ML and old-school algorithmic components for analyzing build logs, navigating large (100K+ files) codebases, and editing source code. This will save thousands of engineering hours spent on dependency upgrades, enabling engineers to build new features and solve hard problems during that time instead.
We’re looking for an Founding Engineer to join us on our journey! You’ll primarily work directly with the CTO, and as we grow, you’ll have the opportunity to choose between building and technical leadership.

About you ​

The ideal team member has designed and trained ML models that outperform previous baselines on a given task. In return, we offer technically stimulating work and the opportunity to deliver more impact than either the best academic institutions or commercial AI labs.
Ideally you:
Have published deep learning research in top-tier conferences (AAAI, NeurIPS, EMNLP, ICML, ACL). If your models were specific to code or natural language domains that’s even better.
Have a PhD in machine learning or previous work demonstrating equivalent experience.
Have hands-on experience implementing ML papers. We use PyTorch, but we trust that you can learn a different framework.

About the problems you’ll be solving ​

You will be responsible for driving research to build the best code editing models for enterprise code. This is a challenging ML task as:
Most existing work focuses on generation of new code rather than guided editing.
The models need to understand huge projects (100K+ files), containing massive source files (3K+ lines).
As the models are expected to perform the task autonomously, the margin for error is very slim.
Although you’ll have freedom to choose the most impactful research problems, here are some representative examples that you might tackle:
Design a novel framework for code editing based on runtime feedback. See https://arxiv.org/pdf/1911.01205.pdf, https://arxiv.org/pdf/2304.05128.pdf, https://arxiv.org/pdf/2306.10763.pdf
Design LLM decoding algorithm for code-editing rather than greenfield generation. Most existing code generation strategies are designed for writing code from scratch rather than editing existing code. You'll have a chance to change that.
Tackle other high-impact problems, as identified by you!
If this resonates with your interests and experience, please shoot me an email (jack@recurse.ml) describing what aspect of Recruse excites you and include your CV or a url to your LinkedIn/Github.
Bonus points if you mention your favourite paper/blogpost.

Hiring Process​

Applicants should send their application to (jack@recurse.ml), including a CV and URL to your Github profile.
Our hiring process consists of 3 steps:
Technical interview (Algorithms)
Technical interview (Systems Design)
Cultural assessment