Data Center Operations Manager - Europe Remote
FluidStack
About Fluidstack
Fluidstack is the AI Cloud Platform. We build GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.
Our team is highly motivated, and focused on providing a world class supercomputing experience. We put our customers first in everything we do, working hard to not just win the sale, but to win repeated business and customer referrals.
We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every interaction with us.
You must work hard, take ownership from inception to delivery, and approach every problem with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset.
About the Role
We’re looking for a Data Center Operations Manager to manage the ongoing operational performance of Fluidstack owned and operated GPU clusters. This is a “player-coach” role with both oversight and hands-on responsibilities, focused on ensuring the availability and performance of our data center infrastructure.
You’ll be the owner of everything that lives within our data centers, managing our Data Center Operations team as well as third parties, from installation through ongoing maintenance and coordinating upgrades. Your primary responsibility is to ensure the continuous and efficient operation of the data center by managing on-site technicians and third-party vendors, creating and maintaining operational procedures, diagnosing issues, and providing hands-on technical support when higher-level intervention is required. This role is ideal for individuals who excel in environments that demand both operational discipline and the ability to navigate complex, technical challenges.
Focus
Ensuring high availability of our GPU infrastructure.
Manage onsite team of data center technicians and third party vendors in daily operations, including server maintenance, equipment installation, and troubleshooting.
Respond to and resolve technical issues and emergencies in a timely manner, ensuring minimal downtime and disruption.
Act as interface between FDEs and onsite team to ensure fast, effective technical remediation and incident resolution.
Undertake regular data center maintenance, performing inspections and audits of equipment to maintain optimal performance and reliability.
Proactively manage infrastructure by defining and continuously improving standard operating procedures (SOPs) for routine data center maintenance.
Manage third-party hardware vendors, including initiating and coordinating the RMA process.
Available to travel to various locations in the US and Europe on short notice and potentially for extended periods when on-site support requires elevated, hands-on expertise.
About You
5+ years experience in data center operations.
Proven ability to lead remote teams and manage vendors.
In-depth knowledge of data center infrastructure, including servers, networking equipment, and cooling systems.
Capable of training on-site datacenter technicians to perform routine physical maintenance.
Capable of remotely diagnosing hardware issues using common Linux and OOB utilities (dmesg, journalctl, dmidecode, lspci, mcelog, dcgmi, nvidia-smi, RedFish, IPMI, etc).
Familiar with common inventory management systems (e.g. NetBox).
Strong communication and organizational skills.
Willing to travel internationally on short notice, based onsite for extended periods as required.
Nice to haves
Strong troubleshooting skills and the ability to quickly diagnose and resolve technical issues
Experience with data center management tools and software
Strong time management, communication and interpersonal skills, with the ability to manage a team
Benefits
Competitive total compensation package (cash + equity).
Retirement or pension plan, in line with local norms.
Health, dental, and vision insurance.
Generous PTO policy, in line with local norms.
Fluidstack is remote first, but has offices in key hubs. For all other locations, we provide access to WeWork.