0
COMPANIES
0
JOBS

Principal Site Reliability Engineer

UiPath

UiPath

Administration, Software Engineering
Bengaluru, Karnataka, India
Posted on Feb 10, 2026

Location

Bangalore - Engineering

Employment Type

Full time

Location Type

On-site

Department

Engineering

Life at UiPath

The people at UiPath believe in the transformative power of automation to change how the world works. We’re committed to creating category-leading enterprise software that unleashes that power.

To make that happen, we need people who are curious, self-propelled, generous, and genuine. People who love being part of a fast-moving, fast-thinking growth company. And people who care—about each other, about UiPath, and about our larger purpose.

Could that be you?

Your mission

UiPath is seeking a Principal Site Reliability Engineer who excels across the full reliability stack — not limited to a single silo. You will help define how reliability is architected, scaled, measured, and automated across our large-scale, cloud-native systems. This role requires broad technical judgment, systems thinking, and the ability to influence reliability outcomes across product and platform teams.

This role is about shaping how reliability works at UiPath — not just firefighting outages or writing code. You will partner with engineering and platform teams to embed reliability into systems, workflows, and culture. You will help raise the bar for how we observe, automate, and ensure our systems scale reliably under real-world load and failure conditions.

You will own service reliability, observability, automation, and continuous improvement initiatives and partners with our Romania and India based application teams as needed.

What you'll do at UiPath

End-to-End Reliability Ownership - Define and evolve reliability strategy for distributed systems, balancing availability, performance, velocity, and cost through clear SLIs/SLOs and error budgets.

Incident Response & Operational Excellence - Lead and contribute to high-severity incidents, drive structured troubleshooting under ambiguity, and ensure durable systemic improvements.

Observability & Operational Insights - Define and promote strong observability practices so that service health and performance risks are visible and actionable.

Automation, Tooling & Engineering Rigor - Automate manual operational work through tooling and self-service, applying disciplined engineering practices.

Infrastructure, Cloud & IaC - Drive reliable, scalable cloud infrastructure using Infrastructure as Code and collaborate with platform teams on best practices.

Technical Leadership & Org Impact - Influence standards, mentor senior engineers, and elevate operational reliability across the organization.


What you'll bring to the team

Engineering & Reliability Experience

• 7+ years of experience in SRE, platform, cloud, or infrastructure engineering roles with a track record of improving reliability for production systems.

• Demonstrated ability to define and operationalize SLIs, SLOs, and use frame works like error budget to align reliability with user impact and business goals.

System Thinking & Distributed Systems Fundamentals

• Strong conceptual understanding of distributed systems, performance bottlenecks, failure modes, and trade-offs inherent to large-scale systems.

Scripting & Tooling

• Proficiency in at least one programming language (e.g., Python, Go, or similar) used to build automation, internal tooling, and reliability workflows.

• Experience developing tools and automation to reduce operational toil and improve system reliability.

Cloud & Infrastructure Expertise

• Hands-on experience working with one or more major cloud providers (Azure, AWS, GCP), with practical knowledge of networking, deployments, and scaling.

• Experience with Infrastructure as Code (e.g., Terraform, Pulumi) and container orchestration (e.g., Kubernetes) in production contexts.

Observability & Operational Practices

• Proven experience with monitoring/observability stacks (metrics, logs, traces) and building meaningful dashboards and alerts that improve reliability signals.

Incident Response & Post-Incident Learning

• Experience participating in and improving incident response, blameless postmortems, and implementing systemic fixes rather than symptomatic patches.

Collaboration & Influence

• Ability to partner with product, infrastructure, and engineering teams to influence architecture and reliability practices without direct authority.

Nice to Have

• Experience with chaos engineering, resilience testing, or performance optimization.

• Exposure to Service mesh, Reliability scoring frameworks or AIOps tooling.

#LI-VR1

Maybe you don’t tick all the boxes above—but still think you’d be great for the job? Go ahead, apply anyway. Please. Because we know that experience comes in all shapes and sizes—and passion can’t be learned.

Many of our roles allow for flexibility in when and where work gets done. Depending on the needs of the business and the role, the number of hybrid, office-based, and remote workers will vary from team to team. Applications are assessed on a rolling basis and there is no fixed deadline for this requisition. The application window may change depending on the volume of applications received or may close immediately if a qualified candidate is selected.

We value a range of diverse backgrounds, experiences and ideas. We pride ourselves on our diversity and inclusive workplace that provides equal opportunities to all persons regardless of age, race, color, religion, sex, sexual orientation, gender identity, and expression, national origin, disability, neurodiversity, military and/or veteran status, or any other protected classes. Additionally, UiPath provides reasonable accommodations for candidates on request and respects applicants' privacy rights. To review these and other legal disclosures, visit our privacy policy.