At Docplanner, we love building software that makes a real difference. Our site reliability engineers (SREs) play a key role in making sure our users get powerful features, fast performance, and rock-solid reliability — so they can focus on what matters most to them. As more and more customers rely on our platform, we’re looking for an experienced SRE to help us build great foundations. We’re after someone who brings fresh ideas, a unique perspective, and things like an owner — just like us — to build practical solutions and great user experiences every step of the way.
Objectives of this role
Operate production environments by monitoring availability and taking a holistic view of system health.
Measure and optimize system performance to stay ahead of customer needs and drive continuous innovation.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Provide primary operational support and engineering expertise for multiple large-scale, distributed software applications.
Responsibilities
Ensure reliability and availability of systems through monitoring, alerting, and incident response.
Investigate and resolve incidents, perform root cause analysis, and implement long-term fixes.
Define and maintain SLOs/SLIs to measure and drive service quality.
Continuously improve performance and optimize infrastructure cost and resource usage.
Collaborate with developers to build scalable, fault-tolerant systems and improve deployment practices.
Automate operational tasks to reduce manual toil and improve efficiency.
What will help you thrive?
Monitoring and observability - Experience with monitoring stack like DataDog / OTEL / Prometheus.
Detective mindset - Strong investigative mindset with a detective-like approach to troubleshooting and resolving complex issues.
.NET experience - Familiar with .NET environment and ability to code.
AWS experience - Experience working with AWS services and cloud-native architectures.
Kubernetes - Practical experience deploying, managing, and troubleshooting applications in Kubernetes; understanding of containers, Helm, and scaling strategies.
Think like an owner - Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
Communicator – Equally fluent when talking to humans or machines; clear, effective communication across teams and tools.
Nice to have
Proficiency in scripting or programming with languages such as Python or Go – to support automation and tooling development.
Hands-on experience in Site Reliability Engineering practices – including incident management and service-level objectives.
Understanding of microservices architecture – with experience in designing, observing, and troubleshooting distributed systems.
Let’s talk money
True flexibility and work-life balance
Health comes first
We promote and embrace equal opportunities in our hiring process, and also every day at work. When you apply for our roles you receive equal treatment regardless of age, disabilities, gender reassignment, marital or civil partner status, pregnancy or parental status, race, colour, nationality, ethnic or national origin, religion or belief, sex, sexual orientation or any other dimension of human difference. If you require additional support in your recruitment process, we kindly encourage you to let us know. Behind those words you’re reading, there’s a person (hi!) who already helped a candidate by adapting the interviews, and now we’re lucky to have this person with us. So, even if you’ve never asked for it before, may this serve as a sign that, now, you can do so. We can only truly be equal if we adapt to each other.
“We believe all humans, in all their beautiful diversity, should have equal rights, dignity and respect. Period.” Mariusz Gralewski, CEO