Site Reliability Engineer (SRE)

Tenbyte Limited is a fast-growing cloud tech company building cloud infrastructure and video streaming infrastructure as a service. We operate across Bangladesh and Malaysia, serving real customers with real scale. High standards. Small team. Big impact.

The role

We’re looking for a hands-on Site Reliability Engineer who thrives on building and operating systems from the ground up. This role is for someone who prefers understanding infrastructure deeply rather than relying heavily on managed services.

You’ll be responsible for ensuring high availability, performance, and scalability across our systems while owning uptime end-to-end.

What you’ll do

  • Design, build, and maintain highly reliable and scalable infrastructure across production environments
  • Work hands-on with microservices and distributed systems in real-world, high-availability setups
  • Build automation for deployments, configuration management, and incident response workflows
  • Implement and improve observability systems including metrics, logging, and distributed tracing
  • Troubleshoot and resolve complex production issues across the stack (application, OS, network, and infrastructure layers)
  • Support on-call operations and participate in incident response when required
  • Collaborate with engineering teams to improve system performance, reliability, and deployment practices

What You’re Expected to Deliver

  • Consistent high availability across all production systems with minimal downtime
  • Scalable infrastructure designs capable of handling growing traffic and system load efficiently
  • Fully automated deployment, monitoring, and recovery processes with reduced manual intervention
  • Strong observability coverage across all critical services enabling fast detection and diagnosis of issues
  • Rapid and effective incident resolution with structured root cause analysis and permanent fixes
  • Continuous improvement of system performance, resilience, and operational efficiency
  • Clear capacity planning practices ensuring systems scale predictably under load

Must-haves

  • Strong hands-on experience with microservices and distributed systems in production
  • Deep expertise in Linux, networking, and bare metal infrastructure
  • Strong ability to operate and troubleshoot infrastructure-level systems with full ownership mindset
  • Solid scripting and automation skills (Bash, Python, or similar)
  • Practical experience with monitoring, logging, and observability tools
  • Strong production debugging and incident handling experience
  • Proven track record of improving system reliability over time

Compensation and benefits

  • Competitive salary with performance-based incentives
  • Paid maternity and paternity leave
  • Two weekly days off
  • Annual increments and two festival bonuses
  • Yearly refresh tour
  • Fully subsidized lunch & snacks
  • Other benefits per company policy

How to Apply

Send your resume along with a cover letter. In your cover letter, tell us specifically why you’re the right person to own this role, what you’ve shipped, how you’ve led delivery, and why Tenbyte is the right next step for you.

Apply for this position

Allowed Type(s): .pdf