DevOps/SRE Team Lead
Telestream, LLC
United StatesFull-TimeLeadMachine Learning
Posted
Yesterday
May 5, 2026
Source
Skills & Technologies
pythonkubernetesgoterraformjenkinsci/cdgitscala
Job Description
Description
About Telestream
Telestream is a leading provider of digital media tools and software solutions for the broadcast, streaming, and media industries. We empower content creators and distributors to produce and deliver high-quality video content while optimizing operations and maximizing revenue. Our teams work diligently to innovate and support world-class services, and we are seeking a DevOps/SRE Team Lead with proven, hands-on Kubernetes expertise to drive the reliability and scalability of our video processing infrastructure and oversee a small team of SRE’s and DevOps Engineers.
This is a deeply technical lead role, requiring real-world experience administering production Kubernetes clusters—not theoretical familiarity. You will own CI/CD pipelines, infrastructure automation, and cloud platform operations in a fully remote environment where independent execution is essential. If you have built, broken, and fixed things in Kubernetes at scale, while managing and mentoring a team, we want to hear from you.
Location: US Remote
Work Authorization - Candidates must be legally authorized to work in the United States. This role is not eligible for employer-sponsored work authorization or visa sponsorship of any kind, now or in the future.
OUR INTERVIEW PROCESS***
Our process includes a live, hands-on technical interview conducted via shared terminal and screen share. You will be asked to work through real Kubernetes and infrastructure scenarios in real time—no take-home exercises, no slides. Candidates who are comfortable with the skills listed above will do well. Candidates who are not, will find this stage difficult to navigate. We value people who are direct about what they know and what they’re still learning.
Requirements
What You Will Do
You will spend 70-80% of your day being hands-on in the following areas;
Design, deploy, and administer production Kubernetes clusters, including workload scheduling, namespace management, RBAC, network policies, and cluster upgrades
Design and maintain continuous integration/deployment pipelines to automate testing and deployment, including Kubernetes-native delivery workflows using Helm and ArgoCD or equivalent
Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows
Implement and manage infrastructure.
Utilize Terraform or CloudFormation for IaC management
Optimize cloud resources by implementing cost-effective solutions
Collaborate with various teams to ensure smooth deployment
Monitor and create new processes based on performance analysis
Implement security best practices, including automated compliance checks and secure code deployment
You will spend 20-30% of your time managing the following areas;
Manage the technical roadmap, architecture while mentoring SRE and DevOps Engineers. (Player/Coach)
Hire, coach, and manage a team of DevOps engineers and Site Reliability Engineers.
Strong communication, conflict resolution, and the ability to influence without authority
Define DevOps/Platform roadmap aligned with business goals (e.g., cloud cost optimization, automation maturity).
Excellent communication and collaboration skills
What You Will Bring
Bachelor’s degree in computer science, Engineering or equivalent
5-8+ years of experience in DevOps/SRE, with 2-3+ years in a leadership role.
Hands-on experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or equivalent) with direct integration into Kubernetes deployment workflows
Production-level experience with infrastructure as code (Terraform required; CloudFormation or Pulumi a plus), including managing cloud-hosted Kubernetes clusters (EKS, GKE, or AKS)
Experience with monitoring, logging, and observability tooling in Kubernetes environments (Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent); ability to build dashboards and alerts from scratch, not just consume existing ones
Demonstrated, hands-on Kubernetes experience in production environments: cluster administration, Helm chart authoring and management, RBAC configuration, persistent storage, horizontal/vertical pod autoscaling, and diagnosing and resolving real production failures (CrashLoopBackOff, OOMKilled, networking issues, etc.)
Strong troubleshooting skills with the ability to diagnose infrastructure and application issues live, under pressure, without reference materials—this is evaluated directly in our interview process
Proficiency in scripting languages (Python, Go, Bash, or PowerShell); ability to write and own automation scripts, not just modify existing ones
Benefits
Perks That Power Your Life
We offer a comprehensive package designed to support your health, financial security, and work-life balance. Our benefits are built to keep you healthy, supported, and free to do your best work.
Day-one medical, dental & vision coverage
100% company-paid life + disability insurance
401(k) with a sweet company match (up to 8%)
Quarterly HSA boosts & flexible spending accounts
Flexible time off (salaried) or PTO (hourly) + generous paid holidays
Pet insurance (yes, your dog gets benefits too)
Legal plan + extras like accident & critical illness coverage
Telestream is committed to a fair and transparent hiring process. We do not use artificial intelligence (AI) to screen, evaluate, or make selection decisions about applicants. All applications are reviewed by our recruiting team and hiring managers to ensure each candidate receives thoughtful, human consideration based on their qualifications and experience.
Interested in this position?
Apply directly on LinkedIn to submit your application.