JobDir — Jobs Directory

Description About Telestream Telestream is a leading provider of digital media tools and software solutions for the broadcast, streaming, and media industries. We empower content creators and distributors to produce and deliver high-quality video content while optimizing operations and maximizing revenue. Our teams work diligently to innovate and support world-class services, and we are seeking a DevOps/SRE Team Lead with proven, hands-on Kubernetes expertise to drive the reliability and scalability of our video processing infrastructure and oversee a small team of SRE’s and DevOps Engineers. This is a deeply technical lead role, requiring real-world experience administering production Kubernetes clusters—not theoretical familiarity. You will own CI/CD pipelines, infrastructure automation, and cloud platform operations in a fully remote environment where independent execution is essential. If you have built, broken, and fixed things in Kubernetes at scale, while managing and mentoring a team, we want to hear from you. Location: US Remote Work Authorization - Candidates must be legally authorized to work in the United States. This role is not eligible for employer-sponsored work authorization or visa sponsorship of any kind, now or in the future. OUR INTERVIEW PROCESS*** Our process includes a live, hands-on technical interview conducted via shared terminal and screen share. You will be asked to work through real Kubernetes and infrastructure scenarios in real time—no take-home exercises, no slides. Candidates who are comfortable with the skills listed above will do well. Candidates who are not, will find this stage difficult to navigate. We value people who are direct about what they know and what they’re still learning. Requirements What You Will Do You will spend 70-80% of your day being hands-on in the following areas; Design, deploy, and administer production Kubernetes clusters, including workload scheduling, namespace management, RBAC, network policies, and cluster upgrades Design and maintain continuous integration/deployment pipelines to automate testing and deployment, including Kubernetes-native delivery workflows using Helm and ArgoCD or equivalent Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows Implement and manage infrastructure.  Utilize Terraform or CloudFormation for IaC management Optimize cloud resources by implementing cost-effective solutions Collaborate with various teams to ensure smooth deployment Monitor and create new processes based on performance analysis Implement security best practices, including automated compliance checks and secure code deployment You will spend 20-30% of your time managing the following areas; Manage the technical roadmap, architecture while mentoring SRE and DevOps Engineers. (Player/Coach) Hire, coach, and manage a team of DevOps engineers and Site Reliability Engineers. Strong communication, conflict resolution, and the ability to influence without authority Define DevOps/Platform roadmap aligned with business goals (e.g., cloud cost optimization, automation maturity). Excellent communication and collaboration skills What You Will Bring Bachelor’s degree in computer science, Engineering or equivalent 5-8+ years of experience in DevOps/SRE, with 2-3+ years in a leadership role. Hands-on experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or equivalent) with direct integration into Kubernetes deployment workflows Production-level experience with infrastructure as code (Terraform required; CloudFormation or Pulumi a plus), including managing cloud-hosted Kubernetes clusters (EKS, GKE, or AKS) Experience with monitoring, logging, and observability tooling in Kubernetes environments (Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent); ability to build dashboards and alerts from scratch, not just consume existing ones Demonstrated, hands-on Kubernetes experience in production environments: cluster administration, Helm chart authoring and management, RBAC configuration, persistent storage, horizontal/vertical pod autoscaling, and diagnosing and resolving real production failures (CrashLoopBackOff, OOMKilled, networking issues, etc.) Strong troubleshooting skills with the ability to diagnose infrastructure and application issues live, under pressure, without reference materials—this is evaluated directly in our interview process Proficiency in scripting languages (Python, Go, Bash, or PowerShell); ability to write and own automation scripts, not just modify existing ones Benefits Perks That Power Your Life We offer a comprehensive package designed to support your health, financial security, and work-life balance. Our benefits are built to keep you healthy, supported, and free to do your best work. Day-one medical, dental & vision coverage 100% company-paid life + disability insurance 401(k) with a sweet company match (up to 8%) Quarterly HSA boosts & flexible spending accounts Flexible time off (salaried) or PTO (hourly) + generous paid holidays Pet insurance (yes, your dog gets benefits too) Legal plan + extras like accident & critical illness coverage Telestream is committed to a fair and transparent hiring process. We do not use artificial intelligence (AI) to screen, evaluate, or make selection decisions about applicants. All applications are reviewed by our recruiting team and hiring managers to ensure each candidate receives thoughtful, human consideration based on their qualifications and experience.

DevOps/SRE Team Lead

Skills & Technologies

Job Description

Interested in this position?

Similar Jobs