Job Details

Analyst, Site Reliability Engineering


Date Opened: 11/02/2023

Job Type:

Job Number: 23000485

Job Description

Who We Are: 

Saks Cloud Services (SCS) is an operating company within Saks, the premier digital platform for luxury fashion. SCS provides IT infrastructure services, technology consulting and systems integration services, while also serving as a software reseller and service provider


Role Summary: 

The successful candidate will primarily focus on identifying and analyzing technical problems on systems and applications across all supported divisions. Work closely with cross-functional IT Teams to troubleshoot and resolve application-related issues. Play a key role in implementing new solutions that improve the efficiency and effectiveness of the team and organization. The ideal candidate for this role should have a strong technical background and communicate effectively with technical and non-technical stakeholders.


Key Qualifications: 

  • 2-3 years of related work experience, preferably in SRE or DevOps related fields.
  • Understand customer business processes & transactions
  • Understand application architecture/design, analyse non-functional requirements, SLI/SLO
  • Independently troubleshoot performance, scalability, capacity, resilience & reliability issues & correlate to application code & configurations.

Role Description: 

  • 2-3 years of experience working within DevOps or SRE teams.
  • 2+ years experience with any cloud platforms 
  • Ability to program (structured and OO) with one or more high-level languages, such
  • as Python, Go, Java, and JavaScript
  • Be on a PagerDuty rotation to respond to availability incidents and provide support for service engineers.
  • Run the production environment by monitoring availability and taking a holistic view of system health
  • Part of building and implementing services to make IT and support better at their jobs.
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Validate the NFR/SLx with production logs or business analytics.
  • Conduct proof-of-concepts to show case the benefit of the recommendation.
  • Instrument the target environment to capture relevant monitoring metrics for analysis.
  • Contribute to grooming SRE in core concepts and build knowledge repository by adding point of view documents and blogs.
  • Document the engineering strategy, analysis reports.
  • Document every action so your findings turn into repeatable actions–and then into
  • automation.
  • Hands-on experience with Distributed Version Control System such as GIT, AWS
  • CodeCommit or equivalent
  • Must have experience with Ansible, Helm, Terraform, and Kubernetes.
  • Know your way around Linux and the Unix Shell.
  • Experience or familiarity with ELK stack
  • Balance feature development speed and reliability with well-defined service level objectives

Job Qualifications


Your Life and Career at SCS: 

  • Exposure to rewarding career advancement opportunities
  • A culture that promotes a healthy, fulfilling work/life balance 
  • Benefits package for all eligible full-time employees (including medical, vision and dental).

Thank you for your interest in SCS. We look forward to reviewing your application.

SCS provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability

SCS welcomes all applicants for this position. Should you be individually selected to participate in an assessment or selection process, accommodations are available upon request in relation to the materials or processes to be used.