Job Details

Analyst, Site Reliability Engineer


Date Opened: 07/04/2022

Job Type:

Job Number: 220003C5

Job Description

Who we are:


As North America’s oldest startup and Canada’s purpose-driven digital marketplace, The Bay is on a high-growth mission to rewrite the rules of retail to help Canadians live a colorful life. If you believe in the power of our iconic brand and thrive on problem-solving at scale, we want you to join our team.

At The Bay, smart, high-performing team members will challenge you to learn and grow every day. We value ambitious work and great ideas grounded in data and insights. We're looking for talented people who love a fast-paced environment, embrace change, and are looking to make an impact with groundbreaking ideas.

We are building a digital-first company and brand for a diverse world and we need an inclusive team to reach our potential. We strongly encourage applications from everyone to come and join a winning team that supports diverse thinking and demonstrates innovation, energy, creativity, and vision every day.

You can learn more and view available positions in Bengaluru, by visiting

What This Position is All About

The Site Reliability Engineering Analyst role assists in the planning, monitoring, and controlling the day-to-day operations and delivery aspects of the Site Reliability Engineering teams.  The role assists in managing team productivity and works to ensure the optimal health of the The Bay eCommerce & CRM platforms by overseeing platform performance, resilience, and stability. This role is also an active participant in all aspects of Site Reliability Engineering, including technical vision, telemetry and observation decisions, automation strategy, solution delivery, and platform incident and problem management. 

Who You Are:

  • Bachelor’s Degree in Computer Science or equivalent
  • Programming language Java script and Angular/React JS/ Node JS
  • Azure/AWS, Microsoft, RedHat, certifications and knowledge of ITIL/MOF practices
  • Highly experienced with monitoring, logging & telemetry tools like New Relic, Splunk, ELK, Nagios, SolarWinds, Prometheus, AWS Cloudwatch, Datadog, etc. 
  • Advanced understanding of Networking, Content Delivery Networks (CDN, e.g. Akamai, Cloudflare), and Cloud Platforms.
  • Understanding hand-on experience in the monitoring of streaming platform technologies, like Apache Kafka. 
  • Highly experience with automation and tools such as (but not limited to) Jenkins, Chef, Terraform, Ansible, etc.
  • Expert in architecting, creating and supporing Automation (PowerShell, Python, Ruby, AWK, SED, etc.) to run health-checks and self-healing capabilities for the platforms.
  • Advanced experience in the use of the following platforms and tools:
  • Cloud: MS Azure/AWS Cloud
  • Networking fundamentals: TCP/IP, DNS, WINS, DHCP, etc. 
  • Collaboration & Change Management tools: Jira, ServiceNow, Cherwell, etc.
  • Databases: (Oracle, MS SQL, Teradata, DB2, etc.)
  • 3-5 years of experience working in global organizations with the ability to effectively communicate with executives, leaders and individual contributors across the organization.
  • 3-5 years of SRE experience working on telemetry, observation, self-healing solutions, and platform automation.



Job Qualifications

Thank you for your interest with HBC. We look forward to reviewing your application.


HBC provides equal employment opportunities (EEO) to all employees and applicants for employment.