a dynamic information technology staffing firm

Careers - Job Details

Site Reliability Engineer

Summary:
BGI has the following Contract opportunity with our direct client in Union NJ
 
Job ID/Number:
201904-2093
 
Posted Date:
4/5/2019
 
Job Location:
Union, NJ
 
Position Type:
Contractor
 
Division:
Information Technology
 
Description:
The Site Reliability Team at Bed Bath and Beyond is looking for a Site Reliability Engineer (SRE) who can build, instrument, troubleshoot, automate and triage highly scalable legacy and modern systems.
 
The candidate will be part of a team with a mission to blend a variety of skill sets and work collaboratively to ensure not only that we deliver quality, but also take an active role in determining what architectures and technologies perform, scale and deliver services reliably.
 
Responsibilities
 
Troubleshoot issues across the entire stack - hardware, software, applications and network.
 
Design, build, test, and automate discovery, instrumentation, alerting, and escalation of monitoring.
 
Document and articulate clearly all efforts and communicate and demonstrate to the team with ease.
 
On-call responsibilities.
 
 
 
Qualifications
 
Capable of responding to major\critical events and be an active participant in determining solutions and instrumentation Hands on experience building fault tolerant infrastructure and monitoring instrumentation with such technologies as Kubernetes, Kafka, Cassandra, AWS, GCP, etc.
 
Experience instrumenting and researching issues with CA Monitoring Suite, Nagios, InfluxDB, Grafana, Prometheus, Stack Driver, Sumo Logic, New Relic, Quantum Metric, Tealeaf etc...
 
Familiarity with tools such as Puppet, Ansible, Salt, Chef, or CFEngine would be a plus.
 
Additional familiarity with log analysis tools such as Sumo Logic, ELK, and Splunk would also be helpful.
 
Practical knowledge of shell scripting and at least one scripting language (Python, Ruby).
 
 
 
close (X)