Site Reliability Engineer
- Location HOBOKEN, NJ
- Department -
- Team -
- Employment Type -
- Position -
- Requisition GH2079548
What you'll do at
Intelligent Retail Lab (IRL) is part of Walmart's Store No.8, an innovation hub formed by the world's largest retailer focused on identifying and investing in trends and technologies reshaping the shopping experience.
IRL's mission is to revolutionize in-store experiences, leveraging emerging technology to help define and deliver on evolving customer expectations. Its success requires a cross-functional, mission-based team that is highly entrepreneurial, collaborative and passionate about solving the unsolved problems. As Walmart’s Applied Artificial Intelligence incubator, we work with the bleeding edge of technology to define the future of retail shopping.
As a Site Reliability Engineer you will be working with a range of technologies that power the platform and development of IRL. Through standards, best practices and choosing the right technologies we are looking to maintain a robust environment capable of moving at the speed IRL requires to remain on the forefront of innovation.
What you'll do:
- Work as a part of a team developing production-ready applied artificial intelligence software and massively scalable distributed systems.
- Work with the global Systems and Infrastructure Platform team to implement organizational practices for software development, CI, CD, containerization, and Kubernetes operations.
- Work with your development team and the Systems and Infrastructure Platform team to analyze software and system performance and optimization opportunities.
- Work with the Systems and Infrastructure Platform team to provide real-world usage and failure scenarios in order to continually improve the reliability and stability of AI systems running in real-world environments.
- Participate as a part of an on-call rotation to ensure site availability and reliability.
Skills & Experience Required:
- 2+ years experience in a technical support, DevOps, Systems Administration, or SRE position
- BS in Computer Science or similar is desired
- Ability to distill complex technical challenges to actionable and explainable decisions in a fast-paced CI/CD environment.
- Comfortable working in a variety of programming or scripting languages (primarily Python, TypeScript, and Rust).
- Experience with creation and maintenance of CI/CD systems (for example, Azure DevOps) is desirable.
- Experience with Apache Kafka or other stream-processing platforms is desirable.
- Experience working with Microsoft Azure and/or Google Cloud Platform is desirable.
- A desire to learn, improve, and help solved unsolved problems is a must.