Senior Lead, Site Reliability Engineer
Who We Are
Kyndryl is a market leader that thinks and acts like a start-up. We design, build, manage, and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers, and our communities.
- Be a responder of the SRE on-call rotation (PagerDuty) to respond to incidents that impact service availability. ( as defined by country regulations )
- Manage infrastructure on Azure and AWS
- Using IaC tools including Terraform and Ansible
- Build monitoring that alerts on symptoms before they become outages.
- Document every action so your findings turn into repeatable actions and then into automation.
- Prevent incidents from happening through blameless postmortems.
- Improve operational processes (such as deployments and upgrades) to make them as simple and streamlined as possible.
- Design, build and maintain core infrastructure that enables scaling to many terabytes of data.
- Debug production issues across all services and levels of the stack.
- Plan the growth of our infrastructure.
- Think about systems: edge cases, failure modes, behaviors, specific implementations.
- Remain current and up to date with emerging technologies, business requirements and enhancements & develop proposals for changes that may be required.
- Execute creation and maintenance of architectural documentation.
- Assist/engage other system owners and project development teams that have integration requirements with the various other enterprise security systems.
- Assist/engage other engineering teams for problem determination of incidents.
- Provide expert advice to the Security Technical Design Authority.
- Act in accordance with and be an advocate for Core Values (Respect, Collaboration, Accountability, and Transparency).
- Be highly motivated with a strong desire to obtain a deep understanding of the supported environments and integrations. Possess the ability to work independently and as part of a team to research/resolve technical issues and develop quality solutions.
- Professionally evolve and inspire others to do the same.
- Work is generally done in a remote home office.
- Be available for occasional night or weekend work.
Who You Are
Able to self-organize and report asynchronously.
Familiar with agile methodologies; use epics, issues to drive projects.
Experience managing complex security solutions in large environments.
Strong understanding of Linux, network troubleshooting analysis, and current security methodologies.
Strong understanding of cybersecurity technologies, protocols, and applications.
Detailed technical experience in the installation, configuration, and operation of high-end security solutions.
Experience in log management platforms experience, including Splunk, Elasticsearch, Logstash, Kibana - ELK, and Elastic Stack.
Experience with container services, including Docker, and Kubernetes.
Experience with IDS/IPS, SEIM, Endpoint solutions and technologies.
Completing Root Cause Analysis (RCA) investigations and performing operational readiness reviews.
Improving team practices through code reviews, handoffs of work and incidents.
Must have a thorough (advanced to expert) understanding of IT security and implementation of security related guidelines and impact on IT infrastructures.
Problem solving abilities across enterprise multiple technology environments with complex integrations.
Effective time management skills.
Strong verbal and written communication skills; must be able to communicate effectively with a wide variety of audiences, both business and technical.
Work collaboratively and cooperatively with diverse geographical and cultural groups.
Diversity is a whole lot more than what we look like or where we come from, it’s how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we’re not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you – and everyone next to you – the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That’s the Kyndryl Way.
What You Can Expect
With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Our employee learning hub gives you access to the best learning in the industry to receive certifications and accreditations, including Microsoft University, AWS Cloud Center of Excellence, Udemy, and the Harvard Business Review. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you! We want you to succeed so that together, we will all succeed.