Chaos engineering started at Netflix in 2011 with the invention of the Chaos Monkey, a tool that intentionally disrupted systems on the production network to discover systemic weaknesses so that they could be removed. Since then, the Chaos Monkey has grown to become the Simian Army, and chaos engineering has spread to a global community that develops free & commercial tools to facilitate experiments in QA and production.
My journey to chaos & resilience engineering started in 2009 with my desire to find a better way, leading me to the world of safety science and to its connection to the work at Netflix, Etsy, and elsewhere. In this talk, I’ll explain chaos engineering, the prerequisites for doing it in production, and how it relates to resilience. I will share some of the work I’ve done in chaos engineering (in a small way) and resilience engineering (in a larger way, including research), and also ask attendees to share their own experiences in chaos & resilience engineering – you might not or realize how easy it is to get started, or know that you’re already doing it!
- What is chaos engineering, and what problems does it solve?
- What is the connection between chaos engineering and safety science?
- How can I get started with chaos and resilience engineering?
About John Benninghoff
JOHN BENNINGHOFF is a long-time student and practitioner of managing information risk. He currently leads the Application Security team at Express Scripts, integrating security into the company’s emerging DevOps practice through better quality engineering. His 20-year career in Information Security includes diverse experience in in financial services, retail, and government: building a Network IDS and a vulnerability management platform using open-source software, leading security incident response, identity and access management, policy & standards, security architecture, and many compliance initiatives. He is currently pursuing a Masters of Science in Managing Risk and Systems Change at the School of Psychology of Trinity College Dublin (online), with the goal of adapting safety science to information technology in the emerging field of resilience engineering.