Posted: 2 Min ReadFeature Stories

When Quality is Everything

Today’s connected world expects near-infallibility

Last week’s outage is a sobering reminder of the connectedness of today’s world and our collective reliance on (and end user expectations of) the quality control of critical vendors balanced against the amount of updates it takes to keep businesses secure against a highly organized, automated, and aggressive threat landscape. 

It also means that we all have a responsibility to support one another when things don’t go to plan. 

The Enterprise Security Group at Broadcom supports all of the vendors and organizations impacted by the outage and extends our gratitude to the teams working tirelessly to get everything up and running again. The security and IT communities always bounce back from challenges stronger than before. We know this because Symantec has been in this very position.

We’ve Been Here Before

In 2007, back when Symantec was a standalone company, we released a faulty update to our endpoint security products. Security products, especially endpoint security, need constant updates to keep up with increasingly clever attackers. But updating software on millions of computers is challenging. Operating systems are complex, and there are thousands of other software products that are also running on computers that could conflict with our updates. 

Most of the updates occur without incident. But in 2007, one of our updates caused a dreaded “blue screen of death” on millions of PCs in the Asia-Pacific region. There was a conflict between our software and the operating system that our testing did not catch. It was a huge nightmare for a subset of our customers. 

What We Learned

That event was a turning point for me and my team. We had violated our first oath — do no harm. There were plenty of excuses for why the crash had occurred. But the responsibility was squarely with us. 

Just as we learned from our own experience, we’re sure that the team at CrowdStrike will learn from theirs. That sounds like something any company could say, but to avoid a repeat of what happened in 2007, we realized that new features all must take a back seat to quality controls. 

This led us to make choices and employ strategies that ensure an increasing baseline of quality: 

  • We built, and have moved, nearly all of our code updates into a sandbox environment where it is impossible for the code to create a crash and guard rails prevent possible conflicts.
  • We also developed “short-circuit” mechanisms that work like the circuit breaker in your home: when an issue is detected, an automatic rollback to a working state is triggered.
  • Updates are pushed incrementally over time, first to Broadcom employees, and then in moderated stages to our customers. This is combined with careful planning to avoid weekends, holidays, and other critical time periods.
  • Endpoint security products were completely rearchitected to remove code from the operating system kernel. We found ways to stay out, while still preserving the security benefits that our products require.

While no vendor is perfect (us included), we have adopted an approach we call “mountain with no top” where we’re constantly thinking about and investing in better and new ways to ensure we do no harm.  

Our mission, as always, is to prevent attacks. But first, we must do no harm.

To understand more about how we are committed to our quality-first culture and the responsibility to do no harm, reach out to your local Symantec representative. They’ll be happy to organize a briefing with our development teams.

 

About the Author

Adam Bromwich

CTO and Head of R&D, Enterprise Security Group, Broadcom

Adam leads a global team of engineers and analysts who develop the game-changing security technologies, attack intelligence, and security content that protects Symantec and Carbon Black customers.

Want to comment on this post?

We encourage you to share your thoughts on your favorite social platform.