Code to Chaos: Crowdstrike and our fragile tech infrastructure

Plus what I'm reading in the world of AI

It’s been a nonstop week for everyone — especially in America. And while the upcoming election will have significant ramifications for AI, the tech industry and society at large — for now, I want to set politics aside and focus on the biggest tech news of the last week.

(Also I know it’s been a while since my last newsletter — sorry for that. More on my plans at the end of this newsletter.)

Millions of people learned about Crowdstrike on Friday. Unfortunately, it was for all the wrong reasons.

Crowdstrike, founded in 2011, is a public cybersecurity company that provides security software to over half of the Fortune 1000, including — critically — all of the airlines, most of the banks, many hospitals, and thousands of other businesses across almost every industry. Its IPO in 2019 was a significant win for early investors Accel and Warburg Pincus. Since then, its market cap had more than quadrupled.

All that momentum came to a screeching halt on Thursday night, though, all due to a single line of faulty code being pushed to production. As many of you know by now, a system update to its Falcon Sensor software created a logic error that caused millions of Microsoft Windows systems to crash and display the dreaded “Blue Screen of Death.” The Verge has a good summary of what happened. (Patrick Wardle goes into the technical details if you are interested.)

It’s hard to overstate the impact a single line of code had on the global economy. Major airlines around the world use Crowdstrike. The result: thousands of flight cancellations and angry, stranded passengers. Countless surgeries were canceled or rescheduled. Credit card machines stopped working. Some reports estimate the losses due to supply chain disruption at over $100 million. I suspect the number will be much higher when everything shakes out.

Poor Crowdstrike Falcon…

An Interconnected Economy with Single Points of Failure

The Crowdstrike catastrophe is just the latest in a series of outages that have taken down large parts of our global technology infrastructure. A Cloudflare outage in 2022 took down Shopify, Discord, Grindr and many other apps. An AWS outage in 2021 took down Disney+, Netflix, Slack, Coinbase and more. 1 to 2 of these global outages occur yearly, though few have been as painful as Crowdstrike.

The incident is a stark reminder that our tech infrastructure is more fragile than we want to admit. A single line of bad code can take down hundreds of websites or cancel thousands of flights. And while almost all of these incidents have been the result of human error instead of coordinated cyberattacks, I suspect foreign adversaries and criminals will look to Friday’s incident as a blueprint for how to disrupt global trade and business truly.

The Opportunity

Not everything is bad, though — the Crowdstrike outage will prompt many companies to reevaluate their systems and their redundancies. Companies like Crowdstrike — responsible for keeping millions of companies safe and secure — will absolutely implement more safeguards to prevent another catastrophic error like this one from happening again.

But the bigger change may come from the customer end. Spirit, Delta, United, TD Bank, Starbucks, Home Depot and every major hospital are surely re-evaluating their critical dependencies. What happens if one of their software providers is knocked out? Is there a backup?

Startups and firms that can help businesses, both big and small, decrease their risk of widespread outages will be given a deeper look. On the provider side, quality assurance (QA) startups and teams will be in greater demand. Last week's outages are sure to have more ripple effects and unexpected opportunities.

What I’m Reading in the World of AI

  • Access to publicly shared content has been restricted greatly since the rise of LLMs, according to The New York Times. This data was used to train the first generation of LLMs. Terms of service have been updated, companies like Reddit and StackOverflow now charge for access to their data, and publishers are either suing or striking deals.

  • OpenAI launched GPT-4o Mini last week. It’s a more cost-efficient model of GPT-4o (smaller, faster, but not as great at complex tasks as 4o) and GPT-4, the latest OpenAI model. However, while GPT-4o remains the best-ranked model on most leaderboards, models like Gemini and Claude 3 are gaining traction and eating into OpenAI’s lead. (P.S. Why can’t these companies use more differentiated names for their models? I can barely keep track of 4os and 4o-minis and Claude’s opuses, sonnets and haikus!)

  • MIT researchers are using machine learning to measure atomic patterns in metals, which could be supremely helpful in developing new alloys for medicine and aerospace. Side note: while we talk a lot about generative AI in Silicon Valley, it’s still such a tiny part of the AI universe. There is so much interesting AI research being conducted that will change our lives — and the public will never know about it.

On a Personal Note

We got married in April in Austin!

It’s been a few months since my last newsletter. A big reason is that I got married to the love of my life, Deborah. The New York Times even wrote about our wedding and the pandemic road trip that brought us together. We took some time off for our honeymoon in New Zealand and Fiji. I’ve also been heads-down with my co-founder Matt on our AI businesses — more to announce soon!

Expect more frequent newsletters moving forward. And stay tuned for different newsletter formats here on The A.I. Analyst as I tinker with the best approach moving forward. I hope you’re all well and that the Crowdstrike outage didn’t disrupt your life!

~ Ben