Someone pointed me to a high-level overview of Google’s Spanner database which included this gem:
A second refinement is that there are many other sources of outages, some of which take out the users in addition to Spanner (“fate sharing”). We actually care about the differential availability, in which the user is up (and making a request) to notice that Spanner is down. This number is strictly higher (more available) than Spanner’s actual availability — that is, you have to hear the tree fall to count it as a problem.
In other words, it doesn’t matter if your distributed database fails if its user are also gone. Keep this concept in mind every time you’re designing a high availability solution – some corner cases are simply not worth solving.
Someone pointed me to a high-level overview of Google’s Spanner database which included this gem:
A second refinement is that there are many other sources of outages, some of which take out the users in addition to Spanner (“fate sharing”). We actually care about the differential availability, in which the user is up (and making a request) to notice that Spanner is down. This number is strictly higher (more available) than Spanner’s actual availability — that is, you have to hear the tree fall to count it as a problem.
In other words, it doesn’t matter if your distributed database fails if its user are also gone. Keep this concept in mind every time you’re designing a high availability solution – some corner cases are simply not worth solving.
After figuring out what business problem you’re trying to solve and what the users expect to get from you it’s time for the next crucial question: should you buy a shrink-wrapped product/solution or build your own? I addressed that question in the third part of Focus on Business Challenges First presentation.
Not surprisingly, the same dilemma applies to network automation solutions, and is often the source of endless time-wasting discussions that I really should have stopped engaging in, but sometimes duty calls ;)
After figuring out what business problem you’re trying to solve and what the users expect to get from you it’s time for the next crucial question: should you buy a shrink-wrapped product/solution or build your own? I addressed that question in the third part of Focus on Business Challenges First presentation.
Not surprisingly, the same dilemma applies to network automation solutions, and is often the source of endless time-wasting discussions that I really should have stopped engaging in, but sometimes duty calls ;)
Continuing our Fast Failover saga, let’s focus on techniques and technologies available to implement it (assuming you still think it’s worth the effort).
There are numerous technologies you can use to implement fast reroute, from the most complex to the easiest one:
Continuing our Fast Failover saga, let’s focus on techniques and technologies available to implement it (assuming you still think it’s worth the effort).
There are numerous technologies you can use to implement fast reroute, from the most complex to the easiest one:
One of my readers encountered an interesting problem when upgrading a data center fabric to 100 Gbps leaf-to-spine links:
Fortunately my reader took a closer look at the data before they requested a wholesale replacement… and spotted an interesting pattern:
One of my readers encountered an interesting problem when upgrading a data center fabric to 100 Gbps leaf-to-spine links:
Fortunately my reader took a closer look at the data before they requested a wholesale replacement… and spotted an interesting pattern:
A while ago we had an interesting exchange of ideas around inserting high-availability network appliance into a public cloud environment (TL&DR: it was really hard until AWS introduced Gateway Load Balancing), and someone quickly pointed out we’re solving the wrong challenge because…
Azure Firewall […] is a fully stateful firewall-as-a-service with built-in high-availability.
Somehow he wasn’t too happy when I pointed out that there’s more to high availability than vendor marketing ;)
A while ago we had an interesting exchange of ideas around inserting high-availability network appliance into a public cloud environment (TL&DR: it was really hard until AWS introduced Gateway Load Balancing), and someone quickly pointed out we’re solving the wrong challenge because…
Azure Firewall […] is a fully stateful firewall-as-a-service with built-in high-availability.
Somehow he wasn’t too happy when I pointed out that there’s more to high availability than vendor marketing ;)
Remember my BGP route selection rules are a clear failure of intent-based networking paradigm blog post? I wrote it almost three years ago, so maybe you want to start by rereading it…
Making long story short: every large network is a unique snowflake, and every sufficiently convoluted network architect has unique ideas of how BGP route selection should work, resulting in all sorts of crazy extended BGP communities, dozens if not hundreds of nerd knobs, and 2000+ pages of BGP documentation for a recent network operating system (no, unfortunately I’m not joking).
Remember my BGP route selection rules are a clear failure of intent-based networking paradigm blog post? I wrote it almost three years ago, so maybe you want to start by rereading it…
Making long story short: every large network is a unique snowflake, and every sufficiently convoluted network architect has unique ideas of how BGP route selection should work, resulting in all sorts of crazy extended BGP communities, dozens if not hundreds of nerd knobs, and 2000+ pages of BGP documentation for a recent network operating system (no, unfortunately I’m not joking).
Dealing with protocols that embed network-layer addresses into application-layer messages (like FTP or SIP) is great fun, more so if the said protocol traverses a NAT device that has to find the IP addresses embedded in application messages while translating the addresses in IP headers. For whatever reason, the content rewriting functionality is called application-level gateway (ALG).
Even when we’re faced with a monstrosity like FTP or SIP that should have been killed with napalm a microsecond after it was created, there’s a proper way of doing things and a fast way of doing things. You could implement a protocol-level proxy that would intercept control-plane sessions… or you could implement a hack that tries to snoop TCP payload without tracking TCP session state.
Not surprisingly, the fast way of doing things usually results in a wonderful attack surface, more so if the attacker is smart enough to construct HTTP requests that look like SIP messages. Enjoy ;)
Dealing with protocols that embed network-layer addresses into application-layer messages (like FTP or SIP) is great fun, more so if the said protocol traverses a NAT device that has to find the IP addresses embedded in application messages while translating the addresses in IP headers. For whatever reason, the content rewriting functionality is called application-level gateway (ALG).
Even when we’re faced with a monstrosity like FTP or SIP that should have been killed with napalm a microsecond after it was created, there’s a proper way of doing things and a fast way of doing things. You could implement a protocol-level proxy that would intercept control-plane sessions… or you could implement a hack that tries to snoop TCP payload without tracking TCP session state.
Not surprisingly, the fast way of doing things usually results in a wonderful attack surface, more so if the attacker is smart enough to construct HTTP requests that look like SIP messages. Enjoy ;)
More than a decade ago I published tons of materials on a web site that eventually disappeared into digital nirvana, leaving heaps of broken links on my blog. I decided to clean up those links, and managed to save some of the vanished content from the Internet Archive:
I also updated dozens of blog posts while pretending to be Indiana Jones, including:
More than a decade ago I published tons of materials on a web site that eventually disappeared into digital nirvana, leaving heaps of broken links on my blog. I decided to clean up those links, and managed to save some of the vanished content from the Internet Archive:
I also updated dozens of blog posts while pretending to be Indiana Jones, including:
It’s amazing how far you can get if you keep doing something for a long-enough time. In a bit over 10 years (the initial versions of the earliest still-active webinars were created in October 2010), we accumulated over 300 hours of online content available with ipSpace.net subscription, plus another 130 hours of online course content.
Obviously I couldn’t have done that myself. Thanks a million to Irena who took over most of the day-to-day business a few years ago, dozens of authors, and thousands of subscribers who enabled us to make it all happen.
It’s amazing how far you can get if you keep doing something for a long-enough time. In a bit over 10 years (the initial versions of the earliest still-active webinars were created in October 2010), we accumulated over 300 hours of online content available with ipSpace.net subscription, plus another 130 hours of online course content.
Obviously I couldn’t have done that myself. Thanks a million to Irena who took over most of the day-to-day business a few years ago, dozens of authors, and thousands of subscribers who enabled us to make it all happen.
One of my subscribers trying to figure out how to improve his career choices sent me this question:
I am Sr. Network Engineer with 12+ Years’ experience. I was quit happy with my networking skills but will all the recent changes I’m confused. I am not able to understand what are the key skills I should learn as a network engineer to keep myself demandable.
Before reading the rest of this blog post, please read Cloud and the Three IT Geographies by Massimo Re Ferre.
One of my subscribers trying to figure out how to improve his career choices sent me this question:
I am Sr. Network Engineer with 12+ Years’ experience. I was quit happy with my networking skills but will all the recent changes I’m confused. I am not able to understand what are the key skills I should learn as a network engineer to keep myself demandable.
Before reading the rest of this blog post, please read Cloud and the Three IT Geographies by Massimo Re Ferre.