Author Archives: Ivan Pepelnjak
Author Archives: Ivan Pepelnjak
Some OpenFlow-focused startups are desperately trying to tell you how redundant their architecture is. Unfortunately all the whitepapers (and the prancing unicorns) cannot change a simple fact: an SDN controller (OpenFlow-based or otherwise) is in some aspects a single failure domain.
Read more ...One of my readers sent me an interesting challenge: they’re deploying a new DMVPN WAN, and as they cannot expect all locations to have native (non-NAT) IPv4 access, they plan to build the new DMVPN over IPv6. He was wondering whether it would work.
Apart from “you’re definitely going in the right direction” all I could tell him was “looking at the documentation I couldn’t see why it wouldn’t work” Has anyone deployed DMVPN over IPv6 in a production network? Any hiccups? Please share your experience in the comments. Thank you!
TL;DR: I'll be in Bern on September 9th. If you'd like to drop by and discuss network design or automation challenges, read on…
Read more ...The latest release of Cisco Nexus 1000V for vSphere can handle twice as many vSphere hosts as the previous one (250 instead of 128). Cisco probably did a lot of code polishing to improve Nexus 1000V scalability, but I’m positive most of the improvement comes from interesting architectural changes.
Read more ...The pilot episode of Software Gone Wild podcast featuring Snabb Switch created plenty of additional queries (and thousands of downloads) – it was obviously time for another deep dive episode discussing the intricate innards of this interesting virtual switch.
During the deep dive Luke Gorrie, the mastermind behind the Snabb Switch, answered a long list of questions, including:
Read more ...If you’re a regular reader of my blog, you know that I spent a lot of time during the last three years debunking SDN myths, explaining the limitations of OpenFlow and pointing out other technologies one could use to program the network.
During the summer of 2014 I organized my SDN- and OpenFlow-related blog posts into a digital book. I want to make this information as useful and as widely distributed as possible – for a limited time you can download the PDF free of charge.
A while ago I wrote about the idea of treating network infrastructure (and all other infrastructure) as code, and using the same processes application developers are using to write, test and deploy code to design and implement networks.
That approach clearly works well if you can virtualize (and clone ad infinitum) everything. We can virtualize appliances or even routers, but installed equipment and high-speed physical infrastructure remain somewhat resistant to that idea. We need a different paradigm, and the best analogy I could come up with is a database.
Read more ...A reader sent me this question:
My company will have 10GE dark fiber across our DCs with possibly OTV as the DCI. The VM team has also expressed interest in DC-to-DC vMotion (<4ms). Based on your blogs it looks like overall you don't recommend long-distance vMotion across DCI. Will the "Data Center trilogy" package be the right fit to help me better understand why?
Unfortunately, long-distance vMotion seems to be a persistent craze that peaks with a predicable period of approximately 12 months, and while it seems nothing can inoculate your peers against it, having technical arguments might help.
Read more ...My good friend Tiziano complained about the fact that BGP considers next hop reachable if there’s an entry in the IP routing table even though the router cannot even ping the next hop.
That behavior is one of the fundamental aspects of IP networks: networks built with IP routing protocols rely on fate sharing between control and data planes instead of path liveliness checks.
Read more ...I first met Elisa Jasinska when she had one of the coolest job titles I ever saw: Senior Packet Herder. Her current job title is almost as cool: Senior Network Toolsmith @ Netflix – obviously an ideal guest for the Software Gone Wild podcast.
In our short chat she described some of the tools she’s working on, including an adaptation of pmacct to environments with numerous BGP exit points (more details in her NANOG presentation).
Building a private cloud infrastructure tends to be a cumbersome process: even if you do it right, you oft have to deal with four to six different components: orchestration system, hypervisors, servers, storage arrays, networking infrastructure, and network services appliances.
Read more ...A few days ago I had an interesting interview with Christoph Jaggi discussing the challenges, changes in mindsets and processes, and other “minor details” one must undertake to gain something from the SDDC concepts. The German version of the interview is published on Inside-IT.ch; you’ll find the English version below.
Read more ...Nexus 1000V release 5.2(1)SV3(1.1) was published on August 22nd (I’m positive that has nothing to do with VMworld starting tomorrow) and I found this gem in the release notes:
Enabling BPDU guard causes the Cisco Nexus 1000V to detect these spurious BPDUs and shut down the virtual machine adapters (the origination BPDUs), thereby avoiding loops.
It took them almost three years, but we finally have BPDU guard on a layer-2 virtual switch (why does it matter). Nice!
After a week of testing, I decided to move the main ipSpace.net web site (www.ipspace.net) as well as some of the resource servicing hostnames to CloudFlare CDN. Everything should work fine, but if you experience any problems with my web site, please let me know ASAP.
2014-08-27: Had to turn off CloudFlare (and thus IPv6). They don't seem to support HTTP range requests, which makes video startup time unacceptable. Will have to move all video URLs (where the HTTP range requests are expected coming from streaming clients) to a different host name, which will take time.
Collateral benefit: ipSpace.net is now fully accessible over IPv6 – register for the Enterprise IPv6 101 webinar if you think that doesn’t matter ;)
A while ago I explained why OpenFlow might be a wrong tool for some jobs, and why centralized control plane might not make sense, and quickly got misquoted as saying “controllers don’t scale”. Nothing could be further from the truth, properly architected controller-based architectures can reach enormous scale – Amazon VPC is the best possible example.
Read more ...Here’s an interesting story illustrating the potential pitfalls of multi-DC deployments and the impact of data gravity on application performance.
Long long time ago on a cloudy planet far far away, a multinational organization decided to centralize their IT operations and move all workloads into a central private cloud.
Read more ...SDN evangelists talking about centralized traffic engineering, flow steering or bandwidth calendaring sometimes tend to gloss over the first rule of successful traffic engineering: Know Thy Traffic.
In a world ruled by OpenFlow you’d expect the OpenFlow controller to know all the traffic; in more traditional networks we use technologies like NetFlow, sFlow or IPFIX to report the traffic statistics – but regardless of the underlying mechanism, you need a tool that will collect the statistics, aggregate them in a way that makes them usable to the network operators, report them, and potentially act on the deviations.
Read more ...I’m still getting questions about layer-2 data center interconnect; it seems this particular bad idea isn’t going away any time soon. In the face of that sad reality, let’s revisit what I wrote about layer-2 DCI over VXLAN.
VXLAN hasn’t changed much since the time I explained why it’s not the right technology for long-distance VLANs.
Read more ...Last week the global routing table (as seen from some perspectives) supposedly exceeded 512K routes, and weird things started to happen to some people that are using old platforms that by default support 512K IPv4 routes in the switching hardware.
I’m still wondering whether the BGP table size was the root cause of the observed outages. Cisco’s documentation (at least this document) is pretty sloppy when it comes to the fact that usually 1K = 1024, not 1000 – I’d expect the hard limit to be @ 524.288 routes … but then maybe Cisco’s hardware works with decimal arithmetic.
Read more ...