Author Archives: Marten Terpstra
Author Archives: Marten Terpstra
Back in December of 2014, I wrote a blog post about the complexities of the network as a distributed system. In it I pointed out that networks have traditionally been built as distributed systems, and that our entire management and knowledge base for networks is based on this. But, is this the best approach for current and future networking needs?
As humans we have our own ideas of how to best solve problems. While we are immensely creative, our solutions aren’t always the best (or will never be). We often look to nature as a guide for how to improve our manmade solutions. When we look at how large complex systems in nature have been created and evolved, perhaps we should look no further than ourselves. The human body is possibly one of the most complex systems we know. If we think of the brain as the centralized control system, it is easy to see that we are, in fact, highly centralized beings.
We live and function, however, in a social environment that has no clearly established central control system. We are organized by different permanent and temporary control systems. We create environments with centralized direction of work to be Continue reading
Many years ago Gartner introduced their technology Hype Cycle, which maps visibility against maturity for new technology. The Hype Cycle in essence states that many new technologies get a large amount of visibility early in their maturity cycle. The visibility and enthusiasm drops significantly when reality sets in: technologies early in their maturity cycle will have low adoption rates. The vast majority of customers of technology are conservative in their choices, especially if this new technology is not (yet) fundamental to this customer’s business.
I call it common sense reality, Garter calls it the Trough of Disillusionment, fine. It is that realization that the technology may have lots of promises, but isn’t ready to be consumed.
That is where the real work starts, maturing the technology, driving solutions and use cases, creating the economic viability of the technology and tons of other stuff that needs to be done to get a customer base to actually buy into this technology. Not with words and attention, but with the only thing that matters ultimately, money. Gartner calls delivering these absolutely necessary components the Slope of Enlightenment.
Not every technology follows this cycle, not every technology survives the downward turn after the inflated Continue reading
Whenever we get to the end of a year we have this tendency to reflect on what has happened in the past year and how we can improve in the coming year. It’s natural to use the change of calendar year as a point in time to think back, even though practically speaking it is usually the most chaotic time of the year between shopping, family and year and quarter end at work.
Almost every industry will go through waves of change and transformation. Real change and transformation is driven by powerful market forces of demand coupled with technology leaps that allow an escape from incremental changes that drive day to day improvements. Networking has gone through several of these transformations. From dedicated main frame based connectivity, to coax based shared ethernet to switches ethernet in local area networks. From 1200 baud dialup serial connections through X.25 (yes, that’s the European in me) to leased T1 to ATM, to Frame Relay, to Packet over SONET to MPLS and various flavors of wide area ethernet services. Some of these were incremental, some of them truly transformational.
When you look back, each of these changes in network technology was very much Continue reading
Through http://blog.ipspace.net I landed on this article on acm.org discussing the complexity of distributed systems. Through some good examples, George Neville-Neil makes it clear that creating and scaling distributed systems is very complex and “any one that tells you it is easy is either drunk or lying, and possibly both”.
Networks are of course inherently distributed systems. Most everyone that has managed a good sized network before knows that like the example in the article, minor changes in traffic or connectivity can have huge implications on the overall performance of a network. In my time supporting some very large networks I have seen huge chain reactions of events based on what appear to be some minor issues.
Very few networks are extensively modeled before they are implemented. Manufacturers of machines, cars and many other things go through extensive modeling to understand the behaviors of what they created and their design choices. Using modeling they will look at all possible inputs and outputs, conditions, failure scenarios and anything else we can think of to see how their product behaves.
There are few if any true modeling tools for networks. We build networks with extensive distributed protocols to control connectivity Continue reading
A few weeks ago Facebook announced their new datacenter architecture in a post on their network engineering blog. Facebook is one of the few large web scale companies that is fairly open about their network architecture and designs and it gives many others the opportunity to see how a network can be scaled, even though the scale is well beyond what most will need in the foreseeable future, if not forever.
In the post, Alexey walks through some of the thought process behind the architecture, which is ultimately the most important part of any architecture and design. Too often we simply build whatever seems to be popular or common, or mandated/pushed by a specific vendor. The network however is a product, a deliverable, and has requirements like just about anything else we produce.
Facebook’s and the other web properties’ scale is at a different order of magnitude from most everyone else, but their requirements should sound pretty familiar to many:
I always enjoy reading the IPspace blog and as Ivan has stated about our blog, I don’t always agree with his opinion, but they are informative and cover just about everything networking. So this may come as a surprise, but in response to his “Do we have too many knobs” post from about a week ago I have one simple response: “Amen”.
Networking is unnecessarily complicated. We have written several blogs on this topic and related items. I used to run the sustaining organization for all data products at my previous company and when you do the analysis of the customer reported issues that come in to the support organization, you find that a very large percentage stem from configuration mistakes.
Many of those mistakes are not typos. We like to refer to fat fingered configurations often as a reason to move to a more automated configuration and provisioning environment, but most of the configuration mistakes that are made are simply because we have made it so difficult to configure these devices. Type something in the wrong order and it may not work right or behave slightly differently. Simple checks across configurations that could avoid many problems are Continue reading
A few weeks ago I read this article from Craig Matsumoto on SDN Central.
At first I read it with a bit of a smile, but for some reason it has actually started to bother me a little. In this article, Craig summarizes a talk by Scott Shenker about SDN and a proposal for an SDNv2 that would fix the things that are wrong with SDNv1. In a way this represents what is wrong with our industry. We create a new version or create a new name for a concept that is not particularly well defined to begin with, and in many interpretations is far broader than is assumed in the pre-fixed version.
Many folks still believe that OpenFlow defines SDN. And that all the limitations of a basic protocol invalidate or limit the capabilities of an evolving concept like SDN. Why do we feel such a need to increment a version of an undefined term to make it sound like we are creating something new and different?
In SDNv2, we would still have separation of control and date (at least all that work is not wasted), but there are three major differences between it and the “old” SDN concepts. Continue reading
The requirements for next generation applications in the Third Platform era have a profound impact on the network. No longer can we treat the network as a piece of infrastructure that just needs to be present. It has to drastically change to become a fundamental component of the next generation application. Mike went through some of the network implications of the new era application properties in his post yesterday:
The change towards Third Platform IT infrastructures is more than evolutionary. The compute, storage and application frameworks and infrastructures started their transformation a while ago. These types of shifts take time, but networking has not run at the same pace of change to keep up. Up to recently, networking’s great contribution to the changing IT world was a move from a multi tier network into a two tier network with a new name. Hardly transformational to say the least.
A move towards a new platform does not happen overnight. It takes time and more importantly, it takes several technology iterations to get there. A migration from the current platform requires migration technologies: pieces and parts of what we will ultimately Continue reading
In the world of Anything-as-a-Service (I will leave the acronym to your imagination), Network-as-a-Service is not a new term. In fact, it even has its own wikipedia page which will tell you it has been used for many years now, well before the current set of service related terms in IT have become popular.
Like most high tech industries, we get somewhat carried away when we have some new terminology and quickly overuse and overload them, watering them down to be meaningless or at least highly confusing. But when you cut through the clutter a bit, the as-a-Service terminology most certainly articulates a shift in thought process and behaviors on how we provide and consume IT resources.
The IT organization has always been a service organization, there is nothing much new there. From the days of mainframes and supercomputers, their job was to provide access to these expensive resources and maintain them. They provided environments that allowed the users to conveniently consume these abilities, and the business applications that ran on top of them, whether those were financial systems, email, uucp news (remember those days) or the basic ability to run user created jobs.
With the distribution of compute and Continue reading
I am sure our work environment is not all that different from many others. There are large whiteboards everywhere and you cannot find a meeting room that does not have circles, lines and squares drawn on them. Some of our favorite bloggers have written blogs about network drawing tools and aids. Probably not restricted to just networking folks, but we certainly love to visualize the things we do. Out of all the customers I have visited, the amount of them where one of us did not end up on a whiteboard can probably be counted on one hand.
It is not surprising that we are drawn to diagrams of the networks we have created. We build our network one device at a time, then use network links to connect the next and on we go until our network is complete. Which of course it never is. To track how we have connected all our devices we need diagrams. They tell us what devices we have, how they are attached to each other, how they are addressed and what protocols we have used to govern their connectivity. They are multi layered and the layers are semi independent.
I have previously said Continue reading
You have probably realized we are having a Big Data kind of week here at the Plexxi blog. And for good reason. The amount of development and change in this big bucket of applications we conveniently label “Big Data”, is astonishing.
Walking around at Hadoopworld in New York last week, I initially felt somewhat lost as a “networking guy”. But that feeling of “not belonging” is only superficial, the network has a tremendously important role in these applications. The challenge is that many “networking” folks don’t quite understand or realize that yet, but contrary to what I believed not too long ago, Big Data Application folks have a pretty good understanding of the role of the network in their overall application and its performance.
As an industry we have been talking about the increase in east-west traffic for quite a few years now. For your typical datacenter infrastructure today this is based on loosely coupled applications and semi-distributed storage. A web based application has many components that together make up the application we see as users. There are application load balancers, web server front ends, application back ends that in turn have databases for their data storage. And those databases Continue reading
Triggered by a discussion with a customer yesterday, it occurred to me (again?) that network engineers are creatures of habit and control. We have strong beliefs of how networks should be architected, designed and build. We have done so for long times and understand it well. We have tweaked our methods, our tools, our configuration templates. We understand our networks inside out. We have a very clear mental view of how they behave and how packets get forwarded, how they should be forwarded. It’s comfort, it’s habit, we feel (mostly) in control of the network because we have a clear model in our head.
I don’t believe this is a network engineering trait per se. Software engineers want to understand algorithms inside out, they want to understand the data modeling, types structures and relationships.
Many of us know the feeling. Something new comes around and it’s hard to put your head around it. It challenges the status quo, it changes how we do things, it changes what we (think we) know. When we are giving responsibility of something new, there is a desire to understand “it” inside out, as a mechanism to be able to control “it”.
Throughout the development cycle of new features and functions for any network platform (or probably most other products not targeted at the mass market consumer) this one question will always come up: should we protect the user of our product from doing this? And “this” is always something that would allow the user of the product to really mess things up if not done right. As a product management organization you almost have to take a philosophical stand when it comes to these questions.
Sure enough, the question came up last week as part of the development of one our features. When putting the finishing touches on a feature that allows very direct control over some of the fundamental portions of what creates a Plexxi fabric, our QA team (very appropriately) raised the concern: if the user does this, bad things can happen, should we not allow the user to change this portion of the feature?
This balancing act is part of what as made networking as complex as it has become. As an industry we have been extremely flexible in what we have exposed to our users. We have given access to portions of our products Continue reading
In the past few weeks at Plexxi we spend probably an unreasonable amount of time talking about, discussing and even arguing over ethernet cables and connectors. As mundane as it may sound, the options, variations, restrictions and cost variations of something that is usually an afterthought is mind boggling. And as a buyer of ethernet networks, you have probably felt that the choices you make will significantly change the price you pay for the total solution.
During our quarterly Product Management get together, my colleague Andre Viera took 25GbE as a trigger to walk the rest of the team through all the variations of cables and transceivers. As a vendor it is a rather complicated topic and as a customer I can only imagine how the choices may put you in a bad mood.
Most of today’s 10GbE switches ship with SFP+ cages and a handful of QSFP cages. Now comes the hard part. What do I plug into these cages? There are lots of choices all with their own pros and cons.
The cheapest solution is a Direct Attach Cable or DAC. These are copper based cables that have SFP+ transceivers molded onto the cable. It Continue reading
In a blog week dedicated to the application and the policies that govern them, I wanted to add some detail on a discussion I have with customers quite often. It should be clear that we at Plexxi believe in application policy driven network behaviors. Our Affinities allow you to specify desired behavior between network endpoints, which will evolve with the enormous amount of policy work Mat described in his 3 piece article earlier this week.
Many times when I discuss Affinities and policies with customers or more generically with network engineering types, the explanation almost always lands at Access Control Lists (ACLs). Cisco created the concept of ACLs (and its many variations used for other policy constructs) way way back as a mechanism to instruct the switching chips inside their routers to accept or drop traffic. It started with a very simple “traffic from this source to this destination is dropped” and has very significantly evolved since then in Cisco’s implementation and many other of the router and switch vendors.
There are 2 basic components in an ACL:
1. what should I match a packet on
2. what is the action I take once I found a match.
Both Continue reading
Last week Ivan Pepelnjak wrote an article about the failure domains of controller based network architectures. At the core of SDN solutions is the concept of a controller, which in most cases lives outside the network devices themselves. A controller as a central entity controlling the network (hence its name) provides very significant values and capabilities to the network. We have talked about these in this blog many times.
When introducing a centralized entity into any inherently distributed system, the architecture of such a system needs to carefully consider failure domains and scenarios. Networks have been distributed entities, with each device more or less independent and a huge suite of protocols defined to manage the distributed state between all of them. When you think about it, it’s actually quite impressive to think about the extend of distribution we have created in networks. We have created an extremely large distributed system with local decision making and control. I am not sure there are too many other examples of complex distributed systems that truly run without some form of central authority.
It is exactly that last point that we networking folks tend to forget or ignore. Many control systems in Continue reading