Archive

Category Archives for "Russ White"

Being an Effective Interviewer

One challenging aspect of being an engineer is interviewing other engineers. The interview process is rife with various problems, including the discomfort of interviewing someone who you perceive as being a better engineer than you are, or figuring out how to draw out actual engineering skill versus simply finding out how much someone has memorized. Does that CCIE or degree on their resume really mean anything? What does an effective network engineering interview look like?

There are, of course, several different theories “out there.” For instance, some companies focus on giving candidates “real world” problems to solve, and checking with them several days later to see how they’ve done. This is potentially useful, but quite often I find my best work is done with a team, rather than by myself. Such systems seem to tend towards pitting the candidate against well known or well established problem sets, which can easily revert back to memorization skills, or towards difficult/obtuse problems. Either way, this doesn’t test the candidate’s ability to work in a team, or interact with others in solving difficult problems.

What about tossing other sorts of puzzles towards the candidate to see how they do? This also seems problematic Continue reading

snaproute Go BGP Code Dive (11): Moving to Open Confirm

In the last post in this series, we began considering the bgp code that handles the open message that begins moving a new peer to open confirmed state. This is the particular bit of code of interest—

case BGPEventBGPOpen:
  st.fsm.StopConnectRetryTimer()
  bgpMsg := data.(*packet.BGPMessage)
  if st.fsm.ProcessOpenMessage(bgpMsg) {
    st.fsm.sendKeepAliveMessage()
    st.fsm.StartHoldTimer()
    st.fsm.ChangeState(NewOpenConfirmState(st.fsm))
  }

We looked at how this code assigns the contents of the received packet to bgpMsg; now we need to look at how this information is actually processed. bgpMsg is passed to st.fsm.ProcessOpenMessage() in the next line. This call is preceded by the st.fsm, which means this function is going to be found in the FSM, which means fsm.go. Indeed, func (fsm *FSM) ProcessOpenMessage... is around line 1172 in fsm.go—

func (fsm *FSM) ProcessOpenMessage(pkt *packet.BGPMessage) bool {
 body := pkt.Body.(*packet.BGPOpen)

 if uint32(body.HoldTime) < fsm.holdTime {
  fsm.SetHoldTime(uint32(body.HoldTime), uint32(body.HoldTime/3))
 }

 if body.MyAS == fsm.Manager.gConf.AS {
  fsm.peerType = config.PeerTypeInternal—
 } else {
  fsm.peerType = config.PeerTypeExternal
 }

 afiSafiMap := packet.GetProtocolFromOpenMsg(body)
 for protoFamily, _ := range afiSafiMap {
  if fsm. Continue reading

Can I2RS Keep Up? (I2RS Performance)

What about I2RS performance?

The first post in this series provides a basic overview of I2RS; there I used a simple diagram to illustrate how I2RS interacts with the RIB—

rib-fib-remote-proxy

One question that comes to mind when looking at a data flow like this (or rather should come to mind!) is what kind of performance this setup will provide. Before diving into the answer to this question, though, perhaps it’s important to ask a different question—what kind of performance do you really need? There are (at least) two distinct performance profiles in routing—the time it takes to initially start up a routing peer, and the time it takes to converge on a single topology and/or route change. In reality, this second profile can be further broken down into multiple profiles (with or without an equal cost path, with or without a loop free alternate, etc.), but for our purposes I’ll just deal with the two broad categories here.

If your first instinct is to say that initial convergence time doesn’t matter, go back and review the recent Delta Airlines outage carefully. If you are still not convinced initial convergence time matters, go back and reread what you can Continue reading

Reaction: Devops and Dumpster Fires

Networking is often a “best effort” type of configuration. We monkey around with something until it works, then roll it into production and hope it holds. As we keep building more patches on to of patches or try to implement new features that require something to be disabled or bypassed, that creates a house of cards that is only as strong as the first stiff wind. It’s far too easy to cause a network to fall over because of a change in a routing table or a series of bad decisions that aren’t enough to cause chaos unless done together. —Networking Nerd

Precisely.

But what are we to do about it. Tom’s Take is that we need to push back on applications. This, also, I completely agree with. But this only brings us to another problem—how do we make the case that applications need to be rewritten to work on a simpler network? The simple answer is—let’s teach coders how networks really work, so they can figure out how to better code to the environment in which their applications live. Let me be helpful here—I’ve been working on networks since somewhere around 1986, and on computers and electronics since Continue reading