Network Break 243: Zoom Changes Tone On Security Vulnerabilities; Cisco Spends $2.6 Billion For Acacia

Today's Network Break analyzes Zoom's change of course on security vulnerabilities, discusses the reasons behind Cisco's multibillion acquisition of Acacia, examines IBM's closing of its Red Hat purchase, and more tech news.

The post Network Break 243: Zoom Changes Tone On Security Vulnerabilities; Cisco Spends $2.6 Billion For Acacia appeared first on Packet Pushers.

The Week in Internet News: Amazon, Microsoft Look to Expand Internet Access

More access, please: Two large tech companies have announced plans to expand Internet access. First, Microsoft and Ohio-based telecom firm Watch Communications have announced an agreement to extend broadband service to underserved areas in Ohio, Indiana, and Illinois, the Associated Press reports, via the Jacksonville Journal Courier. The project is part of Microsoft’s Airband Initiative, an effort to expand service across the U.S.

Look to the sky: Secondly, Amazon has asked the U.S. Federal Communications Commission for permission to launch more than 3,200 satellites, with plans to launch a global broadband network, Smart Cities Dive says. The Amazon plan would target underserved areas across the globe as well as aircraft, ships, and submarines.

That’s a long time without service: More than 350 Internet shutdowns during the last three years have caused the equivalent of 15 years of lost access, The Telegraph reports. About two-third of those shutdowns were in India, and protests or political instability were the reasons for the government actions, according to a report from Access Now.

Warning shot: Two companies, British Airways and Marriott, are facing nine-figure fines (in U.S. dollars) under the European Union’s General Data Protection Regulation for past data breaches Continue reading

Service-defined Firewall Benchmark and Solution Architecture

Today we are happy to introduce the Service-defined Firewall Validation Benchmark report and Solution Architecture document. Firewalls and firewalling technology have come a very long way in thirty years. To understand how VMware is addressing the demands of modern application frameworks, while addressing top concerns for present day CISO’s, let’s take a brief look at the history of this technology.

 

A Brief Firewall History

Over time, the network firewall has grown up, from initially being very basic to more advanced with the inclusion of additional features and functionality. The network firewall incrementally incorporated increasingly complex functionality to address many threats in the modern security landscape.

While the network firewall initially progressed rapidly to keep pace with the development of network technology and rapid evolution of network threat vectors, over the past decade there has been very little in terms of innovation in this space. The requirements of next-generation (NGFW) haven’t changed tremendously since its late 2000’s introduction to the market, and with the uptick in adoption of modern micro-services based architectures into the modern enterprise, applications are becoming more and more distributed in nature, with growing scale and security concerns around the ephemeral nature of the infrastructure.

Micro-services, which Continue reading

Data Shapley: equitable valuation of data for machine learning

Data Shapley: equitable valuation of data for machine learning Ghorbani & Zou et al., ICML’19

It’s incredibly difficult from afar to make sense of the almost 800 papers published at ICML this year! In practical terms I was reduced to looking at papers highlighted by others (e.g. via best paper awards), and scanning the list of paper titles looking for potentially interesting topics. For the next few days we’ll be looking at some of the papers that caught my eye during this process.

The now somewhat tired phrase “data is the new oil” (something we can consume in great quantities to eventually destroy the world as we know it???) suggests that data has value. But pinning down that value can be tricky – how much is a given data point worth, and what framework can we use for thinking about that question?

As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions…. In this work we develop a principled framework to address data valuation in the context of supervised machine learning.

One of the nice outcomes is that once you’ve understood Continue reading

BrandPost: The Latest in Innovation in the SD-WAN Managed Services Market

As enterprisesincreasingly focus on improving network performance to support applications and deliver a better customer experience, SD-WAN solutions are in the spotlight. One of the key components of providing ongoing IT support is ensuring that networks have the agility needed to adapt to changing business priorities at speed.In one recent IDG survey, 91% of enterprises that implemented SD-WAN technologies saw an increase in network speed. SD-WAN managed services have come to the forefront as a choice that allows enterprises to capture the benefits of SD-WAN, along with the expertise to make the most of the technology. Solutions offer access to the knowledge needed to design, deploy and manage SD-WAN networks, while letting the enterprise maintain visibility and control as desired.To read this article in full, please click here

Arista BGP FlowSpec


The video of a talk by Peter Lundqvist from DKNOG9 describes BGP FlowSpec, use cases, and details of Arista's implementation.

FlowSpec for real-time control and sFlow telemetry for real-time visibility is a powerful combination that can be used to automate DDoS mitigation and traffic engineering. The article, Real-time DDoS mitigation using sFlow and BGP FlowSpec, gives an example using the sFlow-RT analytics software.

EOS 4.22 includes support for BGP FlowSpec. This article uses a virtual machine running vEOS-4.22 to demonstrate how to configure FlowSpec and sFlow so that the switch can be controlled by an sFlow-RT application (such as the DDoS mitigation application referenced earlier).

The following output shows the EOS configuration statements related to sFlow and FlowSpec:
!
service routing protocols model multi-agent
!
sflow sample 16384
sflow polling-interval 30
sflow destination 10.0.0.70
sflow run
!
interface Ethernet1
flow-spec ipv4 ipv6
!
interface Management1
ip address 10.0.0.96/24
!
ip routing
!
router bgp 65096
router-id 10.0.0.96
neighbor 10.0.0.70 remote-as 65070
neighbor 10.0.0.70 transport remote-port 1179
neighbor 10.0.0.70 send-community extended
neighbor 10.0.0.70 maximum-routes 12000
!
address-family flow-spec ipv4
neighbor 10.0.0.70 Continue reading

Details of the Cloudflare outage on July 2, 2019

Almost nine years ago, Cloudflare was a tiny company and I was a customer not an employee. Cloudflare had launched a month earlier and one day alerting told me that my little site, jgc.org, didn’t seem to have working DNS any more. Cloudflare had pushed out a change to its use of Protocol Buffers and it had broken DNS.

I wrote to Matthew Prince directly with an email titled “Where’s my dns?” and he replied with a long, detailed, technical response (you can read the full email exchange here) to which I replied:

From: John Graham-Cumming
Date: Thu, Oct 7, 2010 at 9:14 AM
Subject: Re: Where's my dns?
To: Matthew Prince

Awesome report, thanks. I'll make sure to call you if there's a
problem.  At some point it would probably be good to write this up as
a blog post when you have all the technical details because I think
people really appreciate openness and honesty about these things.
Especially if you couple it with charts showing your post launch
traffic increase.

I have pretty robust monitoring of my sites so I get an SMS when
anything fails.  Monitoring shows I was down from 13:03:07  Continue reading

Details zum Cloudflare-Ausfall am 2. Juli 2019

Vor etwa neun Jahren war Cloudflare noch ein winziges Unternehmen und ich war ein Kunde, kein Mitarbeiter. Cloudflare gab es erst seit einem Monat. Eines Tages wurde ich darüber benachrichtigt, dass bei meiner kleinen Website jgc.org der DNS-Service nicht mehr funktionierte. Cloudflare hat seine Verwendung von Protocol Buffers angepasst und dadurch wurde der DNS-Service unterbrochen.

Ich habe eine E-Mail mit dem Titel „Where‘s my dns?“ (Wo ist mein DNS) direkt an Matthew Prince gesendet und er hat mit einer langen, detaillierten, technischen Erklärung reagiert (Sie können den vollständigen E-Mail-Austausch hier lesen), auf die ich antwortete:

Von: John Graham-Cumming
Datum: Do., 7. Okt. 2010 um 09:14
Betreff: Re: Wo ist mein DNS?
An: Matthew Prince

Toller Bericht, danke. Ich werde auf jeden Fall anrufen, wenn es ein
Problem geben sollte.  Es wäre wahrscheinlich sinnvoll, all das in
einem Blog-Beitrag festzuhalten, wenn Sie alle technischen Details haben. Ich glaube nämlich,
dass es Kunden wirklich zu schätzen wissen, wenn mit solchen Dingen offen und ehrlich umgegangen wird.
Sie könnten auch die Traffic-Zunahme nach der Implementierung mit
Diagrammen veranschaulichen.

Ich habe eine recht zuverlässige Überwachung für meine Websites eingerichtet, deshalb bekomme ich eine SMS, wenn
etwas ausfällt.  Meine Daten zeigen,  Continue reading

关于 2019 年 7 月 2 日 Cloudflare 中断的详情

大约九年前,Cloudflare 还是一家小公司,我也还是客户,而不是员工。当时,Cloudflare 早在一个月前就已发布了  jgc.org,有一天警报消息显示,这个小网站似乎不再支持 DNS 了。Cloudflare 实施了一项对 Protocol Buffers 使用的改动,这破坏了 DNS。

我直接给 Matthew Prince 写了一封题为“我的 DNS 在哪儿?”的邮件,他回复了一封篇幅很长、内容详实的技术性解答邮件(您可以点击此处查看往来邮件的全部内容),我对该邮件的回复是:

发件人:John Graham-Cumming
日期:2010 年 10 月 7 日星期四上午 9:14
主题:回复:我的 DNS 在哪儿?
收件人:Matthew Prince

谢谢,这是一篇很棒的报告。如果有问题,我一定会去电
问询。 就某种程度而言,在掌握了所有技术细节后,
将它们撰写为一篇博客文章可能会更好,因为我认为
读者会非常感谢博主对这些信息的坦诚公开。
这一点在您看到文章发布后流量增加的图表时,
会感触更深。

我在密切监控着网站,以便在出现任何故障时能够
收到短信通知。 监控显示,我的网站在 13:03:07 至
14:04:12 期间流量下降。 我会每五分钟测试一次。

这只是个小插曲,我相信您会解决这个问题。 但您确定您不需要
有人在欧洲为您分忧吗?:-)

他的回复是:

发件人:Matthew Prince
日期:2010 年 10 月 7 日星期四上午 9:57
主题:回复:我的 DNS 在哪儿?
收件人:John Graham-Cumming

谢谢。我们已经回复了所有来信。我现在要去办公室,
我们会在博客上发布些信息,或在我们的公告栏系统中
置顶一篇官方帖文。我同意 100%
透明度是最好的。

因此,今天,作为规模远胜以往的 Cloudflare 公司的一员,我要写一篇文章,清楚讲述我们所犯的错误、它的影响以及我们正在为此采取的行动。

7 月 2 日事件

7 月 2 日,我们在 WAF 托管规则中部署了一项新规则,导致全球 Cloudflare 网络上负责处理 HTTP/HTTPS 流量的各 CPU 核心上的 CPU 耗尽。我们在不断改进 WAF 托管规则,以应对新的漏洞和威胁。例如,我们在 5 月份以更新 WAF 的速度出台了一项规则,以防范严重的 SharePoint 漏洞。能够快速地全局部署规则是 WAF 的一个重要特征。

遗憾的是,上周二的更新中包含了一个规则表达式,它在极大程度上回溯并耗尽了用于 HTTP/HTTPS 服务的 CPU。这降低了 Cloudflare 的核心代理、CDN 和 WAF 功能。下图显示了专用于服务 HTTP/HTTPS 流量的 CPU,在我们网络中的服务器上,这些 CPU 的使用率几乎达到了 100%。

事件发生期间某个 PoP 的 CPU 利用率

这导致我们的客户(以及他们的客户)在访问任何 Cloudflare 域时都会看到 502 错误页面。502 错误是由前端 Cloudflare Web 服务器生成的,这些服务器仍有可用的 CPU 内核,但无法访问服务 HTTP/HTTPS 流量的进程。

我们知道这对我们的客户造成了多大的伤害。我们为发生这种事件感到羞耻。在我们处理这一事件时,它也对我们自身的运营产生了负面影响。

如果您是我们的客户,您也一定感受到了难以置信的压力、沮丧和恐惧。更令人懊恼的是,我们的六年零全球中断记录也就此打破。

CPU 耗尽是由一个 WAF 规则引起的,该规则里包含不严谨的正则表达式,最终导致了过多的回溯。作为中断核心诱因的正则表达式是 (?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

尽管正则表达式本身成为很多人关注的焦点(下文将进行详细讨论),但 Cloudflare 服务中断 27 分钟的真实情况要比“正则表达式出错”复杂得多。我们已经花时间写下了导致中断并使我们无法快速响应的一系列事件。如果您想了解更多关于正则表达式回溯以及如何处理该问题的信息,可在本文末尾的附录中查找。

发生了什么情况

我们按事情发生的先后次序讲述。本博客中的所有时间均为协调世界时 (UTC)。

在 13:42,防火墙团队的一名工程师通过一个自动过程对 XSS 检测规则进行了微小改动。这生成了变更请求票证。我们使用 Jira 管理这些票证,下面是截图。

三分钟后,第一个 PagerDuty 页面出现,显示 WAF 故障。这是一项综合测试,从 Cloudflare 外部检查 WAF 的功能(我们会进行数百个此类测试),以确保其正常工作。紧接着出现了多个页面,显示许多其他的 Cloudflare 服务端到端测试失败、全球流量下降警报、众多的 502 错误,之后便是我们在全球各城市的网点 (PoP) 发来的大量指示 CPU 耗尽的报告。



我收到了其中部分警告并立马起身走出会议室,而正在我回到办公桌的途中,解决方案工程师团队的一名负责人告诉我,我们的流量已经减少了 80%。我跑向 SRE 团队,他们正在排除故障。在中断的最初时刻,有人猜测这是某种我们从未见过的攻击。

Cloudflare 的 SRE 团队成员分布在世界各地,他们全天持续监控着网络。绝大多数此类警报都指出了局部区域有限范围内的非常具体的问题,这些警报均在内部仪表板中监控,并且每天会进行多次处理。但这种页面和警报模式表明发生了严重问题,SRE 立即宣布发生 P0 事件,并上报给工程领导层和系统工程部门。

当时,伦敦工程团队正在我们的主要活动场地听取一场内部技术讲座。讲座被打断,所有人都聚集在大型会议室中,商讨问题或是接打电话。这不是 SRE 能够独立处理的一般问题,它需要所有相关团队立即在线联合处理。

在 14:00,WAF 被确定为导致问题的部分原因,并排除了攻击的可能性。性能团队从一台清楚表明 WAF Continue reading

Détails de la panne Cloudflare du 2 juillet 2019

Il y a près de neuf ans, Cloudflare était une toute petite entreprise dont j’étais le client, et non l’employé. Cloudflare était sorti depuis un mois et un jour, une notification m’alerte que mon petit site,  jgc.org, semblait ne plus disposer d’un DNS fonctionnel. Cloudflare avait effectué une modification dans l’utilisation de Protocol Buffers qui avait endommagé le DNS.

J’ai contacté directement Matthew Prince avec un e-mail intitulé « Où est mon DNS ? » et il m’a envoyé une longue réponse technique et détaillée (vous pouvez lire tous nos échanges d’e-mails ici) à laquelle j’ai répondu :

De: John Graham-Cumming
Date: Jeudi 7 octobre 2010 à 09:14
Objet: Re: Où est mon DNS?
À: Matthew Prince

Superbe rapport, merci. Je veillerai à vous appeler s’il y a un
problème.  Il serait peut-être judicieux, à un certain moment, d’écrire tout cela dans un article de blog, lorsque vous aurez tous les détails techniques, car je pense que les gens apprécient beaucoup la franchise et l’honnêteté sur ce genre de choses. Surtout si vous y ajoutez les tableaux qui montrent l’augmentation du trafic suite à votre lancement.

Je dispose d’un système robuste de surveillance de mes sites qui m’envoie un  Continue reading