

At Cloudflare, we're constantly vigilant when it comes to identifying vulnerabilities that could potentially affect the Internet ecosystem. Recently, on September 12, 2023, Google announced a security issue in Google Chrome, titled "Heap buffer overflow in WebP in Google Chrome," which caught our attention. Initially, it seemed like just another bug in the popular web browser. However, what we discovered was far more significant and had implications that extended well beyond Chrome.
The vulnerability, tracked under CVE-2023-4863, was described as a heap buffer overflow in WebP within Google Chrome. While this description might lead one to believe that it's a problem confined solely to Chrome, the reality was quite different. It turned out to be a bug deeply rooted in the libwebp library, which is not only used by Chrome but by virtually every application that handles WebP images.
Digging deeper, this vulnerability was in fact first reported in an earlier CVE from Apple, CVE-2023-41064, although the connection was not immediately obvious. In early September, Citizen Lab, a research lab based out of the University of Toronto, reported on an apparent exploit that was being used to attempt to install spyware on the iPhone Continue reading
IXP Metrics is available on Github. The application provides real-time monitoring of traffic between members of an Internet eXchange Provider (IXP) network.
This article will use Arista switches as an example to illustrate the steps needed to deploy the monitoring solution, however, these steps should work for other network equipment vendors (provided you modify the vendor specific elements in this example).
git clone https://github.com/sflow-rt/prometheus-grafana.git cd prometheus-grafana env RT_IMAGE=ixp-metrics ./start.sh
The easiest way to get started is to use Docker, see Deploy real-time network dashboards using Docker compose, and deploy the sflow/ixp-metrics image bundling the IXP Metrics application.
scrape_configs:
- job_name: sflow-rt-ixp-metrics
metrics_path: /app/ixp-metrics/scripts/metrics.js/prometheus/txt
static_configs:
- targets: ['sflow-rt:8008']
Follow the directions in the article to add a Prometheus scrape task to retrieve the metrics.
sflow source-interface management 1 sflow destination 10.0.0.50 sflow polling-interval 20 sflow sample 50000 sflow run
Enable sFlow on all exchange switches, directing sFlow telemetry to the Docker host (in this case 10.0.0.50).
Use the sFlow-RT Status page to confirm that sFlow is being received from the switches. In this case 286 sFlow datagrams per second are being received from 9 switches. The IX-F Member Export JSON Schema Continue readingIn this episode Tom and Scott explore Zero Trust Architecture (ZTA), where it aligns (and doesn't) with IPv6, and what the future might hold for both technologies.
The post IPv6 Buzz 136: IPv6 And Zero Trust Architecture appeared first on Packet Pushers.


We have always strived to make Cloudflare somewhere where our entire team feels safe and empowered to bring their whole selves to work. It’s the best way to enable the many incredible people we have working here to be able to do their best work. With that as context, we are proud to share that Cloudflare has been certified and recognized as one of the Top 100 Most Loved Workplaces in 2023 by Newsweek and the Best Practice Institute (BPI) for the second consecutive year.

Cloudflare’s ranking follows surveys of more than 2 million employees at companies with team sizes ranging from 50 to 10,000+, and includes US-based firms and international companies with a strong US presence. As part of the qualification for the certification, Cloudflare participated in a company-wide global employee survey — so this award isn’t a hypothetical, it’s driven by our employees’ sentiment and responses.
With this recognition, we wanted to reflect on what’s new, what’s remained the same, and what’s ahead for the team at Cloudflare. There are a few things that especially stand out:
Helping to build a better Internet.
If you speak to any member of Continue reading
This post introduces Cisco's approach to Intent-based Networking (IBN) through their Centralized SDN Controller, DNA Center, rebranded as Catalyst Center. We focus on the network green field installation, showing workflows, configuration parameters, and relationships and dependencies between building blocks.
Figure 1-1 is divided into three main areas: a) Onboard and Provisioning, b) Network Hierarchy and Global Network Settings, c) and Configuration Templates and Site Profiles.
We start a green field network deployment by creating a Network Design. In this phase, we first build a Network Hierarchy for our sites. For example, a hierarchy can define Continent/Country/City/Building/Floor structure. Then, we configure global Network Settings. This phase includes both Network and Device Credentials configuration. AAA, DHCP, DNS serves, DNS name, and Time Zone, which are automatically inherited throughout the hierarchy, are part of the Network portion. Device Credentials, in turn, define CLI, SNMP read/write, HTTP(S) read/write username/password, and CLI enable password. The credentials are used later in the Discovery phase.
Next, we build a site and device type-specific configuration templates. As a first step, we create a Project, a folder for our templates. In Figure 1-1, we have a Composite template into which we attach two Regular templates. Regular templates include Continue reading
netlab release 1.6.2 improved reporting capabilities:
In other news:
netlab release 1.6.2 improved reporting capabilities:
In other news:


On 4 October 2023, Cloudflare experienced DNS resolution problems starting at 07:00 UTC and ending at 11:00 UTC. Some users of 1.1.1.1 or products like WARP, Zero Trust, or third party DNS resolvers which use 1.1.1.1 may have received SERVFAIL DNS responses to valid queries. We’re very sorry for this outage. This outage was an internal software error and not the result of an attack. In this blog, we’re going to talk about what the failure was, why it occurred, and what we’re doing to make sure this doesn’t happen again.
In the Domain Name System (DNS), every domain name exists within a DNS zone. The zone is a collection of domain names and host names that are controlled together. For example, Cloudflare is responsible for the domain name cloudflare.com, which we say is in the “cloudflare.com” zone. The .com top-level domain (TLD) is owned by a third party and is in the “com” zone. It gives directions on how to reach cloudflare.com. Above all of the TLDs is the root zone, which gives directions on how to reach TLDs. This means that the root zone is important Continue reading

在 2023 年 10 月 4 日,从 7:00 UTC 到 11:00 UTC 的这段时间里,Cloudflare 出现了 DNS 解析问题。1.1.1.1 或 Warp、Zero Trust 等产品或使用 1.1.1.1 的第三方 DNS 解析器的一些用户可能会收到对有效查询的 SERVFAIL DNS 响应。对于本次故障,我们深表歉意。本次故障是内部软件错误造成的,不是遭到攻击所致。在这篇博文中,我们将讨论出现了什么故障、发生原因以及我们采取了哪些措施来确保这种情况不再发生。
在域名系统 (DNS) 中,每个域名都位于一个 DNS 区域中。区域是一起控制的域名和主机名的集合。例如,Cloudflare 负责域名 cloudflare.com,我们称其为在 "cloudflare.com" 区域中。.com 顶级域名 (TLD) 归第三方所有,位于 "com" 区域中。它提供如何访问 cloudflare.com 的指示。对于 TLD,最重要的是根区,它提供如何访问 TLD 的指示。这意味着根区对于解析所有其他域名非常重要。与 DNS 的其他重要部分一样,根区也使用 DNSSEC 进行签名,这意味着根区本身包含加密签名。
根区在根服务器上发布,但 DNS 运营商通常也会自动检索并保留根区的副本,使得万一根服务器无法访问时,根区中的信息仍然可用。Cloudflare 的递归 DNS 基础设施也采用了这种方法,因为它还可以加快解析过程。根区的新版本通常每天发布两次。1.1.1.1 有一个名为 static_zone 的 WebAssembly 应用程序,它在主 DNS 逻辑的基础上运行,在新版本可用时为它们提供服务。

在 9 月 21 日,作为根区管理已知和计划变更的一部分,根区中首次加入了一种新的资源记录类型。新资源记录名为 ZONEMD,实际上是根区内容的校验和。
根区通过在 Cloudflare 核心网络中运行的软件检索。随后,它被重新分配到 Cloudflare 遍布全球的数据中心。更改后,包含 ZONEMD 记录的根区仍可正常检索和分配。但是,使用该数据的 1.1.1.1 解析器系统在解析 ZONEMD 记录时遇到了问题。由于区域必须完整加载并提供,如果系统无法解析 ZONEMD,则意味着 Cloudflare 解析器系统未使用新版本的根区。一些托管 Cloudflare 解析器基础架构的服务器在未收到新的根区时,无法逐个请求直接查询 DNS 根服务器。然而,继续依赖于根区的已知正常工作版本的其他用户在内存缓存中仍可使用该版本,即在变更前于 9 月 21 日提取的版本。
在 2023 年 10 月 4 日 07:00 UTC,9 月 21 日发布的根区版本中的 DNSSEC 签名已过期。由于 Cloudflare 解析器系统无法使用本来应该可以使用的较新版本,Cloudflare 的一些解析器系统无法验证 DNSSEC 签名,因此开始发送错误响应 (SERVFAIL)。Cloudflare 解析器生成 SERVFAIL 响应的比率增长了 12%。下图说明了故障的发展过程以及用户如何看到故障。

9 月 21 日 6:30 UTC:最后一次成功提取根区
10 月 4 日 7:00 UTC:在 9 月 21 日获得的根区中的 DNSSEC 签名过期,导致对客户端查询的 SERVFAIL 响应增加。
7:57:开始收到第一份关于意外 SERVFAIL 的外部报告。
8:03:宣布内部 Cloudflare 事件。
8:50:首次尝试通过覆盖规则阻止 1.1.1.1 使用过期的根区文件提供响应。10:30:完全阻止 1.1.1.1 预装根区文件。
10:32:响应恢复正常。
11:02:事件关闭。
下图显示了受影响的时间以及返回 SERVFAIL 错误的 Continue reading

2023 年 10 月 4 日,Cloudflare 於世界標準時 7:00 開始至 11:00 結束期間遇到 DNS 解析問題。1.1.1.1 或 Warp 、 Zero Trust 等產品的一些使用者,或使用 1.1.1.1 的第三方 DNS 解析程式可能已經收到對有效查詢的 SERVFAIL DNS 回應。對於此次服務中斷,我們深感抱歉。此次服務中斷為內部軟體錯誤,而非攻擊造成的結果。在這篇部落格中,我們將討論失敗的內容、發生的原因,以及我們可以採取哪些措施來確保這種情況不再發生。
在 Domain Name System (DNS) 中,每一個網域名稱存在於 DNS 區域內。區域是在一起接受控制的網域名稱和主機名稱的集合。例如,Cloudflare 負責網域 cloudflare.com,我們稱之為「cloudflare.com」區域。頂級網域 (TLD) .com 由第三方擁有,位於「com」區域。它提供如何連線 cloudflare.com 的指示。所有 TLD 之上為根區域,提供如何連線 TLD 的指示。這意味著根區域對於解析所有其他網域名稱很重要。與 DNS 的其他重要部分一樣,根區域使用 DNSSEC 進行簽署,這也意味著根區域本身包含加密簽章。
根區域發布於根伺服器上,但 DNS 營運商自動擷取並保留根區域副本的情況也很常見, 以便在無法連線根伺服器的情況下,根區域中的資訊仍然可供使用。Cloudflare 的遞迴 DNS 基礎架構會採用此方法,因為它還可加速解析程序。新版根區域通常一天發布兩次。1.1.1.1 具有稱為 static_zone 的 WebAssembly 應用程式,該應用程式執行於主 DNS 邏輯之上,當新版本可供使用時,即可提供這些新的版本。

9 月 21 日,作為根區域管理中的已知計畫內變更的一部分,新的資源記錄類型首次納入根區域。新資源記錄稱為 ZONEMD,實際上是根區域內容的總和檢查碼。
藉由執行於 Cloudflare 核心網路的軟體來擷取根區域。隨後,根區域被重新分散到 Cloudflare 在世界各地的資料中心。變更之後,可繼續正常擷取和分散包含 ZONEMD 記錄的根區域。然而,使用該資料的 1.1.1.1 解析程式系統在剖析 ZONEMD 記錄時遇到問題。由於區域必須完整載入和提供,因此系統無法剖析 ZONEMD,這意味著新版根區域未在 Cloudflare 的解析程式系統中使用。若託管 Cloudflare 解析程式基礎架構的某些伺服器未收到新的根區域,則會發生容錯移轉,直接逐個請求地查詢 DNS 根伺服器。不過,其他伺服器會繼續依賴其記憶體快取仍然可用的已知工作版根區域,這是在變更之前於 9 月 21 日提取的版本。
2023 年 10 月 4 日世界標準時 7:00,根區域版本中自 9 月 21 日開始的 DNSSEC 簽章到期。由於 Cloudflare 解析程式系統沒有能夠使用的更新版本,某些 Cloudflare 解析程式系統無法驗證 DNSSEC 簽章,並因此開始傳送錯誤回應 (SERVFAIL)。Cloudflare 解析程式產生 SERVFAIL 回應的速度提升了 12%。下圖說明了失敗的進度,以及如何顯示給使用者。

9 月 21 日世界標準時 6:30:根區最後一次成功提取
10 月 4 日世界標準時 7:00:根區域中於 9 月 21 日取得的 DNSSEC 簽章到期,導致對用戶端查詢的 SERVFAIL 回應增加。
7:57︰第一個外部非預期 SERVFAIL 報告開始出現。
8:03︰正式宣佈發生內部 Cloudflare 事件。
8:50:初次嘗試阻止 1.1.1.1 使用具有覆寫規則的過時根區域檔案來提供回應。
10:30:完全阻止 1.1.1.1 預先載入根區域檔案。
10:32:回應恢復正常。
11:02︰事件結束。
下圖顯示了影響時間表,以及傳回 SERVFAIL 錯誤的 DNS 查詢百分比:

我們預計,在正常操作期間,常規流量的 SERVFAIL 錯誤數量會達到基準。該比例通常在 Continue reading