Thomas Habets blog Archives - NetworkingNexus.net

The strange webserver hot potato — sending file descriptors

I’ve previously mentioned my io-uring webserver tarweb. I’ve now added another interesting aspect to it.

As you may or may not be aware, on Linux it’s possible to send a file descriptor from one process to another over a unix domain socket. That’s actually pretty magic if you think about it.

You can also send unix credentials and SELinux security contexts, but that’s a story for another day.

My goal

I want to run some domains using my webserver “tarweb”. But not all. And I want to host them on a single IP address, on the normal HTTPS port 443.

Simple, right? Just use nginx’s proxy_pass?

Ah, but I don’t want nginx to stay in the path. After SNI (read: “browser saying which domain it wants”) has been identified I want the TCP connection to go directly from the browser to the correct backend.

I’m sure somewhere on the internet there’s already an SNI router that does this, but all the ones I found stay in line with the request path, adding a hop.

Why?

A few reasons:

Having all bytes bounce on the SNI router triples the number of total file descriptors for the connection. (one on the backend, Continue reading

Ideal programming language

My last post about Go got some attention.

In fact, two of my posts got attention that day, which broke my nginx since I was running livecount behind nginx, making me run out of file descriptors when thousands of people had the page opened.

It’s a shame that I had to turn off livecount, since it’d be cool to see the stats. But I was out of the country, with unreliable access to both Internet and even electricity in hotels, so I couldn’t implement the real fix until I got back, when it had already mostly died down.

I knew this was a problem with livecount, of course, and I even allude to it in its blog post.

Anyway, back to programming languages.

The reactions to my post can be summarized as:

Oh yes, these things are definite flaws in the language.
What you’re saying is true, but it’s not a problem. Your post is pointless.
You’re dumb. You don’t understand Go. Here let me explain your own blog post to you […]

I respect the first two. The last one has to be from people who are too emotionally invested with their tools, and take articles like this Continue reading

Go is still not good

Previous posts Why Go is not my favourite language and Go programs are not portable have me critiquing Go for over a decade.

These things about Go are bugging me more and more. Mostly because they’re so unnecessary. The world knew better, and yet Go was created the way it was.

For readers of previous posts you’ll find some things repeated here. Sorry about that.

Error variable scope is forced to be wrong

Here’s an example of the language forcing you to do the wrong thing. It’s very helpful for the reader of code (and code is read more often than it’s written), to minimize the scope of a variable. If by mere syntax you can tell the reader that a variable is just used in these two lines, then that’s a good thing.

Example:

if err := foo(); err != nil {
   return err
}

(enough has been said about this verbose repeated boilerplate that I don’t have to. I also don’t particularly care)

So that’s fine. The reader knows err is here and only here.

But then you encounter this:

bar, err := foo()
if err != nil {
  return err
}
if err =  Continue reading

Setting clock source with GNU Radio

I bought a GPS Disciplined Oscillator (GPSDO), because I thought it’d be fun for various projects. Specifically I bought this one.

I started by calibrating my ICOM IC-9700. I made sure it got a GPS lock, and connected it to the 9700 10MHz reference port, with a 20dB attenuator inline, just in case. Ok, the receive frequency moved a bit, but how do I know it was improved? My D75 was still about 200Hz off frequency.

Segal’s law parahrased: “Someone with one radio knows what frequency they’re on. Someone with two radios is never sure”.

Unless, of course, that person has two radios with disciplined oscillators. Which I do. I also have a USRP B200 with an added GPSDO accessory.

Sidenote: wow, that’s gotten expensive. Today I’d probably use the same GPSDO from DXPatrol instead. Note that if you do have the GPSDO installed in the B200, then you cannot use an external 10MHz reference. It’s a known issue. Then again if you paid this much, why would you not use it?

Configuring GNU Radio to use the GPSDO

First I thought that surely the best reference would be the default, so I should be able to just send Continue reading

Software defined KISS modem

I’ve kept working on my SDR framework in Rust called RustRadio, that I’ve blogged about twice before. I’ve been adding a little bit here, a little bit there, with one of my goals being to control a whole AX.25 stack.

As seen in the diagram in this post, we need:

Applications, client and server — I’ve made those.
AX.25 connected mode stack (OSI layer 4, basically) — The kernel’s sucks, so I made that too.
A modem (OSI layer 2), turning digital packets into analog radio — The topic of this post.

The job of the modem

Applications talk in terms of streams. AX.25 implementation turns that into individual data frames. The most common protocol for sending and receiving frames is KISS.

I’ve not been happy with the existing KISS modems for a few reasons. The main one is that they just convert between packets and audio. I don’t want audio, I want I/Q signals suitable for SDRs.

On the transmit side it’s less of a problem for regular 1200bps AX.25, since either the radio will turn audio into a FM-modulated signal, or if using an SDR it’s trivial to add the audio-to-I/Q step.

QO100 early success

I have heard and been heard via QO-100! As a licensed radio amateur have sent signals via satellite as far away as Brazil.

What it is

QO-100 is the first geostationary satellite with an amateur radio payload. A “repeater”, if you will. Geostationary means that you just aim your antenna (dish) once, and you can use it forever.

This is amazing for tweaking and experimenting. Other amateur radio satellites are only visible in the sky for minutes at a time, and you have to chase them across the sky to make a contact before it’s gone.

They also fly lower, meaning they can only see a small part of the world at a time. QO-100 can at all times see and be seen by all of Africa, Europe, India, and parts of Brazil.

Needs a bit more equipment, though

Other “birds” (satellites) can be accessed using a normal handheld FM radio and something like an arrow antenna. Well, you should actually have two radios, so that you can hear yourself on the downlink while transmitting.

There are also linear amateur radio satellites. For them you need SSB radios, which narrows down which radios you can use. And you still need Continue reading

io_uring, kTLS and Rust for zero syscall HTTPS server

Around the turn of the century we started to get a bigger need for high capacity web servers. For example there was the C10k problem paper.

At the time, the kinds of things done to reduce work done per request was pre-forking the web server. This means a request could be handled without an expensive process creation.

Because yes, creating a new process for every request was something perfectly normal.

Things did get better. People learned how to create threads, making things more light weight. Then they switched to using poll()/select(), in order to not just spare the process/thread creation, but the whole context switch.

I remember a comment on Kuro5hin from anakata, the creator of both The Pirate Bay and the webserver that powered it, along the lines of “I am select() of borg, resistance is futile”, mocking someone for not understanding how to write a scalable webserver.

But select()/poll() also doesn’t scale. If you have ten thousand connections, that’s an array of ten thousand integers that need to be sent to the kernel for every single iteration of your request handling loop.

Enter epoll (kqueue on other operating systems, but I’m focusing Continue reading

Exploring RISC-V vector instructions

It finally happened! A raspberry pi like device, with a RISC-V CPU supporting the v extension. Aka RVV. Aka vector instructions.

I bought one, and explored it a bit.

SIMD background

First some background on SIMD.

SIMD is a set of instructions allowing you to do the same operation to multiple independent pieces of data. As an example, say you had four 8 bit integers, and you wanted to multiply them all by 2, then add 1. You could do this with a single operation without any special instructions.

    # x86 example assembly.

    mov eax, [myvalues]  # load our four bytes.
    mov ebx, 2           # we want to multiply by two
    imul eax, ebx        # single operation, multiple data!
                         # After this, eax contains 0x02040608
    add eax, 0x01010101  # single operation, multiple data!
                         # After this, eax contains 0x03050709
    mov [myvalues], eax  # store back the new value.

section .data
  myvalues db 1,2,3,4

Success, right? No, of course not. This naive code doesn’t handle over/underflow, and doesn’t even remotely work for floating point data. For that, we need special SIMD instructions.

x86 and ARM have gone the way of fixed sized registers. In 1997 Intel introduced MMX, to great Continue reading

Rebuilding FRR with pim6d

Short post today.

Turns out that Debian, in its infinite wisdom, disables pim6d in frr. Here’s a short howto on how to build it fixed.

$ sudo apt build-dep frr
[…]
$ apt source frr
[…]
$ cd frr-8*
$ DEB_BUILD_PROFILES=pkg.frr.pim6d dpkg-buildpackage -us -uc -b
$ sudo dpkg -i ../frr_*.deb

Then you can enable pim6d in /etc/frr/daemons and restart frr.

Not that I managed to get IPv6 multicast routing to to work over wireguard interfaces anyway. Not sure what’s wrong. Though it didn’t fix it, here’s an interesting command that made stuff like ip -6 mroute look like it should work:

$ sudo smcroutectl  add LAN ff38:40:fd11:222:3333:44:0:1122 wg-foo

Pike is wrong on bloat

This is my response to Rob Pike’s words On Bloat.

I’m not surprised to see this from Pike. He’s a NIH extremist. And yes, in this aspect he’s my spirit animal when coding for fun. I’ll avoid using a framework or a dependency because it’s not the way that I would have done it, and it doesn’t do it quite right… for me.

And he correctly recognizes the technical debt that an added dependency involves.

But I would say that he has two big blind spots.

He doesn’t recognize that not using the available dependency is also adding huge technical debt. Every line of code you write is code that you have to maintain, forever.
The option for most software isn’t “use the dependency” vs “implement it yourself”. It’s “use the dependency” vs “don’t do it at all”. If the latter means adding 10 human years to the product, then most of the time the trade-off makes it not worth doing at all.

He shows a dependency graph of Kubernetes. Great. So are you going to write your own Kubernetes now?

Pike is a good enough coder that he can write his own editor (wikipedia: “Pike has written many text Continue reading

Connection coalescing breaks the Internet

Connection coalescing is the dumbest idea to ever reach RFC status. I can’t believe nobody stopped it before it got this far.

It breaks everything.

Thus starts my latest opinion post.

What is connection coalescing?

It’s specified in the RFC for HTTP/2 as connection reuse, but tl;dr: If the IP address of host A and B overlap, and host A presents a TLS cert that also includes B (via explicit CN/SAN or wildcard cert), then the client is allowed to send HTTP requests directed to B on the connection that was established to A.

Why did they do that?

To save roundtrips and TLS handshakes. It seems like a good idea if you don’t think about it too much.

Why does it break everything?

I’ll resist just yelling “layering violation”, because that’s not helpful. Instead I’ll be more concrete.

Performing connection coalescing is a client side (e.g. browser) decision. But it implicitly mandates a very strict server architecture. It assumes that ALL affected hostnames are configured exactly the same in many regards, and indeed that the HTTP server even has the config for all hostnames.

Concrete things that this breaks:

The server can’t have a freestanding TLS termination layer, Continue reading

An AX.25 implementation in Rust

After having written a user space AX.25 stack in C++, I got bitten by the Rust bug. So this is the third time I’ve written an AX.25 stack, and I’ve become exceedingly efficient at it.

Here it is:

The reason for a user space stack remains from last time, but this time:

It’s written in Rust. Yay! I know people say Rust has a honeymoon period, but I guess that’s where I am, still.
It’s a normal library first. The previous C++ implementation started off as microservices, which in retrospect was needlessly complex and put the cart before the horse.

I’ve added almost an excessive amount of comments to the code, to cross reference with the specs. The specs that have a few bugs, by the way.

Rust

I’m not an expert in Rust, but it allows for so much more confidence in your code than any other language I’ve tried.

I think I know enough Rust to know what I don’t fully know. Sure, I’ve successfully added lifetime annotations, created macros, and built async code, but I’m not fluent in those yet.

Interestingly, Continue reading

Is your TLS resuming?

There are two main ways that a TLS handshake can go: Full handshake, or resume.

There are two benefits to resumption:

it can save a round trip between the client and server.
it saves CPU cost of a public key operation.

Round trip

Saving a round trip is important for latency. Some websites don’t use a CDN, so a roundtrip could take a while. And even those on a CDN can be tens of milliseconds away. Maybe won’t matter much for a human, but roundtrips can kill the performance of something that needs to do sequential connections.

E.g. Australia is far away:

$ ping -c 1 -n www.treasury.gov.au
PING treasury.gov.au (3.104.80.4) 56(84) bytes of data.
64 bytes from 3.104.80.4: icmp_seq=1 ttl=39 time=369 ms

That’s about a third of a second. Certainly noticeable to a human. Especially since rendering a web page usually requires many connections to different hosts.

For TCP based web requests (in other words: not QUIC), there’s usually four roundtrips involved (slightly simplified):

TCP connection establishment.
ClientHello & ServerHello.
Client & Server ChangeCipherSpec.
HTTP request & response.

So from the UK to Australia, that’s about Continue reading

Rust is faster than C, even before I added SIMD

I found some old C code of mine from around 2001 or so. I vaguely remember trying to make it as optimized as possible. Sure, I was still a teenager, so it’s not state of the art. But it’s not half bad. I vaguely suspect I could do better with better optimization for cache lines, but it’s pretty good.

On my current laptop it does about 12 million passwords per second, single threaded.

Because I’m learning Rust, I decided to port it, and see how fast rust is.

Imagine my surprise when even the first version in Rust was faster. (Yes, I rebuilt the old C code with a modern compiler and its optimizations)

The first Rust version was about 13 million passwords per second.

Why is that? It’s basically the same as the C code. Maybe Rust can take advantage of knowing there’s no pointer aliasing (the reason usually quoted for why Fortran can be faster than C)? Or maybe the memory layout just happened to become more cache friendly?

In any case, I think we can already say that Rust is at least as fast as C.

The code is on github.

SIMD (with Rust)

I realized, of Continue reading

Cross compiling Rust — Fixed

Set up build environment

rustup toolchain install nightly
rustup component add rust-src --toolchain nightly
apt install {binutils,gcc}-mips-linux-gnu

Create test project

cargo new foo
cd foo

Configure linker

mkdir .cargo
cat > .cargo/config.toml
[target.mips-unknown-linux-gnu]
linker = "mips-linux-gnu-gcc"
^D

Build

cargo +nightly build --release -Zbuild-std --target mips-unknown-linux-gnu

Change the “interpreter” to what the Ubiquiti system expects

cd target/mips-unknown-linux-gnu/release
patchelf --remove-needed ld.so.1 foo
patchelf --set-interpreter /lib/ld-musl-mips-sf.so.1 foo

Does it work?

$ ./foo
Hello, world!

Yay!

Cross compiling Rust to Ubiquiti access point

This is not the right way to do it, as will become abundantly clear. But it works.

Set up build environment

rustup toolchain install nightly
rustup component add rust-src --toolchain nightly
apt install {binutils,gcc}-mips-linux-gnu

Create test project

cargo new foo
cd foo

Build most of it

This will build for a while, then fail.

cargo +nightly build --release -Zbuild-std --target mips-unknown-linux-gnu

For some reason it’s trying to use cc to link. I tried putting this in Cargo.toml, but it does nothing:

[target.mips-unknown-linux-gnu]
linker = "mips-linux-gnu-gcc"

But I found a workaround.

Temporarily change `/usr/bin/cc` to point to the mips gcc

It does not work if you do this before the previous step.

PREV="$(readlink -v /usr/bin/cc)"
sudo rm /usr/bin/cc
sudo ln -s /usr/bin/mips-linux-gnu-gcc /usr/bin/cc

Link the program

Same command again

cargo +nightly build --release -Zbuild-std --target mips-unknown-linux-gnu

It should succeed. Yay.

Restore `/usr/bin/cc`

sudo rm /usr/bin/cc
sudo ln -s "${PREV?}" /usr/bin/cc

Change the “interpreter” to what the Ubiquiti system expects

cd target/mips-unknown-linux-gnu/release
patchelf --remove-needed ld.so.1 foo
patchelf --set-interpreter /lib/ld-musl-mips-sf.so.1 foo

Building it again

Probably easiest to rm -fr target, and go back to the step “Build most of it”.

Does it work?

$ ./foo
Hello, world!

Yay!

Use AGW for packet radio applications

When creating packet radio applications, there are several options on how to get the packets “out there”, and get them back. That is, how to interface with the modem.

Sure, you can write your own modem, and have the interface to the outside world be plain audio and PTT (push to talk, i.e. trigger transmit). But now you’re writing a modem, not an application. You should probably split the two, and have an interface between them.

KISS

You can use KISS, but it’s very limited. You can only send individual packets, so it’s only really good for sending unconnected (think UDP) packets like APRS. It’s not good for querying metadata, such as port information and outstanding transmit queue.

Think of KISS like a lower layer that applications shouldn’t think about. Like ethernet. Sure, as a good engineer you should know about KISS, but it’s not what your application should be interfacing with.

Linux kernel implementation

On Linux you can use AF_AX25 sockets, and program exactly like you do for regular internet/IP programs. SOCK_DGRAM for UI frames (UDP-like), and SOCK_STREAM for connected mode (TCP-like).

But the Linux kernel implementation is way too buggy. SOCK_STREAM works kinda OK, but does Continue reading

Meshtastic quick setup

I wanted some nice offline mid range chat app, for when I don’t have data, or data roaming is too expensive. I also want it to work for people who are not amateur radio licensed, since my girlfriend stubbornly refuses to be interested in that.

Looks like the answer I’m looking for is Meshtastic, preferably with LoRalora]. I bought a couple of Heltec V3 ESP32 LoRa OLED and the matching case.

Maybe I’ll buy a battery, but I’m fine just powering it from a USB power bank.

The documentation makes a fair bit of assumptions about the user knowing the name for what they want, and what firmware provides what.

In short, what I think I want is to ignore the Heltec firmware, and instead just treat the Heltec V3 as the hardware that Meshtastic runs on.

The recommended way to flash, and for some cases even use, is the Meshtastic Web UI. It uses browser integration for serial ports and bluetooth. A nice idea, but it was extremely unreliable for me. The flasher worked for one device, but not the other. The chat client never worked at all.

Here’s what worked reliably for me:

Download “stable” firmware Continue reading

Apollo 11 notes

I was re-reading the Apollo 11 mission reports, as one does, and decided to take some notes along the way.

If you’re interested in these things, I also highly recommend curiousmarc’s series on the Apollo comms hardware.

Notes

First time I’ve seen the word “doff”. Can’t wait to use it in daily conversation.

The rocket equation is a beast. The LM descent stage had 8’210kg of propellant. The ascent stage only 2’365kg.
– Volume 1, Page 50

In total 10’849kg out of 15’061 (72%) of the LM was propellant. (excluding the astronauts themselves)

The LM flown on Apollo 10 did not have the landing program in its computer. To prevent the temptation to land?
– Volume 1, Page 62

Armstrong’s parents were “Mr. and Mrs. Stephen Armstrong”. Michael Collins’ mother is mentioned, but her name is also lost to history, as she’s referred to as “Mrs. James L. Collins”. Only Buzz Aldrin’s mother is named (and what a name!), as Marion Moon Aldrin.

All three were born in 1930, making them turn 39 in 1969.
– Volume 1, Page 76-78

“High speed” data mode is 2400bps, divided into 240 bit blocks.
– Volume 1, Page 93

Aside from the Continue reading

RustRadio improved API 0.4

Since last time, I’ve improved the API a bit. That last post was about API version 0.3. Now it’s on 0.4, and I think it’s getting pretty decent.

0.3 could never have worked very well. The API was VecDeque-based, which means it could not provide a linear view (a slice) of all the data in the buffer.

The 0.4 API is simpler. You get a typed slice, and you read or write to, it as appropriate. Because all streams are currently single writer, single reader, the code is simple, and requires minimal amount of locking.

It’s simpler, but I switched to using memory mapped circular buffers, with a slice as the stream interface. This means that the buffer is allocated only once, yet both reader and writer can use all space available to them, linearly, without having to worry about wrapping around.

The code is still at https://github.com/ThomasHabets/rustradio. I registered the github org rustyradio, too. rustradio was taken. I sent a message to the owner, since it seems to not have any real content, but have not heard back.

Unsafe code

To make this multiuser stream I did have to write some Continue reading

1 2 3 … 7 Next »

Archive

My goal

Why?

Error variable scope is forced to be wrong

Configuring GNU Radio to use the GPSDO

The job of the modem

What it is

Needs a bit more equipment, though

SIMD background

What is connection coalescing?

Why did they do that?

Why does it break everything?

Rust

Round trip

SIMD (with Rust)

Set up build environment

Create test project

Configure linker

Build

Change the “interpreter” to what the Ubiquiti system expects

Does it work?

Links

Set up build environment

Create test project

Build most of it

Temporarily change /usr/bin/cc to point to the mips gcc

Link the program

Restore /usr/bin/cc

Change the “interpreter” to what the Ubiquiti system expects

Building it again

Does it work?

Links

KISS

Linux kernel implementation

Notes

Unsafe code

Temporarily change `/usr/bin/cc` to point to the mips gcc

Restore `/usr/bin/cc`