Benchmarking spreadsheet systems
Benchmarking spreadsheet systems Rahman et al., Preprint
A recent TwThread drew my attention to this pre-print paper. When spreadsheets were originally conceived, data and formula were input by hand and so everything operated at human scale. Increasingly we’re dealing with larger and larger datasets — for example, data imported via csv files — and spreadsheets are creaking. I’m certainly familiar with the sinking feeling on realising I’ve accidentally asked a spreadsheet to open up a file with 10s of thousands of rows, and that my computer is now going to be locked up for an age. Rahman et al. construct a set of benchmarks to try and understand what might be going on under the covers in Microsoft Excel, Google Sheets, and LibreOffice Calc.
Spreadsheets claim to support pretty large datasets these days – e.g. five million cells for Google Sheets, and even more than that for Excel. But in practice, they struggle at sizes well below this.
With increasing data sizes… spreadsheets have started to break down to the point of being unusable, displaying a number of scalability problems. They often freeze during computation, and are unable to import datasets well below the size limits posed by Continue reading