My name is Arseny Kapoulkine and this is my blog where I write about computer graphics, optimization, programming languages and related topics. I’m the author of pugixml, meshoptimizer and other projects.

20 April 2019 qgrep internals

In 2011-2012 I worked on FIFA Street, followed by FIFA EURO 2012 DLC and finally FIFA 13 - all of these games were based on the same codebase, and this codebase was HUGE. Given an unknown codebase, you need a way to quickly get around it - since you don’t know the code, you resort to search-based navigation, aka grep. Using Visual Studio Ctrl+Shift+F search on a HDD on a codebase this size means that every search takes minutes. This was frustrating and as such I decided to solve this problem.

11 March 2019 Small, fast, web

When implementing vertex/index decoders in meshoptimizer, the main focus was on lean implementation and decompression performance.

When your streaming source is capable of delivering hundreds of megabytes per second, as is the case with SSD drives, and you want to accelerate loading by compressing the data further, you need to be decompressing at multiple hundreds of megabytes per second, ideally a gigabyte, to make sure a small number of CPU cores can keep up with IO. Keeping implementation lean meant it was easy to understand and optimize. To supplant the inevitable loss of compression ratio, the codecs were designed in such a way that their output can be compressed further using lossless general purpose compressors such as lz4 or zstd, thus offering an easy tradeoff between compression ratio and performance.

This set of implementation decisions unexpectedly resulted in algorithms that are a pretty good fit for delivering web content. The performance penalty often induced by running code in the browser is offset by the incredibly high baseline performance, and most web content has “free” gzip compression efficiently applied during the download process. This article will describe the evolution of decoder.js, a WebAssembly port of geometry decoders from meshoptimizer.

17 February 2019 Flavors of SIMD

During development of meshoptimizer a question that comes up relatively often is “should this algorithm use SIMD?”. The library is performance-oriented, but SIMD doesn’t always provide significant performance benefits - unfortunately, the use of SIMD can make the code less portable and less maintainable, so this tradeoff has to be resolved on a case by case basis. When performance is of utmost importance, such as vertex/index codecs, separate SIMD implementations for SSE and NEON instruction sets need to be developed and maintained. In other cases it’s helpful to understand how much SIMD can help to make the decision. Today we will go through the exercise of accelerating sloppy mesh simplifier, a new algorithm that was recently added to the library, using SSEn/AVXn instruction sets.

17 January 2019 Is C++ fast?

A library that I work on often these days, meshoptimizer, has changed over time to use fewer and fewer C++ library features, up until the current state where the code closely resembles C even though it uses some C++ features. There have been many reasons behind the changes - dropping C++11 requirement allowed me to make sure anybody can compile the library on any platform, removing std::vector substantially improved performance of unoptimized builds, removing algorithm includes sped up compilation. However, I’ve never quite taken the leap all the way to C with this codebase. Today we’ll explore the gamut of possible C++ implementations for one specific algorithm, mesh simplifier, henceforth known as simplifier.cpp, and see if going all the way to C is worthwhile.

30 December 2017 Voxel terrain: physics

In the last article we’ve discussed the particulars of voxel data definition and storage for voxel terrain we use at Roblox. From there on a lot of other systems read & write data from the storage and interpret it in different ways - the implementation for each system (rendering, networking, physics) is completely separate and not tied too much to decisions storage or other systems are making, so we can study them independently.

While logically speaking it would make sense to look at mesher next (which is how we call the component that is capable of taking a box of voxel data and producing triangle data representing the terrain surface with material attributes), since it is used by both physics and rendering systems, the algorithm is pretty involved and has quite a bit of “magic” so we will leave that for some other time and will instead look at physics today.