Got myself a few months ago into the optimization rabbit hole as I had a slow quant finance library to take care of, and for now my most successful optimizations are using local memory allocators (see my C++ post, I also played with mimalloc which helped but custom local memory allocators are even better) and rethinking class layouts in a more “data-oriented” way (mostly going from array-of-structs to struct-of-arrays layouts whenever it’s more advantageous to do so, see for example this talk).
What are some of your preferred optimizations that yielded sizeable gains in speed and/or memory usage? I realize that many optimizations aren’t necessarily specific to any given language so I’m asking in [email protected].
We had a service that compiles a dataset once per quarter. The total size is ~30gb. We were starting a container, storing it on an EFS volume, and mounting like any other disk.
Every time a pod started it would need to read this data into memory so we would get quick initial start-up time but the time to be ready for traffic still took a while.
Since we didn’t need to update it very often, we decided to just package the compiled dataset into the container and skip the EFS volume. We updated the image pull policy to
ifNotPresent
so it cut egress traffic pricing from EFS to zero. Now there is a cost to pull the image from ECR but that’s only if the pod is being scheduled onto a node it hasn’t been run on before. There was no noticable change in behavior or performance and we saved a bunch on cost.Sometimes the big, dumb option is the right choice.