I have created renders with different variations, however it's hard to preview all possible variations. So I decided that best way is to have interactive preview.
On this webpage, final image is created from multiple layers(frame, earpads, headpad). Each layer has separate files for each required variant(color and angle).
In total there are 1008 unique images encoded into webp format. All images in total take up less that 50mb of space. When you open page for the first time, initial images are loaded and displayed, and all other images are queued to load into memory for smooth preview.