Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It's a preprocessor to lossless codecs shaping noise...

So it's a lossy compression? Why use FLAC then?



Other lossy compression methods generally happen in the frequency domain and, depending on the type of music, can introduce audible distortions.

This preprocessor on the other hand throws away some of least significant bits to save data. This increases the quantization noise but has no other sort of artifacts. The quantization noise can be dithered to fall in the higher frequencies (noise shaping) and is generally not perceptible.


Doesn't 16 bit audio already require dithering to capture the full audible dynamic range? That makes me worry somewhat about cutting more bits and layering dithers on dithers.


While it's good practice, a 16-bit recording is usually not exactly ruined by not dithering it. 96dB dynamic range is a lot.


You always want dithering no matter the bit depth, but 16-bit is actually quite a lot - there's literally no point in better than CD-quality audio. (Except of course that real life is in surround not stereo.)

But yes, if you're doing it repeatedly you'd want an un-dithering filter. Noise reduction tends to do this by accident but it helps if you know what the dither shape was.


Sounds more like trellis quantization in near-lossless H.264, explicitly trading coding cost of details against psychovisual impact of said details.


That's psy-rd (in x264 terms).

Trellis quantization is just a more optimal way to divide numbers - think of it like rounding to nearest instead of down. "optimal" means "optimal rate-distortion tradeoff" and "distortion" means whatever you want it to, but usually it's difference between original and compressed pixels (absolute error/SAD/PSNR).

That can look blurry, because given all alternatives with the same SAD, blurry ones compress more. So psy-rd changes the definition of distortion to add a "has similar amount of noise" factor. That's very far from human optimal (if anything it's SSIM optimal) but it's free detail. Uses the same quantization to get there, though.


It's just a hobby project, and the structure isn't too different from how other lossy codecs work. (Or lossless ones - you can construct one of those from any lossy one just by sticking the difference from the original on the end.)

Most codecs sacrifice transparency to reach a bitrate and this one does the opposite.


> Most codecs sacrifice transparency to reach a bitrate and this one does the opposite.

That's what LAME's presets and the Ogg Vorbis quality settings do as well, isn't it?


Yep, but codecs tend to have maximum bitrates either because of design tradeoffs or to work with hardware decoders. MP3's is too small to be perfectly transparent on some things, like cymbals.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: