CD, Good-bye, Good-job
CD(Compact Disc) have long been the most attractive medium for music since digital storage, but now even the thin, small CD are gradually losing their luster to MP3. However, the sound quality of MP3s is not enough to satisfy enthusiasts, and manufacturers have created a standard called SACD, but it has failed to gain popularity. And now we have a very high quality medium called HFPA (High Fidelity Pure Audio) on Blu-ray, but I’m not sure if it will ever be as popular as CD. It’s not a matter of sound quality, it’s a matter of accessibility.
This HFPA supports 24-bit / 192kHz audio, which is what we’re talking about here, and it’s said to contain non-reproducible technology, but I’m also personally skeptical that non-reproducible technology is going to be as useful as it should be in an audio field where it’s so easy to record.

So, What is CD Spec.
As we all know, CD are 16-bit/44.1kHz.
This means that the analog signal is digitized by dividing the x-axis into 44,100 squares per second and the y-axis into 216=65,536 squares per second.
To put it simply, a CD quality digital file is a grid of 44,100 squares per second on the x-axis and 65,536 squares per second on the y-axis, drawing a wavy pattern (sound source) as shown in [Figure 2] below, and storing the locations near where the wavy pattern passes as data, but not the sound itself. (The figure below is 4-bit quantization.)

You can quickly find theories on the internet that 44,100 doubles the maximum human audible frequency of 22 kHz (the Nyquist Theory) and adds a little extra, but technology doesn’t work that way. The answer is that 44,100 Hz is the frequency that exceeds the Nyquist Theory frequency of 22 kHz, the minimum frequency that can contain all of the sound that humans can hear, while still being compatible with the video standards dominated by Sony and Philips at the time.
More technically, 44,100 is the first of the multiples of the minimum common multiple of each “Hz*effective line” in the old analog scanning NTSC 525 line 60 Hz and PAL 625 line 50 Hz.
- NTSC Effective Shots 490/2 = 245 * 180 = 44,100 - PAL Effective Shots 588/2 = 294 * 150 = 44,100 (The reason for the division by 2 is that this is a interlace system).
So if you sample in this way, anything NTSC or PAL will behave like a CD and you won’t have any sync issues.
Then, Why 192kHz needed?
But why 192kHz when 22,000 steps (44kHz) is more than enough for human hearing?
For one thing, as video became more digitized and more standardized, there was no need to stick to 44.1kHz.
Secondly, there’s the question of whether 44.1kHz is really providing enough sound quality.
If it’s 22.05 kHz, we can still understand music or speech, but when we digitize it, as shown in Figure 2 above, the intervals become farther apart (lower bit rate and lower sample rate), and the computer has to estimate values in between based on special algorithms. Of course, there are many different ways and algorithms to calculate this value. In any case, because the computer is calculating and putting in values that aren’t actually there, it’s somehow different from what it actually sounds like, and that affects the sound quality. To put it a bit more bluntly, digital will never be able to match analog.
In addition to this direct waveform aspect, there is a technique called dithering, which is used in A/D, D/A conversion, quantization bit rate conversion, etc.
The inserted noise (the dither signal) is masked by human hearing characteristics (we can hear sounds around 4 kHz best) or by masking effects, but what’s clear is that the overall noise level is increased. Of course, few people will be able to pick up on this noise.
Anyway, as a sound engineer, you want to record the highest quality sound possible, so the most primal way to counteract quantization errors, dithering noise, etc. is to maximize the number of bits and the number of sample rates, and that’s what HFPA Audio supports, which is the 24bit / 192kHz technology in the title.
This 24bit, 192kHz means that one second is broken down into 192,000 steps on the x-axis and 224=16,777,216 steps on the y-axis, which means that there is less error in the quantization step and less noise in the dithering process.
But here’s the irony.
If you search the internet for 24-bit, 192kHz, you’ll find a lot of “do we really need 24-bit/192kHz” articles. In fact, a lot of the 24-bit music that’s readily available on the internet is ripped from CDs or re-recorded from vinyl, and trying to make it 24-bit/192kHz is like trying to call a pumpkin a watermelon without putting a line through it (but in a basket for watermelon ).
A common mistake some hobbyists make is thinking that CDs contain sound, when in fact they only contain numerical data, and when you listen to them, a chip called a DAC (Digital-Analog-Converter) in the CD-Player turns them into an analog signal that we can hear. This is why the same CD can sound different on different CD players.
LPs are similar. Even though it’s analog, the LP experience has a lot of sonic coloring, starting with the cartridge that turns the LP needle into an electrical signal, then the amplifier, then the speakers, and so on, so there’s a lot of investment in the experience to make it sound better to the ear, and it does sound very good.
But if you record it at 24-bit/192kHz, it’s not going to sound better, it’s going to sound like the recorder’s LP playback.
To get the full effect, you’d have to listen to the original 24-bit/192kHz recording in the studio. In the past, computers weren’t powerful enough to play it back (similar to how UD 4K video can’t be decoded in real time on general-purpose devices), but now it’s possible to listen to it on general-purpose devices through a lossless compression codec called FLAC. Heck, even smartphones are starting to have it. To be honest, the role of the converter is very important when it comes to listening to digital files. However, since the converter is also a problem after the decoding process, I think it is very meaningful that 24bit/192kHz real-time decoding is possible even on smartphones, leaving aside the performance of the converter.
The reason why I question the success of HFPA as mentioned above is that in the reality that 24bit/192kHz is universally decoded through the open format of FLAC, I don’t think there will be a large market for high quality through FLAC, but I wonder if the general consumer will buy HFPA media.
Thank you.