Amadeus32 is a good place to begin. Martin Hairer's program "Amadeus II", which was used to generate all the sonograms you'll see (making possible the graphs below them), has the capacity to do mp3 exporting. It uses a version of the LAME encoder as a shared library. LAME was never intended for low bit rate applications, so the data we have here is full of distortions and sonic problems that we'll see again and again in later, diminished forms as the bit rates increase. Essentially, the white bands at the top are where the encoder threw away band-limited 'noise'- they exactly correspond to time periods where the sound is overlaying a 1K tone (seen as the black bands in the middle of the sonogram frequency range), high noise and a low or subsonic rumble. The intention with those parts of the test sound was to torture test the encoders and make them choose which frequencies to reproduce, force them to throw away something. At a bit rate of 32K, they all throw away the extreme highs (the first band is around 12-22 kilohertz, the second is around 16-22 kilohertz). The vertical bars to the right are the encoder failing to cleanly handle a series of frequency wavelets- all the way to the left, the dark diagonal line that abruptly turns white is a pure frequency sweep- the encoder just stops reproducing the tone at a certain point. Pay attention to the area between the white bars and the dark 1 kilohertz bar- because the next sonogram, Amadeus64, is strikingly different there... Amadeus64 (again, LAME encoders are not designed to encode at these bit rates at all!) is strikingly different in this area! Particularly in the first noise burst (the broader one that extends down farther), the area below, a critical area for ear fatigue and by far the most unpleasant area to have distortion in, is so seriously distorted that it looks _furry_, positively saturated with inaccuracy! LAME really, truly, should not be used for these bit rates- if you look at the first noise band you will see that Amadeus's LAME export is attempting, at 64K, to include some >16K information- the band is in two shades of white and it somehow manages to be marginally more accurate _above_ 16K- you will also see in the initial tone sweep that LAME quits reproducing the tone and then, surprisingly, starts up again. All in all a messy performance, but I can't emphasise enough that LAME is NOT INTENDED for bit rates below 128K (it is actually meant for bit rates well above 128K). Thus, it's a safe way to get into acute criticism of the encoders, as the major problems and errors can be introduced without offending the programmers too much (except for the Fraunhofer guys- they will be offended in the final, high bit rate section ;) ) Look closely at the area just to the right of the dark, horizontal 1 kilohertz bars. You'll see a small white area- this is over-ring. The concept of over-ring will become extremely important with... BladeEnc. Like Amadeus, Blade isn't really geared for the extreme low bit rates- however, its approach to the problem is markedly different. Compare it with Amadeus's LAME encoder, above- the difference is really striking. Basically, BladeEnc (at all bit rates) is markedly more willing to incur pre-echo and over-ring in the cause of smoother, less colored frequency response. At 32K, Blade hurts its charts by totally failing to do the tone sweep (you're seeing it give up at 1 kilohertz, shockingly low) and the center area, which contains the strongest transient attacks a sound can have, is also severely compromised- complete blackness is the ideal and a good performance will show first one, then two vertical bars, but Blade produces just a big blur of distortion. This is hell for Blade- its worst area, at a torturous 32K bit rate. Yet- look at the noise bands. This is far from a pristine showing (nothing is pristine at 32K, period), but see how that 'shrill' 3-8K area, where LAME failed so badly, is relatively smooth and clean? The soft vertical bands are in part being produced by low frequency artifacts- later you'll see versions of LAME adding a single big low frequency artifact (arguably inaudible), and see that it produces smooth vertical lines on the sonogram. Also note that, while Blade is not attempting to deal with >16K noise, in the leftmost noise band it is making some cautious attempts to encode >12K information, and succeeding enough to fade the white 'noise band' slightly at the bottom- this, at 32K! The essence of the encoder's sound is being shown- at the far right, it does absolutely miserably at the tone wavelets, but the amount of sonic degradation in the frequency domain is cunningly minimised. Blade64 is the same puzzle, magnified. You can see that it's dealing with the initial tone sweep right up to 16K- but the amount of visual noise immediately around the sweep shows that Blade is having a rough time keeping the tone pure and uncolored- that's what extensive artifacts look like on a sonogram. Across the first noise burst, Blade astonishingly handles the 12-16K area almost cleaner than the 3-8K area- it tries so hard to encode the high frequency noise it finds there, that it entirely screws up the pure 1K tone it's also being asked to reproduce, and the center of it is heavily washed out. The over-ring from the 1K tone is substantial- and above it, you can see that there is massive over-ring from the same 12-16K area that it was trying so hard to reproduce. Entering the second noise band, you can see massive pre-echo around the 1K tone. Finally and startlingly, when given a 1K tone, 16-22K noise and some subsonics in the second noise band, Blade suddenly decides to make a good hard effort at encoding the >16K content- and succeeds to a surprising extent! A strange and interesting approach to low bit rate encoding- grabbing bits of high fidelity mixed with dreadful distortion. Blade will continue to be very much its own encoder, right through the remainder of the tests. CC32 is CokaCoda, the Japanese LAME-based encoder, at 32K bit rate. Here you see it taking its own path through the torturous challenges of the test sound, aptly named 'EncoderHell'- unlike the Amadeus version of LAME, CokaCoda is much clearer about its desire to throw away the frequency extremes in the interest of midrange purity, and it shows- the areas so dense with error and distortion are comparatively smooth and dim with CokaCoda, and by contrast the noise bands are completely discarded. CokaCoda's LAME version has no trouble with the 1K tone, but check out the huge vertical stripe in the middle! You'll be seeing that a lot from LAME, right through the 128K bit rates- what's happening here is that the waveform being reproduced is basically a chunk of a square wave- sustaining a peak voltage for impossibly long. The frequency content of this wave, naturally, goes well below 20 hertz- many encoders will accept that, but CokaCoda is refusing to deal with those components- and as a result, is introducing a seriously large low frequency artifact into the sound. You can see it on the actual waveform- it's a downwards-pointing lump. No other encoder seems to find this necessary- it is a quirk of the CokaCoda implementation of LAME, and unless you have extremely impressive subwoofers this won't be heard. If you do have extremely impressive subwoofers- you'll be feel/hearing some unusual ultra-low artifacts that have nothing to do with what the true low frequency content is. To put this into a perspective I'd note that CDs absolutely cannot contain sonic information above 22K under any circumstances- but they are quite capable of containing any sort of bass, no matter how deep, and many do contain some sort of bass that pushes the audibility limit. (obSelfPlug- airwindows CDs have frequency content as low as 9 hz, charted with Amadeus sonograms. Good mixing and mastering setups can produce this content as a byproduct of cleanly mixing other low frequency content. End of selfplug, you can start reading again now) CC64 is an even purer representation of what CokaCoda LAME can produce- again, it neatly ignores everything over 16K and for that matter, 12K- again, it refuses to countenance a squarewave cycle that extends for too long- and again, what it does reproduce it reproduces pretty cleanly for 64K bit rate. Not much difference here- so it may come as a surprise to move on to... SWA32 (a Fraunhofer encoder) which is the hack to allow use of the SWA Export Xtra, and find many aspects even cleaner than the very well behaved CokaCoda at 64K! There is no question that Fraunhofer wins by a mile for low bit rate encoding- the story is told not only by the sonogram, but most notably by the underlying charts derived from the sonogram. The chart on the left, deviation from frequency response accuracy, is shockingly low for 32K bit rate. The chart on the right tells an even subtler tale to those who can read it- the right chart represents pre-echo and overhang, and you will see a very small peak that equates to around 1 kilohertz- and to the right of that, heading into the area where over-ring is most sonically annoying and obtrusive, you will see a small dip indicating that Fraunhofer is actually choosing to spend those bits and bytes to mute that particular sonic area. Compare with Amadeus LAME at 64 bits, which has extensive ringing in just this problem area! There's not a lot of room for debate. The frequency response deviations even outdo Blade at 64K- and Blade is the toughest competitor for that, if you're willing to overlook pre-echo and over-ring. Fraunhofer mp3 at 32K is truly an impressive engineering feat. SWA64 is a puzzle- see the broad, smooth grey areas? I have reproduced this result repeatedly, as I thought I'd made some sort of mistake generating the sonogram. It looks completely unlike any other sonogram and I'm at a loss to explain why. If we believe it, it says that Fraunhofer at 64K bits is slightly more distorted in the frequency domain, slightly cleaner in pre-echo and over-ring but without the special favoring of the 'shrill' band from 3K to 8K- the unusual appearance may be a result of some odd handling of the bass frequencies which can cause similar artifacts in the sonogram, but again, no other sonogram is anything like this. I can't be certain it has any validity as a result- the inexplicable evenness of the coloration is beyond my understanding. The mp3 file's available- if anyone wants to de-mystify this they're welcome to try analysing it themselves. There are many more sonograms to investigate, so we can leave this one for now and go on to 128K-land, where many indie musicians reside. Next Page- 128K |
page created Mon, Oct 23, 2000 last modified Fri, Oct 27, 2000 Send Email back to Encoders Analysis of encoders using sonogram plotting Amadeus32.mp3 17 K MPEG Audio Amadeus64.mp3 34 K MPEG Audio Blade32.mp3 17 K MPEG Audio Blade64.mp3 34 K MPEG Audio CC32.mp3 17 K MPEG Audio CC64.mp3 34 K MPEG Audio SWA32.mp3 17 K MPEG Audio SWA64.mp3 35 K MPEG Audio |