Wave RIFF format Header and Data description by Nathan Davidson
---------------------------------------------------------------

	I've had a lot of people asking how to load in a .WAV file
and/or what the format of a .WAV file looks like. So here's my attempt
at explaining exactly what you've got in a wave file.  

	The wave file is a Microsoft creation of the RIFF (Resource
Interchange File Format) standard.  The original format for such types
of files were made in the C language, but can be easily ported to other 
languages (like pascal).  RIFF type files are separated into blocks
of data called chunks. The chunks specifically look like this:

typedef unsigned long DWORD; //32 bits
typedef unsigned char BYTE;  //8 bits
typedef DWORD FOURCC;

typedef struct {
	FOURCC ckID;
	DWORD ckSize;
	BYTE ckData[ckSize];
} CK;

FOURCC stands for four character code, which means you'll have
four identifying Characters.The ckSize specifies how long the chunk 
is (minus the 4bytes of the FOURCC ckID) Then you have the actual data
in the chunk.

I. The First Chunk
	The first chunk in a wave file starts with 4 bytes - the FOURCC
ckID of 'RIFF'. Always check for the first four characters in a wave
file to be 'RIFF', if the first four bytes in a file aren't equal to this
then you don't have a wave file. Then we have the File Size (a 32 bit
value), which is the entire size of the file not including those
first four bytes we used for the 'RIFF' identifier.

II. The 'WAVE' RIFF id
	The following four bytes should say 'WAVE' which specify 
what type of RIFF this is. This is another good thing to check for.

III. The Format Chunk
	Now we start the format chunk. Since we're starting
a new chunk we have another 4 byte FOURCC ckID: 'fmt '
(notice the intentional space at the end of 'fmt ' and small caps!!)
Which lets us know the format chunk is coming up. The format chunk holds
info about the wave file, like how many channels it's using,
is it 16 or 8 bit,KHz, etc.. Immediately after 'fmt '
is a 32 bit number that is our format chunk length.  For a Wave 
file we use 16 bytes to describe the format, so the format chunk length
should be 16. 
IIIb. Format Chunk 
 	First up in the data of the Format chunk is the
format tag which is 2 bytes long and tells us what type of format the
data is in. The most common format is PCM (Pulse Code Modulation) and the 
value to represent this is a 1.  Next we have 2 more bytes
specifying the number of channels, a 1 represents Mono and a 2
represents stereo. Following this is a 32 bit number that represents
the sample rate such as 44100, 22050, 11025, or 8000.  Next is another
32 bit number that represents the average bytes p/second, this is not
really a neccessary number for most people and is usually just
discarded, but it is nice to know if you're getting nitty gritty, 
To find the average bytes p/second you use this formula:
SampleRate*Channels*(Bits/8), 
So if we had a 16bit Stereo 44.1khz WAVE file the avg. bytes p/sec would
be: 44100(samplerate)*2(stereo)*(16(Bits)/8)=176400, 
This lets you know that your computer is gonna be crunching on 176,400
bytes every second it's playing a CD Quality wave file!!  Next up in our
wave file we have a 2byte number used for Block Alignment, this is 
another number that's usually discarded (it's used for helping to
determine the number of Bytes used for each sample) but if needed you use
this formula to calculate it: (Bits/8)*Channels
So for an 8bit mono wave you'd get 1 (meaning 1 byte for each sample)
for a 16bit stereo you'd get 4 (meaning 4 bytes for each sample).
Ok next up is another 2byte number that stores the number of Bits used
for each sample, this is gonna be 8 or 16.  That finishes
up our 16bytes used in the format data chunk.

IV. Data Chunk
	Now we get to the actual data chunk.  To start off the data chunk
we, of course, have 4 bytes (FOURCC ckID) that store the word 'data'. 
Immediately after that is another 32bit number (ckSize) that says how 
many bytes the sample data is. 
IVb. Actual Raw Sample Data
	Now we've finally come to the actual data that makes up the wave
file. The way the data is packed in PCM is determined by the Bits (8 or
16) and the Channels (Mono or Stereo).  An 8 bit mono file is simple
and looks like:
 byte		,byte		,byte		,byte
|__1sample______|__1sample______|__1sample______|__

with a 16 bit stereo file you'll have data as follows:
[channel 0 (2bytes)],[channel 1(2bytes)],[channel 0 (2bytes)],[chan..etc.]
|______1 sample________________________|________1 sample_________________|

it alternates between channel zero and one using 16 bits to represent
each sample. 

8 bit samples are unsigned and have values between 0 and 255
meaning the midpoint of an 8bit wave (volume at 0) is 128
16 bit samples are signed and have values between -32768 and 32767
and the midpoint is at 0.
That's why 16 bit sound is sooo much better than 8 bit, you get 
approximately 64000 volume levels per sample with a 16 bit sound and
a measely 256 with 8 bit. 

The raw data runs until the end of the file,however you should know
that some programs like to attach some notes or comments at the 
very end of the data, but this can be ignored.

Now let me try and represent all this with some pretty ASCII graphics.

|<-----------------------------32 bits---------------------------->|
|<--------------16 bits----------->|<-------------16 bits--------->|
|<----8 bits---->|<-----8 bits---->|<----8 bits--->|<----8 bits--->|

File starts:
--------------------------------------------------------------------
|      'R'       |       'I'       |       'F'     |     'F'       |
--------------------------------------------------------------------
|                          RIFF Chunk Length                       |
--------------------------------------------------------------------
|      'W'       |       'A'       |       'V'     |     'E'       |
--------------------------------------------------------------------
|      'f'       |       'm'       |       't'     |     ' '       |
--------------------------------------------------------------------
|                         FORMAT Chunk Length (16)                 |
--------------------------------------------------------------------
|        Format Tag (1=PCM)        |  Channels (1=Mono 2=Stereo)   |
--------------------------------------------------------------------
|                  Sample Rate (44100,22050,11025,or 8000)         |
--------------------------------------------------------------------
|    Average # of Bytes P/Second (Sample rate*Channels*(Bits/8)    |
--------------------------------------------------------------------
| Block Align ((Bits/8)*Channels)  |   Bits per Sample (8 or 16)   |
--------------------------------------------------------------------
|     'd'        |       'a'       |      't'      |     'a'       |
--------------------------------------------------------------------
|                Data Length (actual length of raw data)           |
--------------------------------------------------------------------
|                                                                  |
|                                                                  |
|                                                                  |
|                                                                  |
|                               raw data                           |
|                                                                  |
|                                                                  |
----------------------------------EOF-------------------------------


	Well, there you have it. If you're programming this on a system
other than a PC (a Mac, Unix, etc.) then you need to do some reading
about little and big endian.  Other computers store the raw
data differently and the way you pull up data if it's 16 bits is
backwards (MSB and LSB are switched).  Unfortunately this is a pain
to explain and i'm already tired.  That's a topic for another day =) 
(look for my .WAV C source code and other sound/game programming stuff
at my web page)

questions/comments/complaints/donations send to:
npawn@geocities.com (try this address first)
or
npawn@juno.com

and my web page is currently at:
http://www.geocities.com/SiliconValley/Pines/4223/

-------------------------------------------------------------------------
Copyright 1996,1997 Nathan Davidson
Feel free to distribute this file wherever you want - as long as it's for
non-profit and the contents of this file aren't changed.
-------------------------------------------------------------------------