WAV音频格式详解

2017-05-18 21:26:04 Audio

如果你已经完全搞懂了我上篇文章中所说的内容,你一定会想这样一个问题:我能不能自己生成一个音频文件出来?事实上,当你搞懂了原理之后,一切都变得非常简单。

WAV格式是所有音频格式中最简单的了,因为它不存在压缩,只是单纯地把所有采样点原封不动地写在文件里。因此,这篇文章的目的就是告诉你怎样生成一个WAV格式的音频文件。

WAV格式文件头

WAV格式遵循RIFF资源交换档案格式,所以WAV格式其实是一个三层关系,这里简化了一下,它的文件头格式如下表:

地址大小类型内容
00H-03H4char*4资源文件交换标志RIFF
04H-07H4unsigned int从下个地址开始到文件末尾的字节数
08H-0BH4char*4WAV文件标志WAVE
0CH-0FH4char*4波形文件标志fmt ,最后一位是0x20空格
10H-13H4unsigned int子Chunk的文件头大小,对于WAV这个子Chunk该值为0x10
14H-15H2unsigned short格式类型,值为1时,表示数据为线性PCM编码
16H-17H2unsigned short声道数
18H-1BH4unsigned int采样频率
1CH-1FH4unsigned int波形文件每秒的字节数=采样率*PCM位深/8*声道数
20H-21H2unsigned shortDATA数据块单位长度=声道数*PCM位深/8
22H-23H2unsigned shortPCM位深
24H-27H4char*4数据标志data
28H-2BH4unsigned int数据部分总长度(字节数)

WAV数据组织方式

在文件头之后,就是WAV文件的数据部分了。它的数据组织方式是:第一个采样点的左声道值,第一个采样点的右声道值,……,最后一个采样点的左声道值,最后一个采样点的右声道值。

每一个值都有位深个比特。

下面这张图解读得比较详细,来源是这里,想对WAV格式有更详细的了解也可以看一下这个。

C++代码实现

文件头

首先,文件头结构体:

struct WAVHeader
{
    char RIFF[4]; ///资源文件交换标志RIFF
    unsigned LEN; ///从下个地址开始到文件末尾的字节数
    char WAV[4]; ///WAV文件标志WAVE
    char FMT[4]; ///波形文件标志fmt ,最后一位是0x20空格
    unsigned SubchunkSize; ///子Chunk的文件头大小,对于WAV这个子Chunk该值为0x10
    unsigned short DATATYPE; ///格式类型,值为1时,表示数据为线性PCM编码
    unsigned short CH; ///声道数
    unsigned F; ///采样频率
    unsigned BYTERATE; ///波形文件每秒的字节数=采样率*PCM位深/8*声道数
    unsigned short DATAUNITLEN; ///DATA数据块单位长度=声道数*PCM位深/8
    unsigned short BITDEPTH; ///PCM位深
    char DATA[4]; ///数据标志data
    unsigned DATALEN; ///数据部分总长度(字节数)
};

填写方法:

WAVHeader getHeader(int num)
{
    WAVHeader res;
    memcpy(res.RIFF,"RIFF",sizeof(res.RIFF));
    memcpy(res.WAV,"WAVE",sizeof(res.WAV));
    memcpy(res.FMT,"fmt ",sizeof(res.FMT));
    res.SubchunkSize=0x10;
    res.DATATYPE=1;
    res.CH=2;
    res.F=F;
    res.BITDEPTH=DEPTH;
    res.BYTERATE=res.F*res.BITDEPTH/8*res.CH;
    res.DATAUNITLEN=res.CH*res.BITDEPTH/8;
    memcpy(res.DATA,"data",sizeof(res.DATA));
    res.DATALEN=num*res.DATAUNITLEN;
    res.LEN=res.DATALEN+44-8;
    return res;
}

其中,num为采样点个数。

FamiTracker效果实现

首先,定义键名——频率对照表。

const double keyf[]=
{
    27.5,29.1352,30.8677,
    32.7032,34.6478,36.7081,38.8909,41.2034,43.6535,46.2493,48.9994,51.9131,55,58.2705,61.7354,
    65.4064,69.2957,73.4162,77.7817,82.4069,87.3071,92.4986,97.9989,103.826,110,116.541,123.471,
    130.813,138.591,146.832,155.563,164.814,174.614,184.997,195.998,207.652,220,233.082,246.942,
    261.626,277.183,293.665,311.127,329.628,349.228,369.994,391.995,415.305,440,466.164,493.883,
    523.251,554.365,587.33,622.254,659.255,698.456,739.989,783.991,830.609,880,932.328,987.767,
    1046.5,1108.73,1174.66,1244.51,1318.51,1396.91,1479.98,1567.98,1661.22,1760,1864.66,1975.53,
    2093,2217.46,2349.32,2489.02,2637.02,2793.83,2959.96,3135.96,3322.44,3520,3729.31,3951.07,
    4186.01
}; ///钢琴88键的频率表,按八度划分(十二平均律)
string keyname[]=
{
    "A-0","A#0","B-0",
    "C-1","C#1","D-1","D#1","E-1","F-1","F#1","G-1","G#1","A-1","A#1","B-1",
    "C-2","C#2","D-2","D#2","E-2","F-2","F#2","G-2","G#2","A-2","A#2","B-2",
    "C-3","C#3","D-3","D#3","E-3","F-3","F#3","G-3","G#3","A-3","A#3","B-3",
    "C-4","C#4","D-4","D#4","E-4","F-4","F#4","G-4","G#4","A-4","A#4","B-4",
    "C-5","C#5","D-5","D#5","E-5","F-5","F#5","G-5","G#5","A-5","A#5","B-5",
    "C-6","C#6","D-6","D#6","E-6","F-6","F#6","G-6","G#6","A-6","A#6","B-6",
    "C-7","C#7","D-7","D#7","E-7","F-7","F#7","G-7","G#7","A-7","A#7","B-7",
    "C-8"
};
string noisename[]=
{
    "0-#","1-#","2-#","3-#","4-#","5-#","6-#","7-#","8-#","9-#","A-#","B-#","C-#","D-#","E-#","F-#"
};

定义一些常量:

const int F=48000; ///音乐采样率,单位Hz
const int DEPTH=16; ///音乐位深
const int LVL=(1<<14)-1; ///max volume, 4 ch * LVL < unsigned short
map<string,int> nametokey;
const double SecondPerKey=0.065; ///每个音符的持续时间
const int S=F*SecondPerKey; ///每个音符需要的采样点个数
typedef unsigned short levelval; ///位深为16所以使用short

下面是生成方波、三角波、噪声波的函数,rnd数组为随机数。但是这样产生的噪声波效果并不好,我研究了一下FamiTracker生成的波形,我也没看出来它是怎么生成的。getWave函数用来得到波形采样后的结果。

///给定时间t,不同波形的值
levelval getPulse(int key,double t)
{
    double T=1/keyf[key];
    double percent=fmod(t,T)/T;
    if (percent<0.5) return LVL;
    else return 0;
}
levelval getTriangle(int key,double t)
{
    double T=1/keyf[key];
    double percent=fmod(t,T)/T;
    if (percent<0.5) return LVL*percent;
    else return LVL*(1-percent);
}
levelval rnd[50005];
levelval getNoise(int key,double t)
{
    double T=1/keyf[key];
    double percent=fmod(t,T)/T;
    if (percent<rnd[(int)(percent*S)]*1.0/LVL) return rnd[(int)(percent*S)];
    else return 0;
}
void getWave(int type,int key,levelval res[],int vol=15)
{
    levelval (*f[])(int,double)= {getPulse,getPulse,getTriangle,getNoise};
    for (int i=0; i<S; i++)
        res[i]=f[type](key,i*1.0/F)*1.0*vol/15;
}

输出

使用二进制方式读写文件,使用fwrite进行二进制输出。

FILE *fp=fopen("test.wav","wb");
WAVHeader header=getHeader(num);
fwrite(&header,sizeof(header),1,fp);

快去自己尝试生成一段WAV文件吧!

网页版实现

2018-03-11 UPDATE:

我最近用js粗略实现了一下,可以实现在网页上播放。不过虽然功能实现了,但是实现不够优美,而且效率比较低,点击播放之后要稍等几秒生成WAV才行。

点击按钮之后即可播放下面的谱子。另外我还写了一个FamiTracker的txt乐谱转换小程序,可以做一些简单的转换。

除了这个下面已经写好的谱子,我这里还提供了另一张,可以粘贴进去播放。

如果长时间无法加载评论,请对 *.disqus.com 启用代理!