WAV音频格式详解
2017-05-18 21:26:04
如果你已经完全搞懂了我上篇文章中所说的内容,你一定会想这样一个问题:我能不能自己生成一个音频文件出来?事实上,当你搞懂了原理之后,一切都变得非常简单。
WAV格式是所有音频格式中最简单的了,因为它不存在压缩,只是单纯地把所有采样点原封不动地写在文件里。因此,这篇文章的目的就是告诉你怎样生成一个WAV格式的音频文件。
WAV格式文件头
WAV格式遵循RIFF资源交换档案格式,所以WAV格式其实是一个三层关系,这里简化了一下,它的文件头格式如下表:
地址 | 大小 | 类型 | 内容 |
00H-03H | 4 | char*4 | 资源文件交换标志RIFF |
04H-07H | 4 | unsigned int | 从下个地址开始到文件末尾的字节数 |
08H-0BH | 4 | char*4 | WAV文件标志WAVE |
0CH-0FH | 4 | char*4 | 波形文件标志fmt ,最后一位是0x20空格 |
10H-13H | 4 | unsigned int | 子Chunk的文件头大小,对于WAV这个子Chunk该值为0x10 |
14H-15H | 2 | unsigned short | 格式类型,值为1时,表示数据为线性PCM编码 |
16H-17H | 2 | unsigned short | 声道数 |
18H-1BH | 4 | unsigned int | 采样频率 |
1CH-1FH | 4 | unsigned int | 波形文件每秒的字节数=采样率*PCM位深/8*声道数 |
20H-21H | 2 | unsigned short | DATA数据块单位长度=声道数*PCM位深/8 |
22H-23H | 2 | unsigned short | PCM位深 |
24H-27H | 4 | char*4 | 数据标志data |
28H-2BH | 4 | unsigned int | 数据部分总长度(字节数) |
WAV数据组织方式
在文件头之后,就是WAV文件的数据部分了。它的数据组织方式是:第一个采样点的左声道值,第一个采样点的右声道值,……,最后一个采样点的左声道值,最后一个采样点的右声道值。
每一个值都有位深个比特。
下面这张图解读得比较详细,来源是这里,想对WAV格式有更详细的了解也可以看一下这个。
C++代码实现
文件头
首先,文件头结构体:
struct WAVHeader
{
char RIFF[4]; ///资源文件交换标志RIFF
unsigned LEN; ///从下个地址开始到文件末尾的字节数
char WAV[4]; ///WAV文件标志WAVE
char FMT[4]; ///波形文件标志fmt ,最后一位是0x20空格
unsigned SubchunkSize; ///子Chunk的文件头大小,对于WAV这个子Chunk该值为0x10
unsigned short DATATYPE; ///格式类型,值为1时,表示数据为线性PCM编码
unsigned short CH; ///声道数
unsigned F; ///采样频率
unsigned BYTERATE; ///波形文件每秒的字节数=采样率*PCM位深/8*声道数
unsigned short DATAUNITLEN; ///DATA数据块单位长度=声道数*PCM位深/8
unsigned short BITDEPTH; ///PCM位深
char DATA[4]; ///数据标志data
unsigned DATALEN; ///数据部分总长度(字节数)
};
填写方法:
WAVHeader getHeader(int num)
{
WAVHeader res;
memcpy(res.RIFF,"RIFF",sizeof(res.RIFF));
memcpy(res.WAV,"WAVE",sizeof(res.WAV));
memcpy(res.FMT,"fmt ",sizeof(res.FMT));
res.SubchunkSize=0x10;
res.DATATYPE=1;
res.CH=2;
res.F=F;
res.BITDEPTH=DEPTH;
res.BYTERATE=res.F*res.BITDEPTH/8*res.CH;
res.DATAUNITLEN=res.CH*res.BITDEPTH/8;
memcpy(res.DATA,"data",sizeof(res.DATA));
res.DATALEN=num*res.DATAUNITLEN;
res.LEN=res.DATALEN+44-8;
return res;
}
其中,num为采样点个数。
FamiTracker效果实现
首先,定义键名——频率对照表。
const double keyf[]=
{
27.5,29.1352,30.8677,
32.7032,34.6478,36.7081,38.8909,41.2034,43.6535,46.2493,48.9994,51.9131,55,58.2705,61.7354,
65.4064,69.2957,73.4162,77.7817,82.4069,87.3071,92.4986,97.9989,103.826,110,116.541,123.471,
130.813,138.591,146.832,155.563,164.814,174.614,184.997,195.998,207.652,220,233.082,246.942,
261.626,277.183,293.665,311.127,329.628,349.228,369.994,391.995,415.305,440,466.164,493.883,
523.251,554.365,587.33,622.254,659.255,698.456,739.989,783.991,830.609,880,932.328,987.767,
1046.5,1108.73,1174.66,1244.51,1318.51,1396.91,1479.98,1567.98,1661.22,1760,1864.66,1975.53,
2093,2217.46,2349.32,2489.02,2637.02,2793.83,2959.96,3135.96,3322.44,3520,3729.31,3951.07,
4186.01
}; ///钢琴88键的频率表,按八度划分(十二平均律)
string keyname[]=
{
"A-0","A#0","B-0",
"C-1","C#1","D-1","D#1","E-1","F-1","F#1","G-1","G#1","A-1","A#1","B-1",
"C-2","C#2","D-2","D#2","E-2","F-2","F#2","G-2","G#2","A-2","A#2","B-2",
"C-3","C#3","D-3","D#3","E-3","F-3","F#3","G-3","G#3","A-3","A#3","B-3",
"C-4","C#4","D-4","D#4","E-4","F-4","F#4","G-4","G#4","A-4","A#4","B-4",
"C-5","C#5","D-5","D#5","E-5","F-5","F#5","G-5","G#5","A-5","A#5","B-5",
"C-6","C#6","D-6","D#6","E-6","F-6","F#6","G-6","G#6","A-6","A#6","B-6",
"C-7","C#7","D-7","D#7","E-7","F-7","F#7","G-7","G#7","A-7","A#7","B-7",
"C-8"
};
string noisename[]=
{
"0-#","1-#","2-#","3-#","4-#","5-#","6-#","7-#","8-#","9-#","A-#","B-#","C-#","D-#","E-#","F-#"
};
定义一些常量:
const int F=48000; ///音乐采样率,单位Hz
const int DEPTH=16; ///音乐位深
const int LVL=(1<<14)-1; ///max volume, 4 ch * LVL < unsigned short
map<string,int> nametokey;
const double SecondPerKey=0.065; ///每个音符的持续时间
const int S=F*SecondPerKey; ///每个音符需要的采样点个数
typedef unsigned short levelval; ///位深为16所以使用short
下面是生成方波、三角波、噪声波的函数,rnd
数组为随机数。但是这样产生的噪声波效果并不好,我研究了一下FamiTracker生成的波形,我也没看出来它是怎么生成的。getWave
函数用来得到波形采样后的结果。
///给定时间t,不同波形的值
levelval getPulse(int key,double t)
{
double T=1/keyf[key];
double percent=fmod(t,T)/T;
if (percent<0.5) return LVL;
else return 0;
}
levelval getTriangle(int key,double t)
{
double T=1/keyf[key];
double percent=fmod(t,T)/T;
if (percent<0.5) return LVL*percent;
else return LVL*(1-percent);
}
levelval rnd[50005];
levelval getNoise(int key,double t)
{
double T=1/keyf[key];
double percent=fmod(t,T)/T;
if (percent<rnd[(int)(percent*S)]*1.0/LVL) return rnd[(int)(percent*S)];
else return 0;
}
void getWave(int type,int key,levelval res[],int vol=15)
{
levelval (*f[])(int,double)= {getPulse,getPulse,getTriangle,getNoise};
for (int i=0; i<S; i++)
res[i]=f[type](key,i*1.0/F)*1.0*vol/15;
}
输出
使用二进制方式读写文件,使用fwrite
进行二进制输出。
FILE *fp=fopen("test.wav","wb");
WAVHeader header=getHeader(num);
fwrite(&header,sizeof(header),1,fp);
快去自己尝试生成一段WAV文件吧!
网页版实现
2018-03-11 UPDATE:
我最近用js粗略实现了一下,可以实现在网页上播放。不过虽然功能实现了,但是实现不够优美,而且效率比较低,点击播放之后要稍等几秒生成WAV才行。
点击按钮之后即可播放下面的谱子。另外我还写了一个FamiTracker的txt乐谱转换小程序,可以做一些简单的转换。
除了这个下面已经写好的谱子,我这里还提供了另一张,可以粘贴进去播放。