Mp4文件解析
MP4可以說是當(dāng)前最流行的視頻格式,要播放一個(gè)MP4文件需要首先將其結(jié)構(gòu)給解析出來。MP4的結(jié)構(gòu)往簡單了說就是類似于俄羅斯套娃一樣的很多box套box,往復(fù)雜了說就是很多種類的box,而且還需要做一些解析和計(jì)算的操作,下面就按照其結(jié)構(gòu)來分析一下MP4文件里的主要的box.左側(cè)的目錄可以清晰地展示出各種box之間的關(guān)系。需要注意的是在ISO標(biāo)準(zhǔn)中box的種類非常多,這里只是列舉分析了一些比較重要的box.
Mp4Box
首先我們需要先了解一下box的結(jié)構(gòu)。在Mp4中所有的box都包含了一個(gè)BoxHeader和BoxData.其中BoxHeader包含了box的類型和整個(gè)box的長度,BoxData里面包含了子box或者各種數(shù)據(jù)?;谶@種結(jié)構(gòu)的考慮,我就可以設(shè)計(jì)出所有box的父類,還需要注意的是Media Data Box的長度有可能超出32位int的范圍,在這種情況下,size的大小就被設(shè)置為1,然后使用接下來的64位來存儲(chǔ)其長度。如果box的type為”uuid”的話,那接下來的16byte就用來存儲(chǔ)uuid.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | public class Mp4Box {public int headSize = 8;public int size;public String type;public long largeSize;public boolean hasSubBox = true;public int index;public long end;public Mp4Box(byte[] byteBuffer, int start) {this.index = start;size = getSize(byteBuffer);end = start + size;type = getType(byteBuffer);if (type.equals("uuid")) {headSize += 16;}if (size == 1) {largeSize = getLongFromBuffer(byteBuffer);end = start + largeSize;headSize += 8;} else if (size == 0) {//box extends to the end of file} } |
Fullbox
在上面我們定義了父類Mp4box,一般情況下box頭部的長度都是8byte, 但是還有一種box在頭部還有1byte的version和3byte的flags。所以我們需要?jiǎng)?chuàng)建一個(gè)Fullbox父類,讓所有這種類型的box來繼承它。
| 1 2 3 4 5 6 7 8 9 10 11 | public class FullBox extends Mp4Box {public int version;public int flag;public FullBox(byte[] byteBuffer, int start) {super(byteBuffer, start);version = getIntFromBuffer(byteBuffer, 1);flag = getIntFromBuffer(byteBuffer, 3);headSize = 12;} } |
ftyp
代表了File type box. 一般在文件的開頭處(只有固定大小的文件簽名可以在其前面),主要包含了該文件brand等,其定義方式如下:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | public class FtypBox extends FullBox {public int majorBrand;public int minor_version;public int[] compatibleBrands;public FtypBox(byte[] byteBuffer, int start) {super(byteBuffer, start);hasSubBox = false;majorBrand = getIntFromBuffer(byteBuffer, 4);minor_version = getIntFromBuffer(byteBuffer, 4);int num = (size - index) / 4;compatibleBrands = new int[num];for (int i = 0; i < num; i++) {compatibleBrands[i] = getIntFromBuffer(byteBuffer, 4);}} } |
moov
代表了Movie box,這個(gè)box包含了一個(gè)mvhd box 和多個(gè)trak box(如video,audio等),一般在文件的開頭處僅次于ftyp,但是也有放在文件末尾的。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | public class MoovBox extends Mp4Box {public MvhdBox mvhdBox;public ArrayList<TrakBox> trakBox = new ArrayList<>();public MoovBox(byte[] byteBuffer, int start) {super(byteBuffer, start);parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;do {int size = getSize(byteBuffer);String type = getType(byteBuffer);if (size > 8) {if (type.equals("mvhd")) {mvhdBox = new MvhdBox(byteBuffer, subStart);} else if (type.equals("trak")) {trakBox.add(new TrakBox(byteBuffer, subStart));}subStart += size;index = subStart;} else {break;}} while (subStart < end);} } |
mvhd
代表了Movie Header box。 包含了整個(gè)媒體問題的信息。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | public class MvhdBox extends FullBox {public long creationTime;public long modificationTime;public int timescale;public long duration;public int rate;public int volume;public int reserved = 0;public int[] matrix = new int[9];public int[] preDefined = new int[6];public int nextTrackId;public MvhdBox(byte[] byteBuffer, int start) {super(byteBuffer, start);if (version == 1) {creationTime = getLongFromBuffer(byteBuffer);modificationTime = getLongFromBuffer(byteBuffer);timescale = getIntFromBuffer(byteBuffer, 4);duration = getLongFromBuffer(byteBuffer);} else {creationTime = getIntFromBuffer(byteBuffer, 4);modificationTime = getIntFromBuffer(byteBuffer, 4);timescale = getIntFromBuffer(byteBuffer, 4);duration = getIntFromBuffer(byteBuffer, 4);}rate = getIntFromBuffer(byteBuffer, 4);volume = getIntFromBuffer(byteBuffer, 2);//skip reservedindex += 10;for (int i = 0; i < matrix.length; i++) {matrix[i] = getIntFromBuffer(byteBuffer, 4);}for (int i = 0; i < preDefined.length; i++) {preDefined[i] = getIntFromBuffer(byteBuffer, 4);}nextTrackId = getIntFromBuffer(byteBuffer, 4);} } |
- creationTime: 一個(gè)int型的數(shù)據(jù)代表了文件的創(chuàng)建時(shí)間(從1904年1月1日凌晨開始的秒數(shù))
- modificationTime:修改時(shí)間,定義同上
-
timescale:1秒內(nèi)包含的時(shí)間單位,和下面的duration結(jié)合可以得到媒體的時(shí)長。
-
duration:媒體的時(shí)長,以時(shí)長最長的track為準(zhǔn)
-
rate:播放的速度,1.0為正常速度
- volume:播放的音量
- matrix:視頻的轉(zhuǎn)換矩陣
- nextTrackId:下一個(gè)track的id
trak
代表了 Track box。包含了一條track。通常情況下一個(gè)視頻會(huì)包含video和audio兩條track,還有一條hint track是為streaming準(zhǔn)備的。
一個(gè)trak box內(nèi)包含了tkdh和mdia兩種box。1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 public class TrakBox extends Mp4Box {public TkhdBox mTkhdBox;public MdiaBox mMdiaBox;public TrakBox(byte[] byteBuffer, int start) {super(byteBuffer, start);parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;do {int size = getSize(byteBuffer);String type = getType(byteBuffer);if (size > 8) {if (type.equals("tkhd")) {mTkhdBox = new TkhdBox(byteBuffer, subStart);} else if (type.equals("mdia")) {mMdiaBox = new MdiaBox(byteBuffer, subStart);}subStart += size;index = subStart;} else {break;}} while (subStart < end);} }
- nextTrackId:下一個(gè)track的id
-
tkhd
代表了Track Header Box。每個(gè)trck box會(huì)包含一個(gè)tkhd box來存儲(chǔ)這條track的信息。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | public class TkhdBox extends FullBox {public long creationTime;public long modificationTime;public int trackId;public long duration;public int layer;public int alternateGroup;public int volume;public int[] matrix = new int[9];public int width;public int height;public TkhdBox(byte[] byteBuffer, int start) {super(byteBuffer, start);if (version == 1) {creationTime = getLongFromBuffer(byteBuffer);modificationTime = getLongFromBuffer(byteBuffer);trackId = getIntFromBuffer(byteBuffer, 4);index += 4;duration = getLongFromBuffer(byteBuffer);} else {creationTime = getIntFromBuffer(byteBuffer, 4);modificationTime = getIntFromBuffer(byteBuffer, 4);trackId = getIntFromBuffer(byteBuffer, 4);index += 4;duration = getIntFromBuffer(byteBuffer, 4);}//skip reservedindex += 8;layer = getIntFromBuffer(byteBuffer, 2);alternateGroup = getIntFromBuffer(byteBuffer, 2);volume = getIntFromBuffer(byteBuffer, 2);index += 2;for (int i = 0; i < matrix.length; i++) {matrix[i] = getIntFromBuffer(byteBuffer, 4);}width = getIntFromBuffer(byteBuffer, 4);height = getIntFromBuffer(byteBuffer, 4);Logger.i(toString());} } ` |
?
- creationTime: 這條track的創(chuàng)建時(shí)間(同mvhd)
- modificationTime: 這條track的更改時(shí)間(同nvhd)
- trackId: 當(dāng)前track的id,不能為0
- duration: 當(dāng)前track的時(shí)長
- layer: 播放時(shí)video track的前后順序,數(shù)字越小的越在上層。
- alternateGroup: 對track進(jìn)行分組,同一個(gè)組內(nèi)同時(shí)只能播放一個(gè)track
- volume: 音量大小,對于video track 會(huì)是0
- matrix:video的轉(zhuǎn)換矩陣
- width/height 視頻的寬度和長度,以像素為單位
mdia
代表了Media Box, 用來包含當(dāng)前track的信息。mdia baox主要有三個(gè)子box,分別是hdlr、mdhd和minf,我們會(huì)在接下來分析這幾個(gè)子box。
其中componentName并不是mdia box需要解析的內(nèi)容,在這里我們用來存儲(chǔ)當(dāng)前track的類型,如是vedio還是audio。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | public class MdiaBox extends Mp4Box {public HdlrBox mHdlrBox;public MdhdBox mMdhdBox;public MinfBox mMinfBox;public String componentName;public MdiaBox(byte[] byteBuffer, int start) {super(byteBuffer, start);parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;do {int size = getSize(byteBuffer);String type = getType(byteBuffer);if (size > 8) {if (type.equals("hdlr")) {mHdlrBox = new HdlrBox(byteBuffer, subStart);componentName = mHdlrBox.componentName;} else if (type.equals("mdhd")) {mMdhdBox = new MdhdBox(byteBuffer, subStart);} else if (type.equals("minf")) {mMinfBox = new MinfBox(byteBuffer, subStart, componentName);}subStart += size;index = subStart;} else {break;}} while (subStart < end);} } |
?
mdhd
代表了Media Header Box。包含了一些當(dāng)前track的信息,如creationTime等,在上文已經(jīng)提過了,這里將不再贅述。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | public class MdhdBox extends FullBox {public long creationTime;public long modificationTime;public int timescale;public long duration;public int language;public int predefined;public MdhdBox(byte[] byteBuffer, int start) {super(byteBuffer, start);if (version == 1) {creationTime = getLongFromBuffer(byteBuffer);modificationTime = getLongFromBuffer(byteBuffer);timescale = getIntFromBuffer(byteBuffer, 4);duration = getLongFromBuffer(byteBuffer);} else {creationTime = getIntFromBuffer(byteBuffer, 4);modificationTime = getIntFromBuffer(byteBuffer, 4);timescale = getIntFromBuffer(byteBuffer, 4);duration = getIntFromBuffer(byteBuffer, 4);}language = getIntFromBuffer(byteBuffer, 2);predefined = getIntFromBuffer(byteBuffer, 2);} } |
-
- language 當(dāng)前媒體的語言代碼
hdlr
代表了Handler Refrence Box。包含了當(dāng)前track類型的信息,如這條track是video、sound還是hint。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 | public class HdlrBox extends FullBox {public int componentType;public int componentSubType;public String componentName;public HdlrBox(byte[] byteBuffer, int start) {super(byteBuffer, start);componentType = getIntFromBuffer(byteBuffer, 4);componentSubType = getIntFromBuffer(byteBuffer, 4);index += 12;componentName = getStringFromBuffer(byteBuffer, 12);} } |
?
minf
代表了Media Information Box。 內(nèi)部主要包含了一個(gè)stbl box。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | public class MinfBox extends Mp4Box {StblBox mStblBox;private String mType;public MinfBox(byte[] byteBuffer, int start, String type) {super(byteBuffer, start);mType = type;parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;do {int size = getSize(byteBuffer);String type = getType(byteBuffer);if (size > 8) {if (type.equals("stbl")) {mStblBox = new StblBox(byteBuffer, subStart,mType);}subStart += size;index = subStart;} else {break;}} while (subStart < end);} } |
?
stbl
代表了Sample Table Box。 這個(gè)box包含了當(dāng)前track中所有Samples的時(shí)間和數(shù)據(jù)索引,所以根據(jù)這些信息就可以定位某個(gè)時(shí)間點(diǎn)的Sample及其大小等信息。
這個(gè)box包含了很多子box,將各個(gè)子box的信息結(jié)合起來就得得到Samples的詳細(xì)信息了。具體怎么計(jì)算將在最后做一下分析。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | public class StblBox extends Mp4Box {private final String mType;public StsdBox mStsdBox;public StssBox mStssBox;public SttsBox mSttsBox;public StszBox mStszBox;public StscBox mStscBox;public StcoBox mStcoBox;public Co64Box mCo64Box;public CttsBox mCttsBox;public StblBox(byte[] byteBuffer, int start, String type) {super(byteBuffer, start);mType = type;parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;do {int size = getSize(byteBuffer);String type = getType(byteBuffer);if (type.equals("stsd")) {mStsdBox = new StsdBox(byteBuffer, subStart, mType);} else if (type.equals("stts")) {mSttsBox = new SttsBox(byteBuffer, subStart);} else if (type.equals("stsz")) {mStszBox = new StszBox(byteBuffer, subStart);} else if (type.equals("stsc")) {mStscBox = new StscBox(byteBuffer, subStart);} else if (type.equals("stco")) {mStcoBox = new StcoBox(byteBuffer, subStart);} else if (type.equals("Co64")) {mCo64Box = new Co64Box(byteBuffer, subStart);} else if (type.equals("ctts")) {mCttsBox = new CttsBox(byteBuffer, subStart);}subStart += size;index = subStart;} while (subStart < end);} } |
?
stts
代表了Decoding Time to Sample Box。這個(gè)box包含了一個(gè)表,從這個(gè)表里可以根據(jù)解碼時(shí)間來定位sample的序號。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | public class SttsBox extends FullBox {private final int entryCount;public int[] sampleCount;public int[] sampleDelta;public SttsBox(byte[] byteBuffer, int start) {super(byteBuffer, start);entryCount = getIntFromBuffer(byteBuffer, 4);sampleCount = new int[entryCount];sampleDelta = new int[entryCount];for (int i = 0; i < entryCount; i++) {sampleCount[i] = getIntFromBuffer(byteBuffer, 4);sampleDelta[i] = getIntFromBuffer(byteBuffer, 4);}} } |
- sampleCount: 一個(gè)時(shí)間段內(nèi)連續(xù)的Sample個(gè)數(shù)
- sampleDelta:一個(gè)時(shí)間段內(nèi)sample的delta。
ctts
代表了Sample Table Box。包含了decoding time 和composition time之間的偏移量。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | public class CttsBox extends FullBox {private final int mEntryCount;public int[] sampleCount;public int[] sampleOffset;public CttsBox(byte[] byteBuffer, int start) {super(byteBuffer, start);mEntryCount = getIntFromBuffer(byteBuffer, 4);sampleCount = new int[mEntryCount];sampleOffset = new int[mEntryCount];for (int i = 0; i < mEntryCount; i++) {sampleCount[i] = getIntFromBuffer(byteBuffer, 4);sampleOffset[i] = getIntFromBuffer(byteBuffer, 4);}} } |
- sampleCount: 特定偏移量內(nèi)連續(xù)的Sample個(gè)數(shù)
- sampleDelta: CT和DT之間的偏移量,如CT(n)=DT(n)+CTTS(n)
stsd
代表了Sample Description Box,里面包含了一些子box,用來存儲(chǔ)編碼的信息。根據(jù)當(dāng)前track的類型來決定子box的類型。
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | public class StsdBox extends FullBox {private final String mType;public int numberOfEntries;public SampleEntry[] entrys;public StsdBox(byte[] byteBuffer, int start, String type) {super(byteBuffer, start);numberOfEntries = getIntFromBuffer(byteBuffer, 4);entrys = new SampleEntry[numberOfEntries];mType = type;parseSub(byteBuffer);}@Overridepublic void parseSub(byte[] byteBuffer) {int subStart = index;for (int i = 0; i < numberOfEntries; i++) {int size = getSize(byteBuffer);String type = getType(byteBuffer);Logger.i("type:" + type);entrys[i] = mType.startsWith("Video") ? new VisualSampleEntry(byteBuffer, subStart) : new AudioSampleEntry(byteBuffer, subStart);subStart += size;}} } |
?
####### VisualSampleEntry
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | public class VisualSampleEntry extends SampleEntry {private final int mWidth;private final int mHeight;private final int mHorizeresolution;private final int mVertresolution;private final int mFrameCount;private final String mCompressorName;public VisualSampleEntry(byte[] byteBuffer, int start) {super(byteBuffer, start);index += 16;mWidth = getIntFromBuffer(byteBuffer, 2);mHeight = getIntFromBuffer(byteBuffer, 2);mHorizeresolution = getIntFromBuffer(byteBuffer, 4);mVertresolution = getIntFromBuffer(byteBuffer, 4);index += 4;mFrameCount = getIntFromBuffer(byteBuffer, 2);mCompressorName = getStringFromBuffer(byteBuffer, 4);Logger.i(toString());} } |
//todo
總結(jié)
- 上一篇: ISO base media file
- 下一篇: MOV及MP4文件格式中几个重要的Tab