基于Android的ELF PLT/GOT符号重定向过程及ELF Hook实现
生活随笔
收集整理的這篇文章主要介紹了
基于Android的ELF PLT/GOT符号重定向过程及ELF Hook实现
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
#引言
寫這篇技術文的原因,主要有兩個:
-?其一是發現網上大部分描述PLT/GOT符號重定向過程的文章都是針對x86的,比如[《Redirecting?functions?in?shared?ELF?libraries》](http://www.codeproject.com/Articles/70302/Redirecting-functions-in-shared-ELF-libraries#_Toc257815978)就寫得非常不錯。雖然其過程跟ARM非常類似,但由于CPU體系不同,指令實現差異非常大;
-?其二是網上大部分關于ELF文件格式的介紹,都是基于鏈接視圖(Linking?View),鏈接視圖是基于節(Section)對ELF進行解析的。然而動態鏈接庫在加載的過程中,linker只關注ELF中的段(Segment)信息。因此ELF中的節信息被完全篡改或者甚至刪除掉,并不會影響linker的加載過程,這樣做可以防止靜態分析工具(比如IDA,readelf等)對其進行分析,一般加過殼的ELF文件都會有這方面的處理。對于這種ELF文件,如果要實現hook功能,則必須要基于執行視圖(Execution?View)進行符號解析;
#準備
在往下閱讀之前,請先確保對ELF文件格式和ARM匯編有個大概了解,參考指引:
-?[ELF?文件格式分析](http://staff.ustc.edu.cn/~sycheng/sst/exp_crack/ELF.pdf);
-?[ARM文檔](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204ic/Cihbiggi.html#);
準備工具:
-?readelf(NDK包含)
-?objdump(NDK包含)
-?IDA?Pro?6.4或以上
-?Android真機或者模擬器
#符號重定向
在ARM上,常見的重定向類型,主要有三種,分別是**R_ARM_JUMP_SLOT**、**R_ARM_ABS32**和**R_ARM_GLOB_DAT**,而我們要hook?elf函數,則需要同時處理好這三種重定向類型。
##例子
先看示例代碼
```
typedef?int?(*strlen_fun)(const?char?*);
strlen_fun?global_strlen1?=?(strlen_fun)strlen;
strlen_fun?global_strlen2?=?(strlen_fun)strlen;
#define?SHOW(x)?LOGI("%s?is?%d",?#x,?x)
extern?"C"?jint?Java_com_example_allhookinone_HookUtils_elfhook(JNIEnv?*env,?jobject?thiz){
??const?char?*str?=?"helloworld";
??strlen_fun?local_strlen1?=?(strlen_fun)strlen;
??strlen_fun?local_strlen2?=?(strlen_fun)strlen;
??int?len0?=?global_strlen1(str);
??int?len1?=?global_strlen2(str);
??int?len2?=?local_strlen1(str);
??int?len3?=?local_strlen2(str);
??int?len4?=?strlen(str);
??int?len5?=?strlen(str);
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??return?0;
}
```
這段代碼分別以三種不同的方式調用strlen,分別是全局函數指針、局部函數指針以及直接調用,下而我們針對這個例子,分別對三種調用分析進行分析。
先通過readelf,我們查看一下重定向表,如下所示:
```
Relocation?section?'.rel.dyn'?at?offset?0x2a48?contains?17?entries:
?Offset?????Info????Type????????????Sym.Value??Sym.?Name
0000ade0??00000017?R_ARM_RELATIVE???
0000af00??00000017?R_ARM_RELATIVE???
0000af0c??00000017?R_ARM_RELATIVE???
0000af10??00000017?R_ARM_RELATIVE???
0000af18??00000017?R_ARM_RELATIVE???
0000af1c??00000017?R_ARM_RELATIVE???
0000af20??00000017?R_ARM_RELATIVE???
0000af24??00000017?R_ARM_RELATIVE???
0000af28??00000017?R_ARM_RELATIVE???
0000af30??00000017?R_ARM_RELATIVE???
0000aefc??00003215?R_ARM_GLOB_DAT????00000000???__stack_chk_guard
0000af04??00003715?R_ARM_GLOB_DAT????00000000???__page_size
0000af08??00004e15?R_ARM_GLOB_DAT????00000000???strlen
0000b004??00004e02?R_ARM_ABS32???????00000000???strlen
0000b008??00004e02?R_ARM_ABS32???????00000000???strlen
0000af14??00006615?R_ARM_GLOB_DAT????00000000???__gnu_Unwind_Find_exid
0000af2c??00007415?R_ARM_GLOB_DAT????00000000???__cxa_call_unexpected
...
...
Relocation?section?'.rel.plt'?at?offset?0x2ad0?contains?48?entries:
?Offset?????Info????Type????????????Sym.Value??Sym.?Name
0000af40??00000216?R_ARM_JUMP_SLOT???00000000???__cxa_atexit
0000af44??00000116?R_ARM_JUMP_SLOT???00000000???__cxa_finalize
0000af48??00001716?R_ARM_JUMP_SLOT???00000000???memcpy
...
0000afd4??00004c16?R_ARM_JUMP_SLOT???00000000???fgets
0000afd8??00004d16?R_ARM_JUMP_SLOT???00000000???fclose
0000afdc??00004e16?R_ARM_JUMP_SLOT???00000000???strlen
0000afe0??00004f16?R_ARM_JUMP_SLOT???00000000???strncmp
...
...
```
在.rel.plt和.rel.dyn兩個section中,我們發現一共出現了4個strlen,我們先把它們的關鍵信息記錄下來,后面分析會非常有用。它們分別是
>?.rel.dyn?0000AF08?R_ARM_GLOB_DAT</p>
>?.rel.dyn?0000B004?R_ARM_ABS32</p>
>?.rel.dyn?0000B008?R_ARM_ABS32</p>
>?.rel.plt?0000AFDC?R_ARM_JUMP_SLOT
在代碼中,我們一共調用了6次strlen,但為什么只出現了4次呢?另外,它們之間又是如何對應的呢,帶著這些問題去分析匯編代碼。把編譯出來的so拖到IDA,我們看到示例代碼的指令:
```
.text:000050BC?????????????????EXPORT?Java_com_example_allhookinone_HookUtils_elfhook
.text:000050BC?Java_com_example_allhookinone_HookUtils_elfhook
.text:000050BC
.text:000050BC?var_40??????????=?-0x40
.text:000050BC?var_38??????????=?-0x38
.text:000050BC?var_34??????????=?-0x34
.text:000050BC?s???????????????=?-0x2C
.text:000050BC?var_28??????????=?-0x28
.text:000050BC?var_24??????????=?-0x24
.text:000050BC?var_20??????????=?-0x20
.text:000050BC?var_1C??????????=?-0x1C
.text:000050BC?var_18??????????=?-0x18
.text:000050BC?var_14??????????=?-0x14
.text:000050BC?var_10??????????=?-0x10
.text:000050BC?var_C???????????=?-0xC
.text:000050BC
.text:000050BC?????????????????PUSH????????????{R4,LR}
.text:000050BE?????????????????SUB?????????????SP,?SP,?#0x38
.text:000050C0?????????????????STR?????????????R0,?[SP,#0x40+var_34]
.text:000050C2?????????????????STR?????????????R1,?[SP,#0x40+var_38]
.text:000050C4?????????????????LDR?????????????R4,?=(_GLOBAL_OFFSET_TABLE_?-?0x50CA)
.text:000050C6?????????????????ADD?????????????R4,?PC?;?_GLOBAL_OFFSET_TABLE_
.text:000050C8?????????????????LDR?????????????R3,?=(aHelloworld?-?0x50CE)
.text:000050CA?????????????????ADD?????????????R3,?PC??;?"helloworld"
.text:000050CC?????????????????STR?????????????R3,?[SP,#0x40+s]
.text:000050CE?????????????????LDR?????????????R3,?=(strlen_ptr?-?0xAF34)
.text:000050D0?????????????????LDR?????????????R3,?[R4,R3]?;?__imp_strlen
.text:000050D2?????????????????STR?????????????R3,?[SP,#0x40+var_28]
.text:000050D4?????????????????LDR?????????????R3,?=(strlen_ptr?-?0xAF34)
.text:000050D6?????????????????LDR?????????????R3,?[R4,R3]?;?__imp_strlen
.text:000050D8?????????????????STR?????????????R3,?[SP,#0x40+var_24]
.text:000050DA?????????????????LDR?????????????R3,?=(global_strlen1_ptr?-?0xAF34)
.text:000050DC?????????????????LDR?????????????R3,?[R4,R3]?;?global_strlen1
.text:000050DE?????????????????LDR?????????????R3,?[R3]
.text:000050E0?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050E2?????????????????MOVS????????????R0,?R2
.text:000050E4?????????????????BLX?????????????R3
.text:000050E6?????????????????MOVS????????????R3,?R0
.text:000050E8?????????????????STR?????????????R3,?[SP,#0x40+var_20]
.text:000050EA?????????????????LDR?????????????R3,?=(global_strlen2_ptr?-?0xAF34)
.text:000050EC?????????????????LDR?????????????R3,?[R4,R3]?;?global_strlen2
.text:000050EE?????????????????LDR?????????????R3,?[R3]
.text:000050F0?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050F2?????????????????MOVS????????????R0,?R2
.text:000050F4?????????????????BLX?????????????R3
.text:000050F6?????????????????MOVS????????????R3,?R0
.text:000050F8?????????????????STR?????????????R3,?[SP,#0x40+var_1C]
.text:000050FA?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050FC?????????????????LDR?????????????R3,?[SP,#0x40+var_28]
.text:000050FE?????????????????MOVS????????????R0,?R2
.text:00005100?????????????????BLX?????????????R3
.text:00005102?????????????????MOVS????????????R3,?R0
.text:00005104?????????????????STR?????????????R3,?[SP,#0x40+var_18]
.text:00005106?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:00005108?????????????????LDR?????????????R3,?[SP,#0x40+var_24]
.text:0000510A?????????????????MOVS????????????R0,?R2
.text:0000510C?????????????????BLX?????????????R3
.text:0000510E?????????????????MOVS????????????R3,?R0
.text:00005110?????????????????STR?????????????R3,?[SP,#0x40+var_14]
.text:00005112?????????????????LDR?????????????R3,?[SP,#0x40+s]
.text:00005114?????????????????MOVS????????????R0,?R3??;?s
.text:00005116?????????????????BLX?????????????strlen
.text:0000511A?????????????????MOVS????????????R3,?R0
.text:0000511C?????????????????STR?????????????R3,?[SP,#0x40+var_10]
.text:0000511E?????????????????LDR?????????????R3,?[SP,#0x40+s]
.text:00005120?????????????????MOVS????????????R0,?R3??;?s
.text:00005122?????????????????BLX?????????????strlen
.text:00005126?????????????????MOVS????????????R3,?R0
??...
??...
.text:000051CA?????????????????ADD?????????????SP,?SP,?#0x38
.text:000051CC?????????????????POP?????????????{R4,PC}
.text:000051CC?;?End?of?function?Java_com_example_allhookinone_HookUtils_elfhook
```
先把幾個重要的地址找出來,它們分別是
-?_GLOBAL_OFFSET_TABLE_:?0x0000AF34
-?strlen_ptr:?0x0000AF08
-?__imp_strlen:?0x0000B0C8
-?global_strlen1_ptr:?0x0000AF0C
-?global_strlen1:?0x0000B004
-?global_strlen2_ptr:?0x0000AF10
-?global_strlen2:?0x0000B008
##全局函數指針調用外部函數
global_strlen1和global_strlen2的調用,對應0x000050E4和0x000050F4兩處的BLX指令,通過計算最終R3的值分別是\*global_strlen1和\*global_strlen2,而global_strlen1和global_strlen2的值正好對應位于.rel.dyn的兩個R_ARM_ABS32的重定位項,因此我們得出結論:**通過全局函數指針的方式調用外部函數,它的重定位類型是R_ARM_ABS32,并且位于.rel.dyn節區**。
我們只分析global_strlen1的調用過程,首先定位到global_strlen1_ptr(0x0000AF0C),該地址位于.got節區,_GLOBAL_OFFSET_TABLE_的上方。然后再通過global_strlen1_ptr定位到0x0000B004(位于.data節區),最后再通過0x0000B004定位到最終的函數地址,**因此R_ARM_ABS32重定位項的Offset指向最終調用函數地址的地址(也就是函數指針的指針)**,整個重定位過程是先位到.got,再從.got定位到.date。下面是.got段區的16進制表示片段:
```
...
0000AF0C??04?B0?00?00?08?B0?00?00??DC?B0?00?00?B4?87?00?00
0000AF1C??F4?84?00?00?60?5B?00?00??58?5B?00?00?50?5B?00?00
0000AF2C??EC?B0?00?00?FC?8C?00?00??00?00?00?00?00?00?00?00
...
0000B004??C8?B0?00?00?C8?B0?00?00?????????????????????????
0000B014??????????????????????????????????????????????????
0000B024??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
0000B0C8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
0000B0D8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
```
最后發現0x0000B0C8地址片的指令全為0,當動態鏈接時,linker會覆蓋0x0000B004地址的值,指向strlen的真正地址(而不是現在的0x0000B0C8,有點繞)。
##局部函數指針調用外部函數
local_strlen1和local_strlen2的調用,對應0x00005100和0x0000510C兩處的BLX指令,通過計算最終R3的值都是*strlen_prt,即0x0000AF08,正好對應位于.rel.dyn中的R_ARM_GLOB_DAT重定位項,因此我們得出結論:**通過局部函數指針方式調用外部函數,它的重定位類型是R_ARM_GLOB_DAT,并且位于.re.dyn節區**。
我們只分析local_strlen1的調用過程,首先是定位到strlen_prt(0x0000AF08),該地址位于.got節區,_GLOBAL_OFFSET_TABLE_的上方,然后再通過strlen_prt,定位到0x0000B0C8,跟上面分析的結果居然一樣,**因此R_ARM_GLOB_DAT的重定項Offset指向最終調用函數地址的地址(也就是函數指針的指針)**,下面是.got段區的16進制表示片段:
```
0000AF08??C8?B0?00?00?04?B0?00?00??08?B0?00?00?DC?B0?00?00
0000AF18??B4?87?00?00?F4?84?00?00??60?5B?00?00?58?5B?00?00
0000AF28??50?5B?00?00?EC?B0?00?00??FC?8C?00?00?00?00?00?00
...
0000B0C8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
0000B0D8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
```
需要注意的是,0x000050D8的指令“STR?R3,?[SP,#0x40+var_24]”,這里已經把函數的真實地址保存到堆棧了,**因此哪怕我們修改了GOT表也不會影響堆棧的值,因此這種重定位類型無法通過修改地址進行hook**。????????
##直接調用外部函數
最后看看strlen的直接調用,對應0x0000511A和0x00005122兩處的BLX指令,最后它們都指向.plt節區指令,如下所示:
```
.plt:00002E38?????????????????ADR?????????????R12,?0x2E40
.plt:00002E3C?????????????????ADD?????????????R12,?R12,?#0x8000
.plt:00002E40?????????????????LDR?????????????PC,?[R12,#(strlen_ptr_0?-?0xAE40)]!?;?__imp_strlen
...
0000AFDC??C8?B0?00?00?CC?B0?00?00??D0?B0?00?00?D4?B0?00?00?
0000AFEC??D8?B0?00?00?DC?B0?00?00??E0?B0?00?00?E4?B0?00?00?
0000AFFC??E8?B0?00?00?00?00?00?00??C8?B0?00?00?C8?B0?00?00?
...
```
最后,PC指向\*strlen_ptr_0,即strlen_ptr_0的地址0x0000AFDC,該地址位于.got節區,而0x0000AFDC地址值的正好是0x0000B0C8,多么熟悉的身影。因此得到結論,**直接調用外部函數,它的重定位類型是R_ARM_JUMP_SLOT,并且位于.re.plt節區,其Offset指向最終調用函數地址的地址(也就是函數指針的指針)**。整個過程是先到.plt,再到.got,最后才定位到真正的函數地址。
關于這部分的分析,發現IDA和objdump的反編譯結果有些差異,下面是通過objdump到的匯編指令:
```
00002e38?<strlen@plt>:
????2e38:??e28fc600???add??ip,?pc,?#0,?12
????2e3c:??e28cca08???add??ip,?ip,?#8,?20??;?0x8000
????2e40:??e5bcf19c???ldr??pc,?[ip,?#412]!??;?0x19c
...
...
??afd8:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afdc:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afe0:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afe4:??00002c50???andeq??r2,?r0,?r0,?asr?ip
```
見到afdc處的地址,指向的是0x00002c50,而0x00002c50正好是PLT[0],指令如下:
```
00002c50?<__cxa_atexit@plt-0x14>:
????2c50:??e52de004???push??{lr}????;?(str?lr,?[sp,?#-4]!)
????2c54:??e59fe004???ldr??lr,?[pc,?#4]??;?2c60?<__cxa_atexit@plt-0x4>
????2c58:??e08fe00e???add??lr,?pc,?lr
????2c5c:??e5bef008???ldr??pc,?[lr,?#8]!
????2c60:??000082d4???ldrdeq??r8,?[r0],?-r4
```
執行2c5c處指令后,最終pc指向0x0000af3c,正好是_GLOBAL_OFFSET_TABLE_?+?8,即GOT[2],我們看到0x0000af3c處:
```
0000AF3C??00?00?00?00?28?B0?00?00??24?B0?00?00?2C?B0?00?00
0000AF4C??30?B0?00?00?34?B0?00?00??38?B0?00?00?3C?B0?00?00
```
結果發現GOT[2]里指向的函數地址居然是0,這是因為android上的符號綁定并不支持lazy綁定,所以當so被加載時,linker會預先把GOT\[n\](n>=2)的所對應的函數都提前找出來,因此這里GOT\[2\]的代碼實際上不會被執行,因此在目前的Android上,并不存在完整的PLT/GOT鏈接過程。猜想這主要是出于穩定性考慮的。
##總結
雖然IDA和obudump兩個工具反編譯得出的指令在PLT\GOT過程中有些差別,但對于Android而言,其實這個差異不會造成影響,因為Android上不支持lazy綁定。同時我們得出一個非常重要的結論:**R_ARM_ABS32、R_ARM_GLOB_DAT和R_ARM_JUMP_SLOT的重定位項雖然在代碼中用法不一樣,但其offset都是指向一個函數的指針的指針**,這個對于我們下面進行elfhook非常有用。
#基于執行視圖解析ELF
[《Redirecting?functions?in?shared?ELF?libraries》](http://www.codeproject.com/Articles/70302/Redirecting-functions-in-shared-ELF-libraries#_Toc257815978)這篇文章所提供的例子,就是基于鏈接視圖對ELF進行解析的,與基于執行視圖進行解析相比,后面的邏輯基本是一樣的,關鍵是要通過segment找到.dynsym、.dynstr、.rel.plt和rel.dyn,以及它們的項數。
首次通過Program?Header?Table找到類型為PT_DYNAMIC的段,該的內容其實對應.dynamic,這段的內容對應Elf32_Dyn類型的數組,其結構體如下所示:
```
/*?Dynamic?structure?*/
typedef?struct?{
??Elf32_Sword??d_tag;????/*?controls?meaning?of?d_val?*/
??union?{
????Elf32_Word??d_val;??/*?Multiple?meanings?-?see?d_tag?*/
????Elf32_Addr??d_ptr;??/*?program?virtual?address?*/
??}?d_un;
}?Elf32_Dyn;
```
通過遍歷這個數組,我們可以找到所有的需要的信息,我把它們的對應關系列出來:
-?DT_HASH?->?.hash
-?DT_SYMTAB?&?DT_SYMENT?->?.dynsym
-?DT_STRTAB?&?DT_STRSZ?->?.dynstr
-?PLTREL(決定REL還是RELA)?&(DT_REL?|?DT_RELA)?&?(DT_RELSZ?|?DT_RELASZ?)?&?(DT_RELENT?|?DT_RELAENT?)?->?.rel.dyn
-?DT_JMPREL?&?DT_PLTRELSZ?&?(DT_RELENT?|?DT_RELAENT)?->?.rel.plt
-?FINI_ARRAY?&?FINI_ARRAYSZ?->?.fini_array
-?INIT_ARRAY?&?INIT_ARRAYSZ?->?.init_array
這是查找的相關代碼:
```
void?getElfInfoBySegmentView(ElfInfo?&info,?const?ElfHandle?*handle){
??info.handle?=?handle;
??info.elf_base?=?(uint8_t?*)?handle->base;
??info.ehdr?=?reinterpret_cast<Elf32_Ehdr?*>(info.elf_base);
??//?may?be?wrong
??info.shdr?=?reinterpret_cast<Elf32_Shdr?*>(info.elf_base?+?info.ehdr->e_shoff);
??info.phdr?=?reinterpret_cast<Elf32_Phdr?*>(info.elf_base?+?info.ehdr->e_phoff);
??info.shstr?=?NULL;
??Elf32_Phdr?*dynamic?=?NULL;
??Elf32_Word?size?=?0;
??getSegmentInfo(info,?PT_DYNAMIC,?&dynamic,?&size,?&info.dyn);
??if(!dynamic){
????LOGE("[-]?could't?find?PT_DYNAMIC?segment");
????exit(-1);
??}
??info.dynsz?=?size?/?sizeof(Elf32_Dyn);
??Elf32_Dyn?*dyn?=?info.dyn;
??for(int?i=0;?i<info.dynsz;?i++,?dyn++){
????switch(dyn->d_tag){
????case?DT_SYMTAB:
??????info.sym?=?reinterpret_cast<Elf32_Sym?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_STRTAB:
??????info.symstr?=?reinterpret_cast<const?char?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_REL:
??????info.reldyn?=?reinterpret_cast<Elf32_Rel?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_RELSZ:
??????info.reldynsz?=?dyn->d_un.d_val?/?sizeof(Elf32_Rel);
??????break;
????case?DT_JMPREL:
??????info.relplt?=?reinterpret_cast<Elf32_Rel?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_PLTRELSZ:
??????info.relpltsz?=?dyn->d_un.d_val?/?sizeof(Elf32_Rel);
??????break;
????case?DT_HASH:
??????uint32_t?*rawdata?=?reinterpret_cast<uint32_t?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????info.nbucket?=?rawdata[0];
??????info.nchain?=?rawdata[1];
??????info.bucket?=?rawdata?+?2;
??????info.chain?=?info.bucket?+?info.nbucket;
??????break;
????}
??}
??//because?.dynsym?is?next?to?.dynstr,?so?we?can?caculate?the?symsz?simply
??info.symsz?=?((uint32_t)info.symstr?-?(uint32_t)info.sym)/sizeof(Elf32_Sym);
}
```
然而,有一個值我無法通過通過PT_DYNAMIC段得到的,那就是.dynsym的項數,我最后通過變通的方法得到的。由于.dynsym和.dynstr兩個節區是相鄰的,因此它們兩個地址相減,即可得到的.dynsym總長度,再除了sizeof(Elf32_Sym)即可得到.dynsym的項數,如果你有更好的方法,請跟我說說。
#ELF?Hook
有了上面的介紹之后,寫個ELF?Hook就很簡單的,我把關鍵代碼貼出來:
```
#define?R_ARM_ABS32?0x02
#define?R_ARM_GLOB_DAT?0x15
#define?R_ARM_JUMP_SLOT?0x16
int?elfHook(const?char?*soname,?const?char?*symbol,?void?*replace_func,?void?**old_func){
??assert(old_func);
??assert(replace_func);
??assert(symbol);
??ElfHandle*?handle?=?openElfBySoname(soname);
??ElfInfo?info;
??getElfInfoBySegmentView(info,?handle);
??Elf32_Sym?*sym?=?NULL;
??int?symidx?=?0;
??findSymByName(info,?symbol,?&sym,?&symidx);
??if(!sym){
????LOGE("[-]?Could?not?find?symbol?%s",?symbol);
????goto?fails;
??}else{
????LOGI("[+]?sym?%p,?symidx?%d.",?sym,?symidx);
??}
??for?(int?i?=?0;?i?<?info.relpltsz;?i++)?{
????Elf32_Rel&?rel?=?info.relplt[i];
????if?(ELF32_R_SYM(rel.r_info)?==?symidx?&&?ELF32_R_TYPE(rel.r_info)?==?R_ARM_JUMP_SLOT)?{
??????void?*addr?=?(void?*)?(info.elf_base?+?rel.r_offset);
??????if?(replaceFunc(addr,?replace_func,?old_func))
????????goto?fails;
??????//only?once
??????break;
????}
??}
??for?(int?i?=?0;?i?<?info.reldynsz;?i++)?{
????Elf32_Rel&?rel?=?info.reldyn[i];
????if?(ELF32_R_SYM(rel.r_info)?==?symidx?&&
????????(ELF32_R_TYPE(rel.r_info)?==?R_ARM_ABS32
????????????||?ELF32_R_TYPE(rel.r_info)?==?R_ARM_GLOB_DAT))?{
??????void?*addr???????=?(void?*)?(info.elf_base?+?rel.r_offset);
??????if?(replaceFunc(addr,?replace_func,?old_func))
????????goto?fails;
????}
??}
??fails:
??closeElfBySoname(handle);
??return?0;
}
```
最后是測試的代碼:
```
typedef?int?(*strlen_fun)(const?char?*);
strlen_fun?old_strlen?=?NULL;
size_t?my_strlen(const?char?*str){
??LOGI("strlen?was?called.");
??int?len?=?old_strlen(str);
??return?len?*?2;
}
strlen_fun?global_strlen1?=?(strlen_fun)strlen;
strlen_fun?global_strlen2?=?(strlen_fun)strlen;
#define?SHOW(x)?LOGI("%s?is?%d",?#x,?x)
extern?"C"?jint?Java_com_example_allhookinone_HookUtils_elfhook(JNIEnv?*env,?jobject?thiz){
??const?char?*str?=?"helloworld";
??strlen_fun?local_strlen1?=?(strlen_fun)strlen;
??strlen_fun?local_strlen2?=?(strlen_fun)strlen;
??int?len0?=?global_strlen1(str);
??int?len1?=?global_strlen2(str);
??int?len2?=?local_strlen1(str);
??int?len3?=?local_strlen2(str);
??int?len4?=?strlen(str);
??int?len5?=?strlen(str);
??LOGI("hook?before:");
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??elfHook("libonehook.so",?"strlen",?(void?*)my_strlen,?(void?**)&old_strlen);
??len0?=?global_strlen1(str);
??len1?=?global_strlen2(str);
??len2?=?local_strlen1(str);
??len3?=?local_strlen2(str);
??len4?=?strlen(str);
??len5?=?strlen(str);
??LOGI("hook?after:");
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??return?0;
}
```
從打印結果可以發現,local_strlen1和local_strlen2正所上面所說,并沒有受影響,但如果函數再次被調用,則生效了,原因不解析了。測試結果就不發了,留給你們試吧。
#GitHup地址
寫這篇技術文的原因,主要有兩個:
-?其一是發現網上大部分描述PLT/GOT符號重定向過程的文章都是針對x86的,比如[《Redirecting?functions?in?shared?ELF?libraries》](http://www.codeproject.com/Articles/70302/Redirecting-functions-in-shared-ELF-libraries#_Toc257815978)就寫得非常不錯。雖然其過程跟ARM非常類似,但由于CPU體系不同,指令實現差異非常大;
-?其二是網上大部分關于ELF文件格式的介紹,都是基于鏈接視圖(Linking?View),鏈接視圖是基于節(Section)對ELF進行解析的。然而動態鏈接庫在加載的過程中,linker只關注ELF中的段(Segment)信息。因此ELF中的節信息被完全篡改或者甚至刪除掉,并不會影響linker的加載過程,這樣做可以防止靜態分析工具(比如IDA,readelf等)對其進行分析,一般加過殼的ELF文件都會有這方面的處理。對于這種ELF文件,如果要實現hook功能,則必須要基于執行視圖(Execution?View)進行符號解析;
#準備
在往下閱讀之前,請先確保對ELF文件格式和ARM匯編有個大概了解,參考指引:
-?[ELF?文件格式分析](http://staff.ustc.edu.cn/~sycheng/sst/exp_crack/ELF.pdf);
-?[ARM文檔](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204ic/Cihbiggi.html#);
準備工具:
-?readelf(NDK包含)
-?objdump(NDK包含)
-?IDA?Pro?6.4或以上
-?Android真機或者模擬器
#符號重定向
在ARM上,常見的重定向類型,主要有三種,分別是**R_ARM_JUMP_SLOT**、**R_ARM_ABS32**和**R_ARM_GLOB_DAT**,而我們要hook?elf函數,則需要同時處理好這三種重定向類型。
##例子
先看示例代碼
```
typedef?int?(*strlen_fun)(const?char?*);
strlen_fun?global_strlen1?=?(strlen_fun)strlen;
strlen_fun?global_strlen2?=?(strlen_fun)strlen;
#define?SHOW(x)?LOGI("%s?is?%d",?#x,?x)
extern?"C"?jint?Java_com_example_allhookinone_HookUtils_elfhook(JNIEnv?*env,?jobject?thiz){
??const?char?*str?=?"helloworld";
??strlen_fun?local_strlen1?=?(strlen_fun)strlen;
??strlen_fun?local_strlen2?=?(strlen_fun)strlen;
??int?len0?=?global_strlen1(str);
??int?len1?=?global_strlen2(str);
??int?len2?=?local_strlen1(str);
??int?len3?=?local_strlen2(str);
??int?len4?=?strlen(str);
??int?len5?=?strlen(str);
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??return?0;
}
```
這段代碼分別以三種不同的方式調用strlen,分別是全局函數指針、局部函數指針以及直接調用,下而我們針對這個例子,分別對三種調用分析進行分析。
先通過readelf,我們查看一下重定向表,如下所示:
```
Relocation?section?'.rel.dyn'?at?offset?0x2a48?contains?17?entries:
?Offset?????Info????Type????????????Sym.Value??Sym.?Name
0000ade0??00000017?R_ARM_RELATIVE???
0000af00??00000017?R_ARM_RELATIVE???
0000af0c??00000017?R_ARM_RELATIVE???
0000af10??00000017?R_ARM_RELATIVE???
0000af18??00000017?R_ARM_RELATIVE???
0000af1c??00000017?R_ARM_RELATIVE???
0000af20??00000017?R_ARM_RELATIVE???
0000af24??00000017?R_ARM_RELATIVE???
0000af28??00000017?R_ARM_RELATIVE???
0000af30??00000017?R_ARM_RELATIVE???
0000aefc??00003215?R_ARM_GLOB_DAT????00000000???__stack_chk_guard
0000af04??00003715?R_ARM_GLOB_DAT????00000000???__page_size
0000af08??00004e15?R_ARM_GLOB_DAT????00000000???strlen
0000b004??00004e02?R_ARM_ABS32???????00000000???strlen
0000b008??00004e02?R_ARM_ABS32???????00000000???strlen
0000af14??00006615?R_ARM_GLOB_DAT????00000000???__gnu_Unwind_Find_exid
0000af2c??00007415?R_ARM_GLOB_DAT????00000000???__cxa_call_unexpected
...
...
Relocation?section?'.rel.plt'?at?offset?0x2ad0?contains?48?entries:
?Offset?????Info????Type????????????Sym.Value??Sym.?Name
0000af40??00000216?R_ARM_JUMP_SLOT???00000000???__cxa_atexit
0000af44??00000116?R_ARM_JUMP_SLOT???00000000???__cxa_finalize
0000af48??00001716?R_ARM_JUMP_SLOT???00000000???memcpy
...
0000afd4??00004c16?R_ARM_JUMP_SLOT???00000000???fgets
0000afd8??00004d16?R_ARM_JUMP_SLOT???00000000???fclose
0000afdc??00004e16?R_ARM_JUMP_SLOT???00000000???strlen
0000afe0??00004f16?R_ARM_JUMP_SLOT???00000000???strncmp
...
...
```
在.rel.plt和.rel.dyn兩個section中,我們發現一共出現了4個strlen,我們先把它們的關鍵信息記錄下來,后面分析會非常有用。它們分別是
>?.rel.dyn?0000AF08?R_ARM_GLOB_DAT</p>
>?.rel.dyn?0000B004?R_ARM_ABS32</p>
>?.rel.dyn?0000B008?R_ARM_ABS32</p>
>?.rel.plt?0000AFDC?R_ARM_JUMP_SLOT
在代碼中,我們一共調用了6次strlen,但為什么只出現了4次呢?另外,它們之間又是如何對應的呢,帶著這些問題去分析匯編代碼。把編譯出來的so拖到IDA,我們看到示例代碼的指令:
```
.text:000050BC?????????????????EXPORT?Java_com_example_allhookinone_HookUtils_elfhook
.text:000050BC?Java_com_example_allhookinone_HookUtils_elfhook
.text:000050BC
.text:000050BC?var_40??????????=?-0x40
.text:000050BC?var_38??????????=?-0x38
.text:000050BC?var_34??????????=?-0x34
.text:000050BC?s???????????????=?-0x2C
.text:000050BC?var_28??????????=?-0x28
.text:000050BC?var_24??????????=?-0x24
.text:000050BC?var_20??????????=?-0x20
.text:000050BC?var_1C??????????=?-0x1C
.text:000050BC?var_18??????????=?-0x18
.text:000050BC?var_14??????????=?-0x14
.text:000050BC?var_10??????????=?-0x10
.text:000050BC?var_C???????????=?-0xC
.text:000050BC
.text:000050BC?????????????????PUSH????????????{R4,LR}
.text:000050BE?????????????????SUB?????????????SP,?SP,?#0x38
.text:000050C0?????????????????STR?????????????R0,?[SP,#0x40+var_34]
.text:000050C2?????????????????STR?????????????R1,?[SP,#0x40+var_38]
.text:000050C4?????????????????LDR?????????????R4,?=(_GLOBAL_OFFSET_TABLE_?-?0x50CA)
.text:000050C6?????????????????ADD?????????????R4,?PC?;?_GLOBAL_OFFSET_TABLE_
.text:000050C8?????????????????LDR?????????????R3,?=(aHelloworld?-?0x50CE)
.text:000050CA?????????????????ADD?????????????R3,?PC??;?"helloworld"
.text:000050CC?????????????????STR?????????????R3,?[SP,#0x40+s]
.text:000050CE?????????????????LDR?????????????R3,?=(strlen_ptr?-?0xAF34)
.text:000050D0?????????????????LDR?????????????R3,?[R4,R3]?;?__imp_strlen
.text:000050D2?????????????????STR?????????????R3,?[SP,#0x40+var_28]
.text:000050D4?????????????????LDR?????????????R3,?=(strlen_ptr?-?0xAF34)
.text:000050D6?????????????????LDR?????????????R3,?[R4,R3]?;?__imp_strlen
.text:000050D8?????????????????STR?????????????R3,?[SP,#0x40+var_24]
.text:000050DA?????????????????LDR?????????????R3,?=(global_strlen1_ptr?-?0xAF34)
.text:000050DC?????????????????LDR?????????????R3,?[R4,R3]?;?global_strlen1
.text:000050DE?????????????????LDR?????????????R3,?[R3]
.text:000050E0?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050E2?????????????????MOVS????????????R0,?R2
.text:000050E4?????????????????BLX?????????????R3
.text:000050E6?????????????????MOVS????????????R3,?R0
.text:000050E8?????????????????STR?????????????R3,?[SP,#0x40+var_20]
.text:000050EA?????????????????LDR?????????????R3,?=(global_strlen2_ptr?-?0xAF34)
.text:000050EC?????????????????LDR?????????????R3,?[R4,R3]?;?global_strlen2
.text:000050EE?????????????????LDR?????????????R3,?[R3]
.text:000050F0?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050F2?????????????????MOVS????????????R0,?R2
.text:000050F4?????????????????BLX?????????????R3
.text:000050F6?????????????????MOVS????????????R3,?R0
.text:000050F8?????????????????STR?????????????R3,?[SP,#0x40+var_1C]
.text:000050FA?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:000050FC?????????????????LDR?????????????R3,?[SP,#0x40+var_28]
.text:000050FE?????????????????MOVS????????????R0,?R2
.text:00005100?????????????????BLX?????????????R3
.text:00005102?????????????????MOVS????????????R3,?R0
.text:00005104?????????????????STR?????????????R3,?[SP,#0x40+var_18]
.text:00005106?????????????????LDR?????????????R2,?[SP,#0x40+s]
.text:00005108?????????????????LDR?????????????R3,?[SP,#0x40+var_24]
.text:0000510A?????????????????MOVS????????????R0,?R2
.text:0000510C?????????????????BLX?????????????R3
.text:0000510E?????????????????MOVS????????????R3,?R0
.text:00005110?????????????????STR?????????????R3,?[SP,#0x40+var_14]
.text:00005112?????????????????LDR?????????????R3,?[SP,#0x40+s]
.text:00005114?????????????????MOVS????????????R0,?R3??;?s
.text:00005116?????????????????BLX?????????????strlen
.text:0000511A?????????????????MOVS????????????R3,?R0
.text:0000511C?????????????????STR?????????????R3,?[SP,#0x40+var_10]
.text:0000511E?????????????????LDR?????????????R3,?[SP,#0x40+s]
.text:00005120?????????????????MOVS????????????R0,?R3??;?s
.text:00005122?????????????????BLX?????????????strlen
.text:00005126?????????????????MOVS????????????R3,?R0
??...
??...
.text:000051CA?????????????????ADD?????????????SP,?SP,?#0x38
.text:000051CC?????????????????POP?????????????{R4,PC}
.text:000051CC?;?End?of?function?Java_com_example_allhookinone_HookUtils_elfhook
```
先把幾個重要的地址找出來,它們分別是
-?_GLOBAL_OFFSET_TABLE_:?0x0000AF34
-?strlen_ptr:?0x0000AF08
-?__imp_strlen:?0x0000B0C8
-?global_strlen1_ptr:?0x0000AF0C
-?global_strlen1:?0x0000B004
-?global_strlen2_ptr:?0x0000AF10
-?global_strlen2:?0x0000B008
##全局函數指針調用外部函數
global_strlen1和global_strlen2的調用,對應0x000050E4和0x000050F4兩處的BLX指令,通過計算最終R3的值分別是\*global_strlen1和\*global_strlen2,而global_strlen1和global_strlen2的值正好對應位于.rel.dyn的兩個R_ARM_ABS32的重定位項,因此我們得出結論:**通過全局函數指針的方式調用外部函數,它的重定位類型是R_ARM_ABS32,并且位于.rel.dyn節區**。
我們只分析global_strlen1的調用過程,首先定位到global_strlen1_ptr(0x0000AF0C),該地址位于.got節區,_GLOBAL_OFFSET_TABLE_的上方。然后再通過global_strlen1_ptr定位到0x0000B004(位于.data節區),最后再通過0x0000B004定位到最終的函數地址,**因此R_ARM_ABS32重定位項的Offset指向最終調用函數地址的地址(也就是函數指針的指針)**,整個重定位過程是先位到.got,再從.got定位到.date。下面是.got段區的16進制表示片段:
```
...
0000AF0C??04?B0?00?00?08?B0?00?00??DC?B0?00?00?B4?87?00?00
0000AF1C??F4?84?00?00?60?5B?00?00??58?5B?00?00?50?5B?00?00
0000AF2C??EC?B0?00?00?FC?8C?00?00??00?00?00?00?00?00?00?00
...
0000B004??C8?B0?00?00?C8?B0?00?00?????????????????????????
0000B014??????????????????????????????????????????????????
0000B024??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
0000B0C8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
0000B0D8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
```
最后發現0x0000B0C8地址片的指令全為0,當動態鏈接時,linker會覆蓋0x0000B004地址的值,指向strlen的真正地址(而不是現在的0x0000B0C8,有點繞)。
##局部函數指針調用外部函數
local_strlen1和local_strlen2的調用,對應0x00005100和0x0000510C兩處的BLX指令,通過計算最終R3的值都是*strlen_prt,即0x0000AF08,正好對應位于.rel.dyn中的R_ARM_GLOB_DAT重定位項,因此我們得出結論:**通過局部函數指針方式調用外部函數,它的重定位類型是R_ARM_GLOB_DAT,并且位于.re.dyn節區**。
我們只分析local_strlen1的調用過程,首先是定位到strlen_prt(0x0000AF08),該地址位于.got節區,_GLOBAL_OFFSET_TABLE_的上方,然后再通過strlen_prt,定位到0x0000B0C8,跟上面分析的結果居然一樣,**因此R_ARM_GLOB_DAT的重定項Offset指向最終調用函數地址的地址(也就是函數指針的指針)**,下面是.got段區的16進制表示片段:
```
0000AF08??C8?B0?00?00?04?B0?00?00??08?B0?00?00?DC?B0?00?00
0000AF18??B4?87?00?00?F4?84?00?00??60?5B?00?00?58?5B?00?00
0000AF28??50?5B?00?00?EC?B0?00?00??FC?8C?00?00?00?00?00?00
...
0000B0C8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
0000B0D8??00?00?00?00?00?00?00?00??00?00?00?00?00?00?00?00
...
```
需要注意的是,0x000050D8的指令“STR?R3,?[SP,#0x40+var_24]”,這里已經把函數的真實地址保存到堆棧了,**因此哪怕我們修改了GOT表也不會影響堆棧的值,因此這種重定位類型無法通過修改地址進行hook**。????????
##直接調用外部函數
最后看看strlen的直接調用,對應0x0000511A和0x00005122兩處的BLX指令,最后它們都指向.plt節區指令,如下所示:
```
.plt:00002E38?????????????????ADR?????????????R12,?0x2E40
.plt:00002E3C?????????????????ADD?????????????R12,?R12,?#0x8000
.plt:00002E40?????????????????LDR?????????????PC,?[R12,#(strlen_ptr_0?-?0xAE40)]!?;?__imp_strlen
...
0000AFDC??C8?B0?00?00?CC?B0?00?00??D0?B0?00?00?D4?B0?00?00?
0000AFEC??D8?B0?00?00?DC?B0?00?00??E0?B0?00?00?E4?B0?00?00?
0000AFFC??E8?B0?00?00?00?00?00?00??C8?B0?00?00?C8?B0?00?00?
...
```
最后,PC指向\*strlen_ptr_0,即strlen_ptr_0的地址0x0000AFDC,該地址位于.got節區,而0x0000AFDC地址值的正好是0x0000B0C8,多么熟悉的身影。因此得到結論,**直接調用外部函數,它的重定位類型是R_ARM_JUMP_SLOT,并且位于.re.plt節區,其Offset指向最終調用函數地址的地址(也就是函數指針的指針)**。整個過程是先到.plt,再到.got,最后才定位到真正的函數地址。
關于這部分的分析,發現IDA和objdump的反編譯結果有些差異,下面是通過objdump到的匯編指令:
```
00002e38?<strlen@plt>:
????2e38:??e28fc600???add??ip,?pc,?#0,?12
????2e3c:??e28cca08???add??ip,?ip,?#8,?20??;?0x8000
????2e40:??e5bcf19c???ldr??pc,?[ip,?#412]!??;?0x19c
...
...
??afd8:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afdc:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afe0:??00002c50???andeq??r2,?r0,?r0,?asr?ip
????afe4:??00002c50???andeq??r2,?r0,?r0,?asr?ip
```
見到afdc處的地址,指向的是0x00002c50,而0x00002c50正好是PLT[0],指令如下:
```
00002c50?<__cxa_atexit@plt-0x14>:
????2c50:??e52de004???push??{lr}????;?(str?lr,?[sp,?#-4]!)
????2c54:??e59fe004???ldr??lr,?[pc,?#4]??;?2c60?<__cxa_atexit@plt-0x4>
????2c58:??e08fe00e???add??lr,?pc,?lr
????2c5c:??e5bef008???ldr??pc,?[lr,?#8]!
????2c60:??000082d4???ldrdeq??r8,?[r0],?-r4
```
執行2c5c處指令后,最終pc指向0x0000af3c,正好是_GLOBAL_OFFSET_TABLE_?+?8,即GOT[2],我們看到0x0000af3c處:
```
0000AF3C??00?00?00?00?28?B0?00?00??24?B0?00?00?2C?B0?00?00
0000AF4C??30?B0?00?00?34?B0?00?00??38?B0?00?00?3C?B0?00?00
```
結果發現GOT[2]里指向的函數地址居然是0,這是因為android上的符號綁定并不支持lazy綁定,所以當so被加載時,linker會預先把GOT\[n\](n>=2)的所對應的函數都提前找出來,因此這里GOT\[2\]的代碼實際上不會被執行,因此在目前的Android上,并不存在完整的PLT/GOT鏈接過程。猜想這主要是出于穩定性考慮的。
##總結
雖然IDA和obudump兩個工具反編譯得出的指令在PLT\GOT過程中有些差別,但對于Android而言,其實這個差異不會造成影響,因為Android上不支持lazy綁定。同時我們得出一個非常重要的結論:**R_ARM_ABS32、R_ARM_GLOB_DAT和R_ARM_JUMP_SLOT的重定位項雖然在代碼中用法不一樣,但其offset都是指向一個函數的指針的指針**,這個對于我們下面進行elfhook非常有用。
#基于執行視圖解析ELF
[《Redirecting?functions?in?shared?ELF?libraries》](http://www.codeproject.com/Articles/70302/Redirecting-functions-in-shared-ELF-libraries#_Toc257815978)這篇文章所提供的例子,就是基于鏈接視圖對ELF進行解析的,與基于執行視圖進行解析相比,后面的邏輯基本是一樣的,關鍵是要通過segment找到.dynsym、.dynstr、.rel.plt和rel.dyn,以及它們的項數。
首次通過Program?Header?Table找到類型為PT_DYNAMIC的段,該的內容其實對應.dynamic,這段的內容對應Elf32_Dyn類型的數組,其結構體如下所示:
```
/*?Dynamic?structure?*/
typedef?struct?{
??Elf32_Sword??d_tag;????/*?controls?meaning?of?d_val?*/
??union?{
????Elf32_Word??d_val;??/*?Multiple?meanings?-?see?d_tag?*/
????Elf32_Addr??d_ptr;??/*?program?virtual?address?*/
??}?d_un;
}?Elf32_Dyn;
```
通過遍歷這個數組,我們可以找到所有的需要的信息,我把它們的對應關系列出來:
-?DT_HASH?->?.hash
-?DT_SYMTAB?&?DT_SYMENT?->?.dynsym
-?DT_STRTAB?&?DT_STRSZ?->?.dynstr
-?PLTREL(決定REL還是RELA)?&(DT_REL?|?DT_RELA)?&?(DT_RELSZ?|?DT_RELASZ?)?&?(DT_RELENT?|?DT_RELAENT?)?->?.rel.dyn
-?DT_JMPREL?&?DT_PLTRELSZ?&?(DT_RELENT?|?DT_RELAENT)?->?.rel.plt
-?FINI_ARRAY?&?FINI_ARRAYSZ?->?.fini_array
-?INIT_ARRAY?&?INIT_ARRAYSZ?->?.init_array
這是查找的相關代碼:
```
void?getElfInfoBySegmentView(ElfInfo?&info,?const?ElfHandle?*handle){
??info.handle?=?handle;
??info.elf_base?=?(uint8_t?*)?handle->base;
??info.ehdr?=?reinterpret_cast<Elf32_Ehdr?*>(info.elf_base);
??//?may?be?wrong
??info.shdr?=?reinterpret_cast<Elf32_Shdr?*>(info.elf_base?+?info.ehdr->e_shoff);
??info.phdr?=?reinterpret_cast<Elf32_Phdr?*>(info.elf_base?+?info.ehdr->e_phoff);
??info.shstr?=?NULL;
??Elf32_Phdr?*dynamic?=?NULL;
??Elf32_Word?size?=?0;
??getSegmentInfo(info,?PT_DYNAMIC,?&dynamic,?&size,?&info.dyn);
??if(!dynamic){
????LOGE("[-]?could't?find?PT_DYNAMIC?segment");
????exit(-1);
??}
??info.dynsz?=?size?/?sizeof(Elf32_Dyn);
??Elf32_Dyn?*dyn?=?info.dyn;
??for(int?i=0;?i<info.dynsz;?i++,?dyn++){
????switch(dyn->d_tag){
????case?DT_SYMTAB:
??????info.sym?=?reinterpret_cast<Elf32_Sym?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_STRTAB:
??????info.symstr?=?reinterpret_cast<const?char?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_REL:
??????info.reldyn?=?reinterpret_cast<Elf32_Rel?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_RELSZ:
??????info.reldynsz?=?dyn->d_un.d_val?/?sizeof(Elf32_Rel);
??????break;
????case?DT_JMPREL:
??????info.relplt?=?reinterpret_cast<Elf32_Rel?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????break;
????case?DT_PLTRELSZ:
??????info.relpltsz?=?dyn->d_un.d_val?/?sizeof(Elf32_Rel);
??????break;
????case?DT_HASH:
??????uint32_t?*rawdata?=?reinterpret_cast<uint32_t?*>(info.elf_base?+?dyn->d_un.d_ptr);
??????info.nbucket?=?rawdata[0];
??????info.nchain?=?rawdata[1];
??????info.bucket?=?rawdata?+?2;
??????info.chain?=?info.bucket?+?info.nbucket;
??????break;
????}
??}
??//because?.dynsym?is?next?to?.dynstr,?so?we?can?caculate?the?symsz?simply
??info.symsz?=?((uint32_t)info.symstr?-?(uint32_t)info.sym)/sizeof(Elf32_Sym);
}
```
然而,有一個值我無法通過通過PT_DYNAMIC段得到的,那就是.dynsym的項數,我最后通過變通的方法得到的。由于.dynsym和.dynstr兩個節區是相鄰的,因此它們兩個地址相減,即可得到的.dynsym總長度,再除了sizeof(Elf32_Sym)即可得到.dynsym的項數,如果你有更好的方法,請跟我說說。
#ELF?Hook
有了上面的介紹之后,寫個ELF?Hook就很簡單的,我把關鍵代碼貼出來:
```
#define?R_ARM_ABS32?0x02
#define?R_ARM_GLOB_DAT?0x15
#define?R_ARM_JUMP_SLOT?0x16
int?elfHook(const?char?*soname,?const?char?*symbol,?void?*replace_func,?void?**old_func){
??assert(old_func);
??assert(replace_func);
??assert(symbol);
??ElfHandle*?handle?=?openElfBySoname(soname);
??ElfInfo?info;
??getElfInfoBySegmentView(info,?handle);
??Elf32_Sym?*sym?=?NULL;
??int?symidx?=?0;
??findSymByName(info,?symbol,?&sym,?&symidx);
??if(!sym){
????LOGE("[-]?Could?not?find?symbol?%s",?symbol);
????goto?fails;
??}else{
????LOGI("[+]?sym?%p,?symidx?%d.",?sym,?symidx);
??}
??for?(int?i?=?0;?i?<?info.relpltsz;?i++)?{
????Elf32_Rel&?rel?=?info.relplt[i];
????if?(ELF32_R_SYM(rel.r_info)?==?symidx?&&?ELF32_R_TYPE(rel.r_info)?==?R_ARM_JUMP_SLOT)?{
??????void?*addr?=?(void?*)?(info.elf_base?+?rel.r_offset);
??????if?(replaceFunc(addr,?replace_func,?old_func))
????????goto?fails;
??????//only?once
??????break;
????}
??}
??for?(int?i?=?0;?i?<?info.reldynsz;?i++)?{
????Elf32_Rel&?rel?=?info.reldyn[i];
????if?(ELF32_R_SYM(rel.r_info)?==?symidx?&&
????????(ELF32_R_TYPE(rel.r_info)?==?R_ARM_ABS32
????????????||?ELF32_R_TYPE(rel.r_info)?==?R_ARM_GLOB_DAT))?{
??????void?*addr???????=?(void?*)?(info.elf_base?+?rel.r_offset);
??????if?(replaceFunc(addr,?replace_func,?old_func))
????????goto?fails;
????}
??}
??fails:
??closeElfBySoname(handle);
??return?0;
}
```
最后是測試的代碼:
```
typedef?int?(*strlen_fun)(const?char?*);
strlen_fun?old_strlen?=?NULL;
size_t?my_strlen(const?char?*str){
??LOGI("strlen?was?called.");
??int?len?=?old_strlen(str);
??return?len?*?2;
}
strlen_fun?global_strlen1?=?(strlen_fun)strlen;
strlen_fun?global_strlen2?=?(strlen_fun)strlen;
#define?SHOW(x)?LOGI("%s?is?%d",?#x,?x)
extern?"C"?jint?Java_com_example_allhookinone_HookUtils_elfhook(JNIEnv?*env,?jobject?thiz){
??const?char?*str?=?"helloworld";
??strlen_fun?local_strlen1?=?(strlen_fun)strlen;
??strlen_fun?local_strlen2?=?(strlen_fun)strlen;
??int?len0?=?global_strlen1(str);
??int?len1?=?global_strlen2(str);
??int?len2?=?local_strlen1(str);
??int?len3?=?local_strlen2(str);
??int?len4?=?strlen(str);
??int?len5?=?strlen(str);
??LOGI("hook?before:");
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??elfHook("libonehook.so",?"strlen",?(void?*)my_strlen,?(void?**)&old_strlen);
??len0?=?global_strlen1(str);
??len1?=?global_strlen2(str);
??len2?=?local_strlen1(str);
??len3?=?local_strlen2(str);
??len4?=?strlen(str);
??len5?=?strlen(str);
??LOGI("hook?after:");
??SHOW(len0);
??SHOW(len1);
??SHOW(len2);
??SHOW(len3);
??SHOW(len4);
??SHOW(len5);
??return?0;
}
```
從打印結果可以發現,local_strlen1和local_strlen2正所上面所說,并沒有受影響,但如果函數再次被調用,則生效了,原因不解析了。測試結果就不發了,留給你們試吧。
#GitHup地址
完整代碼,見https://github.com/boyliang/AllHookInOne.git
原文地址:來自看雪論壇@PEdiy.com http://bbs.pediy.com/showthread.php?t=193720
總結
以上是生活随笔為你收集整理的基于Android的ELF PLT/GOT符号重定向过程及ELF Hook实现的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Trie树的常见应用大总结(面试+附代码
- 下一篇: 微信系列研究之-----资源文件保护的小