总结使用Unity 3D优化游戏运行性能的经验
? ? ? ? ? ? ? ?總結(jié)使用Unity 3D優(yōu)化游戲運行性能的經(jīng)驗
作者:Amir Fasshihi
轉(zhuǎn)載:http://gamerboom.com/archives/76214#
流暢的游戲玩法來自流暢的幀率,而我們即將推出的動作平臺游戲《Shadow Blade》已經(jīng)將在標準iPhone和iPad設備上實現(xiàn)每秒60幀視為一個重要目標。
以下是我們在緊湊的優(yōu)化過程中提升游戲運行性能,并實現(xiàn)目標幀率時需要考慮的事項。
當基本游戲功能到位時,就要確保游戲運行表現(xiàn)能夠達標。我們衡量游戲運行表現(xiàn)的一個基本工具是Unity內(nèi)置分析器以及Xcode分析工具。使用Unity分析器來分析設備上的運行代碼真是一項寶貴的功能。
我們總結(jié)了這種為將目標設備的幀率控制在60fps而進行衡量、調(diào)整、再衡量過程的中相關(guān)經(jīng)驗。
shadow blade(from deadmage.com)
一、遇到麻煩時要調(diào)用“垃圾回收器”(Garbage Collector,無用單元收集程序,以下簡稱GC)
由于具有C/C++游戲編程背景,我們并不習慣無用單元收集程序的特定行為。確保自動清理你不用的內(nèi)存,這種做法在剛開始時很好,但很快你就公發(fā)現(xiàn)自己的分析器經(jīng)常顯示CPU負荷過大,原因是垃圾回收器正在收集垃圾內(nèi)存。這對移動設備來說尤其是個大問題。要跟進內(nèi)存分配,并盡量避免它們成為優(yōu)先數(shù),以下是我們應該采取的主要操作:
1.移除代碼中的任何字符串連接,因為這會給GC留下大量垃圾。
2.用簡單的“for”循環(huán)代替“foreach”循環(huán)。由于某些原因,每個“foreach”循環(huán)的每次迭代會生成24字節(jié)的垃圾內(nèi)存。一個簡單的循環(huán)迭代10次就可以留下240字節(jié)的垃圾內(nèi)存。
3.更改我們檢查游戲?qū)ο髽撕灥姆椒?。用“if (go.CompareTag (“Enemy”)”來代替“if (go.tag == “Enemy”)” 。在一個內(nèi)部循環(huán)調(diào)用對象分配的標簽屬性以及拷貝額外內(nèi)存,這是一個非常糟糕的做法。
4.對象庫很棒,我們?yōu)樗袆討B(tài)游戲?qū)ο笾谱骱褪褂脦?#xff0c;這樣在游戲運行時間內(nèi)不會動態(tài)分配任何東西,不需要的時候所有東西反向循環(huán)到庫中。
5.不使用LINQ命令,因為它們一般會分配中間緩器,而這很容易生成垃圾內(nèi)存。
二、謹慎處理高級腳本和本地引擎C++代碼之間的通信開銷。
所有使用Unity3D編寫的游戲玩法代碼都是腳本代碼,在我們的項目中是使用Mono執(zhí)行時間處理的C#代碼。任何與引擎數(shù)據(jù)的通信需求都要有一個進入高級腳本語言的本地引擎代碼的調(diào)用。這當然會產(chǎn)生它自己的開銷,而盡量減少游戲代碼中的這些調(diào)用則要排在第二位。
1.在這一情景中四處移動對象要求來自腳本代碼的調(diào)用進入引擎代碼,這樣我們就會在游戲玩法代碼的一個幀中緩存某一對象的轉(zhuǎn)換需求,并一次僅向引擎發(fā)送一個請求,以便減少調(diào)用開銷。這種模式也適用于其他相似的地方,而不僅局限于移動和旋轉(zhuǎn)對象。
2.將引用本地緩存到元件中會減少每次在一個游戲?qū)ο笾惺褂?“GetComponent” 獲取一個元件引用的需求,這是調(diào)用本地引擎代碼的另一個例子。
三、物理效果
1.將物理模擬時間步設置到最小化狀態(tài)。在我們的項目中就不可以將讓它低于16毫秒。
2.減少角色控制器移動命令的調(diào)用。移動角色控制器會同步發(fā)生,每次調(diào)用都會耗損極大的性能。我們的做法是緩存每幀的移動請求,并且僅運用一次。
3.修改代碼以免依賴“ControllerColliderHit” 回調(diào)函數(shù)。這證明這些回調(diào)函數(shù)處理得并不十分迅速。
4.面對性能更弱的設備,要用skinned mesh代替physics cloth。cloth參數(shù)在運行表現(xiàn)中發(fā)揮重要作用,如果你肯花些時間找到美學與運行表現(xiàn)之間的平衡點,就可以獲得理想的結(jié)果。
5.在物理模擬過程中不要使用ragdolls,只有在必要時才讓它生效。
6.要謹慎評估觸發(fā)器的“onInside”回調(diào)函數(shù),在我們的項目中,我們盡量在不依賴它們的情況下模擬邏輯。
7.使用層次而不是標簽。我們可以輕松為對象分配層次和標簽,并查詢特定對象,但是涉及碰撞邏輯時,層次至少在運行表現(xiàn)上會更有明顯優(yōu)勢。更快的物理計算和更少的無用分配內(nèi)存是使用層次的基本原因。
8.千萬不要使用Mesh對撞機。
9.最小化碰撞檢測請求(例如ray casts和sphere checks),盡量從每次檢查中獲得更多信息。
四、讓AI代碼更迅速
我們使用AI敵人來阻攔忍者英雄,并同其過招。以下是與AI性能問題有關(guān)的一些建議:
1.AI邏輯(例如能見度檢查等)會生成大量物理查詢。可以讓AI更新循環(huán)設置低于圖像更新循環(huán),以減少CPU負荷。
五、最佳性能表現(xiàn)根本就不是來自代碼!
沒有發(fā)生什么情況的時候,就說明性能良好。這是我們關(guān)閉一切不必要之物的基本原則。我們的項目是一個側(cè)邊橫向卷軸動作游戲,所以如果不具有可視性時,就可以關(guān)閉許多動態(tài)關(guān)卡物體。
1.使用細節(jié)層次的定制關(guān)卡將遠處的敵人AI關(guān)閉。
2.移動平臺和障礙,當它們遠去時其物理碰撞機也會關(guān)閉。
3.Unity內(nèi)置的“動畫挑選”系統(tǒng)可以用來關(guān)閉未被渲染對象的動畫。
4.所有關(guān)卡內(nèi)的粒子系統(tǒng)也可以使用同樣的禁用機制。
六、回調(diào)函數(shù)!那么空白的回調(diào)函數(shù)呢?
要盡量減少Unity回調(diào)函數(shù)。即使敵人回調(diào)函數(shù)存在性能損失。沒有必要將空白的回調(diào)函數(shù)留在代碼庫中(有時候介于大量代碼重寫和重構(gòu)之間)。
七、讓美術(shù)人員來救場
在程序員抓耳撓腮,絞盡腦汁去想該如何讓每秒運行更多幀時,美術(shù)人員總能神奇地派上大用場。
1.共享游戲?qū)ο蟛牧?#xff0c;令其在Unity中處于靜止狀態(tài),可以讓它們綁定在一起,由此產(chǎn)生的簡化繪圖調(diào)用是呈現(xiàn)良好移動運行性能的重要元素。
2.紋理地圖集對UI元素來說尤其有用。
3.方形紋理以及兩者功率的合理壓縮是必不可少的步驟。
4.我們的美術(shù)人員移除了所有遠處背景的網(wǎng)格,并將其轉(zhuǎn)化為簡單的2D位面。
5.光照圖非常有價值。
6.我們的美術(shù)人員在一些關(guān)口移除了額外頂點。
7.使用合理的紋理mip標準是一個好主意(游戲邦注:要讓不同分辨率的設備呈現(xiàn)良好的幀率時尤其如此)。
8.結(jié)合網(wǎng)格是美術(shù)人員可以發(fā)揮作用的另一個操作。
9.我們的動畫師盡力讓不同角色共享動畫。
10.要找到美學/性能之間的平衡,就免不了許多粒子效果的迭代。減少發(fā)射器數(shù)量并盡量減少透明度需求也是一大挑戰(zhàn)。
八、要減少內(nèi)存使用
使用大內(nèi)存當然會對性能產(chǎn)生負面影響,但在我們的項目中,我們的iPod由于超過內(nèi)存上限而遭遇了多次崩潰事件。我們的游戲中最耗內(nèi)存的是紋理。
1.不同設備要使用不同的紋理大小,尤其是UI和大型背景中的紋理。《Shadow Blade》使用的是通用型模板,但如果在啟動時檢測到設備大小和分辨率,就會載入不同資產(chǎn)。
2.我們要確保未使用的資產(chǎn)不會載入內(nèi)存。我們必須遲一點在項目中找到僅被一個預制件實例引用,并且從未完全載入內(nèi)存中實例化的資產(chǎn)。
3.去除網(wǎng)格中的額外多邊形也能實現(xiàn)這一點。
4.我們應該重建一些資產(chǎn)的生周期管理。例如,調(diào)整主菜單資產(chǎn)的加載/卸載時間,或者關(guān)卡資產(chǎn)、游戲音樂的有效期限。
5.每個關(guān)卡都要有根據(jù)其動態(tài)對象需求而量身定制的特定對象庫,并根據(jù)最小內(nèi)存需求來優(yōu)化。對象庫可以靈活一點,在開發(fā)過程中包含大量對象,但知道游戲?qū)ο笮枨蠛缶鸵唧w一點。
6.保持聲音文件在內(nèi)存的壓縮狀態(tài)也是必要之舉。
加強游戲運行性能是一個漫長而具有挑戰(zhàn)性的過程,游戲開發(fā)社區(qū)所分享的大量知識,以及Unity提供的出色分析工具為《Shadow Blade》實現(xiàn)目標運行性能提供了極大幫助。(本文為游戲邦/gamerboom.com編譯,拒絕任何不保留版權(quán)的轉(zhuǎn)載,如需轉(zhuǎn)載請聯(lián)系:游戲邦)
“0 – 60 fps in 14 days!” What we learned trying to optimize our game using Unity3D.
by Amir Fassihi
The following blog post, unless otherwise noted, was written by a member of Gamasutra’s community.
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.
A smooth gameplay is built upon the foundations of a smooth frame rate and hitting the 60 frames per second target on the standard iPhone and iPad devices was a significant goal during the development of our upcoming action platformer game, Shadow Blade. (http://shadowblade.deadmage.com)
The following is a summary from the things we had to consider and change in the game in order to increase the performance and reach the targeted frame rate during the intense optimization sessions.
Once the basic game functionalities were in place, it was time to make sure the game performance would meet its target. Our main tool for measuring the performance was the built-in Unity profiler and the Xcode profiling tools. Being able to profile the running code on the device using the Unity profiler proved to be an invaluable feature.
So here goes our summary and what we learned about the results of this intense measuring, tweaking and re-measuring journey which paid out well at the end and resulted in a fixed 60fps for our target devices.
1 – Head to head with a ferocious monster called the Garbage Collector.
Coming from a C/C++ game programming background, we were not used to the specific behaviors of the garbage collector. Making sure your unused memory is cleaned up automatically for you is nice at first but soon the reality kicks in and you witness regular spikes in your profiler showing the CPU load caused by the garbage collector doing what it is supposed to do, collecting the garbage memory. This proved to be a huge issue specifically for the mobile devices. Chasing down memory allocations and trying to eliminate them became priority number one and here are some of the main actions we took:
Remove any string concatenation in code since this leaves a lot of garbage for the GC to collect.
Replace the “foreach” loops with simple “for” loops. For some reason, every iteration of every “foreach” loop generated 24 Bytes of garbage memory. A simple loop iterating 10 times left 240 Bytes of memory ready to be collected which was just unacceptable
Replace the way we checked for game object tags. Instead of “if (go.tag == “Enemy”)” we used “if (go.CompareTag (“Enemy”)”. Calling the tag property on an object allocates and copies additional memory and this is really bad if such a check resides in an inner loop.
Object pools are great, we made and used pools for all dynamic game objects so that nothing is ever allocated dynamically during the game runtime in the middle of the levels and everything is recycled back to the pool when not needed.
Not using LINQ commands since they tended to allocate intermediate buffers, food for the GC.
2 – Careful with the communication overhead between high level scripts and native engine C++ code.
All gameplay code written for a game using Unity3D is script code which in our case was C# that was handled using the Mono runtime. Any requirements to communicate with the engine data would require a call into the native engine code from the high level scripting language. This of course has its own overhead and trying to reduce such calls in game code was the second priority.
Moving objects around in the scene requires calls from the script code to the engine code and we ended up caching the transformation requirements for an object during a frame in the gameplay code and sending the request to the engine only once to reduce the call overhead. This pattern was used in other similar places other than the needs to move and rotate an object.
Caching references to components locally would eliminate the need to fetch a component reference using the “GetComponent” method on a game object every time which is another example for a call into the native engine code.
3 – Physics, Physics and more Physics.
Setting the physics simulation timestep to the minimum possible. For our case we could not set it lower than 16 milliseconds.
Reducing calls to character controller move commands. Moving the character controller happens synchronously and every call can have a significant performance cost. What we did was to cache the movement requests per frame and apply them only once.
Modifying code to not rely on the “ControllerColliderHit” callbacks. It proved that these callbacks are not handled very quickly.
Replacing the physics cloth with a skinned mesh for the weaker devices. The cloth parameters can play important roles in performance also and it pays off to spend some time to find the appropriate balance between aesthetics and performance.
Ragdolls were disabled so that they were not part of the physics simulation loop and only enabled when necessary.
“OnInside” callbacks of the triggers need to be assessed carefully and in our case we tried to model the logic without relying on them if possible.
Layers instead of tags! Layers and tags can be assigned to objects easily and used for querying specific objects, however, layers have a definite advantage at least performance wise when it comes to working with collision logic. Quicker physics calculations and less unwanted newly allocated memory are the basic reasons.
Mesh colliders are definitely a no-no.
Minimize collision detection requests like ray casts and sphere checks in general and try to get as much information from each check.
4 – Let’s make the AI code faster!
We use artificial intelligence for the enemies that try to block our main ninja hero and fight with him. The following topics needed to be covered regarding AI performance issues:
A lot of physical queries are generated from AI logic like visibility checks. The AI update loop could be set to something much lower than the graphics update loop to reduce CPU load.
5 – Best performance is achieved from no code at ALL!
When nothing happens, performance is good. This was the base philosophy for us to try and turn anything not necessary at the moment off. Our game is a side scroller action game and so a lot of the dynamic level objects can be turned off when they are not visible in the scene.
Enemy AI was turned off when far away using a custom level of detail scheme.
Moving platforms and hazards and their physics colliders were turned off when far away.
Built in Unity “animation culling” system was used to turn off animations on objects not being rendered.
Same disabling mechanism used for all in level particle systems.
6 – Callback! How about empty callbacks?
The Unity callbacks needed to be reduced as much as possible. Even the empty callbacks had performance penalties. There is no reason for having empty callbacks but they just get left in the code base sometimes in between a lot of code rewrite and refactoring.
7 – The mighty Artists to the rescue.
Artists can always magically help out the hair-pulling programmer trying to go for a few more frames per second.
Sharing materials for game objects and making them static in Unity causes them to be batched together and the resulting reduced draw calls are critical for good mobile performance.
Texture atlases helped a lot especially for the UI elements.
Square textures and power of two with proper compression was a must.
Being a side-scroller enabled our artists to remove all far background meshes and convert them to simple 2D planes instead.
Light maps were highly valuable.
Our artists removed extra vertices during a few passes.
Proper texture mip levels were a good decision especially for having a good frame rate on devices with different resolutions.
Combining meshes was another performance friendly action by the artists.
Our animator tried to share animations between different characters if it was possible.
A lot of iterations on the particles were necessary to find the aesthetic/performance balance. Reducing number of emitters and trying to reduce transparency requirements were among the major challenges.
8 – The memory usage needs to be reduced, now!
Using a lot of memory of course has negative performance related effects but in our case we experienced a lot of crashes on iPods due to exceeding memory limits which was a much more critical problem. The biggest memory consumers in our game were the textures.
Different texture sizes were used for different devices, especially textures used in UI and large backgrounds. Shadow Blade uses a universal build but different assets get loaded when the device size and resolution is detected upon startup.
We needed to make sure un-used assets were not loaded in memory. We had to find out a little late in the project that any asset that was only referenced by an instance of a prefab and never instantiated was fully loaded in memory.
Stripping out extra polygons from meshes helped.
We needed to re-architect the lifecycle management of some assets a few times. For example tweaking the load/unload time for the main menu assets or end of level assets or game music.
Each level needed to have its specific object pool tailored to its dynamic object requirements and optimized for the least memory needs. Object pools can be flexible and contain a lot of objects during development, however, they need to be specific once the game object requirements are known.
Keeping the sound files compressed in memory was necessary.
Game performance enhancement is a long and challenging journey and we had a fun time experiencing a small part of this voyage. The vast amount of knowledge shared by the game development community and very good profiling tools provided by Unity were what made us reach our performance targets for Shadow Blade.(source:gamasutra)
總結(jié)
以上是生活随笔為你收集整理的总结使用Unity 3D优化游戏运行性能的经验的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: eclispe/myeclipse中输入
- 下一篇: 大型机、小型机、x86架构以及ARM架构