當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

VR系列——Oculus最佳实践：二、双眼视觉，立体成像和深度线索

發布時間：2023/12/20 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 VR系列——Oculus最佳实践：二、双眼视觉，立体成像和深度线索小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

大腦利用眼睛的不同視角來感知物體的深度。
不要忽略單眼的深度線索，比如說紋理和亮度。
用戶使用虛擬現實頭戴顯示器的最佳深度范圍在0.75米至3.5米之間（在Unity里，1單位=1米）
虛擬攝像頭之間的距離，得從OVR配置工具中設置，等同于用戶瞳孔之間的距離。
確保每只眼睛中的成像一致，相互融合，只有一只眼睛能看到的效果，或者雙眼看到的成像效果差異太大，畫面看起來不好。

基本原理

雙眼視覺意思是同時看到兩種視圖的方式——每只眼睛看到的視圖都截然不同，大腦將二者結合起來，形成一個3D立體圖像，這即是立體視覺。我們左眼看到的景象與右眼看到的景象不同，形成了雙眼差異。立體視覺的產生，由于雙眼用不同視角觀看客觀世界，或由于觀看兩張具有一定差異的平面圖像。

頭戴顯示器顯示兩張圖像，這兩張圖像由兩個有一定間隔的虛擬攝像頭形成，每張圖像對應用戶的一只眼睛。按順序定義一些術語。我們雙眼的距離稱為瞳距（IPD），將捕捉虛擬環境的兩個攝像頭之間的距離稱為攝像頭間距（ICD）。雖然IPD的范圍可在52mm到78mm之間調節，平均IPD（根據一項針對大約4000名美軍士兵的調查）為63.5mm，等于頭戴顯示器的軸間距（IAD），即頭戴顯示器鏡片的中心距（適用于這份修訂版指南）。

單眼深度線索

立體視覺僅僅是我們大腦處理的多種深度線索中的一種。大部分其他深度線索都是單眼的，也就是說，即便只用一只眼睛看，或是以平面圖像的形式呈現出來由雙眼同時看，都能傳達深度。對于VR，由于頭部移動形成的運動視差不需要立體視覺來感知，但這對于傳達深度和為用戶提供舒適的體驗是極為重要的。

其他重要的深度線索包括：曲線透視（直線向遠處延伸后會交匯于一點），相對標度（越遠的物體看起來越小），阻隔（近處的物體阻隔我們觀察遠處的物體），空中透視（由于大氣的折射性，遠處的物體比近處的物體看起來模糊），紋理梯度（重復的圖案，投影大小越小，投影密度越大），和光線（明暗可以幫我們看到物體的形狀和位置）。這些深度線索，很多當前的電腦成像技術已經可以運用，我們之所以還要提起它們，是因為在創新的3D立體視覺的影響之下，我們很容易忽略這些深度線索的重要性。

頭戴顯示器內舒適的視距

當視線集中在一個物體上時，有兩個問題是理解眼睛舒適度的關鍵：調節滯后和聚焦需求。滯后調節需求指的是眼睛需要如何調節晶體的形狀方能聚焦到一個縱面（這個過程稱為調節）。聚焦需求指的是眼睛需要向內轉動多少，方能使視線在一個特定的縱面相交。在現實世界中，這二者是緊密相關的，二者的聯系十分緊密，因此有所謂的調焦反射：眼睛轉向的度數影響晶體的調節，反之亦然。

頭戴顯示器與其他3D立體視覺技術（如3D電影）一樣，可以將滯后需求和聚焦需求相互分離——滯后調節需求是固定的，但是調焦需求可以改變。這是由于形成3D立體視覺的實際圖片都是被投射在與眼睛距離始終保持不變的屏幕上，但是不同的圖像還是需要眼睛轉動，以使視線在不同的縱面交匯于物體上。

我們已經在研究滯后調節需求和調焦需求可以彼此不同到何種程度而不至于引起用戶不適。目前，DK2頭戴顯示器的視覺效果相當于看一塊1.3米外的屏幕。（由于制造的誤差和頭戴顯示儀鏡片的不同，這個數字僅僅是個大概值。）為了避免視疲勞，若你知道某一物體會讓用戶將視線集中在它身上較長一段時間（比如菜單，當前環境中比較有趣的物體等），該物體的距離應該調節到0.75米和3.5米之間。

很顯然，一個完整的虛擬環境會要求將一些物體調整到上述最佳舒適范圍之外。只要用戶無需將視線長時間集中在這些物體上，這就不會造成什么問題。在使用Unity編程時，1個單位相當于現實世界大約1米，所以不需要長時間注視的物體需要被放置在0.75至3.5單位距離之外。

作為目前正在的研究和發展的一部分，未來頭戴顯示器的發展必然將會改進它們的成像零件，使舒適視距范圍擴大。但是不論這個范圍如何改變，保險起見，對于需要用戶長時間觀看的物體，比如菜單或圖形用戶界面等，2.5米應該是一個舒適的距離，不會被未來所淘汰。

據說，一些頭戴顯示器用戶表示，當他們眼睛的晶體調節到虛擬屏幕的縱面上時，若他們此時注視視界上的所有物體，會產生一種非同尋常的感受。這有可能讓一小部分用戶覺得不適或造成他們視疲勞，因為眼睛可能無法容易地準確聚焦。

一些研發人員發現，在一些你知道用戶會仔細觀察的場景中使用景深效果，既可以讓用戶身臨其境，也能使他們覺得更舒適。例如，你可以有意地將用戶調出菜單界面的背景模糊化，或將與需要突出顯示的物體不在同一個縱面上的物體模糊處理。這不僅僅模仿現實中人的視覺的自然機制，而且可以避免讓不需要用戶聚焦的物體吸引了視線。

不幸的是，我們無法控制一些有不合常理、不合常規、不可預見的用戶行為。一些VR用戶可能將眼睛放在物體幾英寸之外，盯著物體看一整天。雖然我們知道這會導致視疲勞，但是如果我們采取一些激烈的措施來阻止這樣反常的情況，比如說設置沖突檢測來防止用戶如此靠近物體，只會破壞大部分用戶的體驗。而作為一個開發人員，你的職責就是避免讓用戶置身于不太友好的環境。

攝像頭間距（ICD）的效果

ICD即兩個攝像頭之間的距離，改變ICD對用戶有重要的影響。如果ICD增加，它會造成一種被稱為“超立體”的體驗，即深度過深。如果ICD縮小，深度將會變淺，稱為“次立體”。改變ICD對用戶還有兩個更深的影響：首先，它改變了眼睛為看清一個指定物體所需要轉動的角度。增大ICD，用戶的眼球需要轉動更大的角度，才能看清同一件物體，這會導致視疲勞。其次，它會改變用戶對置身在虛擬環境中的大小的感受。后者在“用戶創建和環境標度”一章的“內容創造”一節中有更詳細的論述。

按用戶的實際的IPD調節ICD，可使虛擬環境的比例和深度更真實。如果采用縮放效應，那么一定要應用在整個頭部模型中，這樣方能在頭部移動過程中準確地反映用戶在現實世界中的知覺體驗。其它與距離相關的指導方針也是如此。

左右攝像頭之間的距離ICD（下圖左），與用戶雙眼之間的距離IPD（下圖右）必須成比例。任何運用于ICD的比例因數和其它本文所提供的與距離相關的指導方針，都必須應用到整個頭部模型。

融合兩個圖像的潛在問題

在現實世界中,我們經常面臨的情況每一只眼睛有非常不同的視角,并且通常給我們帶來一些小問題。在VR中使用一只眼睛觀察一個角落的效果跟現實中一樣。事實上，眼睛的不同視角可以是有益的：比如說你是一名特工（在現實生活中或者VR中），你正嘗試隱藏在一些長草中。你的眼睛的不同視角可以讓你穿過眼前的草監控周圍的環境，就好像草不在眼前似得。然而，在2D屏幕中的視頻游戲實現同樣的場景，每一片草葉都會遮擋后面世界。

不過，VR（像其他任何的立體圖像）會引起一些讓用戶覺得很煩人的潛在異常情況。例如，渲染效果（如光畸變、粒子效果、泛光）應該顯示在兩只眼睛中，并且視差正確。這種操作失敗會導致閃爍/閃光（當某些東西只顯示在一只眼中）出現，或者錯誤的深度（是否視差是關的，或者是否后期處理效果未渲染其影響到的對象的上下文深度-例如，高光陰影）。重要的是，確保兩眼中圖像不要與固有的雙眼視差造成的輕微不同視角相差太遠。

雖然在一個復雜的3D環境中，這看起來不像是個問題，但是對確保用戶的眼睛能夠接收足夠的信息，以便大腦知道如何正確的融合和解析地圖來說可能是重要的。通常情況下，行和邊組成一個3D場景是足夠的；但是，需要堤防那些會導致人類融合雙眼視覺圖片效果與預期不一致的重復區域。也要注意的是，視錯覺的深度（如“凹面具錯覺”，凹面顯示為凸面）有時會導致知覺錯誤，尤其是在單眼深度線索較少的情況下。

原文如下

The brain uses differences between your eyes’ viewpoints to perceive depth.
Don’t neglect monocular depth cues, such as texture and lighting.
The most comfortable range of depths for a user to look at in the Rift is between 0.75 and 3.5 meters (1 unit in Unity = 1 meter).
Set the distance between the virtual cameras to the distance between the user’s pupils from the OVR config tool.
Make sure the images in each eye correspond and fuse properly; effects that appear in only one eye or differ significantly between the eyes look bad.

Basics

Binocular vision describes the way in which we see two views of the world simultaneously—the view from each eye is slightly different and our brain combines them into a single three-dimensional stereoscopic image, an experience known as stereopsis. The difference between what we see from our left eye and what we see from our right eye generates binocular disparity. Stereopsis occurs whether we are seeing our eye’s different viewpoints of the physical world, or two flat pictures with appropriate differences (disparity) between them.

The Oculus Rift presents two images, one to each eye, generated by two virtual cameras separated by a short distance. Defining some terminology is in order. The distance between our two eyes is called the interpupillary distance (IPD), and we refer to the distance between the two rendering cameras that capture the virtual environment as the inter-camera distance (ICD). Although the IPD can vary from about 52mm to 78mm, average IPD (based on data from a survey of approximately 4000 U.S. Army soldiers) is about 63.5 mm—the same as the Rift’s interaxial distance (IAD), the distance between the centers of the Rift’s lenses (as of this revision of this guide).

Monocular depth cues

Stereopsis is just one of many depth cues our brains process. Most of the other depth cues are monocular; that is, they convey depth even when they are viewed by only one eye or appear in a flat image viewed by both eyes. For VR, motion parallax due to head movement does not require stereopsis to see, but is extremely important for conveying depth and providing a comfortable experience to the user.

Other important depth cues include: curvilinear perspective (straight lines converge as they extend into the distance), relative scale (objects get smaller when they are farther away), occlusion (closer objects block our view of more distant objects), aerial perspective (distant objects appear fainter than close objects due to the refractive properties of the atmosphere), texture gradients (repeating patterns get more densely packed as they recede) and lighting (highlights and shadows help us perceive the shape and position of objects). Currentgeneration computer-generated content already leverages a lot of these depth cues, but we mention them because it can be easy to neglect their importance in light of the novelty of stereoscopic 3D.

Comfortable Viewing Distances Inside the Rift

Two issues are of primary importance to understanding eye comfort when the eyes are fixating on (i.e., looking at) an object: accommodative demand and vergence demand. Accommodative demand refers to how your eyes have to adjust the shape of their lenses to bring a depth plane into focus (a process known as accommodation). Vergence demand refers to the degree to which the eyes have to rotate inwards so their lines of sight intersect at a particular depth plane. In the real world, these two are strongly correlated with one another; so much so that we have what is known as the accommodation-convergence reflex: the degree of convergence of your eyes influences the accommodation of your lenses, and vice-versa.

The Rift, like any other stereoscopic 3D technology (e.g., 3D movies), creates an unusual situation that decouples accommodative and vergence demands—accommodative demand is fixed, but vergence demand can change. This is because the actual images for creating stereoscopic 3D are always presented on a screen that remains at the same distance optically, but the different images presented to each eye still require the eyes to rotate so their lines of sight converge on objects at a variety of different depth planes.

Research has looked into the degree to which the accommodative and vergence demands can differ from each other before the situation becomes uncomfortable to the viewer.[1] The current optics of the DK2 Rift are equivalent to looking at a screen approximately 1.3 meters away. (Manufacturing tolerances and the power of the Rift’s lenses means this number is only a rough approximation.) In order to prevent eyestrain, objects that you know the user will be fixating their eyes on for an extended period of time (e.g., a menu, an object of interest in the environment) should be rendered between approximately 0.75 and 3.5 meters away.

Obviously, a complete virtual environment requires rendering some objects outside this optimally comfortable range. As long as users are not required to fixate on those objects for extended periods, they are of little concern. When programming in Unity, 1 unit will correspond to approximately 1 meter in the real world, so objects of focus should be placed 0.75 to 3.5 distance units away.

As part of our ongoing research and development, future incarnations of the Rift will inevitably improve their optics to widen the range of comfortable viewing distances. No matter how this range changes, however, 2.5 meters should be a comfortable distance, making it a safe, future-proof distance for fixed items on which users will have to focus for an extended time, like menus or GUIs.

Anecdotally, some Rift users have remarked on the unusualness of seeing all objects in the world in focus when the lenses of their eyes are accommodated to the depth plane of the virtual screen. This can potentially lead to frustration or eye strain in a minority of users, as their eyes may have difficulty focusing appropriately.

Some developers have found that depth-of-field effects can be both immersive and comfortable for situations in which you know where the user is looking. For example, you might artificially blur the background behind a menu the user brings up, or blur objects that fall outside the depth plane of an object being held up for examination. This not only simulates the natural functioning of your vision in the real world, it can prevent distracting the eyes with salient objects outside the user’s focus.

Unfortunately, we have no control over a user who chooses to behave in an unreasonable, abnormal, or unforeseeable manner; someone in VR might choose to stand with their eyes inches away from an object and stare at it all day. Although we know this can lead to eye strain, drastic measures to prevent this anomalous case, such as setting collision detection to prevent users from walking that close to objects, would only hurt overall user experience. Your responsibility as a developer, however, is to avoid requiring the user to put themselves into circumstances we know are sub-optimal.

Effects of Inter-Camera Distance

Changing inter-camera distance, the distance between the two rendering cameras, can impact users in important ways. If the inter-camera distance is increased, it creates an experience known as hyperstereo in which depth is exaggerated; if it is decreased, depth will flatten, a state known as hypostereo. Changing inter-camera distance has two further effects on the user: First, it changes the degree to which the eyes must converge to look at a given object. As you increase inter-camera distance, users have to converge their eyes more to look at the same object, and that can lead to eyestrain. Second, it can alter the user’s sense of their own size inside the virtual environment. The latter is discussed further in Content Creation under User and Environment Scale.

Set the inter-camera distance to the user’s actual IPD to achieve veridical scale and depth in the virtual environment. If applying a scaling effect, make sure it is applied to the entire head model to accurately reflect the user’s real-world perceptual experience during head movements, as well as any of our guidelines related to distance.

The inter-camera distance (ICD) between the left and right scene cameras (left) must be proportional to the user’s inter-pupillary distance (IPD; right). Any scaling factor applied to ICD must be applied to the entire head model and distance-related guidelines provided throughout this guide.

Potential Issues with Fusing Two Images

We often face situations in the real world where each eye gets a very different viewpoint, and we generally have little problem with it. Peeking around a corner with one eye works in VR just as well as it does in real life. In fact, the eyes’ different viewpoints can be beneficial: say you’re a special agent (in real life or VR) trying to stay hidden in some tall grass. Your eyes’ different viewpoints allow you to look “through” the grass to monitor your surroundings as if the grass weren’t even there in front of you. Doing the same in a video game on a 2D screen, however, leaves the world behind each blade of grass obscured from view.

Still, VR (like any other stereoscopic imagery) can give rise to some potentially unusual situations that can be annoying to the user. For instance, rendering effects (such as light distortion, particle effects, or light bloom) should appear in both eyes and with correct disparity. Failing to do so can give the effects the appearance of flickering/shimmering (when something appears only in one eye) or floating at the wrong depth (if disparity is off, or if the post processing effect is not rendered to contextual depth of the object it should be effecting—for example, a specular shading pass). It is important to ensure that the images between the two eyes do not differ aside from the slightly different viewing positions inherent to binocular disparity.

Although less likely to be a problem in a complex 3D environment, it can be important to ensure the user’s eyes receive enough information for the brain to know how to fuse and interpret the image properly. The lines and edges that make up a 3D scene are generally sufficient; however, be wary of wide swaths of repeating patterns, which could cause people to fuse the eyes’ images differently than intended. Be aware also that optical illusions of depth (such as the “hollow mask illusion,” where concave surfaces appear convex) can sometimes lead to misperceptions, particularly in situations where monocular depth cues are sparse.

總結

以上是生活随笔為你收集整理的VR系列——Oculus最佳实践：二、双眼视觉，立体成像和深度线索的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： JAVA程序设计：灯泡开关（LeetCo
下一篇：【MATLAB】矩阵操作 ( 矩阵构造