Invention for Systems and Methods for changing the perspective of a virtual reality user based on an selected position by the user

Invented by Jonathan A. Logan, Adam Bates, Hafiza Jameela, Jesse F. Patterson, Mark K. Berner, Eric Dorsey, David W. Chamberlin, Paul Stevens, Herbert A. Waterman, Adeia Guides Inc

The market for systems and methods for changing the perspective of a virtual reality (VR) user based on a selected position by the user is rapidly expanding. As VR technology continues to advance, the ability to customize and enhance the user experience becomes increasingly important. This article explores the growing market for systems and methods that allow users to change their perspective within the virtual world.

Virtual reality has come a long way since its inception, and it is now being used in various industries, including gaming, entertainment, education, and even healthcare. One of the key aspects of VR is the ability to immerse users in a virtual environment, making them feel as if they are physically present in a different world. However, until recently, users were limited to a fixed perspective within the virtual world, which somewhat hindered the overall experience.

The demand for systems and methods that allow users to change their perspective within VR has led to the development of innovative technologies. These technologies enable users to freely move around and explore the virtual environment from different angles and positions. This not only enhances the sense of immersion but also provides a more realistic and interactive experience.

One of the most popular methods for changing the perspective of a VR user is through the use of handheld controllers. These controllers allow users to physically move around in the real world, which is then translated into movement within the virtual world. By simply moving their hands or bodies, users can change their position and viewpoint, giving them a greater sense of control and freedom.

Another method gaining traction in the market is eye-tracking technology. This technology tracks the movement of the user’s eyes and adjusts the perspective accordingly. By simply looking in a specific direction, users can change their viewpoint within the virtual environment. This method offers a more intuitive and natural way of navigating the virtual world, as it mimics the way we naturally explore our surroundings in the real world.

Furthermore, there are systems and methods that utilize motion capture technology to change the perspective of a VR user. These systems track the user’s body movements and translate them into corresponding movements within the virtual world. By physically walking or moving their bodies, users can change their position and viewpoint, creating a more immersive and interactive experience.

The market for systems and methods that allow users to change their perspective within VR is expected to grow significantly in the coming years. As VR technology becomes more mainstream and accessible, the demand for enhanced user experiences will continue to rise. Companies are investing heavily in research and development to create more advanced and user-friendly systems that cater to this demand.

In conclusion, the market for systems and methods that enable users to change their perspective within VR is expanding rapidly. The ability to freely move around and explore the virtual environment from different angles and positions enhances the overall user experience and immersion. With advancements in technologies such as handheld controllers, eye-tracking, and motion capture, users can expect even more realistic and interactive virtual experiences in the future.

The Adeia Guides Inc invention works as follows

The system and method described in this patent is for a media guide application (implemented on a device, e.g.) that allows the user to choose any position within a virtual environment to view virtual content. It also changes the user’s perspective depending on the position selected.

Background for Systems and Methods for changing the perspective of a virtual reality user based on an selected position by the user

Conventional VR media systems do not allow users to share viewing experiences. The typical systems feature headset components which isolate the user from the physical environment and their friends by blocking their view. Users are unable to interact with peers or experience content. In a virtual environment, users are often limited to only a few perspectives. Virtual reality media systems of the past select these perspectives by determining where virtual reality cameras are placed in the physical environment. The virtual reality camera can be placed at a corner on a football pitch during a game of football to create a virtual media asset. The user can view the virtual media asset from the corner camera of the football pitch. The user is unable to experience the game from anywhere on the field, or in the stands.

Systems are described that overcome the limitations of conventional system by allowing the user to choose any position arbitrarily in a virtual environment and then generating a perspective for the user from this position. The systems and methods described also include adding virtual versions of the user’s friends into a virtual media asset. Virtual reality versions can react to a virtual media asset’s content based on the real reactions captured by the user’s friend. Imagine that a Google Daydream-enabled headset is being used by a user to access a virtual reality asset featuring a football match. The virtual reality asset could have been created using two virtual cameras that were positioned symmetrically at opposite ends of the football stadium. The media guidance application can create a virtual point cloud using the depth data from both virtual cameras. This allows the user to choose any position in the virtual field and view the game. A point cloud is an array of coordinates that represent the surface of a 3D object. Imagine that the user chooses to view the match from the middle of the field. The media guidance app can generate a 360-degree user perspective using the point clouds at the center (by adjusting the origin point, for example). The media guidance application can also identify an event (e.g. a touchdown) in the game and retrieve videos from the user’s friend’s reactions. The media guidance app can create a virtual avatar with captured reactions of the user’s friend and place it near the user to allow him/her to experience the football match alongside friends.

The media guidance application can receive, through a virtual-reality-enabled user device, an initial indication that a user has accessed a virtual-reality media asset. The media guidance app can search a database via the user device for a record of a user who accessed the asset using second equipment. The media guidance app can search the database via the user’s device for a recording that captures the reaction of the second person at a certain point within the media asset. The media guidance application can generate a video asset for insertion in the media asset at the point based on the recorded reaction. The media guidance app may insert the video asset into the media asset via the user’s device and can generate the media asset for display via the device.

In some aspects, media guidance applications may access a virtual-reality media asset using the virtual reality enabling device of a particular user. Virtual reality enabling devices include, but are not limited to the Oculus, Google Daydream and PlayStation VR. A virtual reality asset can be a video, photo or game that simulates the user’s presence in a virtual world. Imagine that the user wants to view a soccer game between FC Barcelona and Real Madrid. The match was captured by multiple virtual reality cameras. The user could be watching the match using a smartphone and virtual reality headset, such as Oculus Rift. The media guidance app may display the virtual reality media asset and allow the user view the match from the perspective of a virtual camera near the field.

The media guidance application can process metadata associated with virtual reality media assets. Metadata may include such information as the title, description, transmission times, creator names and media types, lengths, frame details, genre, and media type of the virtual reality asset.

The media guidance application can identify, using the metadata, the first type of event at a progression in the virtual reality asset. The first type event could be an event in the content that the user is currently accessing. In the given example, the first event type could be a goal scored by a player. The progression point can be the time (relative to the virtual media asset) at which the event occurs.

The media guidance application can determine the number of friends that are associated with a user. The media guidance app can identify the friends from various sources. The media guidance application, for example, may be able to access the profile of the user. A list of the friend’s that the user has may be included in his/her user profile. The media guidance app may also be able to access the list of friends of the user from their social media accounts on the Internet. The media guidance app can also identify friends who the user has called or messaged via their wireless communications device (e.g. smartphone). Imagine that the media guidance app identifies A and B.

The media guidance application can search, using the first kind of event and a plurality of friends that are associated with the user to find a video featuring a friend’s reaction to an event in the media asset. This could be done while the media asset is accessed via virtual reality or another media asset. Any facial expressions capture devices that can record video, audio or photos (e.g. Microsoft Kinect) may capture the reactions of users. The media guidance app may record the time a reaction occurred, the event in the media asset that was captured (e.g. soccer goal), as well as the name of that media asset. The identified information can be stored as metadata in a media database (also known as a database of video) along with the reactions. Media database can therefore include reactions from different users to events in a virtual reality asset. The media database can be organized according to media assets. A virtual reality media asset for a movie, for example, may include a list events (e.g. different plot points or scene). Videos of reactions can be stored for each event. Each video can be accompanied with metadata, which lists the person who reacted to the event and their friends. The media guidance application can retrieve videos of reactions for the first kind of event, by navigating to the section of media database associated with virtual reality media assets, then the subsection associated with that first type event. The metadata of each video can be searched by the media guidance application to identify friends.

In response to identifying a friend’s video, the media guidance application can generate a video object animating their reaction to the first kind of event. The media guidance application, for example, may identify the video where friend A is reacting to a soccer goal that is occurring in the virtual reality media asset the user is viewing. In certain embodiments, media guidance applications may detect videos of additional friends. The media guidance application, for example, may detect a clip of friend B reacting in real time to a goal scored by friend B while playing a videogame. The media guidance application can then create a video object that will animate the friend’s reaction. Computer vision can be used to create a cartoon of the friend, which mimics their facial features. “Animating the response may also involve extracting the friend’s face from the video and stitching it onto the virtual reality avatar.

The media guidance application can insert the video into the virtual reality asset at the progression, in response to identifying a first type of event within the media asset. The progression point could be, for example, 5 minutes into a virtual reality media asset’s playback. The first type of content event can occur either at the progression point or a few moments later. The media guidance app may decide where to place the video object in the virtual reality asset (e.g. virtual avatar of a friend). If the user views the virtuality media asset through the first camera’s perspective, for example, the media application can insert the video next to the user in the virtuality media asset. The media guidance application can also insert the video near the place where the first event takes place. The video object could be placed, for example, near the soccer goal.

The media guidance application can determine the location of a video object within the virtuality media asset using depth information extracted from the virtuality media asset (e.g. point cloud), segmentation to identify the floor (e.g. flat surface) and superimposing the depth data associated with the object on the depth data of the media asset. The media guidance app may superimpose video object depth data at a point in the point clouds of the virtuality media asset that corresponds to a flat surface (e.g. a floor) and is close to the user’s viewpoint (e.g. the position of the virtuality camera).

The media guidance application can generate the virtual reality asset for display with the video object embedded at the progression point. The media guidance application, for example, may display the soccer match and the virtual avatar, along with the friend?s reaction, in the progression point. In some embodiments the virtual reality-enabling apparatus comprises a projector that generates a hologram displaying the video asset and the media asset together at the progression point. The hologram can, for example, project the virtuality media asset while also including the video object using depth information.

In some embodiments, a media guidance application can identify a plurality videos from a subset friends who are reacting to a first type of event. The media guidance app can retrieve multiple attributes to classify reactions from users. The media guidance application, for example, may refer to a database of emotions on a remote computer. The emotions database can include reactions from different users to different events. The database can classify, for example, a variety of facial attributes (e.g. eyebrows, eye, nose, and mouth). The emotions database can include a standard face for each user (e.g. no emotions). The media guidance application on the facial expressions capture devices (e.g. Microsoft Kinect) can determine changes to the face sections when the person reacts and classify the reaction. As an example, the widening of eyes and mouth could be classified as shock. The plurality of attributes can also include audio. A loud scream, for example, can be classified under horror. However, someone who says “Yes!” could be classified under happiness. Happiness can be classified by a loud scream. The media guidance application can use natural language processing to classify audio (e.g. speech recognition)

The media guidance application can process each video and compare the reactions in each video with the attributes. The media guidance application, for example, may pull a video of a friend reacting from the database. The media guidance app can divide the user’s face in the video to determine multiple attributes (e.g. changes in the eyes, nose, and mouth). Imagine that the friend was angry at the soccer goal. The media guidance application can compare the attributes to the basic face of the friend in the emotions database. The media guidance app may use computer vision to determine that the eyebrows are angled in the reaction, the eyes are smaller and the edges of mouth are lower in comparison to the basic face. The media guidance app may classify the changes and refer to an section of the emotions database which lists different emotions and their attributes. The media guidance app may decide that angled eyebrows and small eyes are all in the anger section. The friend’s response is anger.

The media guidance application can then score each video based on how many attributes appear in the reactions in each video. The media guidance application, for example, may identify several friends who reacted to a particular event. Imagine that the media guidance app identifies A and B. Media guidance applications may be able to determine that the eyebrows, eye, nose and mouth of friend A have changed compared to his basic face. Media guidance may also determine whether friend B’s response to the first event was the same as his basic face (e.g. no change in expressions). The media guidance application can also score the amount of change compared to the basic face. Media guidance applications may, for example, determine the percentage change in the user?s eyes, mouth, and nose based on the difference in pixels between the basic face of the friend and their reaction. Media guidance applications can use vectors in order to determine how much more angled a friend’s brows, eyes and nose are. The media guidance application may use vectors to classify how much more angled the friend’s eyebrows, eyes, nose and mouth are. The score can be as simple as the amount of change, e.g. percentage, fraction, 1-10 scale, etc .).

The media guidance application can then choose as the selected video one of a plurality of videos that are associated with the highest score. The video with the highest score, for example, represents the most significant deviation from the person’s normal face. This indicates that the person was astonished by the event. The media guidance application can select the video showing the reaction to the event as the selected video.

In some embodiments, a media guidance application could access the camera of each friend. The camera can be, for example, any device capable of capturing video, audio, and images. In this example the purpose of a camera (also known as a facial emotion capture device) is to capture the reactions of friends. The media guidance application can then refer to the database of media assets and identify the cameras associated with the friends of the user. The media guidance application for the camera can access the metadata of media assets consumed by the users.

The media guidance application can detect a certain event within a media asset when a particular friend is using the media asset. The media guidance application can, for example, determine that a plurality of friends are consuming a media asset (e.g. video, audio or game) using the media database. The media guidance app on the camera uploads the reactions of the users to the database. Therefore, the media guide application on the device can recognize the first event in the first asset (e.g. a goal in virtual reality soccer environment). The media guidance app can detect and classify a first event using computer vision. The media guidance application, for example, may classify the point cloud of a ball in a net as a goal. The media guidance app can also extract audio from the media asset, and then use natural language processing to identify words like “goal” in the audio. The media guidance application may also extract the audio from the first media asset and use natural language processing (e.g., speech recognition) to identify words such as?goal?

The media guidance application may decide that the first event falls under a different category. The media guidance application, for example, may consult an event database that classifies events and determine the first event, e.g., a soccer goal, is listed under “scoring a sports point”. The media guidance app on the camera can capture a short video of a friend consuming the media asset at the time of the event. The media guidance application, for example, may detect an incident in the first asset and record the friend’s reaction. The media guidance app on the camera can store the first short video captured in conjunction with the second event type in the media database. The media guidance application, for example, may upload the video of a friend’s reaction to the media database. It will then store the video in the section of “scoring a goal in a sports event.

In some embodiments, the application for media guidance may score all the reactions from the user’s friends to determine which reaction is the strongest. The media guidance application, for example, may compare the faces and calculate a numerical deviation. The media guidance application may compare the faces of the user’s friends and determine a numerical deviation (e.g. difference in pixels, distance between edge-detected lines, difference in vectors, etc.). The media guidance application can display the best reaction in virtual reality media assets based on the face that has the largest numerical deviation. The media guidance application may display the face in the virtual reality asset as the?best reaction?

In some embodiments, a media guidance application can detect a secondary event in a media asset when a second friend is consuming that media asset. The media guidance application, for example, may refer to the media database in order to determine whether the plurality is consuming a media asset (e.g. video, audio or game). The media guidance app of the cameras uploads the reactions of users into the media database. Therefore, the media guide application on the user device can recognize a second incident that occurs within the second media asset. Media guidance application on the device can determine that the second event falls under a particular type of event. The media guidance application, for example, may consult an event database that classifies events and determine the second event, e.g. player injury, is listed under “sports injuries.” The media guidance camera application can capture a short second video of the friend consuming the second asset while the second event is taking place. The media guidance application, for example, may detect an incident in the second asset and record the friend’s reaction. The media guidance app on the camera can store the second short video captured in conjunction with the first event type in the media database. The media guidance application can, for example, upload the video of a friend’s reaction to the media database. It will then store the video in the section titled “Sports Injuries.

Click here to view the patent on Google Patents.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *