The Real Picture: Big Data in Virtual Reality

Imagine you are walking through a luscious tropical garden, with many different kinds of fruits, plants, and animals. All around you, there is a wide range of colors, textures, sizes, and smells. Some things are moving and changing their position quickly, some are swaying gently in the breeze, some are stationary, some are higher up, others are on the ground, still others are visible only if you move position. The image around you is very complex due to the many sizes, textures, positions, and colors represented around you. Yet, thanks to the innate human capacity to discern patterns while immersed in our 3D world, your brain is capable of quickly forming a rich understanding of your surroundings.

But what if you now wanted to visually convey a rich representation of this garden to another person who is not there to experience it with you? A 2D sketch of the garden is better than nothing, but will clearly not come close to representing the richness of the image. A photo will do a better job but will miss a large part of the dimensions of the image. For example, your photo will not convey the fact that behind the large leaf in front of you there is a beautiful brightly colored bird who is about to take off, whereas you—when immersed in 3D — can just move your body slightly and peek behind the leaf.

The drive to re-create that immersive experience is why virtual reality (VR) has exploded. Revenues from VR products (both hardware and software) are projected to increase from $90 million in 2014 to $5.2 billion by 2018. That same year, the number of active VR users is forecast to reach 171 million.

And while using VR to allow another viewer to be immersed in your 3D surroundings and arrive at a rich understanding of the many dimensions in the tropical garden is extraordinary, VR has the potential to make sense of something that’s currently a challenge for corporations, organizations, even governments around the world: Big Data. How does that tropical garden relate to abstract data? Just as the garden can be visually “understood” by the human brain only through an appreciation of the many dimensions that define the garden, abstract data can also be visually “understood” if our brain can grasp the potentially many dimensions inherent in the abstract data.

For example, a hospital records a vast array of metrics on many patients that are admitted for a certain disease over a period of a few months. The hospital may record things such as age, gender, the result of blood tests at admission, which medicines were administered, etc. These are all “dimensions” to the data. Now let’s say that the hospital also records the length of stay for each patient, and wants to understand why some patients are cured quickly and others are not. Usually, the answer in complex situations such as this is the result of the interplay of many dimensions. For example, it may turn out that the patients with bad outcomes are the ones who are 1) older, 2) had a high score in blood test X, 3) had a high score in blood test Y, 4) were given medication Z and 5) are women. Knowing this might help the hospital prevent long stays by providing extra care to the patients who belong to the group most susceptible to staying a long time.

The data is all there, but how would the hospital find out the multi-dimensional answer to this problem? And how would the team that finds the answer convey it to a wider audience so as to make the organization implement the changes needed to solve the problem?

Thanks to ever-increasing computing power, Machine Learning can increasingly spit out the answer. But that’s only the first step. Next, the hospital’s Machine Learning specialist (the “expert”) will need to visualize the intermediate steps and the results of the algorithms used in order to make sure the algorithm used is appropriate to the situation.

Then, the expert, when conveying the result of the analysis to the wider organization, will need to provide a visualization of the results so that the non-experts (administrators, business managers, etc.) will understand the results and buy into its conclusions. Just showing the result of complicated mathematical equations without any visual explanation will make the result look like a black box and will reduce the buy-in from the wider organization. Finally, many times a non-expert wants to find answers on his or her own without relying on the expert – and one of the only ways to do that is through visualization.

Effective visualization is thus an important evolution of Big Data for both the expert and the non-expert . But how would we be able to visualize the hospital problem in its many dimensions? We are used to working with 2D graphs which can convey at most a few dimensions (e.g., a scatterplot might show position in X and Y, and maybe add color—3 dimensions). But just as a 2D sketch of the tropical garden was inadequate in conveying the richness of the garden, those 2D graphs are clearly inadequate in visualizing the complexity of most problems.

And a 3D graph in 2D, just like the photo of the tropical garden, will miss out on a lot of the dimensions. What if I want to quickly walk through the landscape of abstract data points and effortlessly see what is behind a certain group of points? The 3D graph in 2D will not let me move my head and peek behind the cluster of points. What if I want to invite someone else to take a “tour” of my data?

As in the case of visualizing the garden, VR has the potential to solve the problem. We can assign each of the different dimensions to different attributes of our graph in VR. If we do this properly, we can then be immersed in 3D, see the many dimensions in all their richness, and use our innate pattern recognition capability to see the answer. Going from 2D to being immersed in 3D does not just give you access to one more dimension… being immersed in 3D allows you to see many more dimensions. And we can invite other people to tour the data with us.

As we look at the positive ways Big Data can impact the world, we have to recognize that most data are multidimensional because the world is inherently complex. At the same time, the most innovative answers only come when great minds are able to collaborate and see the entire picture by walking together through a complex, luscious dataset. VR has opened that new world, and many exciting answers await.

Michael Amori is Co-Founder and CEO at Virtualitics.