Article reprint source: Silicon Star
Image source: Generated by Unbounded AI
Since the concept of the metaverse was popularized by Zuckerberg in 2021, it has always felt like a "most familiar stranger" to people, always approaching and sometimes far away. To put it bluntly, everyone feels that its presence is not strong enough and is not as powerful as imagined.
Because when it comes to the Metaverse, the impression in netizens' minds is still like this:
Zuckerberg's famous Horizon Worlds virtual image selfie, which was once ridiculed by the public, picture from Facebook
But just one year later, a one-hour long conversation in the metaverse suddenly appeared, blowing up social networks. This time it was the netizens' turn to be dumbfounded, and they all exclaimed, how could it have evolved into this without them noticing? !
Recently, Lex Fridman, an MIT scientist, artificial intelligence expert and well-known podcast host, conducted an in-depth interview with Meta CEO Mark Zuckerberg on augmented reality, AI and large language models.
Unlike in the past, this time the two did not meet in the physical world, but instead completed the conversation in the Metaverse with ultra-realistic 3D virtual images across half of the United States. As of press time, this interview titled "First Interview in Metaverse" has attracted nearly 13 million views on X (original Twitter).
Image from Lex Fridman Youtube channel
At the beginning of the video, Lex Fridman's full-body high-definition digital avatar appears in a white futuristic space. He said: "Although Mark and I are hundreds of miles apart in the real world, because our images are modeled with photo-accurate 3D models and presented to each other with spatial audio, we are like communicating face to face in the same room. This technology is amazing! I think this will be the way for humans to connect with each other more deeply and meaningfully on the Internet in the future."
Both of them wore Meta Quest Pro VR headsets during the interview. Perhaps because the effect in the field of vision was too real, Lex was like a curious baby. In addition to grinning, he kept exclaiming: "Where am I? Mark, is this really you? This is great! You don't mind me being too close to you, right?"
Of course, the sight distance can be adjusted. Not only that, you can also adjust the light source position by controlling the handle to find the lighting angle that best highlights your facial advantages.
Zuckerberg said that unlike the cartoon versions or video transmissions in Horizon Worlds, creating these new Meta Codec virtual images that simulate real people requires extensive scanning of the user's various facial expressions and movement details, which are then modeled and compressed into a coded version.
The headset's real-time eye and facial tracking capabilities then capture the user's expressions, map them onto a 3D virtual avatar, and "send an encoded version of what you should look like" to the people on the other end of the virtual world or conference call, presenting them with a realistic-looking you.
Since the image data in the metaverse is transmitted in encoded form, in addition to being realistic, it also saves more bandwidth than sending a complete immersive video.
Judging from the interview video, the virtual avatar's ability to restore real-life details is indeed too strong. In Lex's words, "It captures everything, including the flaws on the face. For me, these flaws are the subtleties of human beings, these freckles, wrinkles, asymmetrical cheeks, the expressions at the corners of the eyes when smiling... They make me enjoy it more and realize that perfection is not the key to immersion."
"Eyes are indeed important," Zuckerberg said. "Many studies have shown that people communicate mainly through expressions and body language, rather than language. Meta has been working hard to capture these expressions with its classic virtual system, bringing a special sense of presence through a photo-realistic experience." He believes that this also touches on the visual core of virtual reality and augmented reality, which is to make people feel together no matter where they are in the world.
Imagine entering a conference room in the future, where some people are actually present, while others appear in this realistic virtual form, and are superimposed on the physical environment through the combination of mixed reality technology: you think he/she is sitting at the table and talking about the project with you, but in fact he/she is thousands of miles away; when you are alone in a foreign place and miss home in the dead of night, your relatives are just around the corner when you put on the headset. And every smile and every subtle facial expression of the person opposite can be reproduced in three dimensions with almost no delay and loss.
During the experience, Lex couldn't help but exclaim that it was wonderful. "My heart was beating fast at the moment. The intimacy of the conversation can be achieved remotely like this. I felt the emotions and felt that you and I really existed. This is one of the most incredible experiences in my life. It's really eye-opening!"
However, it should be noted that it is not so easy for ordinary people to achieve the dialogue effect in the video. Before the interview, both of them flew to Pittsburgh in advance and conducted a detailed scan for several hours with the Meta Codec project team, which combined the current highest software and hardware technologies.
At the Connect conference that just ended a few days ago, Zuckerberg said that his biggest vision is to make those high-end technologies accessible to the people and change the lives of most people. So what is his vision for the future this time?
Zuckerberg said that we are just getting started. By scanning a small number of people and collecting enough facial expression data, we can explore how much the entire process can be simplified, so that it can be more smooth when applied to a large number of people. Although this technology is not yet fully ready for market, it will continue to be adjusted and optimized in the next few years so that it can be applied to work scenarios and solve productivity problems as soon as possible.
Meta is working to achieve that a very fast face scan with a mobile phone, such as shaking the phone in front of the face, saying a few words, and making some expressions, can produce a call quality similar to the current one in just two or three minutes. How to ensure the experience while being more efficient is still one of the challenges to be faced next.
Lex believes that the new Meta Codec Avatars have obviously crossed the "uncanny valley" of the past. The little Zuckerberg in the camera looks exactly like himself. Then he smiled tentatively and asked: "So we don't need arms and legs, right?"
"No, no, we will still solve these," Zuckerberg hurriedly explained, "In fact, there is a problem. High-precision full-body scanning requires strong computing power support, both for the head display's sensors and rendering capabilities. So we may consider restoring the body with lower fidelity, such as retaining large movements, but the face is what needs to be analyzed most. After all, moving the eyebrows by one millimeter can convey completely different emotions. In comparison, moving the arm by one inch is not that important."
The avatars in Horizon Worlds have been described as having the "uncanny valley effect" because of their pale expressions and lack of lower bodies. Image from Meta
The two later talked about the newly released Quest 3, augmented reality, artificial intelligence in the metaverse, and the future of mankind. Lex joked that this interview with Zuckerberg was "the encounter between the two people with the most rigid expressions on the Internet in the metaverse." He felt that in this virtual space, his expressiveness was easier to capture and his emotions could be conveyed more realistically: "I really hope more people can come and experience it in person!" Zuckerberg also said that he was looking forward to the reaction of netizens after watching this episode of the podcast. The only thing he was worried about was whether the audience could really feel the real shock through the 2D screen.
Judging from the comments from netizens, it is obvious that they not only felt it, but were also deeply shocked.
Even though we have been baptized by several rounds of stormy updates from Google, Microsoft, and especially OpenAI's ChatGPT in the past few months, this interview in the metaverse is still beyond people's cognition. MrBeast, a top global YouTuber, left a message under the video saying, "How did we get from pixel virtual people to here? What did I miss!" Others also posted "This is one of the most incredible things I have ever seen."
Some people say that Meta will always have a place in the team of strong people leading technological innovation. After being questioned, ridiculed and even criticized for so long, the Metaverse has finally evolved into such a powerful 3.0 form.
Lex said, I saw the future.
I believe the next year is going to be pretty crazy, Zuckerberg said.
Regardless of how Meta will develop in the future, this first-ever "real person" conversation across 100 miles in the Metaverse is a milestone. We are lucky enough to be born in this era and witness everything impossible become possible.
