User-generated content is king – from school concerts and sporting events, to tourist hotspots and art galleries, to citizen journalism and simply sharing our lives.
Consumers are now tied to the device in their pockets. They use it not only to communicate with everyone around the world instantly, but to also capture the moments that mean most to them.
Mobile is massive. In China, smartphones are overtaking TV, with an average adult spending 2 hours and 39 minutes per day on a mobile device outside of work. In 2013, 53% of consumers globally shared videos through mobile and 82% of all internet traffic is forecasted to be audio and video in 2018.
To keep up, manufacturers have innovated video and image capture and given users significant amount of control on how they capture content. Users can zoom in, switch cameras and focus on specific objects as they travel around the screen. But the audio remains static, with little control on how it is captured. Take it or leave it.
Dynamic audio in user-generated content
For decades, video capture has allowed users to focus and zoom in on the subject or object of a video to make it the star of the show. It’s intuitive and second nature for many, if not all, consumers today. But in audio capture, everything is given the same priority – background noise is just as powerful as what you want to focus on.
Picture this. A child is about to perform at their first ever school concert. They have practiced for months, ready for their big moment on stage. The proud parent is in the audience, ready to capture this moment forever. So, they get the best seat, take out their smartphone and zoom in. Their child does brilliantly, receives a standing ovation and the parent has a memory that will last a lifetime.
Until they play it back. The images are crystal clear, but they can barely hear their child play. The audio is tinny, and the person coughing next to them and the couple behind them who came in late shuffling to the seats overpowers their kid’s star performance. This perfect memory is ruined by bad audio.
This is where Nokia OZO Audio steps in: transforming audio capture and bringing the quality of sound up to the best of today’s video technologies. Smartphone and camera devices with OZO Audio give consumers advanced focus and zoom capabilities. These allow them to capture sound from a specific direction and suppress noise without distortion, and dynamically adjust to the area of zoomed video.
A key feature of the OZO Audio technology portfolio is Audio Focus, which enables users to focus on the sounds that matter most. For smartphones and cameras that include three mics or more, Audio Focus gives people the ability to select and prioritize the sound they want, removing distracting background noise. Audio Focus makes switching from front video recording to selfie recording simple and effective, and it enables the user to adjust audio focus to a specific part of the screen, as well as maintain audio focus on moving people or objects.
Audio Focus does this through directional controls that allow users to adjust azimuth, elevation, sector width and sector height. All parameters can be changed dynamically while recording.
Audio Focus isn’t just about audio capture. Focus audio playback allows viewers to select and adjust audio focus capabilities during playback. They can adjust the audio focus to a specific part of the screen. With a simple touch user interface, the viewer can create a personalized visual and audio experience.
This is how Audio Focus is augmenting innovation in audio for user-generated content.
Zooming in on the star of the show
Building on these capabilities, the OZO Audio portfolio also features Audio Zoom, an intelligent audio zooming capability that allows the user to dynamically adjust audio to the area of zoomed video. As the picture zooms in on the subject, so can the audio. Think of the same example of a child performing at a school concert. With Audio Zoom, you can zoom the video on your child and the audio will follow, highlighting them over others.
But this content means nothing if it can’t be shared. OZO Audio supports universal playback and sharing – using standard formats, such as AAC, means that this high definition, immersive content can be shared on almost any platform or device.
Implementing a new generation of audio capture
But why should phone manufacturers care about Nokia OZO Audio? While other technologies would mean a dramatic re-invention of the device, the OZO Audio portfolio has been designed to work in the current generation of smartphones and cameras.
While two mics can capture spatial audio, with three mics or more, you can enable Audio Focus. Three mics allow users to capture sounds in a 360-degree environment, using machine learning to track a subject as it moves across the screen.
OZO Audio is providing solutions that consumers care about, giving them more control over audio than ever before.
Creating a new generation of storytellers
Today, we are all storytellers. That’s why consumers care about the video and images they take, and why so much innovation has been geared to the camera. But they also care about audio, and they are getting tired of the disparity.
The device market is a crowded place. Giving a more complete user experience is an important differentiator. Improving audio is key. Those that don’t will see consumers move to those that do.
Learn more about our spatial audio technology here.
Follow us on @NokiaOzo.