Products and services driven by artificial intelligence promise to revolutionize TV production by automating workflows, speeding searches of archives, controling cameras and even editing video. Above, the EVS Xeebra 2.0 system uses AI to calibrate the field of play and virtually place offside lines for soccer refereeing.
AI At NAB: From Buzzword To Practical
The machines have taken over TV.
Well, maybe not quite yet. But a tour of the exhibit floor at the NAB Show in Las Vegas last week revealed the extent to which computers are being allowed to think for themselves in order to speed the production of news and sports content, better organize it for archiving, and analyze how it is consumed by the audience.
Artificial intelligence (AI), and its subcategory of machine learning (ML), is still a relatively new concept for most broadcasters. But its real-world implementation was being demonstrated in various booths and discussed in multiple panels at NAB, in applications ranging from the mundane, like speeding content ingest through automated logging, to the ambitious, such as producing and directing a sports broadcast without human operators.
“AI is moving from a buzzword into usable workflows,” said Jon Roberts, a business development consultant with Nikon, who was at NAB demonstrating an AI-driven robotic camera system created in partnership with graphics supplier ChyronHego.
For broadcasters, the motivation behind AI — to produce and broadcast more and better content at lower cost — isn’t fundamentally different than what drove them to invest in master control automation systems decades ago, or more recently, news production automation and robotic cameras to produce local newscasts with minimal staff.
But with pretty much every broadcast system today being a computer that generates data and connects to other computers over IP networks, the future possibilities for AI are certainly exciting.
Tom Ohanian, an industry consultant and former Avid, Signiant and Vitec executive, noted that AI tools like image classification and facial recognition are already being used today.
Automatic speech recognition is now at a 5.1% error rate, which is on par with human transcriptions. In an NAB panel, he showed a demonstration of Adobe’s “idiom-based” editing, which analyzes facial expressions in video and aligns takes against lines of dialogue in order to create an initial edit of a scene.
Ohanian sees AI being rolled out in three phases:
- First, decreasing human workload.
- Second, using AI for “content insight and workflow steering,” such as IBM mining metadata in Wimbledon tennis footage to automatically create highlights.
- Third, using AI for automatic content production, such as switching between cameras based on image tracking.
“Can you produce content according to rulesets?” mused Ohanian.
One of the big drivers for AI are cloud-based workflows. The algorithms that drive AI generally need a lot of processing power, which is readily supplied through the data centers run by cloud vendors like Amazon Web Services, Microsoft, Google and IBM.
And machine learning — which USC professor and film editor Norman Hollyn described simply as “programming that learns to get better” — is based on analyzing large sets of data, something cloud vendors also have in abundance.
“Machine learning has become incredibly important, the idea of having really fantastic transcription, and adding that to quality-assurance tools for transcoding,” said Jeff Kember, technical director for media and entertainment, office of the CTO at Google Cloud.
Avid, which runs its MediaCentral content management platform through the Microsoft Azure cloud, offers its customers a range of Microsoft’s “Cognitive Services” that automatically index content using machine-learning algorithms such as facial detection, scene recognition and speech-to-text to create searchable metadata.
And cloud giant AWS markets machine learning services to media customers today, following up on its success in other industries. Early adopters include the NFL, which used AWS to create a next-generation statistics system; and C-SPAN, which used Amazon’s image recognition tools to catalogue its archives.
Usman Shakeel, worldwide technology leader for media and entertainment for AWS, said that for media customers the cloud was initially a solution for storing and playing out more content to multiple platforms. But machine learning can take it to a new level by making workflows more efficient.
“Once you have a lot of storage, then it’s how do I manage that content, and the processing of content,” said Shakeel. “All of that is growing at an exponential pace, but the budgets keep getting tighter and tighter for production. It used to be I would get up on stage, and say: ‘The cloud is the answer to all your needs, you can spin up what you need.’
“While all these things still hold [true], the new thing out there is all about workflow efficiency. It’s not just scalability anymore. These workflows that are implemented on the cloud can be much more intelligent.”
Some of the new AI applications are based on simply taking readily available metadata and using it to orchestrate the automatic production of content. Media asset management (MAM) supplier Tedial was demonstrating a new product, Smart LIVE, which is designed to quickly create sports highlights for distribution on social media.
Smart LIVE grabs metadata from sports information suppliers like Opta or Stats to automatically start creating highlights from an event, assembling a placeholder folder with a bunch of keywords such as players’ names. Using the game clock, scoreboard data and speech recognition technology, it then performs “auto clipping” of key plays such as goals in a soccer match to automatically create a highlight, and then pushes it out to social media platforms.
“It allows you to drastically increase the amount of highlights,” said Jerome Wauthoz, VP of products for Tedial. “Instead of just five or 10 highlights per game, you can create one highlight per quarter per player and deliver them to each player’s personal Facebook page.”
At NAB, Wauthoz demonstrated the creation of three highlights in one second, and then 2.5 minutes of highlights in just three clicks. The Smart Live system, which is sold as a bundled solution, starts at $40,000. He said executives from NBC, Telemundo and OBS (Olympic Broadcasting Services) had visited the Tedial booth to see Smart LIVE in action.
Major sports broadcasters were also visiting the Nikon booth to see the robotic camera system using AI that was created in partnership with ChyronHego. The system uses a combination of Chyron’s TRACAB optical tracking system (the basis of Major League Baseball’s TrackMan app); Nikon D5 Digital SLR cameras; and Robotic Pod camera systems from Mark Roberts Motion Control (MRMC), a U.K. firm that was acquired by Nikon in 2016.
According to Nikon’s Roberts, the Chyron optical-tracking system generates real-time data feeds by capturing video at 25 frames per second in a “panoramic stitch” of a playing field and tracking the individual objects within the playing field. Each object has an associated name, which in one application is already being used by soccer clubs to track their players’ movements for training purposes.
The MRMC automated robotic camera extends the use of the Chyron data by relying on it to direct the camera’s movements. Installing robotic cameras like the Robotic Pod can allow a sports production to easily grab stadium views, or eliminate “seat kills” of prime seats, says Roberts (seat kills is what happens when a broadcaster places a traditional camera and a camera operator in certain places in a stadium, like behind home plate in baseball. It can wipe out up to 10-15 high-dollar seats, which teams dislike.) Or, it can be used to simply add an additional game camera without hiring another camera operator.
“The cameraman can be replaced by an algorithm, and the system will make decisions based on data points,” Roberts said. “And those elements can be integrated into a traditional production.”
Even more AI-based tools for sports production could be seen at the EVS booth. The Belgian company, which is the dominant supplier of replay systems, is making a major investment in AI and has hired its own internal team of data scientists to create applications. Its AI features run on the company’s new “VR” platform, which will enable a host of future microservices.
An early product from the AI efforts is Xeebra 2.0, a video refereeing system EVS created for soccer which uses machine learning technology to calibrate the field of play and allow users to accurately overlay an offside line for video assistant referee (VAR) operations. The same AI technology can also be used for “automatic framing,” to accurately crop video for repurposing on social media platforms by predicting where the action of a play is going to move.
“There is a lot of interest in AI, and with good reason,” said Johan Vounckx, SVP of innovation and technology for EVS, speaking at an NAB session. “It has quite a lot of potential, particularly for live production.”
EVS’ customers are being asked to produce more and more sports content on tighter and tighter budgets, Vounckx said. Automation would be a simple answer to this productivity problem in other industries, but in sports production traditional approaches don’t work.
“It’s a people business, and operators do the storytelling,” he said. “You cannot really put a human operator in a mathematical rule. That’s where AI comes in. AI allows you to do automation while preserving this human touch, this creativity.”
The trick, said Vounckx, is to allow machines to learn by example, like humans do, through “neural networks.” Computers are fed “training data,” such as game footage, and deliver an output that is then checked against a desired output, which would be a human’s work. The idea is that eventually the machine’s final model mimics the human examples.
Besides automatic framing, other AI applications for sports production could be automatic calibration of cameras or even automatic direction, though Vounckx was quick to say that EVS’ objective with AI is “to help people, not replace them.”
In a private demo room at the EVS booth, one could see some of the possibilities that Vounckx described. One video demonstration showed steering of a robotic camera through AI.
Similar to the Nikon/Chyron demo, an image from a wide camera was used to create tracking info that then directed a robotic camera. In another, AI was used to take standard-frame-rate video and interpolate the movement in the images to create a “slo-mo” effect similar to that generated by a high-frame-rate camera, by “inventing” the missing frames.
In a third, a split-screen demo showed automated direction of a soccer broadcast, where the AI made switching decisions between multiple cameras next to a human-directed production of the game. This reporter incorrectly identified the AI version as the human director’s cut.
The automated direction was purely image-based, and used ranking orders of activity within the game, explained EVS SVP of Marketing Nicholas Bourdon. He said the machine learned how to direct after watching “tens” of complete soccer matches.
While the automated direction is simply a “concept” now and not yet a product, said Bourdon, it is designed to be “quite harmonious” with traditional productions and could be used with robotic cameras. He said such automated direction could be used as “an assistant” to a traditional director or in lieu of a director to do more and/or cheaper sports production.
“This is really focused on our core business,” Bourdon said.
Several broadcasters at NAB said they were interested in using AI and data science to not only help them produce content, but also to make better decisions about what content to invest in. By using computers to analyze reams of data about their content’s popularity versus its relative production cost, they hope to have a better idea of which horse to back.
Fox has brought “a lot of data scientists on board” to help it make programming decisions and fine-tune its schedules, said Richard Friedel, EVP-GM of Fox Networks engineering and operations.
“It’s pretty interesting to mine the data to reposition shows for different dayparts, or make the decision a show is not going to make it a little sooner than before,” Friedel said.
Discovery Networks is taking a similar approach, said CTO John Honeycutt, “looking horizontally” at the consumption data from all of its businesses and trying to spot new opportunities.
“There are a lot of levers in the model,” said Honeycutt. “When you spend several billions [of dollars] a year on content, a difference that might shape $200 million or $300 million of your investment is a big deal. It might enable you to go to a place you wouldn’t have gone based on gut instinct.”
AWS’ Shakeel agreed, and said that media customers are increasingly relying on big data to make their content investment decisions.
“What are the data points I have, who is the celebrity I want to use for a particular piece of content, what is the series I want to invest in next — Season Two of what?” explained Shakeel. “So, it’s not a select set of people looking at pilots, but instead a very large data set.”
Read all of TVNewsCheck‘s NAB 2018 news here.