Artificial intelligence is making inroads into broadcast TV newsrooms, helping to enhance metadata and provide closed captioning. But it’s also starting to learn the roles of an editor and may soon help suss out fake news. IBM Watson’s Ethan Dreilinger lays out AI’s current newsroom roles and where it could conceivably be deployed to play a bigger part in the editing room.
Artificial intelligence has long left the realm of science fiction and started to wend its way into the broadcast newsroom.
Ethan Dreilinger, a solutions engineer for IBM Watson, said one of its main early beachheads is in diving into a broadcaster’s extensive archives and cleaning up its metadata to improve retrievability, along with real-time closed captioning services. But it’s also moving into the more high-minded functions like generating highlight clips, such as Watson did in the recent Masters golf tournament.
Dreilinger spoke with TVNewsCheck Special Projects Editor Michael Depp about AI’s current capacities and its potential to elbow out newsroom personnel if executives are willing to cede that control.
An edited transcript:
AI is a big industry buzzword right now and has a lot of moving parts, so let’s break down some of its implications for broadcast television. As I understand it, so far a large focus is on metadata. Can you explain how that’s working?
What our platform does is it goes in, delivers intelligent insight into video and unlocks dark metadata that’s within video to expose it and make it actionable by the end user. When we say “actionable,” there’s a lot of different things we can do around search and discovery and a lot that we can do on the closed captioning side as well.
Do broadcasters generally have a large inventory of material that’s not very well archived and therefore irretrievable in many cases?
I come out of the broadcast space. An editor or producer will sit down and write out some sort of title for a piece and give it a little slug line. They’ll write six, seven or eight keywords and some of those keywords will get spelled correctly. That’s the way that piece of content is in an asset management system in a broadcast facility today. There’s very little intelligence layer that’s added on top of that to make that content more findable. If you’re able to enrich the metadata and make [it] more actionable, your time to air goes down and you can do more with that video, including monetization.
Is it your experience that a lot of the metadata is presently riddled with spelling errors?
I have yet to meet or speak with a broadcaster who does not admit to that problem.
What about other applications right now in terms of master control, playout or editing? Where else is AI impacting broadcast right now?
AI is a tool in the box to be deployed at the right time. You can use AI in search and discovery or closed captioning, but then there are other locations for it like some compliance pieces. This isn’t necessarily in the local space, but if you have to distribute content to multiple countries and there are objectionable scenes in some things, you can actually find those scenes and do edit replaces using AI algorithms. You can do some things with quality control and master control about the quality of the signal going out and also what you’re looking at on the return and make sure they’re synced accurately.
On the closed captioning side, what’s the value proposition?
Stations do closed captioning in a lot of different ways. Some of them will generate closed captioning through their news control systems, some will do it by sending content out and literally having someone manually type it in. Really the return on investment is about automating those processes and getting the closed captioning to a point where it’s less of a cost center. It’s a matter of do you spend $500 an episode or $250 an episode, and by episode I could be referring to a news program as well.
Who among broadcasters are using AI for their metadata and their closed captioning?
We have several deployments in market today that are either at a POC level or a production level. [Editor’s note: IBM Watson Media does not publicly disclose its closed captioning customers]
What kind of costs do a media company incur for these services?
The way we charge is a scalable system based on volume commitment. We charge per minute of content enhanced or closed captioned, and based on that commitment it’s a sliding scale. The more content coming in, the lower the per-minute costs.
Is this something that’s affordable to a broadcast station group right now or is it still reserved for the largest of companies?
A lot of that relies on the infrastructure that particular broadcaster or broadcast group has in place. If you have an updated, recent MAM installed and you’re able to let an API-driven UI on your MAM and can bring in external data sources, it’s very affordable.
The Associated Press adopted AI a couple of years ago when it began using Automated Insights to produce stories on quarterly earnings reports and some sports coverage. Is there an AI video production equivalent to this that’s either here now or forthcoming?
Our video enrichment products are pretty close to that. We generate insights into content and generate keywords, code analysis and all of these pieces, but it’s how the output from Watson is implemented at the station that really differentiates what happens. We have a lot of really good examples around search and discovery and highlight clipping and closed captioning leveraging that Watson output in some ways that help stations quickly recognize the ROI on it.
Can you explain how the clipping works?
For the Masters, we covered each shot of the tournament and then we gave each shot an excitement level based on crowd reaction, golfer reaction and the announcer track associated with that clip. Those three elements together generated an excitement factor for that specific clip. Then the Masters was able to leverage those clips in a lot of different ways between social networks, their website and mobile apps in order to generate interest in the event, draw people into the event and draw interest to the live streams of the event.
How does the machine recognize that excitement?
There’s a level of training that goes on across an AI platform like Watson to teach it what to look for. When a golfer raises his arms up with excitement in his face, Watson’s going to say, “Oh, that’s a 100% excitable moment.” When a golfer throws his putter down because he just missed a two-foot gimme, it’s still a high excitement moment, but for a different reason.
So, you have to teach Watson to differentiate those moments. And then there’s the crowd noise associated too, and the trickier one to do in golf is really the announce track. If you think about the way a golf match is announced, it’s very low key. In that case, you have to teach Watson some of the words the announcers use, and Watson has to understand the context of what the announce track is saying in order to generate the rating.
It sounds like Watson is learning how to be an editor.
In the specific case of highlight clipping, yes.
So how far can you take that idea out? Can Watson quickly evolve to replacing that function where the editing of clips and other things can ultimately be automated?
I don’t think broadcasters want to give up that editorial control of their on-air product or their web product. From a completely technical point of view, yes. But from an in-practice point of view, I don’t think we’re there yet.
When is the “yet” conceivably? What’s a realistic time frame for that kind of viability?
It’s technically feasible today. These are early days of AI. What’s technically and practically feasible today is one thing. What’s technically feasible five years from now is very different. I don’t know that we could put a timeline to it, per se, but we can say the technology is progressing down a path to where the enhancements to what humans are doing is recognizable.
In the AP’s case, AI took on stories that used a lot of labor time and freed up the staff to focus their reporting energies in other places. Is it reasonable to think that this technology could do the same for broadcast newsrooms?
It’s reasonable to think that. But coming out of broadcast, I have a hard time seeing broadcast newsrooms giving up the editorial control.
Not to get too sci-fi about this, but is there a potential threat to jobs because this technology can get very smart and adapt very quickly?
We all work for corporations, so there are always threats to jobs. I can’t point to any one thing and say that’s a threat to a job.
What about AI’s role in potentially suppressing fake news, validating content and exposing content that’s fake? Is it playing a role there now or is it going to be?
The first half of your question I don’t know. I know that it’s something our research labs are looking at and working on. I don’t know that anyone has deployed in that area. The nature of a cognitive platform — where it’s able to learn as it goes along — lends itself to a level of detection.
But it’s sort of like old school antivirus software: you’re only as good as the last virus outbreak. The algorithms on the fake news side constantly churn in order to keep that engine going. There’s a role to be played, but it will enhance what other parts of that ecosystem are doing. It’s not going to take over that ecosystem.