Subscribe

Webinar Replay: Metadata Roadshow Session 6 | Metadata and Artificial Intelligence

/
October 18, 2021
la phil dam interview

Missed out on the Metadata Roadshow? Check out the October 14, 2021 session, "Metadata is the foundation for your Artificial Intelligence (AI) Strategy" above to catch up. The session, led by John Horodyski and Misti Vogt, focused on:

  • Providing the data needed for effective machine learning
  • Powering tagging with artificial intelligence
  • Using AI tools like facial recognition and OCR to power your metadata

Metadata Roadshow Webinar Transcript:


Jeanette Camping:  Hello everybody and welcome to the last series, last session a series of the Metadata Roadshow. Glad to welcome more back and this session is sponsored by Orange Logic. Now I'd like to hand it over to John Horodyski. Thank you, John. Actually for the last time.

 

John Horodyski:  Thank you. Good morning, good afternoon, good day everyone and welcome to session six: metadata and artificial intelligence. This is going to be great. Our sponsor today is Orange Logic. We're going to be hearing from Misti a little bit later. A few housekeeping items to kick us off today…as a brief reminder I'm the moderator for this this series. For those of you who might be joining for the first time today, quick re-introduction or a new introduction, My name is John Horodyski. I'm a managing director with Salt Flats for the insights and analytics practice where we do amazing, great, fun things with DAM information management, martech, metadata, taxonomy, analytics. All those great things that we're all sort of thinking about with our content. Been doing this for over 20 years now. Had the fortune great fortune to work with companies Fortune 10, 50 and up. Consumer packaged goods, media and entertainment, pharmaceutical industry, insurance. I'm also an adjunct faculty at San Jose State University in California where I've been teaching a graduate course in digital asset management for over 15 years now. I also do a lot of public speaking around the world. I'm a board member and editor at the Journal of Digital Media Management. If you are not familiar with that great journal, please contact myself or Jeanette afterwards. Love to let you know a little bit more about that. Also do contribution to CMS Wire and wrote a book a few years ago on DAM and I have a new book coming out on metadata in a month. So a little bit about me.

Wow, here we are session -- I can't believe this here, we are session six of this great great series. We've been here, some of us have been here, right from the beginning. We've done metadata management, taxonomy, workflow, collaboration, metadata, UX and UI. And here we are today with Orange Logic talking about, as we've been doing all along, we explore DAMs. We have good conversations about the objectives of metadata. We talk about how technology is helping us to manage all of the things we need to do with our digital assets, we have heard about new approaches, we've had some best practices, case studies, examples of real world practice and of course wonderful, healthy and rigorous Q&A in each section which we will have again today.

So why did we do this series? Why now? Well, I think it can be agreed upon that metadata is kind of everything, and without it you kind of don't have anything at all, you have nothing. Metadata as we know is the key to unlocking the power of your content, it's the foundation of your digital strategy that delivers a fully engaging consumer experience. DAM is nothing without metadata, so whether you're designing a metadata model to creating an authoritative controlled vocabulary to charting your workflow, establishing governance, doing A.I., metadata is critical to the business, and technology must make these goals achievable, so that's why we're doing it. And we could hope to do this again with you next year as well. But before we go, let's do some fun, we love polls, you know.

We love polls here at Henry Stewart. We love to do those things. This one should get everyone going alright. This was pretty easy. How many of you are using A.I. with your DAM? It's either going to be Yes, it's going to be a no or a would? It could be all three. Would like to be pretty simple. How many are using A.I. with your DAM? I think I know what the answer is going to be. I think it's going to be in a certain section…love the feedback. I love doing a live poll, you get to see the true results so… some drum roll please. Ah, this is skewing where I love this, ok, 15% of us are doing it. This is awesome, well done. 34% of you not there yet, 50% would like to. This is why we're here today and this is why we're here to talk about A.I.. This is great. Thank you so much for participating in the poll.

We're going to have a few more a little bit later on. This is great. Thank you for doing that. So let's all ask ourselves the question with metadata and A.I.: How is our metadata doing? I always like to believe that ‘in metadata we trust’ is the best foot forward for your A.I. work. Leveraging meaningful metadata in contextual ways, using and categorizing and accounting for data, provides the best chance for your return on investment with A.I.. And let's be honest, data is the foundation upon which these types of computer intelligence can be built.

We want the machines, the robots, to learn and to do more, but we must provide those robots with good quality data in order for them to do that. We need to get our metadata house in order to support this A.I. based technology. So quality data equals smart data equals happy A.I. and ultimately good robots, right? We want the good robots in order to prepare for the A.I. work, consider the determination of what metadata fields and relationships are most appropriate.

For A.I. integration, using factors such as public domain data like color, shape, sentiment, and potentially business centric data. For example, product hierarchy ingredients in the product, some of those might be needed. In addition, the controlled vocabulary terms for A.I. usage needs to be reviewed for quality and authority to ensure that the accuracy and quality is truly there. Lastly, we would need to advise on the checks and balances. The ongoing maintenance or the governance as performed by the human factor required in the testing and quality assurance of A.I. development. The robots should not be working in splendid isolation but instead with great human interaction. People are good the entire way through the process. This includes inclusivity, so as to strive for and do everything possible to remove as much race and gender bias in this important foundation work. If data describes the world, then we must do better to shape it, to bring, change, and provide visible race and gender equality without intention. Data has no real power. Let's think about that. So, the robots are waiting. The robots are ready to go, so let's give them what they need to do their job with, and that is good quality data and a good quality data foundation. Data integrity is critical to A.I. machine, and trust and certainty that the data is accurate and usable is critical.

Be mindful of the people, processes, and technologies that may influence data and learning within business. Content is critical to business operations, no argument there, but it needs to be managed at all points of the digital lifecycle. The digital experience for users will be defined by their ability to identify, discover, and experience the organization's branding content just as they intended. So value is not found, it's made, so make the data meaningful and manage it well. Start with the foundation and data. Embrace that transformation and discover the value in your content.

Well, there is our most famous A.I. right there. Everyone knows Hal. “Hello, Hal”. A.I. and metadata…are you mature enough to do it? Many organizations move too fast and they think of something and as grand as A.I. or machine learning without the data foundation and the mature metadata in which to support it, so to start such a project it requires good quality data for this to happen and to determine what metadata fields and relationships are the most appropriate for any A.I.. Word metadata matters. Why? Well, because it's the tactical application of the data to your content and the management of that content to enable creation and discovery for the distribution and consumption of all the things you're trying to manage. Metadata demands attention for effective business solutions.

So keep your metadata relevant, keep it usable, clean, consistent and governed well so that sure it serves the needs of the business. Now with that, as I do every week, I always end up with a metadata minute, something fun, bite sized, useful, something memorable, metadata. Today, the best way for A.I. to learn is by doing, working with good data. As we know, technology has the capacity to produce satisfaction when used to perform particular tasks. Understanding the needs of the users and providing those touch points will increase the perception of personalization and improve the overall experience and allow those machines to learn the struggle and managing content within the digital world is as complex as the workflows underpinning those efforts. Metadata provides that link allowing the processes and technology to be optimized, and that's hopefully where the learning and the intelligence will really begin.

So let's work with good quality data. And with that I would now like to pass this over to Misti Vogt from Orange Logic, who is going to run us through a great presentation on metadata and artificial intelligence over to you, Misti.

Misti Vogt:  Thank you, thank you so much John. That was a perfect segue. I am so excited to be here today. Good morning. Good afternoon or evening wherever you happen to be joining us. I’m Misti Vogt with Orange logic and we're here to dive into some practical applications of A.I. in metadata. I actually have some history with artificial intelligence and developing the algorithms so machines can learn and analyze the data provided to them. Some of these methods are actually still in use today in varying areas, from predicting injuries on the basketball court to understanding adverse reactions to vaccines.

I have witnessed A.I. give users the opportunity to let go of repeatable tasks and replace that time creatively. More recently, for the last three years I've been in the DAM space, so naturally I'm always, I'm always trying to find ways to leverage machine learning technology in this. I am excited at the opportunities A.I. brings to the table in the in workflow efficiencies, data integrity initiatives and ultimately augmenting human intelligence to curate an experience of discovery for our users. 

Over the course of this session, we're going to dive into some real life practical examples of leveraging A.I. in our metadata strategies. But first we're going to set the stage with a brief introduction to A.I. as it relates to metadata and the application of it in the DAM space. Alright, it's time again for our second poll.

So how many of you wish you were using A.I. more with your DAM? 

This is great. I see the numbers ticking in. Fantastic, I love the participation. It makes it a lot easier on me. It's a little engagement. Alright, the polls just closed. We are looking at 43% say yes we do wish we were using more A.I. so that goes right along with John's first poll where we understood that some of you are using A.I. currently and then we have 52% of you would like to use more A.I., so I'm hoping I'm going to give you a little bit of insight, peel back some of those layers into the experience of implementing more A.I. with your DAM, so hopefully at the end of this you'll feel a little bit more confident jumping in and 5% said no, you don't want to use, you don't want to use more A.I., but hey, I feel like maybe I can change your mind by the end of this presentation. This is a very positive poll.

 

JH: Misti. I love the numbers. The numbers aren't great to see.

MV:  Alright, let's jump in. OK, for those of you that said, yes or would like to and even those of you that said, no, you've come to the right place today. Most applications of artificial intelligence in DAM relate to metadata, although there are some really cool new things I'm starting to see in action will leave those for another session. A really important note is to acknowledge the A.I. we're focused on today is designed to augment human intelligence, not replace it. Most if not all, DAMs use A.I. created and managed by third party systems, which is a great thing. You might initially think. Wouldn't it be better if it was custom built for my DAM but consider as John mentioned earlier, A.I. learns and refines its results by repeated exposure to data. So by practicing, these third party solutions are consuming massive volumes of data and in turn it's making their predictions more reliable.

Alright, a little background on A.I.. At its most basic level, building an A.I. system is a process of reverse engineering human traits and capabilities in a machine. There are three types of artificial intelligence, general, narrow and new to the game superintelligence. Currently most artificial intelligence is narrow or specialized to achieve one goal. Hey, Siri, play Bohemian Rhapsody” and I know my phone is gonna come up in just a second because it heard me. But nowadays, for me it's more often “Alexa play the frozen soundtrack”.

Even self driving cars are narrow intelligence. It sounds crazy, right? While self driving cars are made up of several A.I.'s working together like lane detection, sensing obstacles, and making sure that passengers and pedestrians are safe when A.I. doesn't do what you expect, it's often because narrow A.I. doesn't think like a person. There's no common sense or real world experience. This type of technology has its imperfections, but with a little creativity we can work with it.

Artificial general intelligence is intelligence on par with human intelligence. Currently it exists more in theory than in practice, so to give you an example and some insight into how incredibly magnificent the human intelligence is: Fujitsu engineered one of the fastest supercomputers of all time, and it took 40 minutes to simulate a single second of human neural activity. It's the stuff in movies. Think T1000 or Wally. And super A.I.. that's A.I. that's smarter than humans and like artificial general intelligence, it's largely hypothetical, so no need to worry about a dystopian future or transcendence just yet.

Alright, so today we're going to be diving into two targeted sections of artificial narrow intelligence, computer vision and natural language processing. Computer vision is just what it sounds like: an effort to replicate human vision with A.I.. We want to be able to recognize and categorize images and videos, both objects and people. It's actually pretty neat how A.I. peels back the layers of an image takes a look at each element individually and then puts them all together to make predictions about what story and image is telling. Natural language processing is where A.I. is trained to listen to speech and all the intricacies of speech like dialect and intonation. Natural language processing has become part of our everyday lives through tools like smart assistants, email spam filters and autocomplete. Oh, autocomplete, that one still might need a little bit more data to nail it.

Alright, enough with the theory. I figured the best way to start learning how you can leverage A.I. in your business is to look at some real life applications of it over the course of the next several slides we're going to look at a few scenarios where we were able to use artificial intelligence as part of the solution. We're going to pull back the curtain and give you some insight into the process to help users improve their metadata through A.I.. we're going to focus on metadata related A.I. applications because this is the metadata roads to the three areas will discuss our auto tagging, facial recognition, and captioning. Speech to text will then go through an expedited discovery summary so we can identify some of the challenges you might experience in implementing these technologies in your own workflows. So let's go.

The first case we're going to explore is a large international organization that licenses their assets. They came to us with their large archive, over 17 million image assets, many of which were poorly tagged or not tagged at all. Their organization wanted to see if they could leverage A.I. to help their team of taxonomists with the monumental task of tagging the entire archive. Their goals were consistency and accuracy, so they could guide their users to easily discover images that met or related to their search criteria. This was a great project to sink our teeth into, so let's take a look at some of the challenges and how we solve them.

As always, is the case with A.I., we had to set a minimum threshold. We ran samples to establish what was the preferred level of confidence. In this case it was 85% confidence. Then we tested again. It was better, but we took it one step further. We've built in the ability to triangulate their results from the different providers, so what this means is we used arbitrage to compare the results from two or three different A.I. providers and find out where they agreed. Then we played with these two variables for a while, both the confidence and the arbitrage until we felt confident in the quality of the keywords being applied.

So we landed on a formula that worked perfect for our organizations: Use 85% minimum confidence and a match of at least two services. Let's take a deeper look at some results.

OK, we have this beautiful woman of a beautiful picture of a woman in Portugal. We know that tagging any image has a few elements to consider straight out of the MLS playbook: procedural product, demographic and experience. As you can see, this A.I. is primarily product classification. The form of computer vision we can fairly accurately identify objects in an image. But we still need a human to come in and augment that automation with terms that require more abstract judgment. If I look at this image, I get a sense she's pondering or dreamy. Of course we will be able to extract some procedural details from the embedded metadata, but it's a combination of these different systems, your DAM, A.I. and you that give you that robust, well rounded keyword framework for your assets. So we ran this image against three well known A.I. providers, Google, Azure and AWS. The results you see highlighted in green show matches across all three providers, so all three providers came back to us and said we're pretty confident there's a vehicle in this image. The yellow shows two providers and red one provider. One provider came back with along with their confidence percentages. Now let's take a look at which of these words would be applied based on our formula 85% confidence and must meet or match at least two providers. So we grayed out the ones that didn't quite make it. The ones that made it vehicle, Sky, Train and umbrella. If I were manually keyword in this, I feel like those are pretty good. I might just add streetcar and overcast. OK, now let's have a fun exercise and John's going to come back and help me with this really quick. But go ahead and open up a notepad or Word document. Or even better, take over the comment section.

I just want to see the comments flooded and in just a moment I'm going to show you an image and I'm going to give you the opportunity to keyword it. Alright, I'm going to join you. Your minute starts now

JH:  Go.

 

MV:  I'm trying not to look. I have my words written down oh wait, oh I need to show you the image that would be helpful here

 

JH: So again, navigate to your right hand screen, go to the comment section, start adding.

 

MV: I love this.

 

JH:  Oh my goodness.

 

MV:  Oh, I love this. You guys are great. Oh wow.

 

JH:  There's some creative ones coming. This is all well done. Well, oh wow…

 

MV:  Oh, clearly we have a group of experts here. Oh Shiba, oh. I was excited about that one.

 

JH:  Little interest I'm seeing Misti.

 

JH:  I see a lot of sentiment coming into for like beyond just what is in the photo. But feeling this is great.

 

MV:  I love that. Funny fun. Oh my goodness, you guys really took over the comments.

 

JH:  I think we're going to break GoToWebinar. A good one, huh? OK looking. Huh? Can you share the command screen now?

 

JH:  But we can produce this out afterwards. Whoa,

 

MV:  Oh, they can't see the comments. Oh so it's top secret. You guys don't know what other people are reading. Oh, so we've got the inside scoop here. All right, OK? I still see some kind of trickling in.I actually added a few. I'll tell you in a second. I try to have fun and be creative with it, but let's take a look at what the A.I. providers came back with. 

So here is again just a subset. We took a subset of the results. The green, or what matched all three providers yellow match on at least two in red only one provider. So again, back to our threshold 85% and must match onto. These are the keywords that would be applied.

Dog, side, beach, and ocean and I'm positive I saw all four of those come up here. So now what we're going to do? I love vacations and we're happy daytime now, what we're going to do is another poll. We're going to pause for another quick poll to see how many of you had some of these keywords on your list.

 

JH:  100% well…

 

MV:  I hope it's 100%. Actually. OK, this is great. I love that participation fantastic 59% voted. Come on. Put your votes in.

 

JH:  Put down your coffee, put down your tea.

 

MV:  I love, I wanna do, Oh, Speaking of coffee and tea, I wanna do more of these. They're so fun. Alright, thank you alright. Wow,

 

JH:  Look at that. Look at that.

 

MV:  This proved you could be smart as a computer, smarter than a computer. It's amazing, so although not perfect, I had happy Hawaiian shirt and sombrero, I really feel like that. Tiny little has a sombrero and I had to go there, but the point is here, A.I. in tandem with your DAM can do a pretty good job of setting up those foundational key words.

 

JH:  What was great, though Misti, some people did, like, some people got Hawaiian shirt. Some people got shirt, others waves, looking at camera, Shiba. Ah, I can't speak, Shiba Inu, which is amazing. We also got sentiment: happy.

 

MV:  I love that. I love the sentiments, the happiness, the sentiments I find are, is one of the things that A.I. isn't quite great at yet. Sometimes it doesn't, doesn't really pick up on the sentiment, especially for animals. Sometimes with people, that can see people smiling and pick that up, but that, I mean, this just proves the fact that A.I. is great for that foundation. But we really like, people are still really important to just kind of augment that intelligence.

 

JH:  Yeah and then with, as with machine learning, we gotta tell the machine that's not a lake. It's not a lake. It looks like a lake. But it's not a lake. It's not a lake, it's a beach. It's an ocean. It's an ocean,

so some of those things. And again with even like the sand versus pebbles, that was amazing that we, some of those were coming up. The fact that the Hawaiian shirt came up a few times, I love the fact that and there's so many other things that can go for. I saw holidays, people identify the house, this dog looks quite happy. Could be funny. That's awful. Lot of those which were great, well done.

 

MV:  Yeah, thank you so much for their participation. That exceeded my expectations. You guys are doing great. Alright, so. We're ready, we're ready to move on a little bit. OK. We're going on to a new challenge that we discovered in our testing, so I actually mislabeled an image of an internationally known female leader as male. We can't have this, so we knew we needed a tool to prevent certain keywords from being auto applied. Categories like race, gender or other sensitive labels that even we as humans sometimes just get wrong or shouldn't guess. And that's exactly what we did. We built a block list so we could accommodate all the different use cases for labels that should be systematically ignored.

Another obstacle during this exercise…Now we had an influx of new keywords. We needed a way to manage all of them. We understand us taxonomy and you just proved that you're great at that. The integrity of your thesaurus is pretty much as important as the air you breathe. So we knew early on that it was really important to prevent the A.I. from running amok and generating its own unvetted terms. So we added some boundaries. The administrators could set their own rules regarding new keywords that met our initial requirements, 85% match on at least two providers. So this way the organization could set the system to behave one of three ways, just add all the words we love, ignore all the words that are offered if they don't already exist in our well vetted, well maintained, perfectly groomed. Or a third option, find all and flag all the new words as to be vetted. Route them into a space where an administrator can quickly fly through them and make decisions for inclusion or exclusion. That's what you see in the screenshot here. OK again we love reinforcing these concepts with examples, so this example was run through three services and we extracted the new keywords that didn't already have a place in our thesaurus.

So using the DAM interface, I can easily review these keywords that meet our original criteria. And actually, looking at this, it kind of gives me an idea. I see an opportunity to improve the quality of my synonyms, I can identify my lead term, let's say city and the synonyms. Metropolis, Metropolitan area and urban area. OK, I would be remiss if I didn't include some cost analysis on each of these scenarios we're going to be reviewing. So after we understand there's value here and this could work, it was time to take a look at the numbers. Suppose it takes one person one minute to keyword an image, this same exercise that we just did with the cute dog on the beach. So if you leverage some tools like batch editing, we figured we're talking about a minute per image…That's still $4.25 million if you're paying someone about $15 an hour. It would take that one person 133 years working full time on this project.

OK, let's suppose a team of 20 people tagging full time. It would still take them six and a half years to finish. So as we just proved, human tagging is superior, but it's 170 times more expensive and more involved. So let's share the task with A.I.. Set that foundation of keywords in days, not years. And then we can use those keywords to batch and tackle the rest of the archive from newest most relevant to oldest. That way will be able to effectively position the assets in the desired way so that the organizations DAM can guide their users through their journey to find the perfect image.

Alright, ready for the next one. They get better I think. OK, changing gears: case study number 2. This performance organization had about 50,000 images just from events, including shots from fundraisers, concerts and galas. Their goal was to quickly identify all the images that included donors, so the organization could easily thank them, share event recaps, feature them in presentations, and just keep track of their donors in their donor families over time. After understanding their problem, we felt the situation seemed like a perfect fit for facial recognition. We started with a strategy to run the A.I. and allow it to tag the recognized faces with unknown ID. It's pretty cool how this works. So basically the system, the A.I. system, creates a map of the features on any person's face and turns that map into a signature. Then we keep track of the signatures and the names of the people they belong to. This helps to protect your data. If we don't know who the signature belongs to, we ask you. They probably ran into: how do we tackle this initial naming and identifying most effects most efficiently? So here's what we did.

We created a filter for all the people in an asset or group of assets and included the unnamed people. Then, that filter sorted from the faces that were most often identified to the least. Then their taggers could easily go in, click one of the unknowns, look at the 143 photos that were that person was recognized and then named them in batches. This also allowed us to use the merge feature when A.I. wasn't quite sure if two people were the same. We see this sometimes happen if you have a profile shot and a front shot, or in their case their donors over time, sometimes they have donors that are with them for 20 or 30 years, so it's the same person and you have to teach the A.I. that this person, this is a picture of the person from 20 or 30 years ago and this is a picture of the person today. Still the same person. I wish I could show you this in action because it was incredibly satisfying.

Alright, an interesting challenge that came up was this company had donors in Europe and I'm guessing some of you are in Europe right now. So we had to take into account, you know it before I even say it, GDPR and data privacy laws as they pertain to biometric data. We looked at this, both at the data processor us and the data controller. We did the research to understand what we needed to do to make sure we were both compliant. The services with Azure, all images are automatically deleted right after their process, so we're good there. However, we still needed a flag to remind users about the terms of use before they applied or ran facial recognition. This way they could confirm whether or not they received consent from the people in those images. Finally, we built an automation to be able to easily upload model releases and relate them to the associated images. 

Pretty cool. OK, back to some cost analysis again because we need to of course justify this cost to leadership. This one was a no brainer. Manually tagging faces we estimated at about 3 minutes per image, so $37,500 for a person to manually tag these images to run facial recognition: the costs would be $20, that's right $20. Then we factored in a cost for the manual part, because we still need the person to go in and teach who's who within your DAM. So let's suppose, conservatively, that becomes one minute per image, that's $12,500. So three times more affordable for all the historical assets, but the savings will continue to grow as you grow your database. You've already done the legwork to teach the system how to identify people. Now the people will easily be added and built into your workflow. 

I know I say this about all of them, but really this one. It was super cool to be a part of. I'm a big problem solver by nature and I just think it's so cool when we're able to pull all the resources together and they work so beautifully. A large organization came to us and needed to caption their entire video database. They had been using a contracted team of people to do this. It seemed like there was a more scalable solution. They needed these captions to be quickly translated to many different languages as well and we needed to be able to plug into their omnichannel video distribution network. OK, so with this scenario we had two tasks: processing the existing assets and processing the future assets. So we relied on our platform to batch process assets that already existed and have the ability to set automations for all newly ingested assets.

Alright. It was important to us to use these captions. They would be easily searchable not just to find content, but to open exact sections of the content. This method was twofold. First, of course, we made sure we indexed all the captions so they'd be searchable. Then we connected all of the captions and facial recognition to the exact sections of video, so you could click on any face or section of the trends, get script and jump directly to that part in the video. So you could click on a face or any line in the transcript and it would jump you to that exact section in the video, so you could watch it. Finally, although natural language processing is quite impressive, we acknowledge that things can still be lost in translation or misheard by the technology.

The solution for this one was pretty obvious. We needed to be able to click directly into any of the captions to make changes. Of course, if you had the permission to make those changes. And then we have this little gift, or you can see her clicking in and making those changes. All right, I bet you're picking up a pattern here by this point. How do we justify the cost? This solution actually required two A.I. platforms: Azure speech to text and Google Translate. So if the organization had approximately 10,000 videos totaling 200,000 minutes, that would cost $8000. To give you an order of magnitude for Google Translate, it's $20 for a million characters, and that's roughly 2 books of 175 or so pages. Alright, ultimately artificial intelligence is an incredibly valuable tool for organizations to take advantage. However, you need to have the right system and people in place to support it. And that's it. I'm so excited to see some of your questions. I know John's gonna hop back on here and I'm sure you've been seeing him come in.

 

Start of Q&A

 

JH:  There are just an abundance of great questions coming. Thank you so much, that was an excellent presentation. That A.I. is, you know, years ago it was very science fiction you know, and just, you know I put hell up there because people were like I don't know but it's so it's real now it's happening. It's not science fiction, it's actually something very real and for many of you on the call today and the presentation there was that interest to do more of it. And I think this, this takes some time. It takes some planning, it takes people, and the robots take a lot of work. And Misti provided an excellent sort of outline of what that is, but Misti wow, great examples and the dog that was a great it worked.

 

MV:  The dog was really, thank you again. Like the participation, I love it. I love the engagement.

  

JH:  Welcome to a world where we do things virtual. Obviously if this was done in person we could have had like live conversations, but everyone, well done for participating in playing along with Misti you know that was super awesome. Alright Misti, ready? Because there is some questions coming your way.

 

MV:  I'll do my best to answer but I will make sure everybody gets answers. So if we don't get to everything or if I don't have an answer we will park it and I'll make sure you get the answer your looking for.

 

Audio Use Case for Artificial Intelligence

 

JH:  Alright, so this was interesting. An individual, Paul, this was something you had said earlier on and they had voted no for A.I. because this individual works in audio 99% of the time. Is there an audio use case for A.I.? I know I have some feedback as well, but Misti is there an audio use case? If you've been doing any audio stuff in Orange Logic?

 

MV:  I mean natural language processing is audio, right? So I think there definitely are some applications using and leveraging natural language, like are we talking about? I mean, I'm curious to know what kind of audio are we talking about?Are there people talking on the audio? Is that music audio?

 

JH:  This individual might be able to respond…actually this person does have a secondary question about that as well, so it might my feedback quickly

 

MV:  Can I ask questions back to the person who asked a question?

 

JH:  If this person, there is a voice for radio

 

MV:  I'm curious to understand what their application in audio is.

 

JH:  Uh, this individual wrote back VO, which I'm assuming is voice over.

 

MV:  Oh yeah, for sure then we could definitely use some natural language processing on that.

 

JH:  Yeah, and he, actually, this individual wrote, again, this is great. This is live everyone: How accurate do you think this is for animated content, such as things like video games? Have you seen the video game cases?

 

MV:  That's a great question. I know that video games are really actually quite good at this because they need to be accessible and part of its capability is being able to translate into captions. That said, I haven't personally seen speech to text on videos, but I'd be happy to explore that with you.

 

JH:  Yeah, and I have seen that work with video game companies where they are doing A.I. recognition. It's also interesting to note there is now A.I. created music. We had a discussion about this about two weeks ago, and in fact Google A.I. created music. There is a new Nirvana song that has been created using artificial intelligence. Looking at how the song, running, it is incredible what they were able to do and to create a brand new Nirvana track I don't know.


Time Based Metadata in Video Content

 

JH:  Right, question number 2 from Jacob. Have you applied this same approach to time based metadata in video content and did the results come back as relevant or overwhelming? So time based metadata and video content? Good question.

 

MV:  Yes, that's a good question… processing. I'm trying to think of an example so, I mean, I feel like all videos are time based, right? Yes, do you understand the question? I don't quite understand the question. I might need to read it.

 

JH:  Uh, I'm just going to try and find the question again. There's so many questions coming through.

 

MV:  We can park that one and come back to it.

 

JH:  Yeah, I wonder. I know that the research and work that we have done in our some of our clients, we do know that with time based there is a lot of metadata that comes out of it, especially with not only just voice but actions and then the products and the scenes that are within the time base, so at a certain clip you can see.

 

MV: However this is saying, I feel like I might be connecting the dots now…So if we were for example, the auto tag of video, we can time base those tags, so that might be what he's asking. So if you're watching a movie, oh sorry, go ahead.

 

JH:  He wrote back. This is great. I love this live. So Misti you're getting live feedback. Thank you. Jacob was specifically talking about the object or keyword tagging.

 

MV:  Oh yes, OK, OK, yes, so absolutely. So if we're watching or if we run auto tag on a feature length film for example, it's gonna pick up the objects in the different sections. And the coolest part I think about that, and of course I can only speak for my platform, but you can click on the keyword and it will tell you the exact places in the video where that object was spotted. So piano for example, and you're watching a full movie, that piano might be seen 35 different times during the duration of that movie and you'll have little pinpricks so you can see exactly and jump right to those places.

 

OCR: Optical Character Recognition in DAM

 

JH: Had one great question from Jennifer. Does A.I. work for things other than photos like technical documents, marketing materials, etc. So forget the images and video. Does it work for other things? Like PDF's, yeah. So obviously OCR optical.

 

MV:  Yeah, you can definitely use OCR.

 

JH:  Not sure if that's, Jennifer, if you are live with us, if that there's more to that let us know. I'm going to give it a few seconds 'cause everyone seems to be responding quite well in a live format right now, but yes, OCR works well. Obviously, recognition of certain text copy can be trained, so if it's like a financial report, it's a financial report etc. I have, that has worked well. There so many good questions here.

 

MV:  Summaries actually is a good consideration, so there's a couple A.I. or programs out there that are able to summarize like large bodies of text.

How to start using A.I. in your DAM

JH:  A few tons of like positive comments, excellent, yes, the deck will be shared out afterwards. So if you are participating today, you will receive the live recording and the deck. Here is a great question for you from David, Misti, and this is something people are probably thinking about to where would you recommend we start if we were looking to introduce A.I. tagging into our DAM? So where are you going to start? Would you start looking at the platform? Would you look at what our A.I. offers where? Where is a good place to start? Mistakes?

 

MV:  That is a great question and I think one that a lot of other people are probably thinking. A good place to start is first of all, make sure that you're speaking to people who have done it before. I mean, that's going to really help you if they've done it before. The goal is that they've already experienced a lot of the challenges that you might experience and have answers for you already in place. And then another good place or an important thing to do is identify the areas where there might be opportunities through writing down the objectives and the problems. So here's my problem, similar to what we did in in my presentation, I pointed out the problem, so customers came to us and they said I have 17 million assets, I need to get them tagged. How may I get into this efficiently? I've got 50,000 image assets. I need to somehow identify the people in these, so if you, if you know the problem and you speak to somebody who's done it. Therefore, that's definitely a recipe for success.

 

JH:  And obviously with that, those examples of the volumes you gave are real. People do have that much content to do in looking. There might be two stages to it as well. Obviously would look at the dog photo. The consistency of dog, beach was there, but then the sentiment happy, joy…maybe that's a second phase, so maybe there's like a first scrub and then a secondary phase of it. I think there's many to consider before I get to the next one.

Jennifer did write back and Jennifer was saying only if you could fill in specific fields with specific locations. Does that help a little bit Misti? From that original question from Jennifer.

 

MV:  Only if you could… can you remind me?

 

JH:  Yeah, so the question was does A.I. work for things other than photos like technical documents,

 marketing materials and then she followed up with but it is helpful but only if you could fill in specific fields with specific locations. Might need a little bit more there too…

 

MV:  I might. Yeah, I'm, things are kind of churning. I feel like this might be a larger conversation that's coming on because I've seen some other really cool things along with accessibility where we're able to use like in text script, and that could be a direction or something that we could leverage. I'm sure there's something I don't have a really great answer right now, but I'll make sure you get one.

 

JH:  Yeah, and Misti, your email is there at the bottom of the screen so that Jennifer can reach out. Here's an interesting point, this is really interesting from Chris, is there an A.I. engine that can leverage IPTC metadata? For example… Can an A.I. read human generated descriptions to create more accurate keywords?

 

MV:  I, yes, I mean, the quick answer is always going to be yes. A.I. evolves really rapidly, so that's another thing I think we should just dig into that a little bit more and it might be something where we just have to apply it, run some testing and learn from it.

 

JH:  Yeah, great question. A question from Loretta, what about pulling data from rolling credits at the end of a television show? Is that possible? OCR? Yeah, that's just, that's just, for that could, that would be able to be picked up and certainly the you know after a while the directors name producers name. Those names would be recognized 'cause they would be there all the time etc etc. The actors name etc. The talent that could be picked up correctly. Right, question from Steven. I think the, I think this is a theme, I think what's happening here Misti but can image recognition identify all visible text on packaging shots?

 

MV:  Yeah. So, the name. Yeah, I feel like I should have added because we do have OCR elements. I thought the other ones were going to be more interesting, so I think we might have to do another one of these and talk specifically about OCR.

 

JH:  There's, well, here's another one on that same theme ish from Jason. Great question. Do marathon races use A.I. to find clips of individual competitors based on their jersey numbers?

 

MV:  I actually have ran a marathon and thought this exact thing, not a marathon. Let me retrace, I ran a half marathon. So anybody who's actually run a full marathon, props to you, but it was interesting. They put it like an RFID tag inside your, I don't think they specifically look at numbers, but they might scrape through the videos and find your number, but there's an RFID tag in your, you know, your bib. So when you track across certain points, that's how you get your exact time. So I'm, I mean I can look to get you the exact answer, but I'm thinking it could have a combination of the two knowing your RFID tag when you track when you run across that last little minute, and potentially using A.I. to take video and find your number. 

 

JH:  Yeah, and I do know that the use case would have to the business case story would have to be created for that because it is a lot of money to spend, but I do know that that same tech A.I. technology is used for movie premieres and lineups as the people enter, I know that's Tom Cruise. I know that's Hugh Jackman. I know that's Renee Zellweger. It will, it will recognize so that it does work. The use case for Marathon might not be as strong, but yes it could. The Olympics would be as a great example of that, because there might be some well known athletes that the A.I. is like, I know that athlete. I know that athlete.

 

MV:  I think their race data might be held closely by the race, so if you're holding the just the content of that race and you don't actually have the RFID data that I mean, you're right. Yeah, that could be a great use case for that.

 

How does A.I. handle issues of race and skin color?

 

JH:  So here is one. This could be an entire other discussion for us Misti, Like part 2, part 3. There's so many other parts to this one from Anna: I was evaluating a project once and the A.I. was tagging black people as darkness. How do these types of programs address issues of race and similar issues? And that is one of the bigger discussions about A.I. have some opinions.

  

MV:  That's a whole conversation in itself.

 

JH:  There is research going on at both academic level and technological proprietary level on addressing those issues, there was a well published paper when Black Panther came out and there was some definite discrepancies between the A.I. engines not recognizing male and female characters, and that is as we said earlier, I mean here we are in 2021, we want to be respectful and honest and use integrity and show the respect for the diversity and use that inclusion. And again, we're not perfect, technology is not perfect and there are these…this is why both myself and Misti said A.I. is a technological and a human endeavor. You gotta put the two together 'cause we do need to look at some of the things that go, In fact, that's this product, not that product. That's this actor, not that, etc. etc. Over time it will get better.

 

MV:  The more data that we continue to feed it, the better the A.I. is going to get at recognizing these different races and all the hot topics. This is part of the reason why now even the A.I. services, if you read them, they acknowledge the fact that they're not great at sometimes things like race or gender, or it'll give you conflicting results. Like it'll say, we're 50% sure this is a male and 50% sure this is a female, so it's not, it's not great yet, but that's why we built the block list. So you can say for now we know A.I. isn't great at these things yet let's just go ahead and ignore these, and then we'll address them as A.I. evolves in the future.

  

JH:  And I do think seriously, Misti, that this could be an entire presentation and panel, maybe at the next Henry Stewart, because…

 

MV:  I do think I think it would be a fun, hot topic. Something that we need to talk about. You know…

 

JH:  We need to talk about how technology is not perfect. We are humans, we’re not perfect. We make mistakes all the time, but how do we do a in such a way that we are to be respectful and show that diversity and inclusion in such a way that mistakes may not be made. I do think I'm now flagging this for a panel for the next Henry Stewart in New York.

 

MV:  Write it down.

 

JH:  There are so many great questions here from Charles. OK, so this one is about corporate policy, so security and corporate IP policy. We would have to run A.I. image recognition software on premises. Cloud is not an option. What are your thoughts suggested on the challenge cloud versus on premise? Interesting, good question Charles. Misti?

 

MV:  That's such a good question and we've actually seen, we've had this exact challenge come up a few times. So I mean, there's lots of different answers. First, it's important to look at your infrastructure. Does it mean everything has to be completely within your own firewalls and can't leave? If that is what it is,

there are libraries out there that we can take advantage of and move the entire library into your stack. So then you're leveraging A.I.. You're not getting the benefit of everybody using the same system, but you're getting a library that's already learned from a lot of data, and you're able to take that entire library and put it into your system.

 

How Artificial Intelligence identifies different species

 

JH:  Absolutely, absolutely these, Oh my goodness. Well, the questions keep coming. Adam: How good is A.I. in identifying specific types? Will it know a rose is a rose versus a tablet versus a Daisy? Or would it just tag it as a flower?

 

MV: Good question. I think I've seen a little bit of both, so I mean sometimes it's really good at classifying things like the type of flower or I mean in our example earlier, did the A.I. come up with the type of dog that it was, or did it just know that it was a dog? So in some cases we're going to be able to recognize that a rose is a rose, a daisy, and it might likely come up with both. It might say this is a rose and a flower, and then tag both. But if you start getting into flowers that are less popular, just as we said earlier, A.I. needs to learn from repeated exposure to data. So the more flowers we send the A.I., the better it's going to get at classifying them correctly.

 

JH:  And my answer to that would be what's the use case? Sometimes if you're processing thousands and thousands of images, you just want to know: I'm looking for flowers, I'm looking for trees. That's it. You don't need to know the specific type of flower or tree 'cause that's the use case. They need to see some things. However, some need to know, I need to see the flowers and Is it a rose? Is it a Tulip is or Daisy? So it depends on the use case.

 

MV:  And that would be the person element that we were talking about earlier. So if you, if A.I. went through incorrectly classified as flower which I'm confident it could, you could then apply the filter to flower and then you have your entire list of flowers and it just organizes your ability of the human to go in there and quickly classify what the A.I. couldn't correctly.

 

JH:  Excellent. Good question from Aaron. Can A.I. identify hand gestures for example? Someone shows the peace sign with their two fingers.

 

MV:  Oh, I don't know. Do you know

 

JH:  No and there are things being done right now with American and British Sign language, which are by the way, very two different sign language systems to do recognition of that, and I think again A.I. is so new as a discipline both professionally and academically. All the things that we've been doing for the last century, we're now trying to say well can A.I. do this as well, so that's a great question.

 

MV:  Are tools that are targeted specifically for sign activity, actors singing, sorry, let me turn that off, but I bet there are tools that were specifically built to take sign language and convert it into text.

Artificial Intelligence and Voice Recognition/Music Recognition

JH:  Excellent. Please bring Misti back for more sessions from Steven. Great awesome Misti. You're hired, you're coming back. I appreciate it. Then another question about A.I. used with sound files kind of do voice recognition or identify musical genres. For example, interesting musical genres.

 

MV:  Now these are so good, I'm gonna have to get back to him on that one I want to test it. I want to test it to see.

 

JH:  So there is a holiday of music genres. Obviously, we do have like the Shazam apps that can find the song, but there was a great article in the New York Times about a year ago where the whole ideal of genre in itself is up to intellectual debate. And how will you ever know what it rock versus hip hop versus urban? Because the genres do change, let alone what was very interesting was the identification of using A.I. to find classical music. And if you said I'm looking for Claire de Lune by Debussy, which version of Claire de Lune are you looking for? And in fact, they went to Siri and to Alexa, and even they were not, like, you would have to say I'm looking for Claire de Lune, as performed by this Symphony at this... It's that specific with the music and with a certain genres and apparently classical music is the most difficult genre to get. I know I want to go back to school. I love these types of, all these interesting situations.

 

MV:  I love this. This is, that is such a relevant question right now. Well, I love it. We're going to have to circle back on that one after I do a little bit of testing.

 

JH:  This is interesting from Nat. Do you, Misti, do you ever get “I'm 80% confident this is a woman and 80% confident this is a man that, for example, greater than 100% total where there are finite options, huh?

 

MV:  Yes. Ah, I have seen that before. Mostly it's with two different competing providers though. So it's not like one provider is coming back with conflicting answers. It's oftentimes two different ones, and then if it comes back with something conflicting like that, then we just, you know, we park that somewhere else.

It's not something that we can use.

 

JH:  Great question here from Manuel, it's comment and a question. Race description is not the same in all languages and then what is not correct in one language may be correct on another. How do you lead with such things on the A.I.F? Tagging is required as multilingual and I think this is very good. Example of this is many of your clients and many of us on the call today are working and probably global corporations where it's not just the United States or Canada. It is actually different countries where there are different cultural norms and different ways of language terms for identify things, do you have examples of global clients Misti that have had these types of potential A.I. unique instances?

 

MV:  I mean, naturally I start thinking about Google, Google Translate and how things are lost in translation. I'm not quite sure if that exactly answers the question though.

 

JH:  I do know with some of our clients A.I. recognition in terms of cultural sensitivity and the identification of terms, obviously, as with machine learning, we need to teach them what their and then like the context.

Obviously in America you say potato chip. If you're in the UK you say crisp, different thing, ish. So there are things where you have to tell my audience and what are they looking for. The thing that I think the layer that has to go on top of that is the one that is the. Let's show respect to words that we should. There are some words we just don't use anymore and for very good reason we don't use those words but the cultural sensitivity and the identification of the different languages. That is something that's a layer we need to bring into the A.I. and again this goes back to the, it's robots and it's people like we said, right?

  

MV:  Yeah. And then again we just have to teach it. We have to teach it, but like what's OK and what's not OK and ultimately I know I have seen, I mean some guys are gonna lean towards different cultures based on how much data you're giving it and it could come up with both like crisp right? You mean you used a great example of potato chip versus crisps or French fries versus chip.

 

JH:  Cookie versus biscuit cookie. Biscuit, it's the same thing ish, but what's the right word? It depends on the audience and who's the target for.

 

MV:  Again, people get that wrong too. Sometimes I still call…I was in…I spent some time in Australia and I just think they called trash rubbish and I it's so much prettier than trash. So now my kids call it rubbish and people don't watch this mother talking about… And people get it wrong and there's A.I., cultures are just kind of spanning now across the entire globe.

  

JH:  Got it, there is. Roberta has confirmed, Roberta is an orchestra DAM manager and Roberta can confirm, so I guess that was the question about the audio and the genre, so that's great that we have people on this. Oh, I wish, I can't wait to do this live mystique, because I think they're the discussion of the different types. And here, you talking, and then it's just such a more organic experience.

 

JH:  Very good. We're running out of time Misti to close it all out. There's a few other questions here about like where do I start? What do I do? Because you know many of us as you saw the interest is there and people want to do more what's like always ask for like a best practice or like the top tip but what would, what would you give to the audience today? What would you say to the audience if you're, if you were interested in, you were excited by what was discussed today? That A.I. seems like something you want to do and should be doing. What's that one tip? What's that sort of “What do we do”?

 

MV:  I have to, I have to go back to the original, which is call someone who's done it before and has experience in it, and just as we saw today, there were some new use cases that I haven't personally experienced and now I'm excited to go dive in. So you want someone that is a problem solver, and as A.I. is evolving, is continuing to evolve as an organization. So I mean. Obviously, give us a call. And will give you some demos and we love to just dive into your objectives with you and explore opportunities. Because sometimes it's not as black and white as it seems, but it's the journey getting there and the evolution of it, which makes it really fun.

 

JH:  Perfect, Misti. On behalf of everyone here at Henry Stewart, I want to say thank you very much to you and Orange Logic for this. I had a great time today to conclude the sixth session of this metadata roadshow. What a great, patient, super interactive audience. Thank you very much for everyone there.

The comments that we received, the questions again, if not everything, we have Misti's email if you are able to ask more questions, do so. Tons of more comments. Definitely, I think we're gonna have to do this in New York. Misti a lot of good questions about how do we solve race issues, gender issues, using terms of A.I., I think that is actually very important to discuss, and I think we as leaders in this industry can do that. But Misti, I want to just say thank you so much, excellent work that was so much fun to do this today and a lot and I think everyone else has learned a lot. And yes, the slide deck will be shared, so if you're attending and you signed up you will be getting this. The recorded version of this as well as the slide deck as well.

 

JH:  And again Misti's information is there for you to use if you want to reach out and I just want to say on behalf of Henry Stuart, I enjoyed doing this series. I do think we should do this again. Six series, six sessions each different topics, so many good things we've learned. I think with over 1000 people attending, people are really seeing the power of what metadata can do for your DAM and any other content management system you are using that metadata is important. It's necessary for what we do. That was demonstrated today with Misti and A.I. and in the other five sessions. So thank you very much. 2022 is going to be a different year, I think we're going to be back in person, 2022 we can actually gather in person. I think it's in London, New York, Los Angeles, so everyone. It's been great to have seen everyone in the last six weeks and well, we will see you in 2022 in person. Thank you again, Misti and everyone have a great day and the rest of your week. Thank you very much.

 

MV:  Thank you so much. John. Thank you, Henry Stewart. Have a great day everybody.


Bring it all together with an intuitive, customizable DAM platform.

Cortex is an Enterprise Digital Management Platform built to grow with your business.

  • 130+ custom tools
  • Tailored dashboards for every user
  • Unlimited digital asset storage
Request demo
×

Recommended Reading

Carnegie Hall shares its 130-year history with public Digital Collections

Read this story

Scaling a DAM to match explosive growth

Read this story

What is a Polymorphic DAM?               

Read this story