What Does Artificial Intelligence Say About Human Creativity?

pexels-photo-largeThis post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In this post (part 4), I talk about creativity and how it relates to AI.

In the previous posts, I’ve been talking about the computer Watson and how it helped create a trailer for the movie Morgan. Is this “cognitive movie trailer” evidence of AI creativity or the potential to mimic human creativity? In other words, can a human be replaced by a machine—in this case a trailer editor who uses skill and imagination to create something new?

Let’s first consider what creativity is. The dictionary defines creativity as the ability to make new things or think of new ideas. But is it a trait only exhibited by humans? Is it an attribute that some people have and others don’t? Is it an occasional mental state that we enter? Can one learn to be more creative? I’m not sure of the answers to all these questions, but perhaps it’s more helpful to ask what creativity is not. It’s not problem solving, which is a process whereby a “rule” or “algorithm” is applied to solve a problem. Being able to understand and apply a rule is different from discovering the rule.

In the case of the computer Watson, we can see that understanding what a movie trailer is and identifying the best scenes from the movie Morgan to use fall into the realm of problem solving and not creativity. A human stepped in to do the actual film editing, which additionally suggests that the “creative” aspect of putting together the trailer could only be done by a person with the requisite editing skills and imagination to sequence the clips and add other components such as music and text. However, I don’t think a human was essential to do the editing, once the scenes were selected.

A movie trailer template could have provided a guide with plascreenshot_imovie13ceholders for media and text, much the way iMovie trailers are created. In this screenshot, you can see an iMovie trailer template, which guides the choice of video clips and text. Scenes are suggested, as are text titles that form a story. Such a template could have been used along with the ten selected scenes from Morgan to produce a finished trailer. However, such an ability by an AI could not be called creative. Although some decision-making would be involved in selecting which scene to go into each placeholder, those steps would be guided by a set of rules—in other words, problem-solving, not creativity. Also, templates would produce an assembly-line of movie trailers that all follow the same format—rather than a unique trailer with sequences, pacing, music, and other features individually selected by the editor using his or her knowledge, skill, and imagination.

I think we are a long way from machines that think and create like humans. However, we are at a point where AI can be used to enhance human skills and help us perform tasks involving vast amounts of information. Artificial intelligence systems are already at work aiding, for example, analysis of medical images, detection of suspicious charges to our credit cards, or automated telephone customer service. The real question is not whether AI can replicate human thinking or creativity but how AI can help humans create new things or think of new ideas faster and more efficiently.

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In the next and final post (part 5), I’ll discuss how AI might help scientists be better communicators.

How Did Artificial Intelligence (AI) Help Create a Movie Trailer?

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In this third post, I describe how the computer, Watson, helped create a movie trailer.

Before we get to the Watson movie trailer, let’s first think about how movie trailers are made. Movie trailers are designed to convince people to go see a particular movie. Superficially, trailers appear to be a condensed version of the film, but good trailers are carefully designed to raise expectations and to appeal to the viewer’s emotions. Most trailers follow a typical formula, modified for the genre such as Action/AdventureComedyDrama/Thriller, or Horror. Many trailers begin by introducing the characters and the setting of the film. Next to appear are the obstacles that change that world and set the characters on a new course. This may be followed by increasingly exciting, funny, or tension-filled scenes to ramp up the viewer’s desire to find out what happens. The specifics—selection of clips, the way they are cut (rapid-fire or slow-reveal), the fonts used for text titles, narration, music, and other choices—differ among movie genres.

All, however, are built more or less the same way by the trailer editor. The original movie is first watched carefully and deconstructed to reveal its basic components, visual and audio. The process then slices the movie audio and video further into segments that can then be rearranged to build the trailer. Next comes the choice of the best elements to use. Is the acting superb? The cinematography? The story? Editors often select those elements that highlight the merits of the film or the ones that have the most emotional impact on a viewer.

Not surprisingly, the AI-enhanced trailer of the movie Morgan was created in much the same way as a regular trailer. The first step, however, was to train Watson to understand what a movie trailer is and what features of a movie are used in movie trailers. The IBM team did this through machine learning and Watson APIs (Application Programming Interfaces, i.e., programming instructions). Basically, each of 100 movie trailers was dissected into component scenes, which were then subjected to the following analysis: (1) Visual (identification of people, objects, and environment), (2) Audio (narrator and character voices, music), and (3) Composition (scene location, framing, lighting). Each scene was tagged with one of 24 emotions (based on visual and audio analysis) and further categorized at to type of shot and other features.

Once Watson was trained, it was fed the full-length movie, Morgan. Based on its knowledge of what makes up a movie trailer—particularly a suspenseful one, Watson then selected ten segments as the best candidates for a trailer. These ten turned out to be scenes belonging to two broad categories of emotion: tenderness or suspense. Because the system was not taught to be a movie editor, a human editor was brought in to finish the trailer. The human editor ordered the segments suggested by Watson and also added titles and music. [see reference below for additional details]

Here’s the trailer that resulted, along with some explanations of how it was done (direct link to video):

As you saw, the end result looks and sounds like a typical movie trailer. The big question is if this cognitive movie trailer does what a good trailer should: make us want to see the movie.

If you like science fiction films that explore questions about human engineering or artificial intelligence, then this trailer might appeal. The trailer does convey through the ten selected scenes that Morgan is an engineered creation that goes rogue—a story we’ve heard before. However, we are left in the dark about what exactly Morgan’s problem is (other than being locked up) and how the humans will deal with it. Many trailers fail by showing too much of the story. For example, the official Morgan trailer shows a lot more of the movie, which made the story sound similar to another movie, Ex Machina (an engineered human-like entity is confined in a futuristic laboratory, tested for flaws, goes amok, kills or maims one or more people, and escapes into the world). But by limiting what’s revealed, the Watson-enhanced trailer makes us think that maybe this story will differ from previous movies and be worth seeing.

I thought the computer-selected segments were interesting in that they not only conveyed a range of emotions (happiness, tenderness, suspense, fear), but many did so in a subtle way (a smile, a hand gesture, a slight gasp, a head turn). No scenes seemed to be selected from the latter part of the movie, which would have given too much of the story away. I don’t know if this was a result of the Watson system ranking scenes near the end lower than those from the beginning and middle.

In the end, I think the Watson-enhanced trailer is pretty good and perhaps better in some ways than the official trailer created entirely by a human.

For more information about the making of the Morgan movie trailer, see this article: Smith, J.R. 2016. IBM research takes Watson to Hollywood with first “cognitive movie trailer”. Think <https://www.ibm.com/blogs/think/2016/08/31/cognitive-movie-trailer/>

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In the next post (part 4), I’ll talk a bit about what AI means for human creativity.

What is Watson and What Does It Have to Do with Videos?

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In this second post (part 2), I describe Watson, a computer that was trained to assist in the making of a movie trailer.

artificial-intelligence-elon-musk-hawkingIn the previous post (part 1), I explained that IBM’s computer system, Watson, was used to help a Hollywood film studio make a trailer for the movie, Morgan. But what is Watson? According to the IBM website, Watson is “a technology platform that uses natural language processing and machine learning to reveal insights from large amounts of unstructured data”. Translating that into everyday language: Watson is a computer that can answer tricky questions like the ones posed on the gameshow Jeopardy!. In 2011, Watson beat two reigning champions, providing answers to Jeopardy! clues—example: even a broken one of these on your wall is right twice a day; correct reply: what is a clock?—and winning $1,000,000 (which was donated to two charities).

Actually, Watson is a cluster of computers (90 servers and 2880 processor cores) running something called DeepQA software. Despite its performance on Jeopardy!, Watson does not “think” like a human and arrives at an answer to a question differently. Tons of information from various sources have been input, providing Watson with an enormous information base to analyze. For the game show, Watson used more than 100 algorithms to come up with a set of reasonable answers to a question. It then ranked those answers and searched its information database for any evidence in support of each answer. The answer with the most evidence was given the highest confidence. When the confidence was not high enough during the Jeopardy! game, though, Watson did not risk losing money by offering an answer.

Despite fears that AI will eliminate jobs or go rogue and destroy humankind, as depicted in the Terminator series, the system is viewed by developers as a way to augment human intelligence and to reduce the time spent on tasks involving large amounts of information. IBM prefers the term Augmented Intelligence (systems that enhance and scale human intelligence) to Artificial Intelligence (systems that replicate human intelligence). There are many ways in which AI can augment information-intensive fields such as medicine, telecommunications, weather forecasting, and financial services. Since the Jeopardy! match, Watson has been used to create cognitive apps and computing tools for businesses and healthcare professionals.

It’s not difficult, then, to imagine AI systems aiding scientific research and especially the communication of those findings in a more efficient way. More and more people are getting their information, particularly about science, in the form of video, but many science professionals have little time or incentive to devote to learning and using new communication tools. A system that can reduce the time involved in making a video and simultaneously enhance the quality could greatly improve communication of science and its importance to society. The first cognitive movie trailer, aided by the computer, Watson, is a “proof of concept” in this regard.

For more information about Watson and preparation for the Jeopardy! gameshow, see this article: Ferrucci, D. et al. 2010. Building Watson: An overview of the DeepQA process. Association for the Advancement of Artificial Intelligence pp. 59-79.

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In the next post (part 3), I’ll describe how Watson helped create a movie trailer.

Science Communication, Artificial Intelligence, and Hollywood

This is the first post in a series about Artificial Intelligence (AI) and how it might help scientists be better communicators. In this post, I introduce the topic.

Consider this futuristic scenario:

4246476627_f40c638984_oA scientist is working on a grant proposal and must create a three-minute video synopsis of what she plans to do with the funding and how her research will benefit society. This video synopsis is one of the required components of proposals submitted to government funding agencies. She logs onto a platform in the Cloud and uploads video clips showing her and her team working in the laboratory and talking on camera about the potential applications of the proposed research. An AI (Artificial Intelligence) system analyzes all of the uploaded information, as well as millions of images, animations, and video clips in the public domain. Within minutes, the AI system has identified the key components necessary to address the stated goals of the funding opportunity and has produced a draft video of the required length that is both intellectually and emotionally stimulating. The scientist takes the draft video file and makes a few edits based on her knowledge of the field and potential reviewers. She renders the final video and attaches it to her application package, which she submits to the funding agency. Her proposal is funded, and the funding agency uses her video synopsis on their website to inform the public about the research they are supporting and how it may affect them.

Far-fetched? Perhaps not. Recently, I was watching an episode of GPS in which Fareed Zakaria interviewed the CEO of IBM, Ginni Rometti, and my ears perked up when they talked about an AI helping a film editor cut a movie trailer, reducing the time required from weeks to a day. The movie studio, 20th Century Fox, recently collaborated with IBM Research and its computer Watson to produce the first computer-generated movie trailer for the science fiction film Morgan, which is about, appropriately enough, an artificially enhanced human.

Watson was trained to “understand” what movie trailers are and then to select key scenes from the full-length movie to create a trailer that would appeal to movie-goers. A similar approach could be applied to scientific information to produce a video proposal that resonates with peer reviewers and panelists, as in the hypothetical example above….or a video abstract to inform the scientific community about a recent journal article. The idea here is that a busy scientist may one day be able to use AI to rapidly scan a vast storehouse of data—much faster and more thoroughly than a human—and then to suggest the best material and design for an information product such as a video.

AI is being considered as a way to enhance many activities involving the analysis of large amounts of data—such as in the medical or legal fields. Using AI to create movie trailers or science videos may seem to be a trivial goal compared to making a more accurate medical diagnosis; however, when you consider how important it is for science professionals to be good communicators, the idea seems worthwhile. In the coming posts, I’ll explore this topic further and provide a bit more detail about how IBM’s Watson was used to create a movie trailer.

This post is part of a series about Artificial Intelligence (AI) and its potential role in science communication. In the next post (part 2), I’ll provide more information about Watson, the computer.

Video Interviews: The Good, The Bad, and The Ugly

Unless you’re a member of the most isolated tribe on Earth, you probably know that we’ve all become potential reporters, capable of shooting video of unfolding events with our phones and instantly sharing it with the world through the Internet. New technologies have given the average person the means and the inspiration to chronicle and share their observations with a global audience. Citizen journalists have documented street demonstrations, natural catastrophes, political uprisings, wars, police shootings, and terrorist attacks. No longer bystanders, people are getting involved by capturing video that becomes key evidence in investigations, that informs search and rescue operations, and that provides spontaneous, person-on-the-street viewpoints. The massive contribution of these amateurs can be seen at CNN iReports where more than 100,000 people posted their stories in 2012.

An increasing number of science professionals are also interested in reporting on their experiences conducting field research as well as at conferences and other scientific gatherings. Some people tweet about talks they heard or about a workshop they attended at a meeting. Conference attendees can become reporters through blogging and vlogging, which is blogging through the medium of video. Vloggers capture footage of various conference activities, such as poster sessions, provide commentary about some aspect of the conference, or interview other attendees about their research. Despite some reservations about premature dissemination of unpublished research through live tweeting and blogging, many conference organizers welcome these new reporting methods because they raise the visibility of the conference and generate excitement in attendees. Small conferences in particular can benefit from these activities.

In this post, I would like to focus on one of the most difficult tasks for the scientist videographer. And that is: interviewing other people. Conducting interviews on camera is always difficult, but trying to interview someone at a conference is particularly challenging because of all the noise and distractions. I recently attended a small conference (~300 people) and conducted a series of video interviews with the conference organizers, sponsors, and attendees. My overall goal was to produce a short video that explained what the conference was about, why the topic of the conference was important, and who some of the attendees were. I wanted to see if I could accomplish this by myself using a simple recording setup: my iPhone (6) and an inexpensive lapel microphone. The end result was a bit longer than I intended, but it pleased the conference organizers who posted it on the conference website. Check it out (direct link) and then I’ll talk about some of the pros and cons below.

The following are some tips that I gleaned from the experience:

  1. First, decide on the objective and length of the video and stick to it. This tip may seem obvious, but often videographers reporting on an event such as a conference will not have a clear objective in mind. The result is a meandering video that fails to send a clear message. In my video, I had been asked by the conference organizers to shoot a video that basically explained the purpose of the meeting and that featured some of the organizers, sponsors, and attendees. In other words, I was somewhat restricted in the “story” I could tell. I also needed to keep the video brief. My target length was under five minutes, which I overshot. However, the organizers liked everything I included, so the final length turned out to be fine. I shot a lot of extra footage (answers to some spontaneous questions) that I would have loved to include but couldn’t without making the video drag on too long. If I had set out to do a video about mangrove researchers and what challenges they face, I would have used that extra footage. However, I was committed in this case to making a video about this particular conference. If you find yourself struggling for a topic, consider asking a single question of a particular segment of conference-goers such as, “Is this your first scientific conference? If so, what are you finding most surprising or interesting about the experience?” or “What one piece of advice would you give to students and early-career scientists about giving their first oral presentation?”
  2. Select interview subjects carefully. When it comes to interviewing, you will likely have to deal with a variety of people: some who shine on camera and others who ramble or have distracting mannerisms. Also, most people become a little nervous and stiff when on camera.
    1. One way to deal with this problem is to carefully select your interview subjects—if possible. I tried to select people to interview who seemed to be articulate and able to answer my questions without too much rambling. In some cases, I knew the person and was confident they would perform well on camera. In other cases, I watched people deliver their conference talk and, based on their delivery, decided whether they would be good interview subjects. In a few cases, I spoke with people beforehand to get an impression of how they would be on camera. In my case, I had a secondary objective in selecting subjects. I wanted to use people who would be good interview subjects but I also wanted be challenged by interviewing people who had no prior experience on camera. I wanted to see if I could still get useable footage from people who were extremely nervous or had other on-camera issues. I found that I could get decent footage from everyone I interviewed if I just kept filming and asking questions until I got something good.
    2. Sometimes, the scientist videographer is restricted with respect to choice of interviewee. If you are making a video of a small workshop, for example, you are limited by the people who are in attendance. They all may have varying levels of difficulty speaking on camera and so you must work with what you have. The best way to deal with this is to try to put the interview subjects at ease by asking them easy questions first, ones that they should have no trouble answering quickly and concisely. Also, you can begin by just having a conversation with them and then turn on the camera after they have relaxed.
    3. At an international conference, you may need to interview people whose native language is not English or who have strong accents. One solution is to prepare and upload a word-for-word transcript along with the video, which can be used for closed captioning. Viewers who have difficulty understanding an interview subject can turn on closed captioning and read the transcript.
    4. In general, if you are covering a large gathering like a conference, it’s a good idea to interview as many different types of people as possible. For this particular video, I wanted to have a good cross-section of people: conference organizers, sponsors, and attendees; established scientists, early career scientists, and students; male and female; people from different countries, not just the U.S.; and people working in different subfields.
  3. Ensure quality audio. Dealing with ambient noise at a conference is probably the biggest challenge for the scientist videographer. On the one hand, you want your interview subject to be clearly heard without distracting noises. On the other, shooting the interview in a crowd of people helps convey the reality and excitement of the conference. I tried a couple of approaches: interviewing people in a noisy poster session as well as outside the venue (either outdoors or in a quiet foyer). I found it easier to interview people in the quieter settings. They had less trouble hearing my questions, and there were fewer distractions for both me and my subject. But these quieter interviews did not have the same energy as the ones captured in the thick of things. In this case, the lapel microphone did a great job of recording the subject’s voice, which is heard clearly above the background noise.
  4. Choose an appropriate backdrop. In general, you want to avoid interviewing people against a blank wall or in front of a window or bright lamp. Also, you want to avoid a situation in which people can walk behind your subject—because the viewer’s attention can be distracted by what is happening in the background. In my interviews, I tried out a variety of backdrops, including conference or institutional posters and blank walls. As you can see in my video, the footage shot in front of a poster or other colorful background worked best. Getting the right combination of backdrop and good audio can be challenging, however.
  5. Avoid the “talking heads” syndrome. The best way to bore a viewer is to show a series of interviews in which the frame never deviates from the head and shoulders of the subjects. Even though the subject may be talking about something really interesting, the viewer’s eyes tell them nothing is happening. Instead, use cutaways to show what the interview subject is talking about. By frequently changing the view, you will add interest to your video. In my video, I used footage and images of mangroves and the conference from my personal library to augment the video interviews.
  6. Prepare interview questions beforehand. Think carefully about what questions you want to ask and have them on hand during the interview. As you saw, I started with a question about what the conference was all about. Next, I asked why the viewer should care about the conference topic: mangroves. I posed that question to someone I knew had extensive experience in many different countries and got a great answer. I next asked why this particular conference was important. That question elicited information from organizers and sponsors about the level of global interest in mangrove science. I then asked attendees to describe their particular topic of research that they were presenting at the conference. Here, I wanted to show how varied the research topics were as well as how varied the researchers themselves were. For example, I interviewed one of the plenary speakers, people who gave regular talks, and students presenting posters. Their answers provided a broad picture of research topics being reported at the meeting and also showed people at various stages in their career. Finally, I asked all of my interview subjects how they first became interested in mangroves, which prompted a variety of interesting, personal responses that told the viewer something about what motivated these scientists to study mangroves. Don’t restrict yourself to prepared questions, though. If you think of an off-the-cuff question during the interview, ask it. Such spontaneous queries often elicit the most interesting answers.
  7. Use camera equipment that is easy to carry, set up, and use. Filming at a conference is really difficult, especially if you also wish to attend the sessions. Using a setup that can be carried in a purse or backpack really simplifies the process. As I said above, I used my iPhone and an inexpensive lapel microphone to conduct the interviews. Having been interviewed by news reporters using only their cell phones to record, I knew that this was an approach used by professionals. This approach made it really easy for me to attend the sessions and then quickly set up during the breaks for the interviews. Basically, all I had to do was plug the mic into my phone and clip it to the subject’s shirt…and I was ready to film. In some instances, I attached my phone to a selfie stick, which helped me stabilize it and also position it to frame my subject correctly.
  8. Review footage (both video and audio) immediately. It’s good practice to do a brief check of your equipment before starting each interview. I usually do this by myself–I simply clip the mic to my shirt and turn the camera on myself. If I’m going to interview in a noisy poster session, for example, I’ll record a brief clip of my voice to make sure it’s audible above the background noise. When you finish an interview, it’s a good idea to review your footage to ensure there are no technical problems. I always take a quick look and listen while I’m still with the interview subject. In one case, I discovered that I had somehow tapped the record button twice, so that I failed to record anything at all. I was able to quickly redo the interview.
  9. Use movie-editing software to edit the interview footage. In interviews, you will capture a lot of footage that is unusable. Editing is essential to remove or minimize bloopers, shaky clips, loud noises, and other problematic footage. Subjects who are nervous tend to ramble and may also string together sentences without a break between, making it difficult to cut and splice footage. Sometimes, it’s necessary during the interview to ask the subject to pause a few seconds between sentences. These pauses will let you more easily extract short statements without cutting off the speaker mid-word. Once you have removed unusable parts, you then need to cut further. Resist the temptation to include everything you filmed. Also, avoid long sequences of one person talking. Edit the footage so that the scene changes frequently. I partially accomplished this by asking a question (in a text title) and then showing a series of clips of different subjects answering each question. I’ve already mentioned the use of cutaways to augment an interview—these cutaways will really help the viewer stay engaged and interested in what the interview subject is saying.