- ONLINE IMPOSTERS AND DISINFORMATION

[House Hearing, 116 Congress]
[From the U.S. Government Publishing Office]


                            ONLINE IMPOSTERS
                           AND DISINFORMATION

=======================================================================

                                HEARING

                               BEFORE THE

                     SUBCOMMITTEE ON INVESTIGATIONS
                             AND OVERSIGHT

                                 OF THE

                      COMMITTEE ON SCIENCE, SPACE,
                             AND TECHNOLOGY
                        HOUSE OF REPRESENTATIVES

                     ONE HUNDRED SIXTEENTH CONGRESS

                             FIRST SESSION

                               __________

                           SEPTEMBER 26, 2019

                               __________

                           Serial No. 116-47

                               __________

 Printed for the use of the Committee on Science, Space, and Technology
 
 [GRAPHICS NOT AVAILABLE IN TIFF FORMAT]	


       Available via the World Wide Web: http://science.house.gov
       
       
                              __________
                               

                    U.S. GOVERNMENT PUBLISHING OFFICE                    
37-739PDF                  WASHINGTON : 2020                     
          
--------------------------------------------------------------------------------------
       

              COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY

             HON. EDDIE BERNICE JOHNSON, Texas, Chairwoman
ZOE LOFGREN, California              FRANK D. LUCAS, Oklahoma, 
DANIEL LIPINSKI, Illinois                Ranking Member
SUZANNE BONAMICI, Oregon             MO BROOKS, Alabama
AMI BERA, California,                BILL POSEY, Florida
    Vice Chair                       RANDY WEBER, Texas
CONOR LAMB, Pennsylvania             BRIAN BABIN, Texas
LIZZIE FLETCHER, Texas               ANDY BIGGS, Arizona
HALEY STEVENS, Michigan              ROGER MARSHALL, Kansas
KENDRA HORN, Oklahoma                RALPH NORMAN, South Carolina
MIKIE SHERRILL, New Jersey           MICHAEL CLOUD, Texas
BRAD SHERMAN, California             TROY BALDERSON, Ohio
STEVE COHEN, Tennessee               PETE OLSON, Texas
JERRY McNERNEY, California           ANTHONY GONZALEZ, Ohio
ED PERLMUTTER, Colorado              MICHAEL WALTZ, Florida
PAUL TONKO, New York                 JIM BAIRD, Indiana
BILL FOSTER, Illinois                JAIME HERRERA BEUTLER, Washington
DON BEYER, Virginia                  JENNIFFER GONZALEZ-COLON, Puerto 
CHARLIE CRIST, Florida                   Rico
SEAN CASTEN, Illinois                VACANCY
KATIE HILL, California
BEN McADAMS, Utah
JENNIFER WEXTON, Virginia
                                 ------                                

              Subcommittee on Investigations and Oversight

              HON. MIKIE SHERRILL, New Jersey, Chairwoman
SUZANNE BONAMICI, Oregon             RALPH NORMAN, South Carolina, 
STEVE COHEN, Tennessee                   Ranking Member
DON BEYER, Virginia                  ANDY BIGGS, Arizona
JENNIFER WEXTON, Virginia            MICHAEL WALTZ, Florida
                        
                        
                        C  O  N  T  E  N  T  S

                           September 26, 2019

                                                                   Page

Hearing Charter..................................................     2

                           Opening Statements

Statement by Representative Mikie Sherrill, Chairwoman, 
  Subcommittee on Investigations and Oversight, Committee on 
  Science, Space, and Technology, U.S. House of Representatives..    10
    Written Statement............................................    11

Statement by Representative Frank Lucas, Ranking Member, 
  Committee on Science, Space, and Technology, U.S. House of 
  Representatives................................................    12
    Written statement............................................    12

Statement by Representative Don Beyer, Subcommittee on 
  Investigations and Oversight, Committee on Science, Space, and 
  Technology, U.S. House of Representatives......................    13

Statement by Representative Michael Waltz, Subcommittee on 
  Investigations and Oversight, Committee on Science, Space, and 
  Technology, U.S. House of Representatives......................    14
    Written statement............................................    14

Written statement by Representative Eddie Bernice Johnson, 
  Chairwoman, Committee on Science, Space, and Technology, U.S. 
  House of Representatives.......................................    15

Written statement by Representative Ralph Norman, Ranking Member, 
  Subcommittee on Investigations and Oversight, Committee on 
  Science, Space, and Technology, U.S. House of Representatives..    16

                               Witnesses:

Dr. Siwei Lyu, Director, Computer Vision and Machine Learning 
  Lab,
  SUNY - Albany
    Oral Statement...............................................    17
    Written Statement............................................    19

Dr. Hany Farid, Professor of Electrical Engineering and Computer 
  Science and the School of Information, UC, Berkeley
    Oral Statement...............................................    24
    Written Statement............................................    26

Ms. Camille Francois, Chief Innovation Officer, Graphika
    Oral Statement...............................................    31
    Written Statement............................................    33

Discussion.......................................................    38

              Appendix: Additional Material for the Record

Report submitted by Ms. Camille Francois, Chief Innovation 
  Officer, Graphika..............................................    58

 
                            ONLINE IMPOSTERS
                           AND DISINFORMATION

                              ----------                              


                      THURSDAY, SEPTEMBER 26, 2019

                  House of Representatives,
      Subcommittee on Investigations and Oversight,
               Committee on Science, Space, and Technology,
                                                   Washington, D.C.

    The Subcommittee met, pursuant to notice, at 2:01 p.m., in 
room 2318 of the Rayburn House Office Building, Hon. Mikie 
Sherrill [Chairwoman of the Subcommittee] presiding.
[GRAPHICS NOT AVAILABLE IN TIFF FORMAT]

    Chairwoman Sherrill. The hearing will now come to order. 
Good afternoon, and welcome to a hearing of the Investigations 
and Oversight Subcommittee. We're here today to discuss online 
impostors and disinformation. Researchers generally define 
misinformation as information that is false, but promulgated 
with sincerity by a person who believes it is true. 
Disinformation, on the other hand, is shared with the 
deliberate intent to deceive. It turns out that these days the 
concepts of disinformation and online impostors are almost one 
in the same. We all remember the classic scams and hoaxes from 
the early days of e-mail--a foreign prince needs help getting 
money out of the country. But today the more common brand of 
disinformation is not simply content that is plainly 
counterfactual, but that is being delivered by someone who is 
not who they say they are. We are seeing a surge in coordinated 
disinformation efforts, particularly around politicians, hot-
button political issues, and democratic elections.
    The 2016 cycle saw Russian troll farms interfering in the 
American discourse across Facebook, Twitter, and Instagram, 
trying to sway public opinion for their preferred candidate. 
But at the same time they were after something else much 
simpler--to create chaos. By driving a wedge into the social 
fissures in our society, sowing seeds of mistrust about our 
friends and neighbors, exploiting social discord, they think 
they might destabilize our democracy, and allow the oligarchy 
to look a little more attractive by comparison.
    When I was a Russian Policy Officer in the Navy, I learned 
how central information warfare is in Russia's quest to 
dominate Western nations. And, unfortunately, modern technology 
makes information warfare a far easier proposition for 
antagonists--foreign or domestic. In fact, it's perhaps too 
easy today to proliferate convincing, harmful disinformation, 
build realistic renderings of people in videos, and impersonate 
others online. That's why the incidents of harmful episodes 
have exploded in the last few years. They range from fake 
reviewers misleading consumers on Amazon, to impersonating real 
political candidates, to fake pornography being created with 
the likenesses of real people. Earlier this year an alleged 
deep fake of the President of Gabon helped trigger an 
unsuccessful coup of the incumbent government. Deep fakes are 
particularly prone to being weaponized, as our very biology 
tells us that we can trust our eyes and our ears.
    There are social science reasons why disinformation and 
online impostors are such a confounding challenge. Research has 
shown that online hoaxes spread 6 times as fast as true 
stories, for example. Maybe human nature just likes a good 
scandal, and confirmation bias shapes how we receive 
information every time we log on, or open an app. If we 
encounter a story, a video, or an influence campaign that seems 
a little less than authentic, we may still be inclined to 
believe it if the content supports the political narrative 
already playing in our own heads. Our digital antagonists, 
whether the intelligence service of a foreign adversary, or a 
lone wolf propagandist working from a laptop, know how to 
exploit all of this.
    Our meeting today is the start of a conversation. Before 
we, as policymakers, can address the threat of fake news and 
online frauds, we have to understand how they operate, the 
tools we have today to address them, and where the next 
generation of bad actors is headed. We need to know where to 
commit more resources, in the way of innovation and education. 
Our distinguished witnesses in today's panel are experts in the 
technologies that can be used to detect deep fakes and 
disinformation, and I'm glad they're here to help us explore 
these important issues. We're especially thankful that all 
three of you are able to roll with the punches when we had to 
move the hearing due a change in the congressional schedule, so 
thank you all. I'd also like to thank my Republican 
counterparts who have been such great partners in this matter. 
He will be here shortly, but Mr. Gonzalez of Ohio is joining us 
today to inform his work on deep fakes, and I'm proud to be a 
co-sponsor of his bill, H.R. 4355, here he is, and I thank you 
for being here, Mr. Gonzalez.
    [The prepared statement of Chairwoman Sherrill follows:]

    Good morning and welcome to a hearing of the Investigations 
and Oversight Subcommittee.
    We're here today to discuss online imposters and 
disinformation. Researchers generally define misinformation as 
information that is false but promulgated with sincerity by a 
person who believes it is true. Disinformation, on the other 
hand, is shared with the deliberate intent to deceive.
    It turns out that these days, the concepts of 
disinformation and online imposters are almost one and the 
same. We all remember the classic scams and hoaxes from the 
early days of email - a Nigerian Prince needs help getting 
money out of the country! But today, the more common brand of 
disinformation is not simply content that is plainly 
counterfactual, but that it is being delivered by someone who 
is not who they say they are.
    We are seeing a surge in coordinated disinformation efforts 
particularly around politicians, hotbutton political issues, 
and democratic elections. The 2016 election cycle saw Russian 
troll farms interfering in the American discourse across 
Facebook, Twitter, Instagram, YouTube and beyond, trying to 
sway public opinion for their preferred candidate. But at the 
same time, they were after something else much simpler: to 
create chaos. By driving a wedge into the social fissures in 
our society, sowing seeds of mistrust about our friends and 
neighbors, exploiting social discord, they think they might 
destabilize our democracy and allow the oligarchy to look a 
little more attractive by comparison. When I was a Russian 
policy officer in the Navy, I learned how central information 
warfare is in Russia's quest to dominate western nations. And 
unfortunately, modern technology makes information warfare a 
far easier proposition for our antagonists, foreign or 
domestic.
    In fact, its perhaps too easy today to proliferate 
convincing, harmful disinformation, build realistic renderings 
of people in videos, and impersonate others online. That's why 
the incidence of harmful episodes has exploded in the last few 
years. They range from fake reviewers misleading consumers on 
Amazon, to impersonating real political candidates, to fake 
pornography being created with the likenesses of real people. 
Earlier this year, an alleged deepfake of the President of 
Gabon helped trigger an unsuccessful coup of the incumbent 
government. Deep fakes are particularly prone to being 
weaponized, as our very biology tells us that we can trust our 
eyes and ears.
    There are social science reasons why disinformation and 
online imposters are such a confounding challenge: research has 
shown that online hoaxes spread six times as fast as true 
stories, for example. Maybe human nature just likes a good 
scandal. And confirmation bias shapes how we receive 
information every time we log on or open an app. If we 
encounter a story, a video or an influence campaign that seems 
a little less than authentic, we may still be inclined to 
believe it if the content supports the political narrative 
already playing in our own heads. Our digital antagonists, 
whether the intelligence service of a foreign adversary or a 
lone wolf propagandist working from a laptop, know how to 
exploit all of this.
    Our meeting today is the start of a conversation. Before we 
as policymakers can address the threat of fake news and online 
frauds, we have to understand how they operate, the tools we 
have today to address them, and where the next generation of 
bad actors is headed. We need to know where to commit more 
resources in the way of innovation and education.
    Our distinguished witnesses on today's panel are experts in 
the technologies that can be used to detect deep fakes and 
disinformation, and I'm glad they are here to help us explore 
these important issues. We are especially thankful that all 
three of you were able to roll with the punches when we had to 
move the hearing due to a change in the Congressional schedule.
    I'd also like to thank my Republican counterparts who have 
been such great partners on this matter. Mr. Gonzalez of Ohio 
is joining us today to inform his work on deep fakes. I'm proud 
to be a cosponsor of his bill H.R. 4355, and I thank you for 
being here, Mr. Gonzalez.

    Chairwoman Sherrill. Unfortunately Ranking Member Norman 
could not be with us today, but we are happy to have the full 
Committee Ranking Member in his place, so the Chair now 
recognizes Mr. Lucas for an opening statement. Thank you, Mr. 
Lucas.
    Mr. Lucas. Thank you, Chairwoman Sherrill, for holding this 
hearing on the growing problem of disinformation on social 
media. We all know that photos these days can be digitally 
altered so easily that it's almost impossible to tell what's 
real and what's not. Now there's a growing problem where audio 
and video can be altered so convincingly that it can appear 
that someone has said or done something that never happened. 
These deep fakes can be produced more and more easily.
    You know, there was once a rumor that I myself was a deep 
fake, just impersonating the real Frank Lucas. The good news, 
or, depending on your perspective, perhaps the bad news, is the 
technology hasn't come quite that far, and I'm the real deal. 
But once it's on the Internet, it never goes away. But deep 
fake technology is getting more and more sophisticated, and 
it's also getting easier to produce. As our witnesses will 
discuss today, the technology for generating deep fakes is 
improving at a rapid clip. Soon anyone with a decent computer, 
and access to training data, will be able to create 
increasingly convincing deep fakes that are difficult to detect 
and debunk. False and misleading content like this undermines 
public trust, and disrupts civil society. Unfortunately, the 
technology for generating deep fakes is developing at a speed 
and a scale that dwarfs the technology needed to detect and 
debunk deep fakes. We must help level the playing field.
    This Committee took the first steps to do this yesterday by 
passing a bipartisan legislation aimed at improving research 
into the technology to detect deep fakes. I want to commend 
Representative Anthony Gonzalez for introducing this bill, and 
his leadership on the issue of technology and security. I often 
say that one of our most important jobs on the Science 
Committee is communicating to the American people the value of 
scientific research and development. Legislation and hearings 
like this are a great example of how the work we do here can 
benefit directly people across the country, and I look forward 
to hearing from our witnesses, and I yield back my time, Madam 
Chair.
    [The prepared statement of Mr. Lucas follows:]

    Thank you, Chairwoman Sherrill, for holding this hearing on 
the growing problem of disinformation on social media.
    We all know that photos these days can be digitally altered 
so easily that it's all but impossible to tell what's real and 
what's not.
    There's now a growing problem where audio and video can be 
altered so convincingly that it can appear that someone has 
said or done something that never happened. These deepfakes can 
be produced more and more easily.
    You know, there was once a rumor that I MYSELF was a 
deepfake, just impersonating the real Frank Lucas. The good 
news-or maybe the bad news-is that technology hasn't come quite 
that far and I am the real deal.
    But deepfake technology IS getting more sophisticated. And 
it's also getting easier to produce. As our witnesses will 
discuss today, the technology for generating deepfakes is 
improving at a rapid clip. Soon, anyone with a decent computer 
and access to training data will be able to create increasingly 
convincing deepfakes that are difficult to detect and debunk.
    False and misleading content like this undermines public 
trust and disrupts civil society.
    Unfortunately, the technology for generating deepfakes is 
developing at a speed and scale that dwarfs the technology 
needed to detect and debunk deepfakes. We must help level the 
playing field.
    This Committee took the first step to do that yesterday by 
passing bipartisan legislation aimed at improving research into 
the technology to detect deepfakes.
    I want to commend Representative Anthony Gonzalez for 
introducing this bill and for his leadership on the issue of 
technology and security.
    I often say that one of our most important jobs on the 
Science Committee is communicating to the American people the 
value of scientific research and development. Legislation and 
hearings like this are a great example of how the work we do 
here can directly benefit people across the country.
    I look forward to hearing from our witnesses, and I yield 
back my time.

    Chairwoman Sherrill. Well, thank you, Ranking Member Lucas. 
And we have an additional opening statement today from my 
colleague across the aisle, Representative Waltz of Florida. 
Unfortunately, Mr. Waltz could not make it to the hearing 
today, but considering his great interest in the issue, I 
allowed him to submit a video of his opening statement, so 
we'll now hear from Mr. Waltz.

    Mr. Waltz. Hello, everyone. I'm sorry I can't be in town 
for the hearing today, but I wanted to make sure to share my 
concerns about digital impostors. Everyone in this room relies 
on social media, video messages, and other digital technology 
to connect with our constituents. We listen to their concerns, 
we share information about our work in Congress. But deep fake 
technology, which can literally put words in our mouths, 
undermines public trust in any digital communication. Today's 
witnesses will paint a picture of just how sophisticated the 
technology has become for creating realistic images, videos, 
and personalities online.
    Before I conclude my statement, I want to say a few words 
about our distinguished Subcommittee Chairwoman, Mikie 
Sherrill. I think we can all agree that Mikie is one of the 
most intelligent, accomplished, and persuasive Members of 
Congress. In fact, she's so persuasive that she convinced me, a 
Green Beret, to cheer on Navy football in this year's rivalry 
game. Thanks, Chairwoman Sherrill, for bringing attention to 
the problems of deep fake technology, and go Navy, beat Army.

    Chairwoman Sherrill. What a pleasure. As you all saw that--
thank you so much for your work. That was obviously a deep 
fake. That is what we're looking at, and that is what we're 
discussing today. Thank you so--right? How nice is that? And, 
sadly, knowing how deep the commitment to our respective 
services' football is, I do know that that was not actually 
your sentiment, although it should be. So thank you, Mr. Waltz 
and Mr. Beyer, for your willingness to participate in our deep 
fake demonstration, and thank you to our distinguished 
witnesses, Dr. Lyu, for creating this video.
    I'll now recognize Mr. Beyer and Mr. Waltz for a few 
remarks. Mr. Beyer?
    Mr. Beyer. Yes. Thank you, Madam Chair, very much. 
Congressman Waltz and I really had fun making the deep fake 
video. You can see that it clearly was in jest. As an Army 
brat, I would never throw a Green Beret under the bus. But you 
also see how dangerous and misleading it could be. I'm sure we 
fooled a couple of people. For instance, what if I had said, 
instead of go Navy, go beat Army, I had said, it's time to 
impeach the President? Well, that would be viral everywhere. I 
mean, the things would be ringing off the hook, and the social 
media----
    Mr. Waltz. Please do not do that to my staff.
    Mr. Beyer. No. And Mr. Waltz would be the first to know, so 
my friends might appreciate it, but I don't think he would at 
all, so obviously the potential for serious harms with these 
deep fakes is quite great on elections, international stage for 
diplomatic purposes, and even for our private lives. That's why 
we, as a country, need to take swift action and invest in the 
research and the tools for identifying and combating deep 
fakes, and create a national strategy immediately, especially 
for election integrity, and ahead of the 2020 presidential 
election.
    The stakes are high. We've got to act now. We already know 
of Russia's intentional campaign to spread disinformation 
throughout the last one, and I don't even want to imagine what 
Russia, or China, or just private players, the havoc they could 
wreak on our elections and on our personal lives. So thank you 
very much to Mikie Sherrill and Frank Lucas for leading this 
effort. I yield back.
    Chairwoman Sherrill. Thank you very much. Mr. Waltz?
    Mr. Waltz. Thank you, Madam Chairwoman. And while I do 
certainly hold you in the highest regard, that was not me. But, 
just to add to my colleagues, that's just an example, and a 
small example, of what a deep fake synthetic video can do. And 
we've seen this insidious capability. We're seeing, I think, 
the birth of it. But I certainly support my colleagues in how 
we can get our arms around this as a country. I think it's 
important to note that Mr. Beyer and I both consented to that 
video, but as, you know, putting words in the mouth of a U.S. 
Army Green Beret and cheering on for Navy is not the worst 
application of this technology, and it's certainly not 
difficult to imagine how our enemies or criminal groups can 
wreak havoc on governments, on elections, on businesses, on 
competitors, and the privacy of all Americans. So these videos, 
and this technology, have the potential to truly be a weapon 
for our adversaries.
    We know that advanced deep fake technology exists within 
China and Russia. We know that they have the capability, and 
that both countries have demonstrated a willingness to use 
asymmetric warfare capabilities. So, as the technology for 
generating deep fakes improves, we do risk falling behind on 
the detection front. That's why this hearing is so important, 
and I certainly commend you for calling it. It will help us 
examine solutions for both detecting and debunking the deep 
fakes of the future. And, you know, at the end of the day, I 
just have to say go Army, beat Navy. I yield back.
    [The prepared statement of Mr. Waltz follows:]

    What you just saw was an example of a ``deepfake,'' or 
synthetic video that can be generated thanks to advancements in 
artificial intelligence and machine learning.
    As we have just seen, deepfakes have the ability to make 
people-myself included-appear as though they have said or done 
things that they have never said or done. And advancements in 
the underlying technology, as we will hear today, are making it 
much more difficult to distinguish an authentic recording from 
synthetic, deepfake impersonations.
    Importantly, Mr. Beyer and I both consented to and 
participated in the creation of this deepfake. But a Green 
Beret cheering for Navy is not the worst application of the 
technology.
    It's not difficult to imagine how deepfakes of 
nonconsenting individuals could be used to wreak havoc on 
governments, elections, business, and the privacy of 
individuals.
    Deepfakes have the potential to be a weapon for our 
adversaries and we know that advanced deepfake technology 
exists in China and Russia and that both countries have 
asymmetric warfare capabilities.
    As the technology for generating deepfakes improves, we 
risk falling behind on the detection front. That's why today's 
hearing is so important. It will help us examine solutions for 
detecting and debunking deepfakes of the future.
    Thank you Chairwoman Sherrill and Ranking Member Norman for 
convening this important hearing.
    Yield back.

    Chairwoman Sherrill. I don't know why I let you testify in 
my--no, thank you very much. Those were really sobering 
comments, and I appreciate you both for showing us a little bit 
of what we're contending with.
    [The prepared statement of Chairwoman Johnson follows:]

    Thank you Madam Chair, and I would like to join you in 
welcoming our witnesses this morning.
    I'm glad we're holding this hearing today. It's worth 
acknowledging just how deeply the phenomenon of online 
disinformation affects most of our lives these days. As long as 
there's been news, there's been fake news. But the American 
people are far more connected than they used to be. And the new 
tools that enable fake images, misleading public discourse, 
even long passages of text are alarming in their 
sophistication. Maybe we all should have seen this coming, the 
explosion of disinformation that would accompany the 
information age.
    I suspect my colleagues here in the House are already 
taking this matter seriously, because in a way, online 
imposters and twisted facts on the internet present a real and 
active threat to the way we do our own jobs. We all use social 
media to connect with our constituents and to hear about their 
concerns. My staff want to read the comments and the posts from 
the people in Dallas and hear what they have to say. If I am to 
believe that a large percentage of the comments on Twitter are 
coming from ``bots'' or some other source of disinformation, 
the waters get muddy very quickly.
    We have to acknowledge the serious legacy of disinformation 
is in this country. In the late 1970s, I was working under 
President Carter as a Regional Director for the Department of 
Health. Around that time, the Soviet Union's KGB kicked off a 
campaign to plant the idea that the United States government 
invented HIV and AIDS at Fort Detrick. The KGB wrote bogus 
pamphlets and fake scientific research and distributed them at 
global conferences. It sold a complex narrative in which the 
United States military deliberately infected prisoners to 
create a public health crisis -- biological warfare against our 
own people. The KGB's efforts were so pervasive that by 1992, 
15% of Americans considered it ``definitely or probably true'' 
that the AIDS virus was created deliberately in a government 
laboratory. Decades later, a 2005 study found that a 
substantial percentage of the African American community 
believed that AIDS was developed as a form of genocide against 
black people.
    How absolutely devastating such disinformation can be. It 
is clear that information warfare can have such profound, 
destructive effects. I think it is long past time to recognize 
how vulnerable we are to the next generation of hostile actors.
    As Chairwoman Sherrill said, the first step in addressing a 
big problem is understanding it. Not every Member of this 
Committee, myself included, is well-versed in what a ``deep 
neural network'' is or how a ``GAN'' works. However, we have a 
sense already that the federal government is likely to need to 
create new tools that address this issue.
    We also need to have a serious conversation about what we 
expect from the social media platforms that so many of us use 
every day. These companies have enjoyed a level of growth and 
success that is only possible in the United States. They were 
created in garages and dorm rooms, but they stand on the 
shoulders of giants like DARPA, which created the internet, and 
the National Science Foundation, which developed the backbone 
of computer networks that allowed the internet to blossom. The 
American consumer has been overwhelmingly faithful to social 
media over the past decade. We will need those companies to 
help combat disinformation. It can no longer be ignored.
    I am pleased to welcome our witnesses today, and I'm also 
pleased that we had bipartisan agreement in yesterday's markup 
on a bill that would enable more research on deep fakes. These 
issues require a bold bipartisan response. I thank my 
colleagues on both sides of the aisle for working together to 
address these important issues. With that, I yield back.

    [The prepared statement of Mr. Norman follows:]

    Good afternoon and thank you, Chairwoman Sherrill, for 
convening this important hearing.
    We are here today to explore technologies that enable 
online disinformation. We'll look at trends and emerging 
technology in this field, and consider research strategies that 
can help to detect and combat sophisticated deceptions and so-
called ``deepfakes.''
    Disinformation is not new. It has been used throughout 
history to influence and mislead people.
    What is new, however, is how modern technology can create 
more and more realistic deceptions. Not only that, but modern 
disinformation can be spread more widely and targeted to 
intended audiences.
    Although media manipulation is nothing new, it has long 
been limited to altering photos. Altering video footage was 
traditionally reserved for Hollywood studios and those with 
access to advanced technological capabilities and financial 
resources.
    But today, progress in artificial intelligence and machine 
learning have reduced these barriers and made it easier than 
ever to create digital forgeries.
    In 1994, it cost $55 million to create convincing footage 
of Forrest Gump meeting JFK. Today, that technology is more 
sophisticated and widely available.
    What's more, these fakes are growing more convincing and 
therefore more difficult to detect. A major concern is this: as 
deepfake technology becomes more accessible, the ability to 
generate deepfakes may outpace our ability to detect them.
    Adding to the problem of sophisticated fakes is how easily 
they can spread. Global interconnectivity and social networking 
have democratized access to communication.
    This means that almost anyone can publish almost anything 
and can distribute it at lightspeed across the globe.
    As the internet and social media have expanded our access 
to information, technological advancements have also made it 
easier to push information to specific audiences.
    Algorithms used by social media platforms are designed to 
engage users with content that is most likely to interest them. 
Bad actors can use this to better target disinformation.
    For example, it is difficult to distinguish the techniques 
used in modern disinformation campaigns from the those used in 
ordinary online marketing and advertising campaigns.
    Deepfakes alone are making online disinformation more 
problematic. But when combined with novel means for 
distributing disinformation to ever more targeted audiences, 
the threat is even greater.
    Fortunately, we are here today to discuss these new twists 
to an old problem and to consider how science and technology 
can combat these challenges.
    I look forward to an engaging discussion with our 
distinguished panel of witnesses on how we can better address 
online disinformation.
    Thank you again, Chairwoman Sherrill, for holding this 
important hearing, and thanks to our witnesses for being here 
today to help us develop solutions to this challenge. I look 
forward to hearing your testimony.
    I yield back.

    Chairwoman Sherrill. At this time I would like to introduce 
our three witnesses.
    First we have Dr. Siwei Lyu. Dr. Lyu is a Professor at the 
University of Albany's College of Engineering and Applied 
Sciences. He is an expert in machine learning, and media 
forensics. Next is Dr. Hany Farid. Dr. Farid is a Professor at 
the University of California Berkeley School of Electrical 
Engineering and Computer Science and the School of Information. 
Dr. Farid's research focuses on digital forensics, image 
analysis, and human perception. Last we have Ms. Camille 
Francois. Ms. Francois is the Chief Innovation Officer at 
Graphika, a company that uses artificial intelligence to 
analyze online communities and social networks.
    As our witnesses should know, you will each have 5 minutes 
for your spoken testimony. Your written testimony will be 
included in the record for the hearing. When you all have 
completed your spoken testimony, we will begin with questions. 
Each Member will have 5 minutes to question the panel. And 
we'll start with you, Dr. Lyu.

                   TESTIMONY OF DR. SIWEI LYU,

           PROFESSOR, DEPARTMENT OF COMPUTER SCIENCE,

      DIRECTOR, COMPUTER VISION AND MACHINE LEARNING LAB,

       UNIVERSITY AT ALBANY, STATE UNIVERSITY OF NEW YORK

    Dr. Lyu. Good afternoon, Chairwoman Sherrill, Ranking 
Member Lucas, and Members of the Committee. Thank you for 
inviting me today to discuss the emerging issue of deep fakes. 
You have just seen a deep fake video we created for this 
hearing, so let me first briefly describe how this video, and 
similar fake videos, are made.
    Making a deep fake video requires a source and a target. In 
this case, the source was Representative Beyer, and the target 
was Representative Waltz. Mr. Beyer's staff was kind enough to 
prepare a video of the Congressman for this project. While Mr. 
Waltz's office consented to this video demonstration, it is 
important to know that we didn't use any video from his office. 
Instead, we conducted an Internet search for about 30 minutes, 
and found one suitable minute-long YouTube video of Mr. Waltz, 
and that's our target video. The next step involves a software 
tool we developed, which used deep neural networks to create 
the fake video. It is important to note that our tool does not 
use a generative adversary network, or GAN.
    It first trains the deep neural network models using the 
source and the target video. It then used the models to extract 
facial expressions in the source video of Mr. Beyer, and 
generate a video of Mr. Waltz with the same facial expressions. 
The audio track is from the original video of Mr. Beyer, and 
was not modified. The training and the production are performed 
on a computer equipped with a graphical processing unit, or 
GPU. The computer and the GPU can be purchased from Amazon for 
about $3,000. The overall training and production took about 8 
hours, and were completely automated, after setting a few 
initial parameters.
    So a similar process was also used to generate the fake 
videos that are being displayed on the screen right now. 
Although we do not distribute this particular software, true, 
similar software making deep fakes can be found on code-sharing 
platforms like GitHub, and are free for anyone to download and 
to use. With the abundance of online media we share, anyone is 
a potential target of a deep fake attack.
    Currently there are active research developments to 
identify, contain, and obstruct deep fakes before they can 
inflict damages. The majority of such research is currently 
sponsored by DARPA (Defense Advanced Research Projects Agency), 
most notably the MediFor (Media Forensics) program. But it is 
also important that the Federal Government fund more research, 
through NSF (National Science Foundation), to combat deep 
fakes. As an emerging research area that does not fall squarely 
into existing AI (artificial intelligence) or cybersecurity 
programs, it may be wise to establish a new functional program 
at NSF dedicated to similar emerging technologies. It can serve 
as an initial catch-all for similar high-risk and high-impact 
research until either an existing program's mission is 
expanded, or a new dedicated program is established.
    We should also examine the approaches we share software 
code and tools, especially those with potential negative 
impacts like deep fakes. Therefore, it may be wise to consider 
requiring NSF to conduct reviews of sponsored AI research and 
enforcing controls on the release of software codes or tools 
with dual use nature. This will help to reduce the potential 
misuses of such technologies.
    Last, but not least, education on responsible research 
should be an intrinsic part of AI research. Investigators 
should be fully aware of the potential impact of the sponsored 
research, and provide corresponding trainings to the graduate 
students and post-docs working on the project. Again, NSF could 
enforce such ethics training and best practices through a 
mandatory requirement to sponsored research projects. The 
creation of new cross-function NSF programs for emerging 
technologies, the introduction of controls on the release of 
NSF-funded AI research with potential dual use, and required 
ethics training for NSF-funded AI research will go far in 
defending against the emerging threat posted by deep fakes.
    Thank you for having this hearing today, and giving me the 
opportunity to testify. I'm happy to answer any questions you 
may have. Thank you.
    [The prepared statement of Dr. Lyu follows:]
    [GRAPHICS NOT AVAILABLE IN TIFF FORMAT]
    
    Chairwoman Sherrill. Thank you very much. Dr. Farid?

                  TESTIMONY OF DR. HANY FARID,

             PROFESSOR, ELECTRICAL ENGINEERING AND

        COMPUTER SCIENCE AND THE SCHOOL OF INFORMATION,

               UNIVERSITY OF CALIFORNIA, BERKELEY

    Dr. Farid. Chairwoman Sherrill, Ranking Member Lucas, and 
Members of the Committee, thanks for the opportunity to talk 
with you today on this important topic. Although disinformation 
is not new, what is new in the digital age is the 
sophistication with which fake content can be created, the 
democratization of access to sophisticated tools for 
manipulating content, and access to the Internet and social 
media, allowing for the delivery of disinformation with an 
unprecedented speed and reach.
    The latest incarnation in creating fake audio, image, and 
video, so-called deep fakes, is being fueled by rapid advances 
in machine learning, and access to large amounts of data. 
Although there are several variations, the core machinery 
behind this technology is based on a combination of traditional 
techniques in computer vision and computer graphics, and more 
modern techniques from machine learning, namely deep neural 
networks. These technologies can, for example, from just 
hundreds of images of the Chairwoman, splice her likeness into 
a video sequence of someone else. Similar technologies can also 
be used to alter a video of the Chairwoman to make her mouth 
consistent with a new audio recording of her saying something 
that she never said. And, when paired with highly realistic 
voice synthesis technologies that can synthesize speech in a 
particular person's voice, these deep fakes can make a, for 
example, CEO announce that their profits are down, leading to 
global stock manipulation; a world leader announcing military 
action, leading to global conflict; or a Presidential candidate 
confessing complicity to a crime, leading to the disruption of 
an election.
    The past 2 years have seen a remarkable increase in the 
quality and sophistication of these deep fakes. These 
technologies are not, however, just relegated to academic 
circles or Hollywood studios, but are freely available online, 
and have already been incorporated into commercial 
applications. The field of digital forensics is focused on 
developing technologies for detecting manipulated or 
synthesized audio, images, and video, and within this field 
there are two broad categories: Proactive and reactive 
techniques.
    Proactive techniques work by using a specialized camera 
software to extract a digital signature from a recorded image 
or video. This digital signature can then be used in the future 
to determine if the content was manipulated from the time of 
recording. The benefit of this approach is that the technology 
is well-understood and developed. It's effective, and it is 
able to work at the scale of analyzing billions of uploads a 
day. The drawback is that it requires all of us to use 
specialized camera software, as opposed to the default camera 
app that we are all used to using, and it requires the 
collaboration of social media giants to incorporate these 
signatures and corresponding labels into their systems.
    Notice that these proactive techniques tell us what is 
real, not what is fake. In contrast, reactive techniques are 
focused on telling us what is fake. These techniques work on 
the assumption that digital manipulation leaves behind certain 
statistical, geometric, or physical traces that, although not 
necessarily visually obvious, can be modeled and 
algorithmically detected. The benefit of these techniques is 
that they don't require any specialized hardware or software. 
The drawback is that, even despite advances in the field, there 
are no universal forensic techniques that can operate at the 
scale and speed needed to analyze billions of uploads a day.
    So, where do we go from here? Four points. One, funding 
agencies should invest at least as much financial support to 
programs in digital forensics as they are in programs that are 
fueling advances that are leading to the creation of, for 
example, deep fakes. Two, researchers that are developing 
technologies that can be weaponized should give more thought to 
how they can put proper safeguards in place so that their 
technologies are not misused. Three, no matter how quickly 
forensic technology advances, it will be useless without the 
collaboration of the giants of the technology sector. The major 
technology companies, including Facebook, Google, YouTube, and 
Twitter, must more aggressively and proactively develop and 
deploy technologies to combat disinformation campaigns. And 
four, we should not ignore the non-technical component of the 
issue of disinformation, us--the users. We need to better 
educate the public on how to consume trusted information, and 
not spread disinformation.
    I'll close with two final points. First, although there are 
serious issues of online privacy, moves by some of the 
technology giants to transform their platform to an end-to-end 
encrypted system will make it even more difficult to slow or 
stop the spread of disinformation. We should find a balance 
between privacy and security, and not sacrifice one for the 
other. And, last, I'd like to re-emphasize that disinformation 
is not new, and deep fakes is only the latest incarnation. We 
should not lose sight of the fact that more traditional human-
generated disinformation campaigns are still highly effective, 
and we will undoubtedly be contending with yet another 
technological innovation a few years from now. In responding to 
deep fakes, therefore, we should consider the past, the 
present, and the future as we try to navigate the complex 
interplay of technology, policy, and regulation, and I'm sorry 
I'm 15 seconds over.
    [The prepared statement of Dr. Farid follows:]
    [GRAPHICS NOT AVAILABLE IN TIFF FORMAT]
    
    Chairwoman Sherrill. Thank you very much. Ms. Francois?

               TESTIMONY OF MS. CAMILLE FRANCOIS,

               CHIEF INNOVATION OFFICER, GRAPHIKA

    Ms. Francois. Chairwoman Sherrill, and Ranking Member 
Lucas, Members of the Committee, thank you for having me here 
today. We're here to discuss the growing issue of online 
imposters and disinformation. As you know, this problem is 
nuanced and complex. I've been looking at disinformation 
campaigns for many years, and I have seen great diversity in 
the types of actors, techniques, and impacts that those 
disinformation campaigns can have. I want to highlight that, 
while we tend to focus on fake content, the most sophisticated 
actors I have seen operate online actually tend to use 
authentic content weaponized against their targets. This is 
what I want to talk about a little bit more.
    It's really hard to give a sense of the growing and global 
scale of the issue, but here are a few recent examples. Today a 
report by my colleagues over at the Oxford Internet Institute 
highlighted that more than 70 countries currently use 
computational propaganda techniques to manipulate public 
opinion online. Since October 2018, Twitter has disclosed 
information around more than 25,000 accounts associated with 
information operations in 10 different countries.
    Twitter is one thing. On Facebook, over 40 million users 
have followed pages that Facebook has taken down for being 
involved in what they call coordinated inauthentic behavior. 
Those may seem like huge numbers, but, in fact, they represent 
a needle in a haystack, and the danger of this particular 
needle is its sharpness. Targeting specific communities at the 
right time, and with the right tactics, can have a catastrophic 
impact on society, or on an election. That impact remains very 
difficult to rigorously quantify. For instance, if you take a 
fake account, what matters is not just the number of followers 
it has, but who those followers are, how they have engaged with 
the campaign, and how they have engaged both online and 
offline. Similarly, for a piece of content, it's not often the 
payload that matters, but really the delivery system, and the 
targeted system.
    We are finding more and more state and non-state actors 
producing disinformation. What keeps me awake at night on this 
issue is also the booming market of disinformation for hire. 
That means troll farms that one can rent, bot networks that one 
can purchase, for instance. These tools are increasingly 
attractive to domestic political actors, who also use them to 
manipulate American audiences online. I see that you discovered 
how easy it was to make a deep fake, and I encourage you to 
also discover how easy it is to buy a set of fake accounts 
online, or, frankly, to purchase a full blown disinformation 
campaign.
    The good news here, if there is any, is that, as a society, 
and as a professional field, we've come a long way since 2016. 
These problems began long before 2016, but it really took the 
major Russian interference in the U.S. election to force us 
toward a collective reckoning. In 2016 the top platforms, law 
enforcement, and democratic institution sleepwalked through the 
Russian assault on American democratic processes. Those who 
raised the alarm were, at best, ignored. Today we're in a 
better place. We have rules, definition, and emerging processes 
to tackle these campaigns. Coordination between researchers, 
platforms, and public agencies have proven successful, for 
instance, in protecting the U.S. 2018 midterms from Russian 
disinformation efforts. Then, those actors worked hand in hand 
to detect, take down, and, to a certain extent, document the 
Russian attempts to deceive an manipulate voters.
    We still have a long way to go, but the scale of the 
problem is staggering. Sophisticated state actors, and, again, 
a growing army of hired guns, are manipulating vast networks' 
interactions among billions of people on dozens of platforms, 
and in hundreds of countries. This manipulation is 
discoverable, but almost in a way that a submarine is 
discoverable under the ocean. What you really need is 
sophisticated sensors that must evolve as rapidly as the 
methods of evasion. That requires a serious investment in the 
development of analytic models, computational tools, and domain 
expertise on adversary trade crafts. We need better technology, 
but also more people able to develop and adopt rapidly evolving 
methods.
    Accomplishing this also requires access to data, and that 
is currently the hardest conversation on this topic. The task 
at hand is to design a system that guarantees user security and 
privacy, while ensuring that the corps of scientists, 
researchers, and analysts can access the data they need to 
unlock the understanding of the threats, and harness innovative 
ways to tackle the issue. Today we're very far from having such 
a system in place. We critically need not just the data, but 
the community of scholars and practitioners to make sense of 
it. That emerging field of people dedicating to ensuring the 
integrity of online conversation needs support, funding, and a 
shared infrastructure.
    [The prepared statement of Ms. Francois follows:]
    [GRAPHICS NOT AVAILABLE IN TIFF FORMAT]
    
    Chairwoman Sherrill. Thank you, Ms. Francois. We'll have to 
get to the rest of it as we go through the questions, but thank 
you very much. At this point we'll begin our first round of 
questions, and I'm going to recognize myself for 5 minutes.
    I'd just like to start with Dr. Farid and Dr. Lyu, because 
we read a lot about the potential for deep fakes to be used on 
political candidates, and we watched Dr. Lyu's very compelling 
example here in this room, so thank you for that brilliant 
demonstration. I hope my fellow Members of Congress who aren't 
in the room today will actually get a chance to see for 
themselves, and hear just how limitless the potential impacts 
of deep fakes can be.
    Let's talk about some hard truths. On a scale of 1 to 10, 
what do you think are the chances of a convincing video deep 
fake of a political candidate, someone running for Congress, or 
President, or Governor, emerging during the 2020 election 
cycle, and why do you think that?
    Dr. Farid. I'm going to save five, to minimize my chances 
of being wrong. I am--and for another reason too, that I think 
we shouldn't--despite the sophistication of deep fakes, we 
shouldn't overlook that traditional disinformation works really 
well, and it's easy, right? Teenagers in Macedonia were 
responsible for a lot of the disinformation campaigns we saw in 
2016. So I think it's coming. I don't know whether it'll be in 
2020, or 2022, or 2024, but largely because the cheap stuff 
still works, and it's going to work for a while. I think we'll 
eventually get out ahead of that, and then this will be the new 
front.
    But I think it is just a matter of time. We've already seen 
nefarious uses of deep fakes for cases of fraud, and I think 
the bigger threat here is not going to be--the first threat I 
predict is not going to be an actual deep fake, but the 
plausible deniability argument, that a real video will come 
out, and somebody will be able to say, that's a deep fake. And 
that, in some ways, is the larger threat that I see coming down 
the road, is once anything can be faked, nothing is real 
anymore. And I think that's probably more likely to happen 
before the first real deep fake comes out.
    Chairwoman Sherrill. That's interesting. Dr. Lyu?
    Dr. Lyu. Yes. Thank you for the question. As, actually, I 
mentioned in the opening remarks, the technical capability of 
making high-quality deep fakes is already at the disposal of 
whoever wants to make it. As I mentioned, for the deep fake 
videos we made, we have a specially made software, but anybody 
can potentially also develop similar softwares based on the 
open-source software on the GitHub, and then they can just buy 
a computer for about, you know, a couple thousand dollars, and 
then run this for a couple hours. Everything is automatic. So 
this is really the reality that, you know, people, whoever want 
to make these kind of videos, they have that capacity.
    However, the question whether we will see such a video in a 
coming election really--as Professor Farid mentioned, depends 
on a lot of other factors, especially, you know, deep fake is 
not the only approach for disinformation. So it is kind of 
difficult to come up with a precise number there, but the 
possibility is certainly substantial. Thank you.
    Chairwoman Sherrill. Thank you. And then, Ms. Francois, you 
have a lot of experience observing how trolls and bots behave 
when they identify a hoax they might want to spread. If a 
convincing deep fake of a politico emerges next year, what do 
you expect the bot and troll participation to look like in 
proliferating the video? In other words, will we see this sort 
of erupt all at once, or does it percolate in dark areas of the 
Internet for a short-period of time before it emerges? How does 
that work?
    Ms. Francois. All of the above are possible. I will say 
that, if we are facing a sophisticated actor able to do a 
convincing deep fake, they will be able to do a convincing 
false amplification campaign, too.
    Chairwoman Sherrill. Thank you very much. And then, Dr. 
Farid, you said in your testimony that researchers working on 
technologies to detect disinformation should give more thought 
to proper safeguards so their work cannot be misused or 
weaponized. What kind of safeguards do you believe could be 
adopted voluntarily by the research community to protect 
against the spread of disinformation?
    Dr. Farid. Good. So I think there's two things that can be 
done. So, first, you have to understand in computer science we 
have an open source culture, which means we publish work, and 
we put it out there. That's been the culture, and it's 
wonderful. It's a wonderful culture. But when that technology 
can be weaponized, maybe we should think about putting the 
data, and the code, and the GitHub repository, where anybody 
can download it, as Professor Lyu was saying. So that's number 
one, is just think about how you disseminate. We can still 
publish and not put the details of it out so that anybody can 
grab it, No. 1.
    No. 2 is, there are mechanisms by which we can incorporate, 
into synthetic media, watermarks that will make it easier for 
us to identify that. That can become a standard. We can say 
academic publishers who are going to post code should 
incorporate into the result of their technology a distinct 
watermark. That is not bulletproof, it's not that it can't be 
attacked, but it's at least a first line of defense. So those 
are the two obvious things that I can see.
    Chairwoman Sherrill. That was perfect timing. Thank you 
very much, I appreciate it. I would now like to recognize Mr. 
Lucas for 5 minutes.
    Mr. Lucas. Thank you, Madam Chair. Dr. Farid, following up 
on what the Chair was discussing, in your written statement you 
say that no matter how quickly forensic technology for 
detecting deep fakes develops, it'll be useless without the 
cooperation of the technology giants like Google and Facebook. 
How do we bring those people to the table to begin this 
collaboration?
    Dr. Farid. Yes. So the bad news is they have been slow to 
respond, for decades, really. It's not just disinformation. 
This is the latest, from child sexual abuse, to terrorism, to 
conspiracy theories, to illegal drugs, illegal weapons. The 
technology sector has been very slow to respond. That's the bad 
news. The good news is I think a combination of pressure from 
here on Capitol Hill, from Brussels, from the UK, and from the 
public, and from advertisers, there is now an acknowledgement 
that we have a problem, step number one.
    Step number two is, what are we going to do about it? And I 
still think we are very slow here, and what you should 
understand is we are fighting against business interests, 
right? The business model of Facebook, Google, YouTube, Twitter 
is data, it's content. Taking down content is bad for business. 
And so we have to find mechanisms and either through regulatory 
pressure, advertising pressure, public pressure, bring them to 
the table. I will say the good news is, in the last 6 months, 
at least the language coming out of the technology sector is 
encouraging. I don't know that there's a lot of action yet.
    So I will give you an example. We all saw a few months ago 
an altered video of Speaker Pelosi. This was not a confusing 
video, we all knew it was fake, and yet Facebook gleefully let 
it on their platform. In fact, defended the decision to leave 
it on their platform, saying, we are not the arbiters of truth, 
OK? So we have two problems now. We have a policy problem, and 
we have a technology problem. I can help with the technology 
problem. I don't know what I can do about the policy problem, 
when you say, we are not the arbiters of truth. So I think we 
have to have a serious look at how to put more pressure on the 
technology sector, whether that's regulatory, or legislative, 
or advertising, or public pressure, and they have to start 
getting serious as to how their platforms are being weaponized 
to great effect in disrupting elections, and inciting violence, 
and sowing civil unrest. I don't think they've quite come to 
grips with that reality.
    Mr. Lucas. Well, when that moment comes, and inevitably it 
will, in your opinion, what will that collaboration look like? 
There's a government element, there's an academic element, 
there's a public-private partnership element.
    Dr. Farid. Yes.
    Mr. Lucas. Can you just----
    Dr. Farid. Sure.
    Mr. Lucas [continuing]. Daydream for a moment here with me?
    Dr. Farid. So I think the good news is the Facebooks and 
the Googles of the world have started to reach out to 
academics, myself included, Professor Lyu included. We now 
receive research funding to help them develop technology. 
That's good. I think the role of the government is to coax them 
along with regulatory pressure. I think what we've noticed over 
the last 20 years of self-regulation is not working. I'd like 
it to work, but it doesn't work in this particular space.
    So I think the role of the government can be through 
oversight, it can be regulatory, it can be through a cyber 
ethics panel that is convened to talk about the serious issues 
of how technology is being weaponized in society. But very much 
I think the academic/industry model has to work, because most 
of the research that we are talking about is happening at the 
academic side of things, and obviously the industry has 
different incentives than we do in the academy, so I think 
there is room for everybody.
    I'll also mention this is not bounded by U.S. borders. This 
is very much an international problem, so we should be looking 
across the pond to our friends in the UK, in the EU, and New 
Zealand, and Australia, and Canada, and bringing everybody on 
board because this is a problem for not just us, but for the 
whole world.
    Mr. Lucas. One last question. In your written testimony you 
suggest there's a non-technological component to solving the 
problem related to deep fakes and disinformation. Specifically, 
you wrote that we need to educate the public on how to consume 
trusted information, and how to be better digital citizens. 
What should this public education initiative----
    Dr. Farid. Yes.
    Mr. Lucas [continuing]. Look like?
    Dr. Farid. I'm always reluctant to say this, because I know 
how taxed our schools are in this country, but at some point 
this is an educational issue, starting from grade school on the 
way up. And, as an educator, I think this is our role. We have 
to have digital citizenry classes. Some of the European 
countries have done this. France is starting to do this, the UK 
is starting to do it. Public service announcements (PSAs) 
explaining to people how information can be trusted, what 
disinformation is, but we've got to start taking more seriously 
how we educate the next generation, and the current generation. 
And whether that's through the schools, through PSAs, through 
industry sponsored PSAs, you know, I think all of those are 
going to be needed.
    Mr. Lucas. And you would agree that our technology giant 
friends have a role in that education process?
    Dr. Farid. They absolutely have a role. They made this 
mess, they need to help fix it.
    Mr. Lucas. Very concise. Thank you, Doctor. I yield back, 
Madam Chair.
    Chairwoman Sherrill. Thank you, Mr. Lucas. And now, Ms. 
Wexton, I recognize you for 5 minutes.
    Ms. Wexton. Thank you, Madam Chair, and thank you to the 
panelists for appearing today. I want to speak a little bit 
about the explosive growth that the major social platforms have 
experienced over the past few years, because I'm worried that 
these companies are more focused on growth, and getting more 
users, than they are about essential oversight and user support 
functions. And, in fact, as has been noted, they disclaim 
responsibility for any information that goes out onto the web 
by the users. And, in fact, it seems to me that they have a 
disincentive to purge suspicious, or fake, or bot accounts.
    You know, I have here an article from July of last year, 
where Twitter's stock price went down by about eight and a half 
percent after they purged, over the course of two months, 70 
million suspicious accounts. Now, don't feel too bad for 
Twitter, because their stock price went up 75 percent over that 
six month period, but, you know, by being socially responsible, 
or by being responsible, it hurt their bottom line.
    Now, the platforms are incredibly powerful. We have already 
seen the power that they have here in the capitol, not just 
because of the lobbyists and everything, but because we all use 
them. We all have those platforms on our phones, and on our 
various devices. And, Dr. Farid, you spoke a little bit about 
how the basic features of the technology and the business model 
at social media companies kind of help exacerbate the 
proliferation of disinformation. Can you explain, from a 
business perspective, what benefit a bot account or a fake 
account might represent for a social media company?
    Dr. Farid. Sure. So, first of all, I think you're 
absolutely right that growth has been priority No. 1. And 
because the metrics of Silicon Valley are number of users, 
number of minutes online, it's because that's what eventually 
leads to advertising dollars. What we have to understand is 
that Silicon Valley, for better or worse, today is driven by ad 
revenue, and ad revenue is optimized by having more engagement, 
OK? So fake account, real account, don't care. Fake like, real 
like, fake tweet, doesn't matter, right, because at the end of 
the day, you get to report big numbers to the advertisers who 
are going to pay more money. Whether 50 percent of those 
accounts are fake or not, who's to know?
    So that's the underlying poison, if you will, of Silicon 
Valley, I think, and is the reason why the system is entirely 
frictionless, by design. There's no friction to creating an 
account on Twitter, or on Facebook, or on YouTube, because they 
want that to be easy. They want bots to be able to create these 
things because that is what elevates the numbers. And I think 
this is sort of our core problem that we have here.
    Ms. Wexton. So, related to that, why would social media 
companies allow, or even encourage, their recommendation 
algorithms to----
    Dr. Farid. Good.
    Ms. Wexton [continuing]. Put people, you know, to direct 
users to----
    Dr. Farid. Good.
    Ms. Wexton [continuing]. Suggested videos, or things like 
that, that are sensational, or even false? Why would they do 
that?
    Dr. Farid. The metric on YouTube is engagement, how long do 
you stay on the platform? And so what the algorithms learn is 
that, if I show you a video that is conspiratorial, or 
outrageous, you are more likely to click on it and watch it. If 
you are more likely to click or watch, you're going to stay on 
the platform longer, right? So the algorithms are not trying to 
radicalize you. What they are trying to do is to keep you on 
the platform for as long as possible. And it turns out, in the 
same way that people will eat candy all day long instead of 
broccoli, people will watch crazy videos all day long instead 
of PBS. I don't think this is surprising. And so the underlying 
algorithms, what they are being optimized for, in part, is 
exactly this.
    And we have been studying the nature of these conspiracy 
videos for over a year now, and I will tell you that, despite 
claims to the contrary, there is a rabbit-hole effect, that 
once you start watching the slightly crazy conspiratorial 
videos, you will get more and more and more of that because you 
are more likely to click, you are more likely to view, they're 
going to get more data, and they're going to sell more 
advertising. That's the underlying business model, is how long 
do you stay on my platform? And, in that regard, the quality of 
the information is utterly unimportant to the platforms. It is 
what keeps you there.
    Ms. Wexton. So maybe we should all have more cats and 
kittens, and less conspiracy?
    Dr. Farid. I'm all for cat videos.
    Ms. Wexton. So, switching gears a little bit, yesterday 
this Committee--we marked up a bill, it was Congressman 
Gonzalez's bill, that would expand research into technologies 
to help us better identify deep fake videos. And I had an 
amendment which was made in order, and approved by the 
Committee, to help increase education to help people identify 
deep fake videos, and so I was encouraged to hear you talk 
about that. So I would inquire of the panel, do you have any 
advice on what the most important elements of a public 
education campaign on deep fake videos should be?
    Dr. Farid. Again, you know, I am reluctant to put this on 
our public schools. I think they are overtaxed, and overworked, 
and underfunded. But at the end of the day, this is sort of 
where it belongs. And I think if we can do this, not as an 
unfunded mandate, but actually give them the resources to 
create courses of digital citizenry, of how you are a better 
digital citizen, how you can trust information and not trust 
information.
    I'll point out too, though, by the way, it's not just the 
young people. The senior citizens among us are more likely to 
share fake news than the young people, so this is across the 
spectrum. So I'm more--this--for me, the education level is 
more about the next 20, 30, 40 years than necessarily today. So 
I think a combination of PSAs, about returning to trusted 
sources, and about educating kids not just, by the way, about 
trusted information, but how to be a better digital citizen, 
how to interact with each other. The vitriol that we see online 
is absolutely horrific, and the things that we accept online we 
would never accept in a room like this, and I think we have to 
start teaching the next generation that this is not a way that 
we interact with each other. We need a more civil discourse.
    Chairwoman Sherrill. Thank you, Dr. Farid. And I'd now like 
to recognize Mr. Biggs for 5 minutes.
    Mr. Biggs. Thank you, Madam Chair, and I appreciate each of 
the witnesses for being here. It's a very, very interesting 
hearing, and appreciate the Chair for convening this hearing.
    So one of the main things I'm worried about is the de facto 
gray area between misinformation and disinformation, despite 
the seemingly clear definitional difference between these 
concepts. While disinformation may be defined in terms of the 
malicious intent on the part of the sender, such intent, as 
we've seen today, can at times be very difficult to identify. 
And then, on top of that, we need to make sure the gatekeepers, 
themselves trying to police content, are objective. Objective 
enough to identify potential misinformation, and able to do so 
as expeditiously as possible.
    It seems to me that, even if we have the technological 
anti-disinformation tools that we've learned about in this 
discussion, and that we anticipate seeing developed over time, 
human judgment will always be a key component of any anti-deep 
fakes effort, and human judgment can, of course, be fallible. 
In short, the difficulties and nuances of the battle pile up 
the deeper we delve into this topic. Maybe that's why I find it 
so interesting to hear what you all have to say today.
    But I want to just get back to something, and I would say I 
feel like we've been doing what I would call an endogenous 
look, and that is what's the technology here? And you mentioned 
it, Dr. Farid, in item four on page four of your 
recommendations in your written report, but it really gets to 
what I think is a real-world problem I'd like all of you to 
respond to, and the last questioner just kind of touched on it 
a bit as well. What do you tell a 13- or 14-year-old that 
you're trying to warn of potential disinformation, 
misinformation? How do you do it as a parent, as a grandparent, 
as someone who cares for, loves, an individual. I mean, that 
really becomes a part of the equation as much as anything else 
on the technological side.
    Dr. Lyu. Well, thank you for asking the question, because 
the nature of my work, I usually show a lot of fake videos to 
my 12-year-old daughter, and she actually grow the habit of 
distrust for any video I showed to her. So I think this may be 
a very effective way to actually tell them--to show them that 
the existence of fake videos will make them aware that those 
are something they should be careful about.
    Ms. Francois. I can take the question on, you know, what 
goes beyond technology, and I want to talk about one specific 
example. I think, when you look at the most sophisticated 
campaigns that have leveraged disinformation, and we're talking 
about actors who are willingly doing this, there's still a lot 
that we don't know. So, back to the Russian example, for 
instance, which is largely seen as the best-documented 
campaign, right, on which the platforms have shared a lot of 
data. I have myself worked with the Senate Select Intelligence 
Committee to document what happened. There are still essential 
pieces of that campaign that we know nothing about, and on 
which there's no data, in the eye of the public, to really 
understand how that technology was leveraged to manipulate 
audiences, direct messages, and how the Russians used to target 
deliberately specific journalists to feed them sources. We 
don't know anything about the scale of how much of that was 
going on.
    Similarly, what the GRU was doing, alongside the IRA, is 
something that there's zero available data on. So I would go 
back to those important and large-scale campaigns that we know 
have really disrupted society and interrogate, where are our 
blind spots? How can we do better? How can we produce this data 
so that we actually are able to fully understand those tactics? 
And then, of course, to build the tools to detect it, but also 
to train people to understand it, and to build defense.
    Mr. Biggs. Thank you. Dr. Farid? What are you going to tell 
your kid?
    Dr. Farid. I, fortunately, don't have kids, so I don't have 
to struggle with this problem.
    Mr. Biggs. They're a blessing and a curse.
    Dr. Farid. I think this is difficult, because the fact is 
this generation is growing up on social media----
    Mr. Biggs. Yes.
    Dr. Farid [continuing]. And they are not reading The 
Washington Post, and The New York Times, and MSNBC, and Fox 
News. They think about information very differently. And I can 
tell you what I tell my students, which is, do not confuse 
information with knowledge. Those are very different things. 
And I think there is this tendency that it's online, therefore 
it must be true. And so my job as an educator is to make you 
critically think about what you are reading. And I don't know 
how to do that on a sort of day-to-day basis, but I do that 
every day with my students, which is critical reasoning. And 
with critical reasoning, I think everything comes.
    And, if I may, I wanted to touch one--because I think you 
made a good point about the--sort of the nuance between mis- 
and disinformation, and we should acknowledge that there are 
going to be difficult calls. There is going to be content 
online that falls into this gray area that it's not clear what 
it is, but there is black and white things out there, and we 
should start dealing with that right now, and then we'll deal 
with that gray area when we need to, but let's not get 
confounded with that gray area, and not deal with the real 
clear cut harmful content.
    Mr. Biggs. Right. So information's not knowledge. I'd like 
to tell people in Congress, activity is not progress either, 
so, I mean, we----
    Dr. Farid. We agree on that.
    Chairwoman Sherrill. Thank you, Mr. Biggs. And next I would 
like to recognize Mr. Beyer for 5 minutes.
    Mr. Beyer. Madam Chair, thank you very much. Dr.--Ms. 
Francois--so Dr. Lyu talked about funding more civilian 
research through the National Science Foundation, and setting 
up an emerging technologies directorate, and you spoke about 
this emerging field of interdisciplinary scholars, 
practitioners, that needed support, funding, and shared 
infrastructure. How best do you see us making that happen? Do 
we need congressional legislation? How big a budget does it 
have to be? Is it only NSF, or NIST (National Institutes of 
Standards and Technology), or----
    Ms. Francois. That's a great question, thank you. I think 
it can be a whole of government effort, and I do think that a 
series of institutions have to get involved, because indeed, as 
I say, it's very interdisciplinary. I do think that regulation 
has to play a role too, not only to address those critical and 
complex questions, like the one of data access that I 
discussed.
    I want to build on a point that Dr. Farid made about the 
algorithmic reinforcement, as an example. This is something 
that we know is impacting society. People watch one video, and 
seem to end up in a filter bubble of conspirational video. But, 
unfortunately, we have very little serious research on the 
matter. We are making those observations on a purely empirical 
basis out of, you know, people who let their computers run. We 
can't afford to be in the dark on the impact of technology on 
society like this. And in order to do serious scientific 
research on those impacts at scale, we need data, and we need 
the infrastructure to systematically measure and assess how 
this technology is impacting our society.
    Mr. Beyer. Thank you very much. Dr. Farid, I was fascinated 
you talked about determining what's real, rather than what's 
fake, and specifically talking about the control capture 
technologies. We've had a number of Science Committee hearings 
on blockchain technology, which inevitably lead into quantum 
computing (QC) technology. Is blockchain, and ultimately QC, 
the right way to deal with this?
    Dr. Farid. I think blockchain can play a role here. So the 
basic idea, for those who don't know, blockchain--basically all 
you have to know is that it's an immutable distributed ledger. 
So immutable, when you put information on there, it doesn't 
change. Distributed as it's not stored on one central server, 
but on millions of computers, so you don't have to rely on 
trust of one individual.
    So one version of control capture is, at the point of 
capture, you extract that unique signature, cryptographically 
sign it, and you put that signature on the blockchain for 
public viewing of it, and public access to it. It's a very nice 
application of blockchain. I don't think it's critical to the 
solution. If you have a trusted central server, I think that 
would work well, but the reason why people like the blockchain 
is that I don't have to trust a Facebook, or an Apple, or a 
Microsoft, I can trust the globe. So I do see that as being 
part of the control capture environment, and being part of the 
solution of a universal standard that says, if you want your 
content to be trusted, take it with this control capture, and 
then we can trust that going down the line. I think we're 
eventually going to get there. I think it's just a matter of 
time.
    Mr. Beyer. And, Dr. Lyu, how would you contrast 
watermarking technology with the blockchain, with the control 
capture? And is one better than the other, or do you need both, 
or----
    Dr. Lyu. I think these technologies are somehow 
complementary. So watermark is the content you actually embed 
into the image, and blockchains are ways to actually 
authenticate if the watermark is consistent with the original 
contents we invited into the signal. So they can work together. 
You can imagine that we have watermark also being part of the 
blockchain, uploaded to the remote distributed serer. So they 
can work hand in hand in this case. But watermarks can also 
work independently from a single capture control mechanism for 
authenticity of digital visual media.
    Mr. Beyer. Thank you. And Ms.----
    Dr. Lyu. Thank you.
    Mr. Beyer. Ms. Francois, again, you talked about how the 
big data players, the Facebooks and Twitters, obviously are a 
huge part of the potential problem--source material, and have 
to be part of the solution, and you mentioned regulation as one 
of the pieces of the NSF/NIST piece. Not that you can do it in 
45 seconds, but anything that you guys can prepare to help our 
Energy and Commerce Committee, the committees in both houses, 
looking at how we manage the social media giants would be very, 
very appreciated. Because understanding how they've gone from 
basically unregulated unicorn game changers in our society, to 
how they can properly play within the rules, is going to be a 
really, really big challenge for us.
    Ms. Francois. I think it's going to be a lot of moving 
pieces. It's a complex problem, as I said, and I do believe 
that there's a lot of different bodies of regulation that can 
be applied and brought to bear to tackle it. One that is often 
left out of the conversation that I just want to highlight here 
is consumer protection. Dr. Farid talked about how the 
advertisers are getting the fake clicks. This can be a consumer 
protection issue. So different bodies of regulation, from cyber 
security to consumer protection, to address a whole of the 
disinformation problem, plus serious pressure to ensure that 
the data that the field needs is being shared in a way that 
makes it--for people.
    Mr. Beyer. Yes. Thank you very much, and I yield back.
    Chairwoman Sherrill. Thank you. Next I'd recognize Mr. 
Waltz for 5 minutes.
    Mr. Waltz. Thank you, Madam Chairwoman. Ms. Francois, going 
back to the disinformation campaigns that the Russians, the 
Iranians, and others have ongoing, the FBI and Department of 
Homeland Security have briefed us that they're confident, at 
least at this point in time, that active hacking into our 
election infrastructure has diminished, at least for now.
    Although I, and other colleagues, have worked to ensure 
that critical infrastructure is secured going forward, and this 
Committee has done work on that as well, but I'm interested in 
the disinformation piece of it, are you seeing increasing 
evidence of our adversaries conducting disinformation against 
individuals, whether they're thought leaders, journalists, 
politicians? For example, I could foresee hawks on Iran policy, 
or Russia, or others being specifically targeted during an 
election in order to change that election outcome, and 
therefore change our policy and voices. Are you seeing an 
increase there? What types of techniques are you seeing, and 
where are you seeing it, aside from the United States?
    One of the things that I've pushed is for us to share what 
we're gathering. For example, the Taiwanese elections, or other 
elections, for us to create a collaborative approach with our 
allies as well. This is a problem with the West, and I think 
with free speech and free thought, as much as it is with, you 
know, 2020 elections. And I'd welcome your thought.
    And then second, sorry, what would you think the response 
would be if we took more of a deterrence measure? For example, 
sending the signal that the Iranians, the Russians, and other 
bad actors, they have their own processes, and they have their 
own concerns, and often these regimes are more concerned with 
their own survival than they are with anything else, and at 
least demonstrating that we have that capability to interfere 
as well. I know that may present a lot of moral and ethical 
questions of whether we should have that capability, and 
whether we should demonstrate we should use it, but we've 
certainly taken that approach with nuclear weapons. And so I'd 
welcome your thoughts there.
    Ms. Francois. Thank you. I want to start by saying that 
part of it--yes, I am seeing an increase. Part of it is an 
increase, the other part is simply just a reckoning, as I said. 
Iran is a good example. We see a lot of disinformation 
campaigns originating from the Iranian state, who's a very 
prolific actor in that space.
    Now, people often ask me, is Iran following the Russian 
model. In reality the first Iranian campaign to use social 
media to target U.S. audiences date back from 2013, where we 
were asleep at the wheel, and not looking for them. So, despite 
our reckoning with sort of the diversity of actors who have 
been engaged with these techniques to target us, there is also 
an increase in both their scale and their sophistication. This 
is a cat-and-mouse game, and so what we also see is, as we 
detect actors and their techniques, they increase the 
sophistication. They make it harder for us to do the forensics 
that we need in order to catch those campaigns as they unfold.
    Thank you for raising the question of deterrence. I do 
think that this ultimately is a cyber policy issue too, and 
therefore the government has a role to play. In the case of the 
U.S. midterms in 2018, we saw U.S. Cyber Command target the 
Internet Research Agency in St. Petersburg in an act of this 
attempted cyber deterrence. So I do think that there is a 
governmental response too by putting this problem in the 
broader context of cyber issue and cyber conflict.
    Mr. Waltz. Thank you for raising that. I think it's 
important for my colleagues to note that was a policy change 
under this Administration that then allowed Cyber Command to 
take those kind of, what they call active defensive measures, 
and taking election security very seriously. I want to 
distinguish, though, between active defense and the potential, 
at least, and sending the signal that we have the potential for 
offense. And your thoughts there on the United States also 
participating in disinformation, or at least a deterrent 
capability?
    At the end of the day I think we can only do so much in 
playing defense here. We can only counter so much of this cat-
and-mouse game. We have to fundamentally change our 
adversaries' behavior, and put them at risk, and their regimes 
at risk, in my own view. But I'd welcome your thoughts in my 
remaining time.
    Ms. Francois. Yes, I think the--8 minutes to answer this 
complex question on the dynamics of deterrence and resilience 
in cyberspace. I will say what immediately comes to mind is, of 
course, a question of escalation. How much of these dynamics 
contribute to escalation is something that is an unknown in 
this space.
    So far I think that the approach of being much more 
aggressive in both catching these campaigns, deactivating them, 
and publicly claiming that we have found them, and this is what 
they look like, seems to be a welcome move in this area. I 
think by exposing what actors are doing, we are also 
contributing to raising the cost for them to engaging in these 
techniques.
    Chairwoman Sherrill. Well, that was well done----
    Mr. Waltz. Thank you.
    Chairwoman Sherrill [continuing]. Ms. Francois. Thank you. 
Next I recognize Mr. Gonzalez for 5 minutes.
    Mr. Gonzalez. Thank you, Madam Chair, and thank you for 
being here, to our witnesses, and your work on this topic. A 
very important topic, and one that's a little bit new to 
Congress, but one that, alongside of Madam Chair, and others on 
this Committee, we've been excited to lead on, and I think 
we're making progress, unlike some other areas of Congress that 
I'm a part of.
    So, that being said, Dr. Lyu, I want to start with you, and 
I really just want to understand kind of where we are in the 
technology, from the standpoint of cost. So if, call it 2 
decades ago, I used the Forrest Gump example, yesterday. You 
know, Forrest Gump, if you've seen the movie, makes it looks 
like he's shaking hands with Presidents, and all kinds of 
things, and you can't tell the difference, except for you just 
know that there's no way that happened. Hollywood studio 
could've produced that, but it was costly back then, right, 
however much it costs. Today I think some numbers came out that 
you were citing that as, you know, roughly a couple thousand 
dollars. How quickly is the cost going down, to the point that 
this will be a weapon, if you will, that, you know, a 16-year-
old sitting behind his computer could pull off?
    Dr. Lyu. I think this is basically, you know, we used to 
call this Moore's Law, where the computational power just got 
doubled every 18 months, and I think Moore's Law has already 
been broken with the coming of GPUs. The computational power 
that are at our hand is extremely higher than we have imagined 
before, and this trend is growing. So I will predict in the 
coming years it will become cheaper and easier, and also better 
to produce these kind of videos, and the computer hardware and 
algorithms will all get rapid improvements.
    Mr. Gonzalez. Yes.
    Dr. Lyu. So that's coming. I think it's a coming event. 
Thank you.
    Mr. Gonzalez. Thanks. And I actually think, you know, we 
talk a lot about great power competition in Iran, and China, 
and Russia, and I think that makes sense. I'm also maybe 
equally concerned about just a random person somewhere in 
society who has access to this, and can produce these videos 
without any problem, and the damage that that can cause. And I 
don't know that we've talked enough about that, frankly.
    But switching to Ms. Francois, you talked about how you 
found 70 countries use computational propaganda techniques in 
your analysis. And obviously a lot of this is spread through 
the platforms, and I think you talked really well about just 
how you can go down rabbit holes in the engagement metrics, and 
things like that. What do you think, and Dr. Farid, I'd welcome 
your comments as well, what do the platforms themselves need to 
be doing differently? Because it strikes me that they're being 
somewhat, or I would say, I would argue grossly irresponsible 
with how they manage some of the content on their systems 
today.
    Ms. Francois. That's a great question. I just want it 
precise that the 70 countries method comes from the Oxford 
Internet Institute report that was published today.
    Mr. Gonzalez. OK. Thank you.
    Ms. Francois. For me, the platform's play here is actually 
quite simple, and I would say clearer roles, more aggressive 
action, more transparency.
    Mr. Gonzalez. Yes.
    Ms. Francois. Let's start with clearer roles. Some 
platforms still don't have a rule that governments are not 
allowed to leverage their services in order to manipulate and 
deceive. And they will say they have rules that kind of go to 
this point, you know, tangentially, but there's still a lot of 
more clear rules that need to be established. To the second 
point, aggressive enforcement. There's still a lot of these 
campaigns that go under the radar, and that go undetected. They 
need to put the means to the table to make sure that they 
actually are able to catch, and detect, and take down as much 
of this activity as possible. My team, this week, published a 
large report on a spam campaign that was targeting Hong Kong 
protestors from Chinese accounts, and then they----
    Mr. Gonzalez. Yes.
    Ms. Francois [continuing]. Had to take it down. There's 
more that they can do. Finally, transparency. It's very 
important that the platform continue, and increase, their 
degree of transparency in saying what they're seeing on their 
services, what they're taking down, and share the data back to 
the field.
    Mr. Gonzalez. Yes. I think that makes a lot of sense. My 
fear is, you know, we're going to do the best we can. I don't 
know that, one, this is intellectually difficult to figure out, 
as Congress, and it's also politically difficult, which, to me, 
puts it in that, like, Never Never Land, if it's going to take 
a while. So my hope is that the social medial platforms 
understand their responsibility, and come to the forefront with 
exactly what you said, because if not, I don't know that we're 
going to get it right, frankly.
    But with my final question, I'll throw just the word mental 
health, and the platforms themselves, and misinformation. Any 
studies that you're aware of that are showing the impacts on 
mental health, in particular teenagers, with respect to what's 
going on on the platforms today? Anybody can answer that.
    Ms. Francois. Again, I want to say that in this field we 
direly lack the data, infrastructure, and access to be able to 
do robust at-scale studies. So there is a variety of wonderful 
studies that are doing their best with small and more 
qualitative approaches, which are really, really important, but 
we're still direly lacking an important piece of doing rigorous 
research in this area.
    Mr. Gonzalez. Thank you. And I'll follow up with additional 
questions on how we can get that data, and be smarter about 
that in Congress. So, thank you, I yield back.
    Mr. Beyer [presiding]. Thank you very much, sir. Dr. Farid, 
I understand you developed a seminal tool for Microsoft called 
PhotoDNA that detects and weeds out child pornography as it's 
posted online. Can you talk about how this tool works? Could 
this be used to address harmful memes and doctored images? And 
how do the social media companies respond to this?
    Dr. Farid. So PhotoDNA was a technology that I developed in 
2008-2009 in collaboration with Microsoft and the National 
Center for Missing and Exploited Children (NCMEC). Its goal was 
to find and remove the most horrific child sexual abuse 
material (CSAM) online. The basic idea is that the technology 
reaches into an image, extracts a robust digital signature that 
will allow us to identify that same piece of material when it 
is reuploaded. NCMEC is currently home to 80 million known 
child sexual abuse material, and so we can stop the 
proliferation and redistribution of that content.
    Last year alone, in one year, the National Center for 
Missing and Exploited Children's CyberTipline received 18 
million reports of CSAM being distributed online. That's 2,000 
an hour. 97, 98 percent of that material was found with 
PhotoDNA. It has been used for over a decade, and has been 
highly effective. Two more things. That same core technology 
can be used, for example, to find the Christchurch video, the 
Speaker Pelosi video, the memes that are known to be viral and 
dangerous. Once content is detected, the signature can be 
extracted, and we can stop the redistribution.
    And to your question of how the technology companies 
respond, I think the answer is not well. They were asked in 
2003 to do something about the global distribution of child 
sexual abuse material, and for 5 years they stalled, they did 
absolutely nothing. We're not talking about complicated issues 
here, gray areas. We are talking about 4-year-olds, 6-year-
olds, 8-year-olds being violently raped, and the images and the 
videos of them, through these horrific acts, being distributed 
online. And the moral compass of Silicon Valley for the last 
decade has been so fundamentally broken they couldn't wrap 
their heads around their responsibility to do something about 
that.
    That doesn't bode well, by the way, for going forward, so I 
think that history is really important, and we have to remember 
that they come begrudgingly to these issues, and so we have to 
coax them along the way.
    Mr. Beyer. Thank you very much. So there--these images have 
digital signatures, even before we talk about the capture 
control technology----
    Dr. Farid. Yes.
    Mr. Beyer [continuing]. Or the watermark----
    Dr. Farid. That's exactly right. These don't have to be 
captured with specific hardware. So what we do is, after the 
point of recording, we reach in and we find a distinct 
signature that will allow us to identify, with extremely high 
reliability, that same piece of content. And that can be child 
abuse material, it can be a bomb-making video, it can be a 
conspiracy video, it can be copyright infringement material. It 
can be anything.
    Mr. Beyer. But it has to show up first----
    Dr. Farid. That's right.
    Mr. Beyer [continuing]. In the public space----
    Dr. Farid. Yes.
    Mr. Beyer [continuing]. At least once, and we have to know 
that it's there in order to capture this----
    Dr. Farid. That's the drawback. But the good news is that 
technology works at scale. It works at the scale of a billion 
uploads to Facebook a day, and 500 hours of YouTube videos a 
minute. And that's a really hard engineering problem to tackle, 
but this technology actually works, unlike many of the other 
algorithms that have extremely high error rates, and would 
simply have too many mistakes.
    Mr. Beyer. Thank you very much. Dr. Lyu, you talked about 
using AI to find AI, and that more deep neural networks are 
used to detect the fakes, but there's the sense that the good 
guys are always trying to catch up with the bad guys, you know, 
the cat-and-mouse. Is there any way around the cat-and-mouse 
nature of the problem? Which, by the way, we just saw before, 
it's got to be out there before you can tag it and chase it 
down.
    Dr. Lyu. That's a very good question. Actually, I think on 
this point, I'm more pessimistic because I don't think there's 
a way we can escape that, because that's the very nature of 
this kind of problem. Unlike other research areas, where the 
problem's fixed, we're basically dealing with a moving target. 
Whenever we have new detection or deterrent algorithms, the 
adversaries will always try to improve their algorithm to beat 
us. So I think, in the long run, this will be the situation 
that will keep going.
    But I--that also emphasize Dr. Farid's point that we need 
more investment onto the side of detection and protection for 
the sake that, you know, we have a lot more resources put into 
making deep fakes for, you know, all kinds of reasons, but the 
investment in detection has not been catching up with that 
level. So that's part of my testimony, is encouraging the 
Federal Government to put more investment into this important 
area. Thank you.
    Mr. Beyer. Ms. Francois?
    Ms. Francois. Yes, if I may add a very simple metaphor 
here, I think we also have a leveling of the playing field 
issue. We're currently in a situation where there are a lot of 
cats, and very few mouses. We need to bring the resources to 
the table that correspond to the actual scale and gravity of 
the problem.
    Mr. Beyer. OK. Great. Thank you very much. I now recognize 
the gentleman from Ohio, Mr. Gonzalez.
    Mr. Gonzalez. Thanks. Didn't know I was going to get a few 
extra seconds. So I just want to drill down on that data-
sharing component. So you mentioned that we just need a better 
data-sharing infrastructure. Can you just take me as deep as 
you can on that? What do we need specifically? Just help me 
understand that.
    Ms. Francois. Yes. There are many different aspects to what 
we need, and I think that the--both the infrastructure, people 
involved, and type of data depend on the type of usage. So, for 
instance, facilitating academic access to at-scale data on the 
effects of technology on society is ultimately a different 
issue than ensuring that cybersecurity professionals have 
access to the types of forensics that correspond to a high-
scale manipulation campaign that enables them to build better 
detection tools. And so I think the first step in tackling this 
problem is recognizing the different aspects of it.
    Mr. Gonzalez. Got it.
    Ms. Francois. Of course, the key component here is security 
and privacy, which here go hand in hand. What you don't want is 
to enable scenarios like Cambridge Analytica, where data abuses 
lead to more manipulation. Similarly, when we see 
disinformation campaigns, we often see a lot of real citizens 
who are caught into these nets, and they deserve the protection 
of their privacy.
    If you go down sort of the first rabbit hole of ensuring 
that cybersecurity professionals have access to the type of 
data and associated forensics that they need in order to do 
this type of detection at scale, and to build the forensics 
tool we need at scale, there's still, as I said, a lot we can 
do. The platforms right now are sharing some of the data that 
they have on these types of campaigns, but in a completely 
haphazard way. So they're free to decide when they want to 
share, what they want to share, and in which format. Often the 
format, they're sharing them in are very inaccessible, so my 
team has worked to create a database that makes that accessible 
to researchers. That's one step we can take.
    And, again, and I'll wrap on that, because this can be a 
deep rabbit hole----
    Mr. Gonzalez. Yes.
    Ms. Francois [continuing]. You pushed me down this way. 
Again, if we take the Russia example, for instance when we 
scope a collection around something that we consider to be of 
national security importance, we need to make sure we have the 
means to ensure that the picture we're looking at is 
comprehensive.
    Mr. Gonzalez. Right.
    Ms. Francois. Our own false sense of security, in looking 
at the data, thinking that they represent the comprehensive 
picture of what happened, and was directed at us, is a problem 
in our preparations for election security.
    Mr. Gonzalez. Thank you. Dr. Farid, any additional thoughts 
on that?
    Dr. Farid. Yes. I just wanted to mention, and I think Ms. 
Francois mentioned this, there is this tension between privacy 
and security, and you're seeing this particularly with 
Cambridge Analytica. And I will mention too that this is not, 
again, just a U.S. issue, this is a global issue. And with 
things like GDPR (General Data Protection Regulation), it has 
made data sharing extremely more complex for the technology 
sector.
    Mr. Gonzalez. Yes.
    Dr. Farid. So, for example, we've been trying to work with 
the sector to build tools to find child predators online, and 
the thing we keep running up against is we can't share this 
stuff because of GDPR, we can't share it because of privacy. I 
think that's a little bit of a false choice, but there is a 
sensitivity there that we should be aware of.
    Mr. Gonzalez. Yes. That's fair. I agree with you. 
Certainly, I think what you highlight, which I agree with, is 
there are gray areas----
    Dr. Farid. Yes.
    Mr. Gonzalez [continuing]. OK, but there also, like, big 
bright lines. Child pornography, let's get that off our 
platforms.
    Dr. Farid. Yes, I agree. And feels to me like, if you share 
child pornography, you have lost the right to privacy. I don't 
think you have a right to privacy anymore once you've done 
that, I should have access to your account. So I think there's 
a little bit of a false narrative coming out here, but I still 
want to recognize that there are some sensitivities, 
particularly with the international standards. The Germans have 
very specific rules----
    Mr. Gonzalez. Yes.
    Dr. Farid [continuing]. The Brits, the EU, et cetera.
    Mr. Gonzalez. So the last question, and this is maybe a bit 
of an oddball, so with the HN site that was ultimately brought 
down, I believe Cloudflare was their host, is that----
    Dr. Farid. Yes.
    Mr. Gonzalez. So we talk a lot about the platforms 
themselves, right, but we don't always talk about the 
underlying infrastructure----
    Dr. Farid. Yes.
    Mr. Gonzalez [continuing]. And maybe what responsibilities 
they have.
    Dr. Farid. Yes.
    Mr. Gonzalez. Any thoughts on that? Should we be looking 
there as well?
    Dr. Farid. You should. And it is complicated, because----
    Mr. Gonzalez. Yes.
    Dr. Farid [continuing]. When you go to a Cloudflare--as the 
CEO came out and said, I woke up 1 day, and I thought, I don't 
like these guys, and I'm going to kick them off my platform. 
That is dangerous.
    Mr. Gonzalez. That's very----
    Dr. Farid. Yes. But Ms. Francois said it very well. Clear 
rules, enforce the rules, transparency. We have to have due 
process. So define the rules, enforce them consistently, and 
tell me what you're doing. I can fix this problem for the CEO 
of Cloudflare. Just tell me what the rules are. So--but I don't 
think they get a bye just because they're the underlying 
hardware of the Internet. I think they should be held to 
exactly the same standards, and they should be held to exactly 
the same standards of defining, enforcing, and transparency.
    And, by the way, I'll also add that cloud services are 
going to be extremely difficult. So, for example, we've made 
progress with YouTube on eliminating terror content, but now 
they're just moving to Google Drive, and Google is saying, 
well, Google Drive is a cloud service, so it's outside of this 
platform. So I do think we have to start looking at those core 
infrastructures.
    Mr. Gonzalez. OK. I appreciate your perspective. Frankly, I 
don't know what I net out on it, I just know it's something 
that I think we should be looking at----
    Dr. Farid. I agree.
    Mr. Gonzalez [continuing]. And weighing, so thank you.
    Mr. Beyer. Thank you. Dr. Lyu, you know, Ms. Francois just 
talked about a level playing field, you know, that, the bad 
guys have a lot more tools and resources than the good guys.
    Dr. Lyu. Right.
    Mr. Beyer. We talked a lot about the perils of deep fakes, 
but are there any constructive applications?
    Dr. Lyu. Actually----
    Mr. Beyer [continuing]. Where we want to use deep fakes in 
a good way?
    Dr. Lyu. Yes, indeed. Actually, the technology behind deep 
fake, as I mentioned in my opening remark, is of dual use. So 
there's a beneficial side of using this technology. For 
instance, the movie industry can use that, reduce their costs. 
There are also ways to actually make sure a message can be 
broadcast to multilinguistic groups without, you know, 
regenerating the media in different languages. It is also 
possible to use this technology to protect privacy. For 
instance, for people like whistleblowers, or, you know, victims 
in violent crime. If they don't want to expose their identity, 
it's possible to use this technology, replacing the face, but 
leaving the facial expression intact there.
    The negative effect of deep fake, this kind of technology, 
you get a lot of spotlight, but there's also this dual use that 
we should also be aware of. Thank you very much.
    Mr. Beyer. Thank you. Ms. Francois, are there any good 
bots?
    Ms. Francois. Yes. They're really fun. One of them 
systematically tweets out every edit to Wikipedia that is made 
from the Congress Internet infrastructure. In general what I'm 
trying to say is there are good bots. Some of them are fun and 
creative, and I think they do serve the public interest. I do 
not think that there are good reasons to use an army of bots in 
order to do coordinated amplification of content. I think when 
you are trying to manipulate the behavior to make it look like 
a broader number of people are in support of your content than 
actually is the case, I do not see any particularly good use of 
that.
    Mr. Beyer. I want to send you one of my daughter's bots. 
She has a perfectly normal Twitter account, and then she has 
the Twitter bot account, where she leverages off of her 
linguistics background, and I cannot make heads nor tails of 
what it does. But perhaps----
    Ms. Francois [continuing]. Can look at it.
    Mr. Beyer [continuing]. You can. Yes, it's----
    Ms. Francois. OK.
    Mr. Beyer. She says it's OK. Dr. Farid, you talked--it 
would be a mistake for the tech giants to transform their 
system from end-to-end encrypted systems, that would make the 
problem only worse. Can you walk us through that?
    Dr. Farid. Sure, and I'm glad you asked the question. So 
let's talk about what end-to-end encryption is. So the idea is 
I type a message on my phone, it gets encrypted, and sent over 
the wire. Even if it's a Facebook service, Facebook cannot read 
the message. Under a lawful warrant, you cannot read the 
message. Nobody can read the message until the receiver 
receives it, and then they decrypt. So that's called an end-to-
end encryption. Everything in the middle is completely 
invisible. WhatsApp, for example, owned by Facebook, is end-to-
end encrypted, and it is why, by the way, WhatsApp has been 
implicated in horrific violence in Sri Lanka, in the 
Philippines, in Myanmar, in India. It has been linked with 
election tampering in Brazil, in India, and other parts of the 
world, because nobody knows what's going on on the platform.
    So last year, you heard me say, 18 million reports to the 
National Center for Child Sexual Abuse Material, more than half 
of those came from Facebook Messenger, currently unencrypted. 
If they encrypt, guess what happens? Ten million images of 
child sexual abuse material, I can no longer see. This is a 
false pitting of privacy over security, and it's completely 
unnecessary. We can run PhotoDNA, the technology that I 
described earlier, on the client so that, when you type the 
message and attach an image, we can extract that signature. 
That signature is privacy preserving, so even if I hand it to 
you, you won't be able to reconstruct the image, and I can send 
that hash, that signature, along with the encrypted message, 
over wire, pull the hash off, compare it to a database, and 
then stop the transmission.
    And I will mention, by the way, when Facebook tells you 
that this is all about privacy, is that on WhatsApp, their 
service, if somebody sends you a link, and that link is 
malware, it's dangerous to you, it will be highlighted in the 
message. How are they doing that? They are reading your 
message. Why? For security purposes. Can we please agree that 
protecting you from malware is at least as important as 
protecting 4-year-olds and 6-year-olds and 8-year-olds from 
physical sexual abuse?
    We have the technology to do this, and the rush to end-to-
end encryption, which, by the way, I think is a head fake. 
They're using Cambridge Analytica to give them plausible 
deniability on all the other issues that we have been trying to 
get them--progress on, from child sexual abuse, to terrorism, 
to conspiracies, to disinformation. If they end-to-end encrypt, 
we will lose the ability to know what's going on on their 
platforms, and you have heard very eloquently from my colleague 
that this will be a disaster. You should not let them do this 
without putting the right safeguards in place.
    Mr. Beyer. So you were just making a powerful argument now 
for national and international level banning end-to-end 
encryption?
    Dr. Farid. I wouldn't go that far. We want end-to-end 
encryption for banking, for finance. There are places where it 
is the right thing to do, but there are other places where we 
have to simply think about the balance. So, for example, in my 
solution I didn't say don't do end-to-end encryption. I said 
put the safeguards in place so that if somebody's transmitting 
harmful content, I can know about it.
    I have mixed feelings about the end-to-end encryption, but 
I think, if you want to do it, and we should think seriously 
about that, you can still put the safeguards in place.
    Mr. Beyer. And blockchain is not end-to-end encryption?
    Dr. Farid. No, it is not.
    Mr. Beyer. But it gets close?
    Dr. Farid. These are sort of somewhat orthogonal separate 
issues, right? What we are talking about is a controlled 
platform saying that--everything that comes through us, we will 
no longer be able to see. That is super convenient for the 
Facebooks of the world, who don't want to be held accountable 
for the horrible things happening on their platforms, and I 
think that's the core issue here.
    Mr. Beyer. Great, thanks. Anything else? All right. I think 
Mr. Gonzalez and I are done and thank you very much. It's a 
very, very interesting mission, and don't be discouraged that 
there weren't more Members here, because everyone's in their 
office watching this, and have their own questions. So thank 
you very much for being here, and thanks for your witness 
stuff. And the record will remain open for 2 weeks for 
additional statements from the Members, and, additionally, we 
may have questions of you to answer in writing. So thank you 
very much.
    Dr. Farid. OK.
    Mr. Beyer. You're excused, and the hearing is adjourned.
    Dr. Farid. Thank you.
    [Whereupon, at 3:26 p.m., the Subcommittee was adjourned.]

                                Appendix

                              ----------                              


                   Additional Material for the Record


[GRAPHICS NOT AVAILABLE IN TIFF FORMAT]	

                                 [all]