Google pushes “text fragment links” with new Chrome extension

New feature can deep-link to specific text on a Web page, with highlighting.

Google has been cooking up an extension to the URL standard called “Text Fragments.” The new link style will allow you to link not just to a page but to specific text on a page, which will get scrolled to and highlighted automatically once the page loads. It’s like an anchor link, but with highlighting and creatable by anyone.

The feature has actually been supported in Chrome since version 80, which hit the stable channel in February. Now a new extension from Google makes it easy to create this new link type, which will work for anyone else using Chrome on desktop OSes and Android. Google has proposed the idea to the W3C and hopes other browsers will adopt it, but even if they don’t, the links are backward-compatible.

The syntax for this URL is pretty strange looking. After the URL, the magic is in the string “#:~:text=” and then whatever text you want to match. So a full link would look like this:

https://en.wikipedia.org/wiki/Cat#:~:text=Most breeds of cat have a noted fondness for sitting in high places

If you copy and paste this into Chrome, the browser will open Wikipedia’s cat page, scroll to the first text that matches “Most breeds of cat have a noted fondness for sitting in high places,” and will highlight it. If the text doesn’t match anything, the page will still load. Backward-compatibility works because browsers currently support the number sign (#) as a URI fragment, which usually gets used for anchor links that are made by the page creator. If you paste this into a browser that doesn’t support it, the page will still load, and everything after the number sign will just be ignored as a bad anchor link. So far, so good.

One problem is that this means you can have spaces in a URL. On a webpage or forum, you can hand-code the link with a href tag (or whatever the non-HTML equivalent is) and everything will work. For instant messengers and social media though, which don’t allow code and use automatic URL parsers, things get a bit more complicated. Every URL parser treats a space as the end of a URL, so you’ll need to use percent-encoding to replace all the spaces with the equivalent “%20.” URL parsers now have a shot at linkifying this correctly, but it looks like a mess:

https://en.wikipedia.org/wiki/Cat#:~:text=Most%20breeds%20of%20cat%20have%20a%20noted%20fondness%20for%20sitting%20in%20high%20places.

Spaces aren’t the only characters that can cause problems. The standard RFC 3986 defines several “reserved” characters as having a special meaning in a URL, so they shouldn’t be in a URL. Web-page-authoring tools tend to handle these characters automatically, but now that you’re embedding arbitrary sentences in a URL for highlighting, there’s a higher chance you’ll run into one of these reserved characters:! * ‘ ( ) ; : @ & = + $ , / ? # [ ]. They all need to be percent-encoded in order for the URL to work, and Google’s extension takes care of that for you.

Google’s new Chrome extension, called “Link to Text Fragment,” (it’s also on Github) will put a new entry in Chrome’s right-click menu. You just highlight text on a page, right-click it, and hit “Copy link to selected text.” Like magic, a text fragment link will end up on your clipboard. All the text encoding is done automatically, so the link should work with most websites and messengers.

Google seems like it is going to start pushing out support for text fragments across its Web ecosystem, even without the W3C. The links have already started to show up in some Google search results, which allow Chrome users to zip right to the relevant text. It’s probably only a matter of time before link creation moves from an extension to a normal Chrome feature.

The Great Google Revolt

Some of its employees tried to stop their company from doing work they saw as unethical. It blew up in their faces.

Rana Foroohar, “Don’t Be Evil”

00:05
very excited to introduce Rana Rana Zay
00:08
is the global business columnist and
00:11
natural times and CNN’s global economic
00:13
analysts previously she’s been the
00:15
assistant managing editor in charge of
00:16
Business and Economics at time as well
00:18
as the magazine’s economic columnist and
00:20
spent 13 years at Newsweek as an
00:22
economic foreign affairs editor and
00:24
correspondent and in her new book don’t
00:26
be evil which i think is a great title
00:29
Rhonda Chronicles how far big tech has
00:31
fallen from its original vision of free
00:33
information and digital democracy
00:35
drawing on nearly 30 years of experience
00:36
reporting on the technologies sector
00:39
Ronna traces the evolution of companies
00:40
such as Google Facebook Apple Amazon
00:46
into behemoths that monetize people’s
00:49
data spread misinformation and hate
00:51
speech and threaten citizens privacy she
00:53
also shows how we can fight back by
00:55
creating a framework that both fosters
00:57
innovation and protects us from threats
00:59
posed by digital technology her book is
01:02
already garnering widespread praise with
01:04
the Guardian calling it a masterly
01:05
critique of the internet pioneers who
01:07
now dominate our world so without
01:08
further ado please help me in welcoming
01:10
Rana for a heart to politics and prose
01:16
thank you I am so honored to be here
01:19
it’s really a pleasure this is one of my
01:22
favorite bookstores probably my favorite
01:24
bookstore in Washington and so it’s just
01:27
a huge pleasure I thought I would start
01:30
by just talking a little bit about how I
01:32
got the idea to write this book it’s
01:33
actually my second book my first book
01:35
makers and takers was a look at the
01:38
financial sector and how it no longer
01:40
serves business so I like to kind of
01:42
take on these big industry-wide maybe
01:45
take down so we’ve the word but kind of
01:49
look at an ecosystem and economic
01:50
ecosystem see how it’s working or not
01:52
working I got the idea for this book
01:56
probably two months into my new job at
02:00
the Financial Times
02:01
I was hired in 2017 to be the chief
02:05
business commentary writer so my my job
02:08
was to sort of look at the top world’s
02:11
business stories economic stories and
02:13
try to make sense of them in commentary
02:14
and when I do that I tend to try and
02:17
follow the money in order to narrow the
02:18
funnel of where to put my focus and I
02:20
had come across a really really
02:22
interesting statistic that 80%
02:25
of the world’s wealth corporate wealth
02:27
was living in 10% of companies and these
02:30
were the companies that had the most
02:31
data personal data and intellectual
02:34
property and so the biggest of those
02:36
were the big tech platforms that my my
02:38
book kind of tries to make icons of
02:41
we’re using all the candy colors here
02:43
the fangs Facebook Amazon Apple Netflix
02:47
Google so that was a pretty stunning
02:50
statistic and it was interesting because
02:51
I was thinking about how wealth since
02:54
2008 had transferred from the financial
02:56
sector into the big tech sector and that
02:59
had happened really quietly without a
03:02
whole lot of commentary in the press now
03:05
at the same time I was starting to kind
03:06
of dig into this story something else
03:08
happened a much more personal episode I
03:11
came home one day and I there was a
03:14
credit card bill waiting for me and I
03:16
opened it up and I started looking
03:17
through and there were all these tiny
03:19
charges in the amount of dollar
03:21
ninety-nine three dollars five dollars
03:22
whatever and I noticed that they were
03:25
all from the app store and I thought oh
03:28
my gosh I must have been hacked and then
03:30
I thought who else has my password my
03:33
ten-year-old son Alex I see nods from
03:38
parents and others so I go downstairs
03:42
and I find Alex on the couch with his
03:44
phone which is his usual after-school
03:46
position and I say you know what what’s
03:50
up do you know anything about this and
03:51
he sort of stunned and oh yes oh that
03:55
yeah and turns out alex has gotten very
03:58
fond of a game called FIFA Mobile which
04:01
is an online soccer game and it’s one of
04:03
these games that’s dude that you can
04:04
download it for free but once you get
04:07
into the game and start playing you have
04:10
to buy stuff
04:11
in-app purchases it’s called our loot
04:13
boxes is another another name so if you
04:16
want to move up the rankings and do well
04:18
in the game
04:19
you have to buy virtual Ronaldo or some
04:22
new shoes for your player and nine
04:24
hundred dollars and one month later Alex
04:27
was at the top of the rankings but I was
04:32
horrified I was actually horrified and
04:34
fascinated in fact I mean as
04:36
mother I was horrified his phone was
04:39
immediately confiscated passwords were
04:41
changed limitations were put into place
04:44
by the way he now officially is allowed
04:47
only one hour a day on his phone he’s 13
04:51
years old the average for that age is 7
04:54
hours a day national average now he
04:57
sneaks in an extra I think he probably
04:58
gets about 90 minutes because I can’t
05:00
police him all the time on the way to
05:01
the on the way to school but it’s I mean
05:04
to me that is a stunning fact that the
05:06
average American 13 year old spent 7
05:09
hours day on their phone anyway so I was
05:12
horrified as a parent but I was
05:13
fascinated as a business writer because
05:15
I thought this is the most amazing
05:17
business model I have ever seen and I
05:20
have to learn everything about it and
05:22
right about that time someone had come
05:26
to see Mia a man named Tristan Harris
05:28
who’s one of the characters in my book
05:29
and Tristan is a really interesting guy
05:32
he was formerly the chief ethics officer
05:35
at Google and he was trying to bring
05:39
goodness and not evil to the company and
05:42
make sure that all the all the products
05:45
and services were functioning sort of a
05:47
human interest and then he realized he
05:48
was not having any luck doing that
05:49
within the company so he decided to go
05:52
outside and start something called the
05:54
Center for Humane technology and Tristan
05:57
had become really really worried about
05:59
the core business model that is it’s
06:02
particularly relevant for Google and
06:05
Facebook but is also a big part of
06:07
Amazon’s model and and it’s really the
06:08
model that another author Shoshanna
06:10
Zubov who recently wrote a wonderful
06:12
book on this topic would call
06:14
surveillance capitalism and so it’s the
06:16
idea of companies coming in and tracking
06:20
everything you are doing online and
06:22
increasingly offline you know if you
06:24
have your if you have an Android phone
06:25
it might know where you are in the
06:27
grocery store if you’re in a car with
06:29
smart technology your your location
06:32
coordinates can be tracked so all of
06:35
this is serving to build a picture of
06:37
you that is then used to be sold to
06:41
advertisers and then you can be targeted
06:44
with what’s called hyper targeted
06:46
advertising which is essentially why for
06:49
example
06:50
if I go online to look for a hotel in
06:53
California I might get a certain price
06:55
but someone else might get a different
06:57
price so this is a really important
06:59
thing we are looking at different
07:01
internets right there are subtle
07:04
differences but they’re there and this
07:06
data profile that is being built up is
07:08
splitting us as individual consumers but
07:12
I would argue that it’s also splitting
07:14
us as citizens and I’ll when I get to
07:16
the readings I’ll kind of flush that out
07:18
a bit more but Tristan
07:20
kind of turned me on to this business
07:23
model and he also helped me connect the
07:25
dots between this business model and
07:27
what had happened to my son because it
07:29
turns out that the technologies these
07:31
sorts of nudges that take you down a
07:34
game or that bring you to certain places
07:36
on Amazon or that give you a certain
07:39
kind of search result or purchasing
07:41
option on Google are part of an entire
07:45
field called capped ology which is kind
07:49
of an Orwellian word and these these
07:52
technologies actually come largely out
07:54
of something called the Stanford
07:55
persuasive technology lab so there is an
07:58
entire industry that is designed to
08:01
track your behavior and pull in things
08:03
like behavioral psychology casino gaming
08:06
techniques and then layer those on to
08:09
apps that will push you towards making
08:13
purchasing decisions or perhaps even
08:15
other kinds of decisions political
08:16
decisions that might be good for certain
08:19
actors and it’s interesting because when
08:22
I started to think about all this one of
08:24
the things I really wanted to do in this
08:26
book was to cry try and create a single
08:28
narrative arc to take folks through this
08:31
20 year evolution of this industry from
08:34
the mid-1990s which is really when the
08:36
consumer internet was born till now and
08:39
at the time I was writing and and still
08:41
probably today you could argue that
08:43
Facebook was the company that was
08:45
getting the most negative attention for
08:48
a lot of the economic and political
08:49
ramifications of its business model but
08:51
if you go back to the very beginning
08:53
Google is the most interesting way to
08:56
track this because Google really
08:59
invented the targeted advertising
09:01
business model they really invented
09:03
surveillance capitalism and one of the
09:05
things that is fascinating and and
09:06
sometimes I’m asked what’s the most
09:08
surprising thing that you found when
09:10
writing this book and really the most
09:11
surprising thing is it was all hiding in
09:14
plain sight so if you go back to the
09:17
original paper the Larry Page and Sergey
09:19
Brin who were the founders of Google did
09:21
in 1998 while at Stanford as graduate
09:25
students they actually lay out they lay
09:28
out what a giant search engine would
09:30
look like how it would function but then
09:31
how you might pay for it and if you go
09:34
down to page 33 there is a section in
09:36
the appendix called advertising and its
09:38
discontents and it essentially says that
09:42
if you monetize a search engine in this
this way with hyper targeted advertising
the interests of the users and the
interests of the advertisers be they
companies or who knows what public
entities are eventually going to come
into conflict and so they actually
recommend that there be some kind of
academic search engine an open search
engine in the public interest so this to
10:05
me first of all is fascinating that it
10:07
was just there all along and fascinating
10:11
that very few people have read that
10:13
entire paper even though even those that
10:16
write about it which in some ways kind
10:18
of goes to the point that in the last 20
10:20
years we all do a lot less reading not
10:22
folks here but but in general we do less
10:25
reading there was actually a fascinating
10:26
study that came out recently from common
10:28
sense media which is Jim’s dyers group
10:30
in California that tracks children’s
10:33
behaviors online teenagers only
10:36
one-third of them read for pleasure more
10:39
than once a month
10:41
long-form articles doesn’t matter if
10:43
you’re reading on an e-book or device
10:44
but long-form articles books only once a
10:47
month for pleasure so all our entire
10:50
world has been changed economically
10:52
these companies have huge monopoly power
10:54
politically we’re all kind of living
10:56
with the ramifications of this new world
10:58
of social media disinformation fake news
11:01
and cognitively our brains are changing
11:05
our behaviors are changing so connecting
11:07
all of those things was really what I
11:10
was trying to get at in this book and so
11:13
I’m gonna read two or three maybe short
11:16
excerpt
11:17
and then we can leave a lot of time for
11:19
questions so that people can kind of
11:20
dive into as much of this as they want
11:23
and I’ll start perhaps with my very
11:28
first meeting with the Googlers Larry
11:33
Page and Sergey Brin who I met not in
11:36
Silicon Valley but in Davos the Swiss
11:39
gathering spot of the global power elite
11:42
where they had taken over a small Chalet
11:44
to meet with a select group of media the
11:47
year was 2007 the company had just
11:50
purchased YouTube a few months back and
11:52
it seemed eager to convince skeptical
11:54
journalists that this acquisition wasn’t
11:56
yet another death blow to copyright paid
11:58
content creation and the viability of
12:00
the news publications for which we
12:02
worked
12:02
unlike the buttoned-up consulting types
12:05
or the suited executives from the old
12:07
guard multinational corporations that
12:09
roamed the promenades of davos their
12:11
tasseled loafers slipping on the icy
12:13
paths the Googlers with a cool bunch
12:15
they wore fashionable sneakers and their
12:17
chalet was sleek white and stark with
12:19
giant cubes masquerading as chairs in a
12:21
space that looked as though it had been
12:23
repurposed that morning by designers
12:25
flown in from the valley in fact it may
12:27
have been and if so Google would not
12:29
have been alone in such access I
12:30
remember attending a party once in Davos
12:32
hosted by Napster founder and former
12:34
Facebook president Sean Parker that
12:37
featured giant taxidermy bears and a
12:39
musical performance by John Legend back
12:42
in the Google Chalet Brin and page
12:44
projected a youthful earnestness as they
12:46
explained the company’s involvement in
12:48
or authoritarian China and insisted
12:50
they’d never be like Microsoft which was
12:52
considered the corporate bully and
12:53
monopolist at the time what about the
12:55
future of news we wanted to know after
12:57
admitting that page read only free news
12:59
online whereas Brin often bought the
13:01
sunday New York Times in print it’s nice
13:03
he said cheerfully
13:04
the duo affirmed exactly what we
13:07
journalists wanted to hear Google they
13:09
assured us would never threaten our
13:10
livelihoods
13:11
yes advertisers were indeed migrating
13:14
and mass from our publications to the
13:15
web where they could target consumers
13:17
with a level of precision that the print
13:19
world could barely imagine but not to
13:21
worry Google would generously retool our
13:22
business models so we too could thrive
13:24
in the new digital world I was much
13:27
younger than and not the admittedly
13:29
cynical business journalist that I have
13:30
since
13:31
and yet I listened skeptically
13:32
skeptically to that happy future of news
13:35
like lecture whether Google actually
13:37
intended to develop some brilliant new
13:40
revenue model or not what alarmed me was
13:42
that none of us were asking a far more
13:44
important question sitting towards the
13:46
back of the room somewhat conscious of
13:48
my relatively junior status I hesitated
13:50
waiting until the final moments of the
13:52
meeting before raising my hand excuse me
13:55
I said we’re talking about all this like
13:57
journalism is the only thing that
13:58
matters but isn’t this really about
13:59
democracy if newspapers and magazines
14:02
are all driven out of business by Google
14:04
or companies like it I asked how are
14:06
people gonna find out what’s going on
14:07
Larry Page looked at me with an odd
14:10
expression as if he were surprised that
14:11
someone should be asking such a naive
14:13
question oh yes we’ve got a lot of
14:16
people thinking about that
14:17
not to worry his tone seemed to say
14:19
Google had the engineers working on that
14:22
little democracy problem next question I
14:26
read that because I’m kind of amazed
14:30
there is still a real lack of
14:34
understanding I think in the valley
14:36
about some of the real negative
14:39
externalities of what have been let’s
14:41
face it amazing technologies I mean we
14:43
you know where would we be without
14:44
search in our smartphones we all
14:46
carrying around the power of a mainframe
14:47
in our pockets but as a journalist I
14:51
think there’s really been a an inability
14:54
of these companies to kind of own up to
14:56
you know some of the bad stuff that they
14:59
have wrought and I think that that still
15:00
considers oh sorry still continues to be
15:03
to be the case one of the other points
15:06
that I try and make in the book is that
15:09
the problems I’m talking about have
15:12
actually moved outside of just the big
15:14
four flat platform firms that that we’re
15:16
moving into a world in which
15:17
surveillance capitalism is going to be
15:19
part of the healthcare system and the
15:21
financial system and really every kind
15:24
of business is now using this as its
15:26
model so for example if you buy coffee
15:29
at Starbucks Starbucks knows a lot about
15:30
you Johnson & Johnson knows a lot about
15:33
you there there are firms watching you
15:35
all the time and so we’re really at a
15:37
pivot point I think where we have to ask
15:40
as a society what are the deeper
15:43
implications of this and our
15:44
okay with them and so I would like to
15:47
read another excerpt where I look at how
15:50
this model is is moving into the
15:52
insurance sector and what that means so
15:58
far data has been obtained via computers
16:01
and mobile devices but now with the rise
16:03
of personal digital assistants like
16:05
Amazon’s Alexa Google’s home mini and
16:07
Apple Siri now at 30 and now in a third
16:10
of American homes with triple digit
16:12
sales growth a year the human voice is
16:14
the new gold while reports of Alexa
16:16
Alexa and Siri listening in on
16:18
conversations and phone calls are
16:19
disputed there’s no question that they
16:21
can hear every word you say and from
16:23
there it’s a short step to them using
16:24
that knowledge to direct your purchasing
16:26
decisions it isn’t much of a longer step
16:28
to see the political implications
16:30
already some researchers worry that
16:32
digital assistants will become even more
16:33
powerful tools than social media for
16:36
election manipulation certainly none of
16:38
us will be unaffected consider consider
16:41
that homeowner oops sorry
16:43
I’m reading from a reading from the
16:44
wrong part I think apologies somehow
16:54
picked the wrong section here anyway I’m
16:57
going to talk you through this example
16:58
because it’s it’s something that is
17:01
already out there I had a conversation a
17:03
couple of years ago with an executive
17:04
from Zurich Financial which is a big
17:07
financial company they do insurance many
17:10
parts of the world they will now if
17:12
you’d like them to put sensors in your
17:14
home or in your car and if you have for
17:18
example as I do you live in a 1901
17:20
townhouse let’s say you’re upgrading
17:22
your pipes you get a check you get a you
17:24
know a positive mark and you may see
17:26
your insurance premium go down but let’s
17:30
say your kid is smoking a joint in their
17:32
bedroom and the sensor picks up on that
17:34
you then get a black mark here and your
17:36
premium may go up same again in your car
17:39
if you’re speeding your insurance
17:42
company will know and so on and so forth
17:43
now you can either like this or not
17:46
depending on where you sit in the
17:48
socio-economic spectrum but what’s very
17:50
very interesting is that entire business
17:53
model a pooled risk business model
17:55
that’s what insurance is it’s now been
17:57
completely dissed
17:58
so you can be targeted and split so this
18:02
is no longer about society pulling risk
18:04
a saree pooling risk this is about
18:06
individuals having to own the risk so if
18:09
you take that to its natural conclusion
18:12
you can imagine an elite up here that
18:17
has access to special pricing and all
18:19
kinds of great products but you can also
18:21
imagine an uninsurable group of people
18:25
at the bottom and then who is going to
18:28
pick up that risk now the public sector
18:30
may be maybe they’ll be a junk bond
18:33
market for insurance either way you have
18:36
a split in society that didn’t exist
18:39
before and that was always the business
18:42
model here you know you go back and read
18:44
some of the early work of someone like
18:46
Hal Varian for example who was the chief
18:48
economist at Google splitting pricing
18:51
down to the individual was always the
18:53
point of platform technology firms like
18:56
Google or Facebook or Amazon splitting
18:58
individuals out so they could be
18:59
targeted in different ways but that not
19:01
only splits pricing it splits Society
19:05
and so that’s kind of really the the
19:07
core issue I want to get out here
19:10
I think I’ll maybe read just just one
19:13
more excerpt and then we can do we have
19:15
we have time yeah and then we’ll open it
19:17
up for questions after that my first
19:22
book just to mention again was about the
19:25
financial industry and one of the things
19:26
that strikes me is that big tech
19:28
companies have in some way become the
19:30
new too big to fail entities not only
19:33
are they holding more wealth and power
19:35
than the largest banks but in some ways
19:36
they function like banks they have a
19:39
tremendous amount of money they use it
19:41
to buy up corporate debt if that debt
19:44
were to go bad that could actually be
19:46
the beginnings of another financial
19:47
crisis and so that’s kind of a part of
19:49
this story that really hasn’t gotten out
19:51
there so let me let me read just two or
19:54
three more pages for you on that topic
19:57
the late great management guru Peter
20:00
Drucker once said in every major
20:01
economic downturn in US history the
20:03
villains have been the heroes during the
20:05
preceding boom I can’t help but wonder
20:08
if that might be the case over the next
20:10
few years as the you know
20:11
it states and possibly the world heads
20:13
towards its next big slowdown downturns
20:16
historically come about once every
20:18
decade and it’s been more than that
20:19
since the 2008 financial crisis back
20:22
then banks were the too-big-to-fail
20:24
institutions responsible for our falling
20:26
stock portfolios home prices and
20:28
salaries technology companies by
20:30
contrast have led the market upswing
20:32
over the past decade but this time
20:34
around it’s the big tech firms that
20:36
could play the spoiler role you wouldn’t
20:39
think that it could be so when you look
20:40
at the biggest and richest tech firms
20:42
today take Apple for example warren
20:44
buffett says he wished he owned even
20:45
more Apple stock Goldman Sachs is
20:47
launching a new credit card with the
20:48
tech Titan which became the world’s
20:50
first trillion-dollar market cap company
20:52
in 2018 but hidden within these bullish
20:55
headlines are a number of disturbing
20:57
economic trends of which Apple is
20:59
already exemplar study this one company
21:02
and you begin to understand how big tech
21:04
companies the new too-big-to-fail
21:05
institutions could indeed sow the seeds
21:08
of the next financial crisis the first
21:11
thing to consider is the financial
21:12
engineering done by such firms like most
21:15
of the largest and most profitable
21:17
multinational companies Apple has loads
21:19
of cash about 300 billion as well as
21:22
plenty of debt close to 122 billion
21:24
that’s because like nearly every other
21:27
large rich company it has parked most of
21:30
its spare cash in offshore bond
21:32
portfolios over the last ten years at
21:34
the same time since the 2008 crisis is
21:37
that it is issued cheap debt at rates to
21:41
do sorry it is issued cheap rate sorry
21:44
cheap debt at low rates in order to do
21:48
record amounts of share buybacks and
21:50
dividends Apple’s responsible about a
21:53
quarter of the 407 billion in buybacks
21:55
and out since the Trump tax bill was
21:57
passed in December of 2017 but buybacks
22:00
have bolstered mainly the top 10% of the
22:03
US population that owns 84% of all stock
22:06
the fact that share buybacks have become
22:08
the biggest single use of corporate cash
22:10
for over a decade now has buoyed markets
22:13
but it’s also increased the wealth
22:15
divide which many common economists
22:17
believe is that not only the single
22:19
biggest factor in slower than historic
22:21
trend growth but is also driving
22:22
political populism which threatens the
22:25
good system itself that phenomenon has
22:28
been put on steroids by the rise of yet
22:30
another trend epitomized by Apple
22:33
intangibles such as intellectual
22:35
property and brands now make up a much
22:37
larger share of wealth in the global
22:39
economy the digital economy has a
22:41
tendency to create super stars since
22:43
software and internet services are so
22:45
scalable and they enjoy network effects
22:50
let’s see do but as these as software
22:56
and internet services become a bigger
22:58
part of the economy they reduce
23:00
investment across the economy as a whole
23:02
and that’s not only because banks are
23:03
reluctant to lend to businesses whose
23:06
intangible assets may simply disappear
23:08
if they go belly-up but because of the
23:10
winner-take-all effect that a handful of
23:12
companies including Apple Amazon and
23:14
Google enjoy so to sum this up in plain
23:17
English as this handful of companies has
23:20
gotten bigger and more powerful
23:21
investment in the overall decline
23:23
economy has declined the number of jobs
23:26
that they’re creating relative to their
23:28
market size is much lower than that in
23:30
the past so you have the superstar
23:32
economy that has become kind of a
23:33
winner-take-all game I think that we’re
23:37
going to probably see some kind of a
23:39
market correction in the next couple of
23:41
years it’s going to be very interesting
23:43
at that point to see whether tech leads
23:45
the markets down and whether you might
23:47
then see a kind of an Occupy Silicon
23:49
Valley sentiment as you did in 2008 with
23:53
Occupy Wall Street I think that that’s
23:54
really quite possible we can delve more
23:57
into that if you’d like but I think I
23:59
want to stop here and be respectful of
24:01
question time and there are parts that
24:04
you guys want to hear more about or
24:06
particular areas that I could read more
24:08
from you can let me know go ahead
24:15
because sure we don’t get to speak very
24:18
often you and I one is you’ve doubtless
24:23
read about Bloomberg’s decision recently
24:26
to forbade its reporters from covering
24:28
Michael Bloomberg yeah yet The
24:31
Washington Post has no problem
24:34
investigating Vsauce do you see is that
24:38
a problem for you have you thought about
24:40
that is that a and so have any
24:43
consistency that should bother at
24:45
financial journalists and the second
24:46
question is how important for any
24:51
solution to the problems you you raise
24:53
would an tights for the revival of
24:56
antitrust be s we see on the continent
24:59
where it’s more aggressive and among
25:01
some of the the Democratic candidates
25:04
for the president well so let me take
25:06
the antitrust question first that’s
25:08
actually important part of the book
25:10
there’s an entire chapter on antitrust
25:12
and I think we probably are gonna see
25:15
some shifts as folks may know since the
25:19
1980s onward antitrust in America has
25:23
basically been predicated on price so as
25:25
long as consumer prices were falling it
25:28
was perceived that companies could be as
25:30
big as they wanted that it wasn’t a
25:31
problem but one of the things I look at
25:34
in the book is this this shift to a
25:36
world in which transactions are being
25:39
done not in dollars but in data so
25:42
that’s a that’s a barter transaction
25:43
really and one of the things that’s so
25:45
interesting and this is actually a way
25:47
in another way in which Silicon Valley
25:49
is similar to Wall Street the
25:50
transaction is really opaque so you
25:53
don’t know essentially how much you’re
25:55
paying for the supposedly free service
25:58
that you’re receiving that is a very
26:02
difficult market to create fairness
26:04
within and it probably makes the Chicago
26:07
School notion of consumer prices going
26:10
down no problem I think probably
26:13
irrelevant and so there’s two ways in
26:15
which that’s being dealt with you have
26:17
the rise of this new Brandeis school of
26:19
thinking in which you know maybe this is
26:22
really about power maybe maybe we should
26:25
think about the big tech firms
26:26
we do the nineteenth-century railroads
26:28
we’re alright you know you had at one
26:30
point railroad Titans that would come in
26:33
and build tracks and then own the cars
26:35
and then own the things that were in the
26:37
cars and eventually that became a
26:39
zero-sum game and it’s you know it’s as
26:42
folks probably know we’re in a period in
26:44
which there’s as much concentration of
26:46
wealth and power as there was in the
26:47
Gilded Age so I could imagine very
26:50
easily a scenario in which you could
26:51
justify Amazon say being the platform
26:55
for e-commerce but not being able to
26:57
compete in the specific areas of fashion
27:01
or you know whatever else they’re
27:03
selling against other customers and in
27:05
fact that’s already the case in the
27:06
financial sector that big companies that
27:09
trade let’s say aluminum you know as
27:12
Goldman Sachs did this is what it ran
27:14
into a suit a few few years ago that it
27:16
was both owning all the aluminum and
27:18
trading it and that’s that’s
27:20
anti-competitive and so that became an
27:22
issue for the Fed so I think we probably
27:24
are going to see that kind of ruling as
27:26
for the post and journalism you know
27:29
it’s funny I have some friends that are
27:30
they’re quite influential to post and
27:34
they say that Bezos is pretty hands-off
27:37
I mean I can’t I can’t vouch for that
27:38
one thing I will say is that Amazon did
27:41
put this book on the top 20 nonfiction
27:43
what Stern’s a month so you know I don’t
27:46
know if that’s a ploy to make me think
27:48
that they’re they’re being really fair
27:49
but from probably Jeff Bezos I don’t
27:52
know I he probably not thinking that
27:53
much about this book or me but anyway
27:56
next question go ahead so it seems like
27:59
some of the major decisions that these
28:01
big tech companies are making are in
28:04
regard to fake news and how they’re
28:06
moderating fake news or the lack of it
28:08
so have you seen maybe an approach by
28:11
any current social media platform or any
28:13
proposed plans in place that you think
28:15
would be best for moderating fake news
28:17
that’s such a good question so just to
28:20
kind of pull back the the two points of
28:22
view on that are hey look you know the
28:26
platform tech companies are essentially
28:27
giant media and advertising firms right
28:30
I mean if you look at the business model
28:31
of a Google or a Facebook it’s
28:34
essentially just like the Financial
28:35
Times or CNN it’s just much more
28:37
effective and it can be targeted to the
28:39
individual
28:40
that means that these firms have taken
28:42
you know 85 90 percent of the app new
28:45
digital advertising pie in the last few
28:47
years now given that they function as
28:49
media companies should they not be
28:51
liable for disinformation in the way
28:55
that a media company would be so if I
28:57
print something incorrect at the FT
28:59
that’s you know the the paper and also
29:02
my hide on the line there I think that
29:05
we should actually think about rolling
29:08
back some of those loopholes that these
29:09
firms enjoy since the mid-1990s onwards
29:12
I think that they are going to have to
29:14
take some responsibility now the
29:16
question is do we want Mark Zuckerberg
29:18
being the minister of truth and that’s
29:20
that’s that’s a really tough question
29:23
what I would prefer is for the
29:26
government to actually you know for
29:28
democratically elected governments to
29:29
come up with rules about what is and
29:32
isn’t appropriate and to not have
29:34
individual companies making those
29:36
choices I think we’re in a period right
29:38
now where you know you’ve got Twitter
29:40
you’ve got Google to a certain extent
29:41
coming out saying okay we recognize we
29:43
need to do things differently that’s
29:44
putting pressure on Facebook but at the
29:46
end of the day we’re gonna have to have
29:47
I think an entirely new framework not
29:51
just in this area but also in taxation
29:53
in you know an antitrust which we’ve
29:56
already talked about this is the shift
29:58
that we’re going through is I think the
30:00
new Industrial Revolution it’s a 70 year
30:03
transition and it’s going to require a
30:04
lot of different frameworks relative to
30:07
what we already have so the answer is no
30:10
I don’t see any particular company that
30:12
has come up with the right framework yet
30:14
any other questions
30:16
oh yeah I’d like to go back to antitrust
30:18
for a minute the Washington Post put up
30:20
an article just this afternoon about how
30:23
Apple is changing its business model and
30:25
it’s different as you know it’s
30:27
differentiated itself in the market by
30:29
saying they care about privacy well now
30:32
they are moving from a a device company
30:38
to a services company according to the
30:40
article and they are used and they are
30:43
using privacy as a lever to provide
30:46
services that their that other smaller
30:51
companies like tile which is the example
30:54
the article has used to create a market
30:59
for itself right and so it says in the
31:03
article that the feds are considering
31:04
looking at antitrust measures against
31:06
Apple but I think it raises a bigger
31:09
question that you pointed to which is
31:13
that the models of antitrust don’t work
31:16
anymore so in terms of privacy lots of
31:22
people have talked about monetizing
31:24
privacy getting paid yeah data but how
31:27
do you think from an economic point of
31:30
view we as a society need to look at the
31:33
role of privacy and the role of
31:35
antitrust together to somehow change the
31:38
way we think about these companies
31:41
because in addition we’ve got
31:43
consolidation in the marketplace so yeah
31:45
no longer fair competition you can’t
31:48
become another Amazon right easily
31:51
because there are so many big so mate
31:53
because the players are big and there
31:55
are so few of them in each part of the
31:58
economy yeah a right so there’s a lot in
32:00
what you’ve just said for starters I
32:03
think you’re hitting on something really
32:04
important which I get at in my solutions
32:06
chapter that this is such a huge shift
32:09
and it’s touching so many different
32:11
areas and we’ve talked about privacy
32:13
we’ve talked about antitrust we haven’t
32:15
even gotten into national security you
32:17
know civil liberties I mean there there
32:19
are so many different areas and when you
32:21
one of the things I noticed when I sat
32:23
down to write the solution sections you
32:25
know when you do a think book you always
32:26
have to have the solutions section and
32:28
you know the publisher wants like that
32:29
Silver Bullet thing and you look at this
32:32
and you notice that when you pull a
32:33
lever here it effects something in this
32:35
other areas so I think that’s one reason
32:38
why we should have a national committee
32:41
to actually look at what are all the
32:43
questions it’s when I speak to folks
32:45
particularly in DC policymakers there’s
32:47
you know the antitrust camp here the
32:49
privacy camp here the security folks
32:50
there that conversation needs to be
32:52
happening in a 360 way and it is
32:54
happening much more so that way in
32:57
Europe I will say I just came off of two
32:59
weeks of book touring in Europe and the
33:02
conversation there I think is much more
33:04
developed and they seem to be to go to
33:06
your point about the ecosystem and how
33:08
share it one of the things that seems to
33:11
be folks seem to be headed towards is a
33:13
public digital Commons a kind of a
33:16
database let’s say alright if you decide
33:19
as you know the cat seems to be out of
33:21
the bag that we’re gonna allow
33:22
surveillance capitalism I mean there
33:24
there are certain folks like Shoshanna
33:26
would love to see the dial turned back
33:28
I’m not sure if that’s possible let’s
33:30
have a public database in which not just
33:33
one corporation or a handful of
33:35
corporations but multiple sized players
33:37
as well as the public sector as well as
33:40
individual citizens who’s you know after
33:43
all it’s our data being harvested
33:45
everybody gets access and then you can
33:47
figure out how you want to share the pie
33:49
and one interesting example recently is
33:51
the Google sidewalk project in Toronto
33:54
it sounds like you’re up on these issues
33:56
so you’re probably aware but Google had
33:59
taken over sort of twelve acres on the
34:01
Toronto Waterfront and put sensors
34:04
everywhere and the idea was to create a
34:06
smart city in which you’d be able to
34:08
manage traffic patterns and energy usage
34:10
and things like that but until recently
34:12
Google was going to own all that data
34:14
and have access to and finally the
34:16
Toronto government got a clue and said
34:18
well actually you know what let’s put
34:19
this in a public database so other
34:22
smaller or midsize local firms can come
34:25
in and be part of that economic
34:26
ecosystem but also as a public sector we
34:30
can decide well maybe we want to share
34:32
data for energy issues or for health
34:36
issues but maybe we don’t want to share
34:37
it for certain other kinds of things and
34:40
perhaps there would be some way in which
34:42
individuals could take back some of that
34:44
value so California is thinking about a
34:47
digital dividend payment from the big
34:49
tech companies there’s also been talk of
34:51
a digital sovereign wealth fund if you
34:53
think about kind of data as the new oil
34:56
whatever the value is judged to be it
34:59
would be putting the sovereign wealth
35:01
fund in the same way that Alaska or
35:02
Wyoming give back payments or use that
35:05
for the the public sector that could be
35:08
done with data too so I think something
35:10
like that is probably going to be the
35:12
best solution I’ll tell you I have many
35:14
examples in the book of ways in which
35:16
the bigger players have been able to
35:18
squash small and mid-sized firms and
35:20
that
35:21
a major issue and a lot of venture
35:23
capitalists that I speak to are actually
35:26
becoming concerned about that because
35:27
they say that there’s sort of black
35:29
zones of innovation where if Amazon is
35:33
there or Google is there you really
35:34
can’t start a business there’s just been
35:36
too much that’s been been written
35:38
ring-fenced question over here
35:40
yes while your book may be the the best
35:43
one on the subject they’ve certainly
35:44
been other books before talking about
35:46
individuals privacy and their their data
35:49
and everything about them why is it that
35:52
you think people are so unconcerned
35:56
about handing over all of their data to
35:58
these companies when they are perhaps
36:00
very concerned about handing it over to
36:02
the government why why do they feel
36:04
these guys are the good guys and the
36:06
government is necessarily the bad guys
36:08
yeah it’s such an interesting question
36:11
and that really varies from country to
36:13
country I find that that’s sort of an
36:15
interesting cultural dynamic that can
36:17
shift depending on what market you’re in
36:19
I have really been puzzled as to why
36:23
people are so first of all why everybody
36:25
just clicks the box and says no problem
36:27
I think part of that is is the opacity
36:29
of the market I mean if you kind of go
36:31
back to Adam Smith basic economics you
36:35
need three things to make a market
36:36
function property properly that would be
36:38
equal access to data transparency in the
36:41
transaction and a shared moral framework
36:44
and you could argue that none of those
36:46
things are in place so when we’re making
36:48
these transactions I think as that’s
36:51
that very fact becomes better explained
36:56
and people begin to kind of understand
36:58
that narrative like the insurance
36:59
example I just gave that all right
37:02
you’re getting something but you’re
37:03
giving up a lot I’m beginning to see
37:07
pushback already and I suspect in recent
37:10
weeks as some of the big players have
37:11
moved into healthcare you know into into
37:14
the commercial banking business I just
37:17
think that we are going to begin to see
37:18
more people being reluctant to give up
37:23
that much value for what they’re getting
37:25
you’re also interestingly seeing when
37:28
there are other options people will go
37:31
elsewhere so Jimmy Wales who started
37:33
Wikipedia just I think
37:34
the weeks ago came up with a new social
37:36
networking site he’s already got 300,000
37:38
users there and it’s an odd
37:41
they don’t do targeted advertising it’s
37:42
run on the wiki model where you can
37:44
donate if you want I think once the
37:47
antitrust piece is in place and you
37:49
actually have space for new competitors
37:52
to come in and to offer up different
37:54
kinds of services that perhaps are more
37:56
respectful of privacy that you you know
37:58
you could see a shift there but I’m
38:00
curious actually can I pull the audience
38:02
for a minute because I want to ask how
38:04
many people think that in the next five
38:07
years individuals are going to become
38:09
more worried about giving up information
38:12
that’s going to change their behavior
38:13
online so like two-thirds but not yeah
38:19
that’s interesting okay oh go ahead
38:23
sorry we’re sheep we’re cheap oh my god
38:26
that was a different book curious if you
38:30
see the administration’s
38:32
suggestion that it the California can’t
38:35
set its own rules for gas mileage and so
38:41
on and emissions as having a parallel in
38:44
this area you know I hadn’t thought
38:48
about that question before I always
38:50
think about California as really being
38:53
very leading what is eventually going to
38:55
become the national standard and I think
38:58
in data I feel like that is gonna happen
39:01
you know even the Europeans in fact are
39:04
saying that the California model is
39:06
probably the better model for data data
39:08
protection and privacy and sharing of
39:11
value so the Europeans have GDP are you
39:13
know which was kind of the first step in
39:15
the privacy direction but it doesn’t
39:17
take into account that economic
39:18
ecosystem so perversely you have the big
39:21
companies maybe being able to do better
39:23
with the GDP our model and smaller ones
39:26
getting cut out of the loop because they
39:27
don’t have the legal muscle to kind of
39:29
deal with all the rules so I do think
39:31
the California model is going to become
39:32
a de facto standard we also haven’t
39:34
talked about China which is of course
39:36
going its own way and I have it I have a
39:38
chapter in the book where I look at that
39:40
I look at the current trade war tech war
39:43
kind of through the lens of surveillance
39:44
capitalism and
39:46
that’s gonna be very interesting I think
39:48
one of the big probably the biggest mid
39:52
to long-term economic question for me is
39:54
are we going to see a transatlantic
39:56
alliance around digital trade and coming
39:59
up with some standards because China is
40:01
going its own direction it’s going to
40:02
develop its own ecosystem it has its own
40:04
big players the u.s. is in another
40:07
category but where is Europe gonna be is
40:09
it going to be a tri polar world is it
40:11
going to be a bipolar world in terms of
40:13
how all this works that that’s a major
40:15
ik you cannot make an actually foreign
40:16
policy question I think hey thanks for
40:21
coming and thinking um I’m wondering we
40:25
have like a Department of Agriculture we
40:27
have a Department of Energy will there
40:29
be a Department of Technology ever in
40:30
the US and which other countries already
40:33
have that kind of thing going yeah
40:35
England is talking about that actually I
40:37
think kind of an FDA of Technology is
40:40
probably a very good idea you know I see
40:44
going back to the example about my son
40:46
there there
40:46
the research is nascent and causality is
40:49
is difficult to prove but there there’s
40:51
you know a new body of research since
40:54
2011 2012 when smartphones really became
40:57
ubiquitous showing that levels of
40:59
anxiety and depression and younger
41:01
people arising you know they’re there
41:04
they’re issues of self harm sometimes
41:07
when people you know use these
41:08
technologies addictively so I think that
41:11
that’s that’s a big issue to me it’s
41:12
very similar to cigarettes you know
41:14
those were regulated there was a
41:16
different narrative and then behaviors
41:17
changed and I think I think that that’s
41:20
one area to consider policy wise there
41:25
may be time for one or two more
41:26
questions
41:27
okay sorry over here and then over here
41:29
hi
41:30
I’m kind of curious what you think about
41:33
the fact that most of these
41:36
conversations around technology or even
41:38
democracy tends to focus on institutions
41:41
and systems and structures which is
41:44
great because they are so powerful and
41:47
ubiquitous my background is in teaching
41:51
critical thinking and in conflict
41:54
management and I what I worry
41:58
that so little attention is being paid
42:01
to the intelligence and maturity of the
42:05
citizenry I’m from India after 70 years
42:10
of democracy we’ve lost it I think it’s
42:15
simplistic to blame the right-wing
42:18
leaders and the government I believe we
42:22
as a people have not developed the
42:24
maturity to be effective intelligent
42:30
citizens we don’t have the values we are
42:34
still feudal we are still extremely
42:37
hierarchical we don’t have the
42:40
democratic values in India and we didn’t
42:43
cultivated over 70 years I see a
42:46
parallel to being susceptible to the
42:52
seductions of Technology whether it be
42:56
free news or the click baiting or
43:00
anything that the big companies seduce
43:04
us with that even as we need as you said
43:08
they an FDA kind of for technology we
43:13
seem to be observing ourselves of the
43:17
responsibility of being you know of
43:20
waking up and no se pians I hear I hear
43:24
what you’re saying and it’s interesting
43:25
two things come to mind first of all as
43:28
I say I just got back from Europe the
43:29
debate is much more nuanced there and
43:33
and further along and I think that’s in
43:36
part because there was not quite as much
43:39
pendulum shift in the last 40 or 50
43:41
years from the public sector to the
43:43
private sector as there was here I think
43:46
I’m not quite sure if I agree entirely
43:49
with your point about institutions I
43:50
think in some ways part of the problem
43:53
one of the reasons why we have
43:54
concentration levels that are same as
43:57
they were in the 19th century is that
44:00
you know we have a generation of
44:03
business leaders that grew up in the 80s
44:04
thinking that the government was only
44:06
good for cutting taxes and there’s hyper
44:09
individualism that’s that’s
44:12
the entire economic model and in some
44:14
ways I think that you know Facebook is
44:16
maybe the apex of the neoliberal
44:19
economic model if you think about the
44:22
problems of globalization were that cap
44:25
but you know it was supposed to be
44:27
globalization was supposed to be about
44:28
capital goods and people crossing
44:30
borders well it turned out the capital
44:31
could cross a lot faster than either
44:33
goods or people if you take that into
44:36
the world of data that’s even more true
44:38
and so I think that you have a group of
44:41
companies now that have really
44:44
turbocharged a lot of the problems that
44:46
have given us the politics that we have
44:49
now and and a company like Facebook I
44:51
mean I think it every time Zuckerberg is
44:52
on the hill it’s like there’s this
44:54
attitude that they are supranational you
44:56
know and kind of flying 35,000 feet
44:59
above national concerns and I think that
45:02
that’s part of a larger shift and
45:04
probably going to be a big part of the
45:05
2020 debate right are we gonna now have
45:08
a pendulum shift back away from private
45:12
power to some public power some
45:14
different sharing of that which is a
45:16
values question which I think gets at
45:18
some of what you’re talking about
45:20
long-winded answer anyway I think we
45:22
have time for maybe one more question
45:23
yeah – quick question okay one is some
45:27
of the tech companies especially the
45:29
platform companies have you know why
45:32
should we not consider looking at them
45:35
as utility companies yeah I mean we’ve
45:39
had phone companies and as far as I know
45:41
they don’t data mine our conversations
45:43
and maybe mistaken a bit right I mean
45:46
right that’s they could easily right
45:49
right yes it’s different different
45:50
business model yeah yeah so so that was
45:52
one the other thing is you mentioned
45:54
that eventually we need tech policy
45:56
around this and the issue at least my
45:59
issue is that the people who make these
46:01
decisions the the policy makers they
46:05
just most to them don’t have the
46:07
technical background right to properly
46:10
assess the different choices and make
46:12
those decisions I mean I think one of
46:15
them Zuckerberg or someone testified the
46:17
questioning was just awful I mean they
46:20
just ignore our tech support was
46:22
terrible
46:23
yeah exactly so I know anyway whatever
46:27
thoughts you have no that’s a great and
46:29
that’s like maybe a great place to sort
46:31
of wrap up I think the utility model is
46:34
completely viable and it’s interesting
46:36
one of the bits of pushback that you’ll
46:38
sometimes get from folks in the valley
46:40
about that is well if we’re if we’re
46:42
split in this way or if the the capacity
46:46
to innovate is sort of you know
46:47
compressed as the profit margins would
46:49
be compressed in a utility model that’ll
46:52
be bad for innovation not really I mean
46:54
there’s the statistics show for starters
46:56
that companies innovate more when
46:58
they’re smaller they tend to innovate
47:00
more when they’re private and breakups
47:03
in the past you can argue have actually
47:05
created more innovation so a lot of
47:07
academics would say that even the the
47:10
the the antitrust just the threat of
47:13
antitrust action against Microsoft was
47:15
one of the reasons that Google was
47:16
allowed to to blossom as it did you can
47:19
go back to the breakup of the bells and
47:22
say maybe that created space for the
47:25
cellphone industry to to move ahead so I
47:28
think there’s a lot of examples that a
47:31
more decentralized model is actually a
47:34
good thing and I think that that is
47:36
actually going to be a really important
47:37
thing because right now there’s this I
47:40
think very perverse debate in the u.s.
47:42
that is bringing together parts of the
47:45
far right and parts the far left that
47:47
all right we need these companies to
47:48
stay big because they’re the national
47:50
champions and the the becoming war with
47:52
China that is a complete bunk that is
47:56
not shown out first of all I mean these
47:58
companies would love to be in China if
47:59
they could get into China you know I
48:02
think decentralized is the advantage in
48:06
all respects in the US economically so
48:09
yeah I’m have no problems with a utility
48:12
model anyway I think my time is up but
48:15
I’d be happy to sign books and answer
48:17
any other questions here at the table
48:18
and thanks so much for your attention
48:19
[Applause]
48:34
you

How Google Interferes With Its Search Algorithms and Changes Your Results

The internet giant uses blacklists, algorithm tweaks and an army of contractors to shape what you see

Every minute, an estimated 3.8 million queries are typed into Google, prompting its algorithms to spit out results for hotel rates or breast-cancer treatments or the latest news about President Trump.

They are arguably the most powerful lines of computer code in the global economy, controlling how much of the world accesses information found on the internet, and the starting point for billions of dollars of commerce.

Twenty years ago, Google founders began building a goliath on the premise that its search algorithms could do a better job combing the web for useful information than humans. Google executives have said repeatedly—in private meetings with outside groups and in congressional testimony—that the algorithms are objective and essentially autonomous, unsullied by human biases or business considerations.

The company states in a Google blog, “We do not use human curation to collect or arrange the results on a page.” It says it can’t divulge details about how the algorithms work because the company is involved in a long-running and high-stakes battle with those who want to profit by gaming the system.

But that message often clashes with what happens behind the scenes. Over time, Google has increasingly re-engineered and interfered with search results to a far greater degree than the company and its executives have acknowledged, a Wall Street Journal investigation has found.

Those actions often come in response to pressure from businesses, outside interest groups and governments around the world. They have increased sharply since the 2016 election and the rise of online misinformation, the Journal found.

Google’s evolving approach marks a shift from its founding philosophy of “organizing the world’s information,” to one that is far more active in deciding how that information should appear.

More than 100 interviews and the Journal’s own testing of Google’s search results reveal:

• Google made algorithmic changes to its search results that favor big businesses over smaller ones, and in at least one case made changes on behalf of a major advertiser, eBay Inc., contrary to its public position that it never takes that type of action. The company also boosts some major websites, such as Amazon.com Inc. and Facebook Inc., according to people familiar with the matter.

• Google engineers regularly make behind-the-scenes adjustments to other information the company is increasingly layering on top of its basic search results. These features include auto-complete suggestions, boxes called “knowledge panels” and “featured snippets,” and news results, which aren’t subject to the same company policies limiting what engineers can remove or change.

• Despite publicly denying doing so, Google keeps blacklists to remove certain sites or prevent others from surfacing in certain types of results. These moves are separate from those that block sites as required by U.S. or foreign law, such as those featuring child abuse or with copyright infringement, and from changes designed to demote spam sites, which attempt to game the system to appear higher in results.

• In auto-complete, the feature that predicts search terms as the user types a query, Google’s engineers have created algorithms and blacklists to weed out more-incendiary suggestions for controversial subjects, such as abortion or immigration, in effect filtering out inflammatory results on high-profile topics.

• Google employees and executives, including co-founders Larry Page and Sergey Brin, have disagreed on how much to intervene on search results and to what extent. Employees can push for revisions in specific search results, including on topics such as vaccinations and autism.

• To evaluate its search results, Google employs thousands of low-paid contractors whose purpose the company says is to assess the quality of the algorithms’ rankings. Even so, contractors said Google gave feedback to these workers to convey what it considered to be the correct ranking of results, and they revised their assessments accordingly, according to contractors interviewed by the Journal. The contractors’ collective evaluations are then used to adjust algorithms.

THE JOURNAL’S FINDINGS undercut one of Google’s core defenses against global regulators worried about how it wields its immense power—that the company doesn’t exert editorial control over what it shows users. Regulators’ areas of concern include anticompetitive practices, political bias and online misinformation.

Far from being autonomous computer programs oblivious to outside pressure, Google’s algorithms are subject to regular tinkering from executives and engineers who are trying to deliver relevant search results, while also pleasing a wide variety of powerful interests and driving its parent company’s more than $30 billion in annual profit. Google is now the most highly trafficked website in the world, surpassing 90% of the market share for all search engines. The market capitalization of its parent, Alphabet Inc., is more than $900 billion.

Google made more than 3,200 changes to its algorithms in 2018, up from more than 2,400 in 2017 and from about 500 in 2010, according to Google and a person familiar with the matter. Google said 15% of queries today are for words, or combinations of words, that the company has never seen before, putting more demands on engineers to make sure the algorithms deliver useful results.

A Google spokeswoman disputed the Journal’s conclusions, saying, “We do today what we have done all along, provide relevant results from the most reliable sources available.”

Lara Levin, the spokeswoman, said the company is transparent in its guidelines for evaluators and in what it designs the algorithms to do.

AS PART OF ITS EXAMINATION, the Journal tested Google’s search results over several weeks this summer and compared them with results from two competing search engines, Microsoft Corp. ’s Bing and DuckDuckGo, a privacy-focused company that builds its results from syndicated feeds from other companies, including Verizon Communications Inc. ’s Yahoo search engine.

The testing showed wide discrepancies in how Google handled auto-complete queries and some of what Google calls organic search results—the list of websites that Google says are algorithmically sorted by relevance in response to a user’s query. (Read about the methodology for the Journal’s analysis.)

Ms. Levin, the Google spokeswoman, declined to comment on specific results of the Journal’s testing. In general, she said, “Our systems aim to provide relevant results from authoritative sources,” adding that organic search results alone “are not representative of the information made accessible via search.”

The Journal tested the auto-complete feature, which Google says draws from its vast database of search information to predict what a user intends to type, as well as data such as a user’s location and search history. The testing showed the extent to which Google doesn’t offer certain suggestions compared with other search engines.

Typing “Joe Biden is” or “Donald Trump is” in auto-complete, Google offered predicted language that was more innocuous than the other search engines. Similar differences were shown for other presidential candidates tested by the Journal.

The Journal also tested several search terms in auto-complete such as “immigrants are” and “abortion is.” Google’s predicted searches were less inflammatory than those of the other engines.

See the results of the Journal’s auto-complete tests
Use the lookup tool below to select the search terms analyzed. Percentages indicate how many times each suggestion appeared during the WSJ’s testing.
GOOGLE
  • done100%
  • how old100%
  • from99%
  • running for president79%
  • he democrat78%
  • he running for president76%
  • toast71%
  • a democrat70%
DUCKDUCKGOSHOW BING
  • an idiot100%
  • creepy100%
  • from what state100%
  • too old to run for president100%
  • a moron94%
  • a liar84%
  • a joke78%
  • done22%
  • a creep22%
View more auto-complete suggestions:

Gabriel Weinberg, DuckDuckGo’s chief executive, said that for certain words or phrases entered into the search box, such as ones that might be offensive, DuckDuckGo has decided to block all of its auto-complete suggestions, which it licenses from Yahoo. He said that type of block wasn’t triggered in the Journal’s searches for Donald Trump or Joe Biden.

A spokeswoman for Yahoo operator Verizon Media said, “We are committed to delivering a safe and trustworthy search experience to our users and partners, and we work diligently to ensure that search suggestions within Yahoo Search reflect that commitment.”

Said a Microsoft spokeswoman: “We work to ensure that our search results are as relevant, balanced, and trustworthy as possible, and in general, our rule is to minimize interference with the normal algorithmic operation.”

In other areas of the Journal analysis, Google’s results in organic search and news for a number of hot-button terms and politicians’ names showed prominent representation of both conservative and liberal news outlets.

ALGORITHMS ARE effectively recipes in code form, providing step-by-step instructions for how computers should solve certain problems. They drive not just the internet, but the apps that populate phones and tablets.

Algorithms determine which friends show up in a Facebook user’s news feed, which Twitter posts are most likely to go viral and how much an Uber ride should cost during rush hour as opposed to the middle of the night. They are used by banks to screen loan applications, businesses to look for the best job applicants and insurers to determine a person’s expected lifespan.

In the beginning, their power was rarely questioned. At Google in particular, its innovative algorithms ranked web content in a way that was groundbreaking, and hugely lucrative. The company aimed to make the web useful while relying on the assumption that code alone could do the heavy lifting of figuring out how to rank information.

But bad actors are increasingly trying to manipulate search results, businesses are trying to game the system and misinformation is rampant across tech platforms. Google found itself facing a version of the pressures on Facebook, which long said it was just connecting people but has been forced to more aggressively police content on its platform.

A 2016 internal investigation at Google showed between a 10th of a percent and a quarter of a percent of search queries were returning misinformation of some kind, according to one Google executive who works on search. It was a small number percentage-wise, but given the huge volume of Google searches it would amount to nearly two billion searches a year.

By comparison, Facebook faced congressional scrutiny for Russian misinformation that was viewed by 126 million users.

Google’s Ms. Levin said the number includes not just misinformation but also a “wide range of other content defined as lowest quality.” She disputed the Journal’s estimate of the number of searches that were affected. The company doesn’t disclose metrics on Google searches.

Google assembled a small SWAT team to work on the problem that became known internally as “Project Owl.” Borrowing from the strategy used earlier to fight spam, engineers worked to emphasize factors on a page that are proxies for “authoritativeness,” effectively pushing down pages that don’t display those attributes.

Other tech platforms, including Facebook, have taken a more aggressive approach, manually removing problem content and devising rules around what it defines as misinformation. Google, for its part, said its role “indexing” content versus “hosting” content, as Facebook does, means it shouldn’t take a more active role.

One Google search executive described the problem of defining misinformation as incredibly hard, and said the company didn’t want to go down the path of figuring it out.

Around the time Google started addressing issues such as misinformation, it started fielding even more complaints, to the point where human interference became more routine, according to people familiar with the matter, putting it in the position of arbitrating some of society’s most complicated issues. Some changes to search results might be considered reasonable—boosting trusted websites like the National Suicide Prevention Lifeline, for example—but Google has made little disclosure about when changes are made, or why.

Businesses, lawmakers and advertisers are worried about fairness and competition within the markets where Google is a leading player, and as a result its operations are coming under heavy scrutiny.

The U.S. Justice Department earlier this year opened an antitrust probe, in which Google’s search policies and practices are expected to be areas of focus. Google executives have twice been called to testify before Congress in the past year over concerns about political bias. In the European Union, Google has been fined more than $9 billion in the past three years for anticompetitive practices, including allegedly using its search engine to favor its own products.

In response, Google has said it faces tough competition in a dynamic tech sector, and that its behavior is aimed at helping create choice for consumers, not hurting rivals. The company is currently appealing the decisions against it in the EU, and it has denied claims of political bias.

GOOGLE RARELY RELEASES detailed information on algorithm changes, and its moves have bedeviled companies and interest groups, who feel they are operating at the tech giant’s whim.

In one change hotly contested within Google, engineers opted to tilt results to favor prominent businesses over smaller ones, based on the argument that customers were more likely to get what they wanted at larger outlets. One effect of the change was a boost to Amazon’s products, even if the items had been discontinued, according to people familiar with the matter.

The issue came up repeatedly over the years at meetings in which Google search executives discuss algorithm changes. Each time, they chose not to reverse the change, according to a person familiar with the matter.

Google engineers said it is widely acknowledged within the company that search is a zero-sum game: A change that helps lift one result inevitably pushes down another, often with considerable impact on the businesses involved.

Ms. Levin said there is no guidance in Google’s rater guidelines that suggest big sites are inherently more authoritative than small sites. “It’s inaccurate to suggest we did not address issues like discontinued products appearing high up in results,” she added.

Many of the changes within Google have coincided with its gradual evolution from a company with an engineering-focused, almost academic culture, into an advertising behemoth and one of the most profitable companies in the world. Advertising revenue—which includes ads on search as well as on other products such as maps and YouTube—was $116.3 billion last year.

Some very big advertisers received direct advice on how to improve their organic search results, a perk not available to businesses with no contacts at Google, according to people familiar with the matter. In some cases, that help included sending in search engineers to explain a problem, they said.

“If they have an [algorithm] update, our teams may get on the phone with them and they will go through it,” said Jeremy Cornfeldt, the chief executive of the Americas of Dentsu Inc.’s iProspect, which Mr. Cornfeldt said is one of Google’s largest advertising agency clients. He said the agency doesn’t get information Google wouldn’t share publicly. Among others it can disclose, iProspect represents Levi Strauss & Co., Alcon Inc. and Wolverine World Wide Inc.

One former executive at a Fortune 500 company that received such advice said Google frequently adjusts how it crawls the web and ranks pages to deal with specific big websites.

Google updates its index of some sites such as Facebook and Amazon more frequently, a move that helps them appear more often in search results, according to a person familiar with the matter.

“There’s this idea that the search algorithm is all neutral and goes out and combs the web and comes back and shows what it found, and that’s total BS,” the former executive said. “Google deals with special cases all the time.”

Ms. Levin, the Google spokeswoman, said the search team’s practice is to not provide specialized guidance to website owners. She also said that faster indexing of a site isn’t a guarantee that it will rank higher. “We prioritize issues based on impact, not any commercial relationships,” she said.

Alphabet’s net income

$30

 billion

20

10

0

2005

’10

’15

Note: 2017 figure reflects a one-time charge of $9.9 billion related to new U.S. tax law. Alphabet was created through a corporate restructuring of Google in 2015. Figures for prior years are for Google Inc.

Source: FactSet

Online marketplace eBay had long relied on Google for as much as a third of its internet traffic. In 2014, traffic suddenly plummeted—contributing to a $200 million hit in its revenue guidance for that year.

Google told the company it had made a decision to lower the ranking of a large number of eBay pages that were a big source of traffic.

EBay executives debated pulling their quarterly advertising spending of around $30 million from Google to protest, but ultimately decided to step up lobbying pressure on Google, with employees and executives calling and meeting with search engineers, according to people familiar with the matter. A similar episode had hit traffic several years earlier, and eBay had marshaled its lobbying might to persuade Google to give it advice about how to fix the problem, even relying on a former Google staffer who was then employed at eBay to work his contacts, according to one of those people.

This time, Google ultimately agreed to improve the ranking of a number of pages it had demoted while eBay completed a broader revision of its website to make the pages more “useful and relevant,” the people said. The revision was arduous and costly to complete, one of the people said, adding that eBay was later hit by other downrankings that Google didn’t help with.

“We’ve experienced significant and consistent drops in Google SEO for many years, which has been disproportionally detrimental to those small businesses that we support,” an eBay spokesman said. SEO, or search-engine optimization, is the practice of trying to generate more search-engine traffic for a website.

Google’s Ms. Levin declined to comment on eBay.

Companies without eBay’s clout had different experiences.

Dan Baxter can remember the exact moment his website, DealCatcher, was caught in a Google algorithm change. It was 6 p.m. on Sunday, Feb. 18. Mr. Baxter, who founded the Wilmington, Del., coupon website 20 years ago, got a call from one of his 12 employees the next morning.

“Have you looked at our traffic?” the worker asked, frantically, Mr. Baxter recalled. It was suddenly down 93% for no apparent reason. That Saturday, DealCatcher saw about 31,000 visitors from Google. Now it was posting about 2,400. It had disappeared almost entirely on Google search.

Mr. Baxter said he didn’t know whom to contact at Google, so he hired a consultant to help him identify what might have happened. The expert reached out directly to a contact at Google but never heard back. Mr. Baxter tried posting to a YouTube forum hosted by a Google “webmaster” to ask if it might have been a technical problem, but the webmaster seemed to shoot down that idea.

One month to the day after his traffic disappeared, it inexplicably came back, and he still doesn’t know why.

“You’re kind of just left in the dark, and that’s the scary part of the whole thing,” said Mr. Baxter.

Google’s Ms. Levin declined to comment on DealCatcher.

(The Wall Street Journal is owned by News Corp, which has complained publicly about Google’s moves to play down news sites that charge for subscriptions. Google ended the policy that year after intensive lobbying by News Corp and other paywalled publishers. More recently, News Corp has called for an “algorithm review board” to oversee Google, Facebook and other tech giants. News Corp has a commercial agreement to supply news through Facebook, and Dow Jones & Co., publisher of The Wall Street Journal, has a commercial agreement to supply news through Apple services. Google’s Ms. Levin and News Corp declined to comment.)

GOOGLE IN RECENT months has made additional efforts to clarify how its services operate by updating general information on its site. At the end of October it posted a new video titled “How Google Search Works.”

Jonathan Zittrain, a Harvard Law School professor and faculty director of the Berkman Klein Center for Internet & Society, said Google has poorly defined how often or when it intervenes on search results. The company’s argument that it can’t reveal those details because it is fighting spam “seems nuts,” said Mr. Zittrain.

“That argument may have made sense 10 or 15 years ago but not anymore,” he said. “That’s called ‘security through obscurity,’ ” a reference to the now-unfashionable engineering idea that systems can be made more secure by restricting information about how they operate.

Google’s Ms. Levin said “extreme transparency has historically proven to empower bad actors in a way that hurts our users and website owners who play by the rules.”

“Building a service like this means making tens of thousands of really, really complicated human decisions, and that’s not what people think,” said John Bowers, a research associate at the Berkman Klein Center.

On one extreme, those decisions at Google are made by the world’s most accomplished and highest-paid engineers, whose job is to turn the dials within millions of lines of complex code. On the other is an army of more than 10,000 contract workers, who work from home and get paid by the hour to evaluate search results.

The rankings supplied by the contractors, who work from a Google manual that runs to hundreds of pages, can indirectly move a site higher or lower in results, according to people familiar with the matter. And their collective responses are measured by Google executives and used to affect the search algorithms.

Google’s results page has become a complex mix of search results, advertisements and featured content, not always distinguishable by the user. While these features are all driven by algorithms, Google has different policies and attitudes toward changing the results shown in each of the additional features. Featured snippets and knowledge panels are two common features.

Other features

Organic search results

Featured snippet

Knowledge panel

Highlights web pages that Google thinks will contain content a user is looking for. Google says it will remove content from the feature if it violates policies around harmful and hateful content.

Information Google has compiled from various sources on the web, such as Wikipedia, that provides basic facts about the subject of your query. Google is willing to adjust this material.

search term

Organic search results

Links to results that Google’s algorithms have determined are relevant to your query. Google says it doesn’t curate these results.

One of those evaluators was Zack Langley, now a 27-year-old logistics manager at a tour company in New Orleans. Mr. Langley got a one-year contract in the spring of 2016 evaluating Google’s search results through Lionbridge Technologies Inc., one of several companies Google and other tech platforms use for contract work.

During his time as a contractor, Mr. Langley said he never had any contact with anyone at Google, nor was he told what his results would be used for. Like all of Google’s evaluators, he signed a nondisclosure agreement. He made $13.50 an hour and worked up to 20 hours a week from home.

Sometimes working in his pajamas, Mr. Langley was given hundreds of real search results and told to use his judgment to rate them according to quality, reputation and usefulness, among other factors.

At one point, Mr. Langley said he was unhappy with the search results for “best way to kill myself,” which were turning up links that were like “how-to” manuals. He said he down-ranked all the other results for suicide until the National Suicide Prevention Lifeline was the No. 1 result.

Soon after, Mr. Langley said, Google sent a note through Lionbridge saying the hotline should be ranked as the top result across all searches related to suicide, so that the collective rankings of the evaluators would adjust the algorithms to deliver that result. He said he never learned if his actions had anything to do with the change.

Mr. Langley said it seemed like Google wanted him to change content on search so Google would have what he called plausible deniability about making those decisions. He said contractors would get notes from Lionbridge that he believed came from Google telling them the “correct” results on other searches.

He said that in late 2016, as the election approached, Google officials got more involved in dictating the best results, although not necessarily on issues related to the campaign. “They used to have a hands-off approach, and then it seemed to change,” he said.

Ms. Levin, the Google spokeswoman, said the company “long ago evolved our approach to collecting feedback on these types of queries, which help us develop algorithmic solutions and features in this area.” She added that, “we provide updates to our rater guidelines to ensure all raters are following the same general framework.”

Lionbridge didn’t reply to requests for comment.

AT GOOGLE, EMPLOYEES routinely use the company’s internal message boards as well as a form called “go/bad” to push for changes in specific search results. (Go/bad is a reporting system meant to allow Google staff to point out problematic search results.)

One of the first hot-button issues surfaced in 2015, according to people familiar with the matter, when some employees complained that a search for “how do vaccines cause autism” delivered misinformation through sites that oppose vaccinations.

At least one employee defended the result, writing that Google should “let the algorithms decide” what shows up, according to one person familiar with the matter. Instead, the people said, Google made a change so that the first result is a site called howdovaccinescauseautism.com—which states on its home page in large black letters, “They f—ing don’t.” (The phrase has become a meme within Google.)

Google’s Ms. Levin declined to comment.

In the fall of 2018, the conservative news site Breitbart News Network posted a leaked video of Google executives, including Mr. Brin and Google CEO Sundar Pichai, upset and addressing staffers following President Trump’s election two years earlier. A group of Google employees noticed the video was appearing on the 12th page of search results when Googling “leaked Google video Trump,” which made it seem like Google was burying it. They complained on one of the company’s internal message boards, according to people familiar with the matter. Shortly after, the leaked video began appearing higher in search results.

“When we receive reports of our product not behaving as people might expect, we investigate to see if there’s any useful insight to inform future improvements,” said Ms. Levin.

FROM GOOGLE’S FOUNDING, Messrs. Page and Brin knew that ranking webpages was a matter of opinion. “The importance of a Web page is an inherently subjective matter, which depends on the [readers’] interests, knowledge and attitudes,” they wrote in their 1998 paper introducing the PageRank algorithm, the founding system that launched the search engine.

PageRank, they wrote, would measure the level of human interest and attention, but it would do so “objectively and mechanically.” They contended that the system would mathematically measure the relevance of a site by the number of times other relevant sites linked to it on the web.

Today, PageRank has been updated and subsumed into more than 200 different algorithms, attuned to hundreds of signals, now used by Google. (The company replaced PageRank in 2005 with a newer version that could better keep up with the vast traffic that the site was attracting. Internally, it was called “PageRankNG,” ostensibly named for “next generation,” according to people familiar with the matter. In public, the company still points to PageRank—and on its website links to the original algorithm published by Messrs. Page and Brin—in explaining how search works. “The original insight and notion of using link patterns is something that we still use in our systems,” said Ms. Levin.)

By the early 2000s, spammers were overwhelming Google’s algorithms with tactics that made their sites appear more popular than they were, skewing search results. Messrs. Page and Brin disagreed over how to tackle the problem.

Mr. Brin argued against human intervention, contending that Google should deliver the most accurate results as delivered by the algorithms, and that the algorithms should only be tweaked in the most extreme cases. Mr. Page countered that the user experience was getting damaged when users encountered spam rather than useful results, according to people familiar with the matter.

Google already had been taking what the company calls “manual actions” against specific websites that were abusing the algorithm. In that process, Google engineers demote a website’s ranking by changing its specific “weighting.” For example, if a website is artificially boosted by paying other websites to link to it, a behavior that Google frowns upon, Google engineers could turn down the dial on that specific weighting. The company could also blacklist a website, or remove it altogether.

Mr. Brin still opposed making large-scale efforts to fight spam, because it involved more human intervention. Mr. Brin, whose parents were Jewish émigrés from the former Soviet Union, even personally decided to allow anti-Semitic sites that were in the results for the query “Jew,” according to people familiar with the decision. Google posted a disclaimer with results for that query saying, “Our search results are generated completely objectively and are independent of the beliefs and preferences of those who work at Google.”

Finally, in 2004, in the bathroom one day at Google’s headquarters in Mountain View, Calif., Mr. Page approached Ben Gomes, one of Google’s early search executives, to express support for his efforts fighting spam. “Just do what you need to do,” said Mr. Page, according to a person familiar with the conversation. “Sergey is going to ruin this f—ing company.”

Ms. Levin, the Google spokeswoman, said Messrs. Page, Brin and Gomes declined to comment.

After that, the company revised its algorithms to fight spam and loosened rules for manual interventions, according to people familiar with the matter.

Google has guidelines for changing its ranking algorithms, a grueling process called the “launch committee.” Google executives have pointed to this process in a general way in congressional testimony when asked about algorithm changes.

The process is like defending a thesis, and the meetings can be contentious, according to people familiar with them.

In part because the process is laborious, some engineers aim to avoid it if they can, one of these people said, and small changes can sometimes get pushed through without the committee’s approval. Mr. Gomes is on the committee that decides whether to approve the changes, and other senior officials sometimes attend as well.

Google’s Ms. Levin said not every algorithm change is discussed in a meeting, but “there are other processes for reviewing more straightforward launches at different levels of the organization,” such as an email review. Those reviews still involve members of the launch committee, she said.

Today, Google discloses only a few of the factors being measured by its algorithms. Known ones include “freshness,” which gives preference to recently created content for searches relating to things such as breaking news or a sports event. Another is where a user is located—if a user searches for “zoo,” Google engineers want the algorithms to provide the best zoo in the user’s area. Language signals—how meanings change when words are used together, such as April and fools—are among the most important, as they help determine what a user is actually asking for.

Other important signals have included the length of time users would stay on pages they clicked on before clicking back to Google, according to a former Google employee. Long stays would boost a page’s ranking. Quick bounce backs, indicating a site wasn’t relevant, would severely hurt a ranking, the former employee said.

Over the years, Google’s database recording this user activity has become a competitive advantage, helping cement its position in the search market. Other search engines don’t have the vast quantity of data that is available to Google, search’s market-leader.

That makes the impact of its operating decisions immense. When Pinterest Inc. filed to go public earlier this year, it said that “search engines, such as Google, may modify their algorithms and policies or enforce those policies in ways that are detrimental to us.” It added: “Our ability to appeal these actions is limited.” A spokeswoman for Pinterest declined to comment.

Search-engine optimization consultants have proliferated to try to decipher Google’s signals on behalf of large and small businesses. But even those experts said the algorithms remain borderline indecipherable. “It’s black magic,” said Glenn Gabe, an SEO expert who has spent years analyzing Google’s algorithms and tried to help DealCatcher find a solution to its drop in traffic earlier this year.

ALONG WITH ADVERTISEMENTS, Google’s own features now take up large amounts of space on the first page of results—with few obvious distinctions for users. These include news headlines and videos across the top, information panels along the side and “People also ask” boxes highlighting related questions.

Google engineers view the features as separate products from Google search, and there is less resistance to manually changing their content in response to outside requests, according to people familiar with the matter.

These features have become more prominent as Google attempts to keep users on its results page, where ads are placed, instead of losing the users as they click through to other sites. In September, about 55% of Google searches on mobile were “no-click” searches, according to research firm Jumpshot, meaning users never left the results page.

Two typical features on the results page—knowledge panels, which are collections of relevant information about people, events or other things; and featured snippets, which are highlighted results that Google thinks will contain content a user is looking for—are areas where Google engineers make changes to fix results, the Journal found.

Google has looser policies about making adjustments to these features than organic search results. The features include Google News and People also ask.

Other features

Organic search results

search term

Top stories

News articles surfaced as being particularly relevant. Google blocks some sites that don’t meet its policies.

People also ask

A predictive feature that suggests related questions, providing short answers with links. Google says it weeds out and blocks some phrases in this feature as it does in its auto-complete feature.

Organic

search results

In April, the conservative Heritage Foundation called Google to complain that a coming movie called “Unplanned” had been labeled in a knowledge panel as “propaganda,” according to a person familiar with the matter. The film is about a former Planned Parenthood director who had a change of heart and became pro-life.

After the Heritage Foundation complained to a contact at Google, the company apologized and removed “propaganda” from the description, that person said.

Google’s Ms. Levin said the change “was not the result of pressure from an outside group, it was a violation of the feature’s policy.”

On the auto-complete feature, Google reached a confidential settlement in France in 2012 with several outside groups that had complained it was anti-Semitic that Google was suggesting the French word for “Jew” when searchers typed in the name of several prominent politicians. Google agreed to “algorithmically mitigate” such suggestions as part of a pact that barred the parties from disclosing its terms, according to people familiar with the matter.

In recent years, Google changed its auto-complete algorithms to remove “sensitive and disparaging remarks.” The policy, now detailed on its website, says that Google doesn’t allow predictions that may be related to “harassment, bullying, threats, inappropriate sexualization, or predictions that expose private or sensitive information.”

GOOGLE HAS BECOME more open about its moderation of auto-complete but still doesn’t disclose its use of blacklists. Kevin Gibbs, who created auto-complete in 2004 when he was a Google engineer, originally developed the list of terms that wouldn’t be suggested, even if they were the most popular queries that independent algorithms would normally supply.

For example, if a user searched “Britney Spears”—a popular search on Google at the time—Mr. Gibbs didn’t want a piece of human anatomy or the description of a sex act to appear when someone started typing the singer’s name. The unfiltered results were “kind of horrible,” Mr. Gibbs said in an interview.

He said deciding what should and shouldn’t be on the list was challenging. “It was uncomfortable, and I felt a lot of pressure,” said Mr. Gibbs, who worked on auto-complete for about a year, and left the company in 2012. “I wanted to make sure it represented the world fairly and didn’t leave out any groups.”

Google still maintains lists of phrases and terms that are manually blacklisted from auto-complete, according to people familiar with the matter.

The company internally has a “clearly articulated set of policies” about what terms or phrases might be blacklisted in auto-complete, and that it follows those rules, according to a person familiar with the matter.

Blacklists also affect the results in organic search and Google News, as well as other search products, such as Web answers and knowledge panels, according to people familiar with the matter.

Google has said in congressional testimony it doesn’t use blacklists. Asked in a 2018 hearing whether Google had ever blacklisted a “company, group, individual or outlet…for political reasons,” Karan Bhatia, Google’s vice president of public policy, responded: “No, ma’am, we don’t use blacklists/whitelists to influence our search results,” according to the transcript.

Ms. Levin said those statements were related to blacklists targeting political groups, which she said the company doesn’t keep.

Google’s first blacklists date to the early 2000s, when the company made a list of spam sites that it removed from its index, one of those people said. This means the sites wouldn’t appear in search results.

Engineers known as “maintainers” are authorized to make and approve changes to blacklists. It takes at least two people to do this; one person makes the change, while a second approves it, according to the person familiar with the matter.

The Journal reviewed a draft policy document from August 2018 that outlines how Google employees should implement an anti-misinformation blacklist aimed at blocking certain publishers from appearing in Google News and other search products. The document says engineers should focus on “a publisher misrepresenting their ownership or web properties” and having “deceptive content”—that is, sites that actively aim to mislead—as opposed to those that have inaccurate content.

“The purpose of the blacklist will be to bar the sites from surfacing in any Search feature or news product sites,” the document states.

Ms. Levin said Google does “not manually determine the order of any search result.” She said sites that don’t adhere to Google News “inclusion policies” are “not eligible to appear on news surfaces or in information boxes in Search.”

SOME INDIVIDUALS and companies said changes made by the company seem ad hoc, or inconsistent. People familiar with the matter said Google increasingly will make manual or algorithmic changes that aren’t acknowledged publicly in order to maintain that it isn’t affected by outside pressure.

“It’s very convenient for us to say that the algorithms make all the decisions,” said one former Google executive.

In March 2017, Google updated the guidelines it gives contractors who evaluate search results, instructing them for the first time to give low-quality ratings to sites “created with the sole purpose of promoting hate or violence against a group of people”—something that would help adjust Google algorithms to lower those sites in search.

The next year, the company broadened the guidance to any pages that promote such hate or violence, even if it isn’t the page’s sole purpose and even if it is “expressed in polite or even academic-sounding language.”

Google has resisted entirely removing some content that outsiders complained should be blocked. In May 2018, Ignacio Wenley Palacios, a Spain-based lawyer working for the Lawfare Project, a nonprofit that funds litigation to protect Jewish people, asked Google to remove an anti-Semitic article lauding a German Holocaust denier posted on a Spanish-language neo-Nazi blog.

The company declined. In an email to Mr. Wenley Palacios, lawyers for Google contended that “while such content is detestable” it isn’t “manifestly illegal” in Spain.

Mr. Wenley Palacios then filed a lawsuit, but in the spring of this year, before the suit could be heard, he said, Google lawyers told him the company was changing its policy on such removals in Spain.

According to Mr. Wenley Palacios, the lawyers said the firm would now remove from searches conducted in Spain any links to Holocaust denial and other content that could hurt vulnerable minorities, once they are pointed out to the company. The results would still be accessible outside of Spain. He said both sides agreed to dismiss the case.

Google’s Ms. Levin described the action as a “legal removal” in accordance with local law. Holocaust denial isn’t illegal in Spain, but if it is coupled with an intent to spread hate, it can fall under Spanish criminal law banning certain forms of hate speech.

“Google used to say, ‘We don’t approve of the content, but that’s what it is,’ ” Mr. Wenley Palacios said. “That has changed dramatically.”

Google’s search results page has changed over the years, becoming much more ad-heavy.

Other features

Organic search results

search term

Ad

Ads

Ads in recent years claim more space at the top of the results page.

Ad

Vertical search results

Various features that present specialized results for specific topics, like hotels or places, often with photos or maps. The results in some of these features are paid advertisements.

Organic search results

As Google has placed more ads and verticals at the top of the page, organic search results have shrunk.

Health policy consultant Greg Williams said he helped lead a campaign to push Google to make changes that would stifle misleading results for queries such as “rehab.”

At the time, in 2017, addiction centers with spotty records were constantly showing up in search results, typically the first place family members and addicts go in search of help.

Google routed Diane Hentges several times over the last year to call centers as she desperately researched drug addiction treatment centers for her 22-year-old son, she said.

Each time she called one of the facilities listed on Google, a customer-service representative would ask for her financial information, but the representatives weren’t seemingly attached to any legitimate company.

“If you look at a place on Google, it sends you straight to a call center,” Ms. Hentges said, adding that parents who are struggling with a child with addiction “will do anything to get our child healthy. We’ll believe anything.”

After intense lobbying by Mr. Williams and others, Google changed its ad policy around such queries. But addiction industry officials also noticed a significant change to Google search results. Many searches for “rehab” or related terms began returning the website for the Substance Abuse and Mental Health Services Administration, the national help hotline run by the U.S. Department of Health and Human Services, as the top result.

Google never acknowledged the change. Ms. Levin said that “resources are not listed because of any type of partnership” and that “we have algorithmic solutions designed to prioritize authoritative resources (including official hotlines) in our results for queries like these as well as for suicide and self-harm queries.”

A spokesman for SAMHSA said the agency had a partnership with Google.

Google’s search algorithms have been a major focus of Hollywood in its effort to fight pirated TV shows and movies.

Alphabet’s revenue, by type

Advertising

Other

$150

 billion

100

50

0

2005

’10

’15

Note: Alphabet was created through a corporate restructuring of Google in 2015. Figures for prior years are for Google Inc.

Source: the company

Studios “saw this as the potential death knell of their business,” said Dan Glickman, chairman and chief executive of the Motion Picture Association of America from 2004 to 2010. The association has been a public critic of Google. “A hundred million dollars to market a major movie could be thrown away if someone could stream it illegally online.”

Google received a record 1.6 million requests to remove web pages for copyright issues last year, according to the company’s published Transparency Report and a Journal analysis. Those requests pertained to more than 740 million pages, about 12 times the number of web pages it was asked to take down in 2012.

A decade ago, in concession to the industry, Google removed “download” from its auto-complete suggestions after the name of a movie or TV show, so that at least it wouldn’t be encouraging searches for pirated content.

In 2012, it applied a filter to search results that would lower the ranking of sites that received a large number of piracy complaints under U.S. copyright law. That effectively pushed many pirate sites off the front page of results for general searches for movies or music, although it still showed them when a user specifically typed in the pirate site names.

In recent months the industry has gotten more cooperation from Google on piracy in search results than at any point in the organization’s history, according to people familiar with the matter.

“Google is under great cosmic pressure, as is Facebook,” Mr. Glickman said. “These are companies that are in danger of being federally regulated to an extent that they never anticipated.”

Mr. Pichai, who became CEO of Google in 2015, is more willing to entertain complaints about the search results from outside parties than Messrs. Page and Brin, the co-founders, according to people familiar with his leadership.

Google’s Ms. Levin said Mr. Pichai’s “style of engaging and listening to feedback has not shifted. He has always been very open to feedback.”

CRITICISM ALLEGING political bias in Google’s search results has sharpened since the 2016 election.

Interest groups from the right and left have besieged Google with questions about content displayed in search results and about why the company’s algorithms returned certain information over others.

Google appointed an executive in Washington, Max Pappas, to handle complaints from conservative groups, according to people familiar with the matter. Mr. Pappas works with Google engineers on changes to search when conservative viewpoints aren’t being represented fairly, according to interest groups interviewed by the Journal, although that is just one part of his job.

“Conservatives need people they can go to at these companies,” said Dan Gainor, an executive at the conservative Media Research Center, which has complained about various issues to Google.

Google also appointed at least one other executive in Washington, Chanelle Hardy, to work with outside liberal groups, according to people familiar with the matter.

Ms. Levin said both positions have existed for many years. She said in general Google believes it’s “the responsible thing to do” to understand feedback from the groups and said Google’s algorithms and policies don’t attempt to make any judgment based on the political leanings of a website.

Mr. Pappas declined to comment, and Ms. Hardy didn’t reply to a request for comment.

SHARE YOUR THOUGHTS

Does Google give you what you expect in search results? Join the discussion below.

Over the past year, abortion-rights groups have complained about search results that turned up the websites of what are known as “crisis pregnancy centers,” organizations that counsel women against having abortions, according to people familiar with the matter.

One of the complaining organizations was Naral Pro-Choice America, which tracks the activities of anti-abortion groups through its opposition research department, said spokeswoman Kristin Ford.

Naral complained to Google and other tech platforms that some of the ads, posts and search results from crisis pregnancy centers are misleading and deceptive, she said. Some of the organizations claimed to offer abortions and then counseled women against it. “They do not disclose what their agenda is,” Ms. Ford said.

In June, Google updated its advertising policies related to abortion, saying that advertisers must state whether they provide abortions or not, according to its website. Ms. Ford said Naral wasn’t told in advance of the policy change.

Ms. Levin said Google didn’t implement any changes with regard to how crisis pregnancy centers rank for abortion queries.

The Journal tested the term “abortion” in organic search results over 17 days in July and August. Thirty-nine percent of all results on the first page had the hostname www.plannedparenthood.org, the site of Planned Parenthood Federation of America, the nonprofit, abortion-rights organization.

By comparison, 14% of Bing’s first page of search results and 16% of DuckDuckGo’s first page of results were from Planned Parenthood.

Ms. Levin said Google doesn’t have any particular ranking implementations aimed at promoting Planned Parenthood.

See the results of the Journal’s search tests
Use the lookup tool below to select search terms analyzed. Percentages indicate how many times each web page appeared during the WSJ’s testing.
GOOGLE
  • Abortion – Wikipediahttps://en.wikipedia.org/wiki/Abortion100%
  • Abortion Information | Information About Your Optionshttps://www.plannedparenthood.org/learn/abortion100%
  • An Overview of Abortion Laws | Guttmacher Institutehttps://www.guttmacher.org/state-policy/explore/overview-abortion-laws100%
  • What facts about abortion do I need to know? – Planned Parenthoodhttps://www.plannedparenthood.org/learn/abortion/considering-abortion/what-facts-about-abortion-do-i-need-know67%
  • In-Clinic Abortion Procedure | Abortion Methods – Planned Parenthoodhttps://www.plannedparenthood.org/learn/abortion/in-clinic-abortion-procedures52%
  • National Abortion Federation: Homehttps://prochoice.org/44%
  • Abortion | Center for Reproductive Rightshttps://reproductiverights.org/our-issues/abortion38%
  • What Happens During an In-Clinic Abortion? – Planned Parenthoodhttps://www.plannedparenthood.org/learn/abortion/in-clinic-abortion-procedures/what-happens-during-an-in-clinic-abortion38%
DUCKDUCKGOSHOW BING
  • Abortion – Pros & Cons – ProCon.orghttps://abortion.procon.org/100%
  • Abortion – The New York Timeshttps://www.nytimes.com/topic/subject/abortion100%
  • Abortion Information | Information About Your Optionshttps://www.plannedparenthood.org/learn/abortion100%
  • Abortion: Get Facts About the Procedure and Statisticshttps://www.emedicinehealth.com/abortion/article_em.htm100%
  • AbortionFacts.com – Information on Abortion You Can Usehttps://www.abortionfacts.com/100%
  • Abortion – Wikipediahttps://en.wikipedia.org/wiki/Abortion99%
  • Abortion | Medical Abortion | MedlinePlushttps://medlineplus.gov/abortion.html98%
  • Abortion Procedures During First, Second and Third Trimesterhttps://americanpregnancy.org/unplanned-pregnancy/abortion-procedures/76%
View more search results:

The practice of creating blacklists for certain types of sites or searches has fueled cries of political bias from some Google engineers and right-wing publications that said they have viewed portions of the blacklists. Some of the websites Google appears to have targeted in Google News were conservative sites and blogs, according to documents reviewed by the Journal. In one partial blacklist reviewed by the Journal, some conservative and right-wing websites, including The Gateway Pundit and The United West, were included on a list of hundreds of websites that wouldn’t appear in news or featured products, although they could appear in organic search results.

Google has said repeatedly it doesn’t make decisions based on politics, and current and former employees told the Journal they haven’t seen evidence of political bias. And yet, they said, Google’s shifting policies on interference—and its lack of transparency about them—inevitably force employees to become arbiters of what is acceptable, a dilemma that opens the door to charges of bias or favoritism.

Google’s Ms. Levin declined to comment.

DEMANDS FROM GOVERNMENTS for changes have grown rapidly since 2016.

From 2010 to 2018, Google fielded such requests from countries including the U.S. to remove 685,000 links from what Google calls web search. The requests came from courts or other authorities that said the links broke local laws or should be removed for other reasons.

Nearly 78% of those removal requests have been since the beginning of 2016, according to reports that Google publishes on its website. Google’s ultimate actions on those requests weren’t disclosed.

Russia has been by far the most prolific, demanding the removal of about 255,000 links from search last year, three-quarters of all government requests for removal from Google search in that period, the data show. Nearly all of the country’s requests came under an information-security law Russia put into effect in late 2017, according to a Journal examination of disclosures in a database run by the Berkman Klein Center.

Google said the Russian law doesn’t allow it to disclose which URLs were requested to be removed. A person familiar with the matter said the removal demands are for content ruled illegal in Russia for a variety of reasons, such as for promoting drug use or encouraging suicide.

Requests can include demands to remove links to information the government defines as extremist, which can be used to target political opposition, the person said.

Google, whose staff reviews the requests, at times declines those that appear focused on political opposition, the person said, adding that in those cases, it tries not to draw attention to its decisions to avoid provoking Russian regulators.

The approach has led to stiff internal debate. On one side, some Google employees say that the company shouldn’t cooperate at all with takedown requests from countries such as Russia or Turkey. Others say it is important to follow the laws of countries where they are based.

“There is a real question internally about whether a private company should be making these calls,” the person said.

Google’s Ms. Levin said, “Maximizing access to information has always been a core principle of Search, and that hasn’t changed.”

Google’s culture of publicly resisting demands to change results has diminished, current and former employees said. A few years ago, the company dismantled a global team focused on free-speech issues that, among other things, publicized the company’s legal battles to fight changes to search results, in part because Google had lost several of those battles in court, according to a person familiar with the change.

“Free expression was no longer a winner,” the person said.

How Google Edged Out Rivals and Built the World’s Dominant Ad Machine: A Visual Guide

The U.S. is investigating whether the tech giant has abused its power, including as the biggest broker of digital ad sales across the web

Nexstar Media Group Inc., the largest local news company in the U.S., recently tested what would happen if it stopped using Google’s technology to place ads on its websites.

Over several days, the company’s video ad sales plummeted. “That’s a huge revenue hit,” said Tony Katsur, senior vice president at Nexstar. After its brief test, Nexstar switched back to Google.

Alphabet Inc. ’s Google is under fire for its dominance in digital advertising, in part because of issues like this. The U.S. Justice Department and state attorneys general are investigating whether Google is abusing its power, including as the dominant broker of digital ad sales across the web. Most of the nearly 130 questions the states asked in a September subpoena were about the inner workings of Google’s ad products and how they interact.

We dug into Google’s vast, opaque ad machine, and in a series of graphics below, show you how it all works—and why publishers and rivals have had so many complaints about it.

Much of Google’s power as an ad broker stems from acquisitions of ad-technology companies, especially its 2008 purchase of DoubleClick. Regulators who approved that $3.1 billion deal warned they would step in if the company tied together its offerings in anticompetitive ways.

In interviews, dozens of publishing and advertising executives said Google is doing just that with an array of interwoven products. Google operates the leading selling and buying tools, and the biggest marketplace where online ad deals happen.

When Nexstar didn’t use Google’s selling tool, it missed out on a huge amount of demand that comes through its buying tools, Mr. Katsur said: “They want you locked in.”

What Ever Happened to Google Books?

It was the most ambitious library project of our time—a plan to scan all of the world’s books and make them available to the public online. “We think that we can do it all inside of ten years,” Marissa Mayer, who was then a vice-president at Google, said to this magazine in 2007, when Google Books was in its beta stage. “It’s mind-boggling to me, how close it is.”

Today, the project sits in a kind of limbo. On one hand, Google has scanned an impressive thirty million volumes, putting it in a league with the world’s larger libraries (the library of Congress has around thirty-seven million books). That is a serious accomplishment. But while the corpus is impressive, most of it remains inaccessible. Searches of out-of-print books often yield mere snippets of the text—there is no way to gain access to the whole book. The thrilling thing about Google Books, it seemed to me, was not just the opportunity to read a line here or there; it was the possibility of exploring the full text of millions of out-of-print books and periodicals that had no real commercial value but nonetheless represented a treasure trove for the public. In other words, it would be the world’s first online library worthy of that name. And yet the attainment of that goal has been stymied, despite Google having at its disposal an unusual combination of technological means, the agreement of many authors and publishers, and enough money to compensate just about everyone who needs it.

The problems began with a classic culture clash when, in 2002, Google began just scanning books, either hoping that the idealism of the project would win everyone over or following the mantra that it is always easier to get forgiveness than permission. That approach didn’t go over well with authors and publishers, who sued for copyright infringement. Two years of insults, ill will, and litigation ensued. Nonetheless, by 2008, representatives of authors, publishers, and Google did manage to reach a settlement to make the full library available to the public, for pay, and to institutions. In the settlement agreement, they also put terminals in libraries, but didn’t ever get around to doing that. But that agreement then came under further attacks from a whole new set of critics, including the author Ursula Le Guin, who called it a “deal with the devil.” Others argued that the settlement could create a monopoly in online, out-of-print books.

Four years ago, a federal judge sided with the critics and threw out the 2008 settlement, adding that aspects of the copyright issue would be more appropriately decided by the legislature. “Sounds like a job for Congress,” James Grimmelmann, a law professor at the University of Maryland and one of the settlement’s more vocal antagonists, said at the time. But, of course, leaving things to Congress has become a synonym for doing nothing, and, predictably, a full seven years after the court decision was first announced, we’re still waiting.

There are plenty of ways to attribute blame in this situation. If Google was, in truth, motivated by the highest ideals of service to the public, then it should have declared the project a non-profit from the beginning, thereby extinguishing any fears that the company wanted to somehow make a profit from other people’s work. Unfortunately, Google made the mistake it often makes, which is to assume that people will trust it just because it’s Google. For their part, authors and publishers, even if they did eventually settle, were difficult and conspiracy-minded, particularly when it came to weighing abstract and mainly worthless rights against the public’s interest in gaining access to obscure works. Finally, the outside critics and the courts were entirely too sanguine about killing, as opposed to improving, a settlement that took so many years to put together, effectively setting the project back a decade if not longer.

In the past few years, the Authors Guild has usefully proposed a solution known as an “extended collective licensing” system. Using a complex mechanism, it would allow the owners of scanned, out-of-print libraries, such as Google or actual non-profits like the Hathitrust library, to make a limited set of them available with payouts to authors. The United States Copyright Office supports this plan. I have a simpler suggestion, nicknamed the Big Bang license. Congress should allow anyone with a scanned library to pay some price—say, a hundred and twenty-five million dollars—to gain a license, subject to any opt-outs, allowing them to make those scanned prints available to institutional or individual subscribers. That money would be divided equally among all the rights holders who came forward to claim it in a three-year window—split fifty-fifty between authors and publishers. It is, admittedly, a crude, one-time solution to the problem, but it would do the job, and it might just mean that the world would gain access to the first real online library within this lifetime.

To Break Google’s Monopoly on Search, Make Its Index Public

Ex-Google-Search engineer here, having also done some projects since leaving that involve data-mining publicly-available web documents.

This proposal won’t do very much. Indexing is the (relatively) easy part of building a search engine. CommonCrawl already indexes the top 3B+ pages on the web and makes it freely available on AWS. It costs about $50 to grep over it, $800 or so to run a moderately complex Hadoop job.

(For comparison, when I was at Google nearly all research & new features were done on the top 4B pages, and the remaining 150B+ pages were only consulted if no results in the top 4B turned up. Difficulty of running a MapReduce over that corpus was actually a little harder than running a Hadoop job over CommonCrawl, because there’s less documentation available.)

The comments here that PageRank is Google’s secret sauce also aren’t really true – Google hasn’t used PageRank since 2006. The ones about the search & clickthrough data being important are closer, but I suspect that if you made those public you still wouldn’t have an effective Google competitor.

The real reason Google’s still on top is that consumer habits are hard to change, and once people have 20 years of practice solving a problem one way, most of them are not going to switch unless the alternative isn’t just better, it’s way, way better. Same reason I still buy Quilted Northern toilet paper despite knowing that it supports the Koch brothers and their abhorrent political views, or drink Coca-Cola despite knowing how unhealthy it is.

If you really want to open the search-engine space to competition, you’d have to break Google up and then forbid any of the baby-Googles from using the Google brand or google.com domain name. (Needless to say, you’d also need to get rid of Chrome & Toolbar integration.) Same with all the other monopolies that plague the American business landscape. Once you get to a certain age, the majority of the business value is in the brand, and so the only way to keep the monopoly from dominating its industry again is to take away the brand and distribute the productive capacity to successor companies on relatively even footing.

Ex-Google-Search engineer here, having also done some projects since leaving that involve data-mining publicly-available web documents.This proposal won’t do very much. Indexing is the (relatively) easy part of building a search engine. CommonCrawl already indexes the top 3B+ pages on the web and makes it freely available on AWS. It costs about $50 to grep over it, $800 or so to run a moderately complex Hadoop job.

(For comparison, when I was at Google nearly all research & new features were done on the top 4B pages, and the remaining 150B+ pages were only consulted if no results in the top 4B turned up. Difficulty of running a MapReduce over that corpus was actually a little harder than running a Hadoop job over CommonCrawl, because there’s less documentation available.)

The comments here that PageRank is Google’s secret sauce also aren’t really true – Google hasn’t used PageRank since 2006. The ones about the search & clickthrough data being important are closer, but I suspect that if you made those public you still wouldn’t have an effective Google competitor.

The real reason Google’s still on top is that consumer habits are hard to change, and once people have 20 years of practice solving a problem one way, most of them are not going to switch unless the alternative isn’t just better, it’s way, way better. Same reason I still buy Quilted Northern toilet paper despite knowing that it supports the Koch brothers and their abhorrent political views, or drink Coca-Cola despite knowing how unhealthy it is.

If you really want to open the search-engine space to competition, you’d have to break Google up and then forbid any of the baby-Googles from using the Google brand or google.com domain name. (Needless to say, you’d also need to get rid of Chrome & Toolbar integration.) Same with all the other monopolies that plague the American business landscape. Once you get to a certain age, the majority of the business value is in the brand, and so the only way to keep the monopoly from dominating its industry again is to take away the brand and distribute the productive capacity to successor companies on relatively even footing.

Sure, it costs $50 to grep it, but how much does it cost to host an in-memory index with all the data?This is not a proposal to just share the crawl data, but the actual searchable index, presumably at arms length cost both internally & externally.

The same ideas could be extended to the Knowledge Graph, etc.

IMO the goal here should not be to kill Google, but to keep Google on their toes by removing barriers to competition.

This ^ times a 1000.Google simply has the best search product. They invest in it like crazy.

I’ve tried bing multiple times. It’s slow, it spams msn ads in your face on the homepage. Microsoft just doesn’t get the value of a clean UX.

DuckDuckGo results are pretty irrelevant the last time I tried them. There is nothing that comes close to their usability. To make the switchover, it has to be much much better than Google. Chances are that if something is, Google will buy them.

One thing to keep in mind when comparing DuckDuckGo to Google is that people do not use Google with an alternative backup in mind. When you DDG something and it fails, you can always switch to google.But what about when Google fails? Unlike DDG, there is no culture of switching between search engines when googling. Typically, you’ll just rewrite the query for google. And as rewriting the query is an entrenched part of googling, you are less likely to notice this as a failure. It is this training that’s the core advantage nostrademons points out.
Webspam is a really big problem, yes. It’s very unlikely that you’d be able to catch up or keep up in that regard without Google’s resources.Building the index itself is relatively easy. There are some subtleties that most people don’t think about (eg. dupe detection and redirects are surprisingly complicated, and CJK segmentation is a pre-req for tokenizing), but things like tokenizing, building posting lists, and finding backlinks are trivial – a competent programmer could get basic English-only implementations of all three running in a day.
> 1) a record of searches and user clicks for the past 20 years

From what I can tell, Google cares a lot more about recency.

When I switch over to a new framework or language, search results are pretty bad for the first week, horrible actually as Google thinks I am still using /other language/. I have to keep appending the language / framework name to my queries.

After a week or so? The results are pure magic. I can search for something sort of describing what I want and Google returns the correct answer. If I search for ‘array length’ Google is going to tell me how to find the length of an array in whatever language I am currently immersed in!

As much as I try to use Duck Duck Go, Google is just too magic.

But I don’t think it is because they have my complete search history.

Also people forget that the creepy stuff Google does is super useful.

For example, whatever framework I am using, Google will start pushing news updates to my Google Now (or whatever it is called on my phone) about new releases to that framework. I get a constant stream of learning resources, valuable blog posts, and best practices delivered to me every morning!

It really is impressive.