Wikipedia co-founder: I no longer trust the website I created

Freddie Sayers meets Larry Sanger.

Listen to the podcast version: https://shows.acast.com/lockdowntv-wi…

Read the full article here: https://unherd.com/thepost/wikipedia-…

Chances are, if you’ve ever been on the internet, you’ve visited Wikipedia. It is the world’s fifth largest website, pulling in an estimated 6.1 billion followers per month and serves as a cheat sheet for almost any topic in the world. So great is the online encyclopaedia’s influence is so great that it is the biggest and “most read reference work in history”, with as many as 56 million editions. 

But the truth about this supposedly neutral purveyor of information is a little more complex. Historically, Wikipedia has been written and monitored by a community of volunteers who collaborated and contested competing claims with one another. In the words of Wikipedia’s co-founder, Larry Sanger who spoke to Freddie Sayers on LockdownTV, these volunteers would “battle it out”. 

This battle of ideas on Wikipedia’s platform formed a crucial part of the encyclopaedia’s commitment to neutrality, which according to Sanger, was abandoned after 2009. In the years since, on issues ranging from Covid to Joe Biden, it has become increasingly partisan, primarily espousing an establishment viewpoint that increasingly represents “propaganda”. This, says Sanger, is why he left the site in 2007, describing it as “broken beyond repair”.

Jimmy Wales (Wikipedia): How I Built This With Guy Roz

You must have know at this point in 2001 or 2003.  Wikipedia was growing really fast.

You decide, I guess around 2003.  What was the thinking behind that? Why did you do that?

The community of volunteers very much wanted it to be a non-profit.

Finally for me, it just made sense.  Aesthetically,  my ambitions for Wikipedia.

really make a nonprofit option more senseible.   I think if we had gone a different route it would be very different today.

Imagine a world in which every person on the planet were given access to the sum of all human knowledge.

(23:10 min)

But I wonder why you could not have done that same thing and still have put ads on Wikipedia, like banner ads and stuff.

So here’s the thing — think about the DNA of an organization.

It is very difficult to have an organization from following the money. So Wikipedia is a non-profit, we could run ads.  There is no prohibition of non-profits running ads.

Suddenly, people would start to care a lot more about our traffic in highly developed advertising markets.  We would begin to care more about which pages you’re reading.

If you’re reading about Queen Victoria.

If you’re reading about Tesla cars or vacations in Las Vegas, we would have an incentive to

We an encyclopedia. We don’t think about adding page views.

We just think about how we make the encyclopedia better and how do we reach more people in the developing world.  That’s just fundamental to what this is all about.

How do you even fund that.  How do you even get the money to even fund the servers.

The main reason why we started the non-profit is exactly thinking about that for the future but I had no idea whether it was going to be possible.  So we setup the non-profit in June.

Then we had this disaster on Christmas day and I had to scramble to get the site running on 1 server and it was painfully slow.  And it was painfully obvious because the traffic was doubling.

That was the first time I decieded to do a fundraising campaign.

These days we call that crowd funding.

I remember very clearly that had hoped to raise $20,000 in a month’s time. But in about 2 weeks time we had raised $30,000.

A lot of small donors. And that is today the model for Wikipedia.  People who believe in Wikipedia, who think it is useful for their lives.

Hey I should chip in.

 

(35:45 min)

When you think about this thing that you built and your role in the history of the internet, how much of the success of Wikipedia do you think was because of  your brilliance and your hard work and how much was luck?A huge amount due to luck.

A huge amount of luck

I do think a component of the success of Wikipedia is that I’m a very friendly and nice person and I’m very laid back and so therefore I was able to work in a community environment where people basically yell at you and just have to kind of roll with it and you’re in some sense a leader but you can’t tell anyone what to do. They’re volunteers, so you have to work with love and reason and move people on in a useful way.

So I do think that I’m not irrelevant to the process, but I also think that the community is amazing and the luck of the timing of really hitting that moment when it was possible to build Wikipedia.

Jimmy, you’ve seen the estimates that if Wikipedia were a for-profit, it could be worth at least $5 billion dollars, maybe more.

Yeah.

Does mean anything to you?

Not really. I mean.  It’s you know.

People, they love to write about how Jimmy Wales is not a billionaire.

I think that there are actually articles with the headline.  Jimmy Wales in not an internet billionaire.

Exactly.  And for that’s a bit odd. My life is unbeelivable interensting. amazing. I have the ability to meet almost anyone in the world.  And usually I introduce myself an say I’m Jimmy Wales founder of Wikipedia.   And usually they say “Oh Wow”.  And if I say: “I’m Jimmy Wales. I own the largest group of car dealers across the southern part of America.”  Not that interesting.

At least in that regard, no one will remember me in 500 years, but they will definitely remember Wikipedia.

That’s something that you can hardly get your head around.

There have been comparisons to the Gutenberg Press.  This is the biggest dissemination of human knowledge in modern world history.

But its a bit embarrassing to talk about it that way.  I just try to have fun.

WT:Social is a New Social Network From WikiTribune

WT:Social is a new social network from Jimmy Wales, the founder of Wikipedia. He promises it will never sell user data and rely on donors rather than ads (via BBC).

WT:Social

When you first sign up you’ll be put on a waiting list and asked to invite others, or you can sign up for a subscription. It costs US$13/mo or US$100/year.

We will empower you to make your own choices about what content you are served, and to directly edit misleading headlines, or flag problem posts. We will foster an environment where bad actors are removed because it is right, not because it suddenly affects our bottom-line.

WT:Social will be focused on news and members will be asked to edit misleading headlines. Articles will be shared in a timeline that presents content by the newest stuff first, rather than algorithmically-sorted like Facebook and Twitter.

The Internet Archive Is Making Wikipedia More Reliable

The operator of the Wayback Machine allows Wikipedia’s users to check citations from books as well as the web.

Wikipedia is the arbiter of truth on the internet. It’s what settles arguments at bars. It supplies answers for the information snippets you see on your Google or Bing search results. It’s the first stop for nearly everyone doing online research.

The reason people rely on Wikipedia, despite its imperfections, is that every claim is supposed to have citations. Any sentence that isn’t backed up with a credible source risks being slapped with the dreaded “citation needed” label. Anyone can check out those citations to learn more about a subject, or verify that those sources actually say what a particular Wikipedia entry claims they do—that is, if you can find those sources.

It’s easy enough when the sources are online. But many Wikipedia articles rely on good old-fashioned books. The entry on Martin Luther King Jr., for example, cites 66 different books. Until recently, if you wanted to verify that those books say what the article says they say, or if you just wanted to read the cited material, you’d need to track down a copy of the book.

Now, thanks to a new initiative by the Internet Archive, you can click the name of the book and see a two-page preview of the cited work, so long as the citation specifies a page number. You can also borrow a digital copy of the book, so long as no else has checked it out, for two weeks—much the same way you’d borrow a book from your local library. (Some groups of authors and publishers have challenged the archive’s practice of allowing users to borrow unauthorized scanned books. The Internet Archive says it seeks to widen access to books in “balanced and respectful ways.”)

So far the Internet Archive has turned 130,000 references in Wikipedia entries in various languages into direct links to 50,000 books that the organization has scanned and made available to the public. The organization eventually hopes to allow users to view and borrow every book cited by Wikipedia, with the ultimate goal being to digitize every book ever published.

“Our goal is to be a library that’s useful and reachable by more people,” says Mark Graham, director of the Internet Archive’s Wayback Machine service.

If successful, the Internet Archive’s project would be a boon to students, journalists, or anyone who wants to check the references of a Wikipedia entry. Google Books also has a massive collection of digitized print books, but it tends to only show small snippets of a text.

“I’ve tried to verify Wikipedia pages by searching blurbs in Google Books but it’s an unpredictable link, and you often don’t have enough surrounding context to evaluate the use,” says Mike Caulfield, a digital literacy expert and director of blended and networked learning at Washington State University Vancouver. “The ability to read a page or two of context around a quote is crucial to both editors trying to protect the integrity of articles, and to readers who need to get to that next step of verification.”

You could, of course, verify the information the traditional way by tracking down a physical copy of a book. But students working late into the night on term papers, or reporters on tight deadlines, might not have time to order a book on Amazon or wait for a library book to become available. In other cases, books might be hard to come by. The Wikipedia entry on the internment of Japanese-Americans during World War II, for example, cites hard-to-find titles, says Internet Archive director of partnerships Wendy Hanamura. But thanks to the Internet Archive’s Digital Library of Japanese-American Incarceration, created with the Seattle-based organization Densho, many of those rare books are now available online.

The Internet Archive embarked on its effort to weave digital books into Wikipedia after the 2016 election. “No matter who you wanted to be president, I would say almost everyone would agree the whole process was a train wreck,” Internet Archive founder Brewster Kahle said in a speech in San Francisco last week. From fake news and inauthentic social media campaigns waged by foreign nations to concerns about voting systems themselves being rigged, there were plenty of ways that technology and information systems failed the public. So Kahle convened a group of people to discuss how to improve the information ecosystem. One issue that came up was the fragility of Wikipedia citations. Books and academic journals supply some of the best, most reliable information for Wikipedia editors, but those sources frequently are either unavailable online or are behind paywalls. And even freely available internet content often disappears.

The Internet Archive was in a unique position to help solve this problem. The organization’s Wayback Machine service has archived 387 billion webpages since 2001. It’s also been digitizing physical books and other analog media, and has now scanned 3.8 million books. It has millions more books warehoused.

Graham and company created the InternetArchiveBot, a tool that scans Wikipedia for broken links and automatically adds links to versions archived in the Wayback Machine. Because automatic editing tools require special permission to use, Graham has to work with the Wikipedia communities that manage versions of the encyclopedia in different languages. “All told, we’ve edited 14 million links; more than 11 million point to Internet Archive,” he says.

Adding links to books is similar but more challenging. “If a book has an ISBN number and an entry has a traditional citation format, it’s pretty easy,” Graham explains. But not all books have ISBN numbers, and many Wikipedia citations aren’t properly formatted. For instance, some only cite the book and not a specific page number. There can also be differences between different editions of a book.

Of course, the Internet Archive hasn’t scanned all the books cited by Wikipedia yet. It’s working hard to digitize collections from libraries around the world, along with donations from companies like Better World Books. Graham says the organization scans more than 1,000 books per day. But it has plenty more work to do.

MariaDB Vs MySQL In 2019: Compatibility, Performance, And Syntax

Who Uses These Databases?

MySQL: MySQL has generated a strong following since it was started in 1995. Some organizations that use MySQL include GitHub, US Navy, NASA, Tesla, Netflix, WeChat, Facebook, Zendesk, Twitter, Zappos, YouTube, Spotify. You can check the full list here: https://www.mysql.com/customers/.

MariaDB: MariaDB is being used by many large corporations, Linux distributions, and more. Some organizations that use MariaDB include Google, Craigslist, Wikipedia, archlinux, RedHat, CentOS, and Fedora.

A Look Inside Wikipedia’s Infrastructure (2008)

Despite being one of the world’s busiest sites, Wikipedia runs on fewer than 300 servers from a single data center in Tampa.

As a non-profit running one of the world’s busiest web destinations, Wikipedia provides an unusual case study of a high-performance site. In an era when Google and Microsoft can spend a half-billion dollars on one of their global data center projects, Wikipedia runs on fewer than 300 servers from a single data center in Tampa, Fla. It also has servers in Amsterdam at the AMS-IX peering exchange.

The engineers on the Wikipedia team may not take themselves too seriously, but they are serious about performance. That’s in keeping with Wikipedia’s guiding principles, which emphasize community over commerce (the site runs no ads) and getting excellent mileage out of its donations. Wikipedia maintains high 99-percent availability, and the usage data for Wikipedia includes some mind-boggling numbers.

  •  50,000 http requests per second
  •  80,000 SQL queries per second
  •  7 million registered users
  •  18 million page objects in the English version
  •  250 million page links
  •  220 million revisions
  •  1.5 terabytes of compressed data

Wikipedia is powered by the MediaWiki software, which was originally written to run Wikipedia and is now an open source project. MediaWiki uses PHP running on a MySQL database. Mituzas said MySQL instances range from 200 to 300 gigabytes. In addition to Squid, Wikipedia uses Memcached and the Linux Virtual Server load balancer. Wikipedia also uses database sharding to set up master-slave relationships between databases.

Additional technical details on Wikipedia’s infrastructure is available in 2007 presentations by Mituzas and WikiMedia’s Mark Bergsma.

Mituzas summed up his view of Wikipedia’s operations in a blog post about his Velocity presentation: “As I see it, in such context Wikipedia is more interesting as a case of operations underdog – non-profit lean budgets, brave approaches in infrastructure, conservative feature development, and lots of cheating and cheap tricks (caching! caching! caching!).”