Wikipedia & YouTube

Why Wikipedia and YouTube will never be integrated.
Wikipedia, Google, technology, law
2009-01-152013-11-08 finished certainty: certain importance: 3

Crit­i­cisms of Wikipedia tend to fall into 2 cat­e­gories—the clue­less and the cliche. Ei­ther they are based on lack of knowl­edge1, or they are ob­vi­ous if you think about how wikis work2 and have been writ­ten about and cov­ered with mind-numb­ing fre­quen­cy.

The Wiki­me­dia Com­mons project tends to at­tract the for­mer3. For ex­am­ple, peo­ple crit­i­cize it for not sup­port­ing Flash, for only sup­port­ing Ogg & The­ora me­dia files, for us­ing a Java ap­plet for play­back, for sup­port­ing only Fire­fox’s <video></video> tags…

Commons and videos

But the stu­pid­est I’ve seen in a long time has to be the crit­i­cism that the Foun­da­tion is wast­ing the hard-earned money of donors by not host­ing videos on YouTube4.

The lev­els of fail­ure in this crit­i­cism are many and con­sid­er­able. Wikipedia will nev­er, ever host things on YouTube un­less YouTube makes some con­sid­er­able changes. (Heck, the Wikipedian com­mu­nity views merely link­ing to YouTube con­tent with dis­taste.)

YouTube’s software sucks

From a tech­ni­cal stand­point, YouTube is a ter­ri­ble part­ner:

  1. Its in­ter­face is clut­tered and poor.

  2. It is not lo­cal­ized to the over 200 lan­guages the Me­di­aWiki en­gine sup­ports5

  3. It does not sup­port ba­sic fea­tures im­por­tant to Wikipedi­ans, like the triv­ial abil­ity to down­load the files.

    More specifi­cal­ly, one can down­load the file, but this file is not the orig­i­nal file! The meta­data is gone, a vast amount of res­o­lu­tion (au­dio & vi­su­al) is de­stroyed in the en­cod­ing, and YouTube re­port­edly does still other lossy pro­cess­ing.

  4. It does not sup­port ex­tremely large files. Wiki­me­dia Com­mons sup­ports files up to 100 megabytes and more; file sizes this large are ab­solutely nec­es­sary in or­der to han­dle video files of any rea­son­able length.

    One goal of Com­mons is to be archival, in the sense of pro­vid­ing orig­i­nals and faith­ful re­pro­duc­tions of Free con­tent; oth­er­wise the project sim­ply can­not meet its goals as an ed­u­ca­tional re­source. Grainy 3202 pixel 5-minute videos are sim­ply un­ac­cept­able.

  5. It does not sup­port meta­da­ta.

    1. The ex­ist­ing YouTube sup­port for meta­data is piti­ful. The up­load­er, the date, a few words of sum­ma­ry, and per­haps some other items like whether it’s a ‘video re­sponse’ to some­thing. The over­all im­pres­sion is re­ally sad. YouTube com­ments are leg­en­dar­ily mo­ronic; they don’t have to be—there is­n’t even a sim­ple rank­ing, it’s just that new com­ments dis­place older ones and so there is no way for good com­ments to per­sist. Even Slash­dot in the 1990s had a bet­ter com­ment sys­tem than YouTube. That YouTube, owned by a com­pany fa­mous for hav­ing many of the best pro­gram­mers in the world, can not or will not im­ple­ment a bet­ter sys­tem speaks vol­umes.
    2. Me­di­aWiki, on the other hand, sup­ports meta­data very well (to an ar­bi­trary de­gree, in fact, be­cause of the tem­plate sys­tem). Com­mons has a won­der­ful li­brary-like cul­ture of check­ing im­ages, adding meta­data, and or­ga­niz­ing me­dia. If files were to be put on YouTube, it would lead to a meta­data holo­caust. There’s sim­ply nowhere to put it in YouTube.
  6. Its soft­ware is not free. Flash is not free. YouTube’s sys­tems are not free. The source of nei­ther is avail­able in any way. This alone would rule out YouTube as a pos­si­ble part­ner.

    • Why? One of the fun­da­men­tal tenets of Wikipedia is that it will be ‘user-friendly’ in the best sense. There won’t be any busi­ness or profit-driven bull­shit that gets in the way of pro­duc­ing stuff. This means as lit­tle bu­reau­cratic or fi­nan­cial over­head as pos­si­ble.
    • This view has ma­jor con­se­quences. For ex­am­ple, the con­tent will be avail­able and us­able by every­one: com­mer­cial users6, bank­rupt users, every­one. Flash and the audio/video codecs are not avail­able to every­one (in­clud­ing com­mer­cial user­s). They are en­cum­bered with li­cens­ing re­stric­tions and roy­alty pay­ments, and are il­le­gal to mod­ify or change in any way. Sup­pose some­one wanted to dis­trib­ute DVDs of Com­mons videos and mu­sic files; if they used MP3s and other such en­cum­bered for­mats, who will pay the own­ers7 their pound of flesh?
  7. Com­mer­cial hosts are un­re­li­able.

    YouTube may be the most suc­cess­ful video site on­line, but that’s like set­ting a record at the Olympics or burn­ing a CD—it’ll last for­ever or 5 years, whichever comes first. YouTube is not even a decade old. Google does not have a long-term track record, and reg­u­larly shuts down ser­vices which do not pan out or whose salad days are be­hind it. Nat­u­ral­ly, YouTube has­n’t been delet­ing files yet (other than the files deleted as part of the usual high er­ror rate of a legal­ly-fear­ful ser­vice), but it’d be very stu­pid to wait a few decades to see how YouTube pans out. We should look at the out­side view and see how other on­line ser­vices have been do­ing over the span of less than 2 decades; the ob­vi­ous can­di­date is Ya­hoo!, a mam­moth In­ter­net com­pany that was once as suc­cess­ful and dom­i­nant as Google is now.

    The per­spec­tive from Ya­hoo! is not good. It stag­nated for years, and re­cent man­age­ment has been ag­gres­sively shed­ding ser­vices. First to go were the ter­abytes of user con­tent at . Next on the block was the data­base, with its hun­dreds of mil­lions of tags for URLs. Most re­cent­ly, was sched­uled to go down the mem­ory hole on 2011-03-15 (too bad about that roughly 25 ter­abytes of video). See also my .

  8. YouTube likely does not re­spect WMF’s strin­gent pri­vacy pol­icy—after all, it makes its money from ad­ver­tis­ing (which might it­self be a prob­lem as ad­ver­tis­ing can en­tail con­flicts of in­ter­est). Es­pe­cially prob­lem­atic from the pri­vacy per­spec­tive has been Google’s uni­fi­ca­tion of user ac­counts across all its prop­er­ties, moves to­wards re­al-name ac­counts, and en­cour­ag­ing YT view­ers to have Google ac­counts.

  9. The YouTube/Google back­end does not use all-FLOSS com­po­nents as WMF prefers.

Fur­ther rea­sons could be ad­duced. I hope it’s clear after 7 points that in or­der to be­come an ac­cept­able host, YouTube would have to cease to be YouTube.

YouTube’s terms of service suck

But let’s say that we’ve given up on the idea of YouTube as a ‘pri­mary’ host. Could­n’t per­haps we still put up sec­ondary back­ups, for peo­ple who like us­ing YouTube in­stead of Com­mons, and don’t mind the (very) lim­ited se­lec­tion and qual­i­ty? It would­n’t even have to be any offi­cial li­ai­son be­tween the Wiki­me­dia Foun­da­tion and YouTube—just a few users se­lect­ing videos, com­press­ing, and up­load­ing.

Sad­ly, even this is com­pletely im­pos­si­ble. At least, if you care about things like copy­right laws.

You see, YouTube claims le­gal rights over up­loaded con­tent that are in­com­pat­i­ble with the GFDL and the CC-BY-SA. They are, in fact, in­com­pat­i­ble with every copy­left li­cense out there. You can’t com­ply with their Terms of Ser­vice (TOS) and also up­load a GFDL’d video8.

Why not? Well, let’s look at para­graph 10 of the YouTube TOS (em­pha­sis added):

When you up­load or post a User Sub­mis­sion to YouTube, you grant: 1. to YouTube, a world­wide, non-ex­clu­sive, roy­al­ty-free, trans­fer­able li­cence (with right to sub­-li­cence) to use, re­pro­duce, dis­trib­ute, pre­pare de­riv­a­tive works of, dis­play, and per­form that User Sub­mis­sion in con­nec­tion with the pro­vi­sion of the Ser­vices and oth­er­wise in con­nec­tion with the pro­vi­sion of the Web­site and YouTube’s busi­ness, in­clud­ing with­out lim­i­ta­tion for pro­mot­ing and re­dis­trib­ut­ing part or all of the Web­site (and de­riv­a­tive works there­of) in any me­dia for­mats and through any me­dia chan­nels;

Now, re­mem­ber the le­gal judo that makes copy­left pos­si­ble: every de­riv­a­tive work must be li­censed un­der at least the same li­cense as the orig­i­nal, and be at least as un­re­stricted as it. So if I have a GFDL video, every de­riv­a­tive of it—and it­self—­must al­ways be un­der the GFDL. The wa­ters can get muddy here with mul­ti­-li­cens­ing changes, but this is the part that mat­ters here.

Now, if I give you a GFDL video, you don’t own the copy­right. You merely have a li­cense from me which lets you do a lot of things with the video only so long as you give the video away as a GFDL’d video. So, how is it pos­si­ble for you to grant YouTube any li­cense which is not the GFDL? You can’t do it! You aren’t al­lowed—if you try to, you break your GFDL con­tract, and now it’s il­le­gal for you to give YouTube the video in any form. The is­sue is­n’t that YouTube is mak­ing you give them the right to do just about any­thing9 with your up­loads, but rather that they won’t ac­cept the GFDL.

YouTube will change this para­graph of their Terms of Ser­vice when hell freezes over.

Summing up

So, YouTube is grossly in­ap­pro­pri­ate tech­ni­cal­ly. It is im­pos­si­ble legal­ly. And it would be of du­bi­ous util­ity any­way, as spend­ing on servers and band­width is a fairly small frac­tion of the Foun­da­tion’s bud­get10. Host­ing Com­mons videos on YouTube is an im­pres­sive stinker of an idea, the kind of idea which dis­plays frac­tal crud­di­ness—the more you look at spe­cific de­tails, the worse it gets. Which is not to say there aren’t li­aisons out there that make sense for Wiki­Me­dia Com­mons (for ex­am­ple, the In­ter­net Archive has a sim­i­lar ide­ol­o­gy, with a good archival ap­proach, and the me­dia col­lec­tions over­lap), but it’s safe to say that com­mer­cial ser­vices are un­likely to make good part­ners. If I had to gen­er­al­ize from YouTube, the les­son to be drawn is:

A profitable site is not a user-friendly site.

  1. ‘Wikipedia steals your copy­right’, ‘Wikipedia pages can’t be re­li­ably cited’ etc.↩︎

  2. ‘I saw some van­dal­ism the other day’, ‘this ar­ti­cle is­n’t very good’, ‘I think this topic should­n’t be cov­ered at all’↩︎

  3. Which is bet­ter than at­tract­ing the lat­ter, be­cause you can cor­rect some­one who is think­ing but sim­ply lacks knowl­edge about all the rel­e­vant con­straints Wiki­Me­dia Com­mons op­er­ates un­der; but what do you say to the mil­lionth it­er­a­tion of “Wikipedia should­n’t be used be­cause any­one can edit it and it’s un­re­li­able”?↩︎

  4. This charge was made by a real per­son; to pro­tect the guilty, they will not be linked.↩︎

  5. For de­tails on i8n & Me­di­aWiki in gen­er­al, see Mul­ti­lin­gual Me­di­aWiki; for hard num­bers, see Trans­la­tion Sta­tis­tics↩︎

  6. One prob­lem with con­sid­er­ing com­mer­cial users is that every­one in­stantly thinks of large abu­sive cor­po­ra­tions and feels the urge to use only NC li­cens­es. The prob­lem is that while NC li­censes would in­deed pun­ish large abu­sive cor­po­ra­tions (as­sum­ing they care), there are many ‘com­mer­cial’ users who don’t seem com­mer­cial at all but would still be banned by such a li­cense. There are other rea­sons of course; see “The Case for Free Use: Rea­sons Not to Use a Cre­ative Com­mons -NC Li­cense” for more.↩︎

  7. Thom­son Con­sumer Elec­tron­ics & Fraun­hofer In­sti­tute & Au­dio MPEG, Inc & Al­catel-Lu­cent↩︎

  8. There is one way you could get around it—if you own the en­tire copy­right on the up­loaded video. Then the law would in­ter­pret your up­load as an up­load not of a copy­left­-li­censed-video, but of an al­l-right­s-re­served video, since that’s the only way the TOS would be sat­is­fi­able. The differ­ence is im­plied in your con­sent.↩︎

  9. They get the rights to use, re­pro­duce, dis­trib­ute, and make de­riv­a­tives of the video. That’s just about every­thing!↩︎

  10. As of the 2007 and 2008 bud­get­s—half and falling! (More in­for­ma­tion.)↩︎