Difference between revisions of "User:Econterms/Report from WikiSym/OpenSym 2013"

From Wikimedia District of Columbia
Jump to navigation Jump to search
Line 2: Line 2:
 
WikiSym is an annual conference on academic research about wikis and other kinds of open collaboration. As in past years some of the research is fascinating. This time I happily identified myself as a member of Wikimedia DC on my name bad. Here are some topics and findings I found interesting. Most of the full papers are linked from the conference proceedings, [http://opensym.org/wsos2013/program/proceedings online here]).
 
WikiSym is an annual conference on academic research about wikis and other kinds of open collaboration. As in past years some of the research is fascinating. This time I happily identified myself as a member of Wikimedia DC on my name bad. Here are some topics and findings I found interesting. Most of the full papers are linked from the conference proceedings, [http://opensym.org/wsos2013/program/proceedings online here]).
   
  +
; Sources used for Wikipedias and suchlike
;
 
 
; Reverts on Wikipedia: these are edits on wikipedia that undo a string of previous edits. There were interesting analyses of reverts in the histories of edits:
 
* Geiger and Halfaker analyze the sources of "reverts" on Wikipeda -- . Most reverts are designed to maintain quality against vandalism and errors. The authors show that ClueBotNG is the quickest and most active mechanism -- usually acting against vandalism within 20 seconds if it will act at all -- and discuss the spectrum of other bots and tools and human behaviors that cause reverts. ClueBotNG was down several times for days in 2011, and they analyze how many reverts occurred in those periods. They conclude in essence that the same quality control was exercised in those periods, but more slowly, and they discuss how slowly. http://opensym.org/wsos2013/proceedings/p0200-geiger.pdf
 
 
; Sources used
 
 
* Han-Ten Liao compared a big official Chinese online encyclopedia, Baiku Baide, with Wikipedia. Liao analyzed the sources they cited. BB copies from the Chinese wikipedia and there are a spectrum of differences, e.g. that BB is . The work is underway; here is an abstract: [http://opensym.org/wsos2013/proceedings/p0601-liao.pdf]
 
* Han-Ten Liao compared a big official Chinese online encyclopedia, Baiku Baide, with Wikipedia. Liao analyzed the sources they cited. BB copies from the Chinese wikipedia and there are a spectrum of differences, e.g. that BB is . The work is underway; here is an abstract: [http://opensym.org/wsos2013/proceedings/p0601-liao.pdf]
 
* We saw an analysis of sources cited in English Wikipedia in footnotes. Scholarly publications are cited less than in a traditional encyclopedia. Large fractions of references are to primary sources; and to from "alternative" publishers, governments, and nonprofits.
 
* We saw an analysis of sources cited in English Wikipedia in footnotes. Scholarly publications are cited less than in a traditional encyclopedia. Large fractions of references are to primary sources; and to from "alternative" publishers, governments, and nonprofits.
Line 15: Line 10:
 
* [http://opensym.org/wsos2013/proceedings/p0301-sowe.pdf Sowe and Zettsu] discuss "curating" data sets with a wiki. A version control system for datasets is different from a version control system for code partly because data sets may change so much from version to version that they are too hard to compare realistically; and the data sets are large. They have a "data curation model" implemented on a MediaWiki in which a description of the data ("metadata") are on the wiki, and it links to the data itself, and the individuals doing this have wiki-histories and reputations.
 
* [http://opensym.org/wsos2013/proceedings/p0301-sowe.pdf Sowe and Zettsu] discuss "curating" data sets with a wiki. A version control system for datasets is different from a version control system for code partly because data sets may change so much from version to version that they are too hard to compare realistically; and the data sets are large. They have a "data curation model" implemented on a MediaWiki in which a description of the data ("metadata") are on the wiki, and it links to the data itself, and the individuals doing this have wiki-histories and reputations.
 
* Computational Biologist Philip Bourne on the challenges of open science. experiment making a PLOS publication that also went right to wikipedia. Discussed how a scientific paper could or should be associated with easy access to its data and executable versions of its statistical analysis and graphs. This subject came up other times at the conference. It implies a set of steps beyond open data toward open and reusable data and analysis. We're not close to making this easy to implement; it's a bit like making a movie for each scientific paper, which also includes its footnotes. He is the co-founder and founding Editor-in- Chief of the open access journal PLOS Computational Biology which, he said, is publishing 30,000 articles this year and is by this measure the largest academic journal in the world. ([http://www.slideshare.net/mobile/pebourne/wiki-symopensym2013 Bourne's slides])
 
* Computational Biologist Philip Bourne on the challenges of open science. experiment making a PLOS publication that also went right to wikipedia. Discussed how a scientific paper could or should be associated with easy access to its data and executable versions of its statistical analysis and graphs. This subject came up other times at the conference. It implies a set of steps beyond open data toward open and reusable data and analysis. We're not close to making this easy to implement; it's a bit like making a movie for each scientific paper, which also includes its footnotes. He is the co-founder and founding Editor-in- Chief of the open access journal PLOS Computational Biology which, he said, is publishing 30,000 articles this year and is by this measure the largest academic journal in the world. ([http://www.slideshare.net/mobile/pebourne/wiki-symopensym2013 Bourne's slides])
  +
 
; Reverts on Wikipedia: these are edits on wikipedia that undo a string of previous edits
 
* Geiger and Halfaker analyze the sources of "reverts" on Wikipeda -- . Most reverts are designed to maintain quality against vandalism and errors. The authors show that ClueBotNG is the quickest and most active mechanism -- usually acting against vandalism within 20 seconds if it will act at all -- and discuss the spectrum of other bots and tools and human behaviors that cause reverts. ClueBotNG was down several times for days in 2011, and they analyze how many reverts occurred in those periods. They conclude in essence that the same quality control was exercised in those periods, but more slowly, and they discuss how slowly. http://opensym.org/wsos2013/proceedings/p0200-geiger.pdf

Revision as of 06:11, 11 August 2013

Draft blog about WikiSym/OpenSym 2013

WikiSym is an annual conference on academic research about wikis and other kinds of open collaboration. As in past years some of the research is fascinating. This time I happily identified myself as a member of Wikimedia DC on my name bad. Here are some topics and findings I found interesting. Most of the full papers are linked from the conference proceedings, online here).

Sources used for Wikipedias and suchlike
  • Han-Ten Liao compared a big official Chinese online encyclopedia, Baiku Baide, with Wikipedia. Liao analyzed the sources they cited. BB copies from the Chinese wikipedia and there are a spectrum of differences, e.g. that BB is . The work is underway; here is an abstract: [1]
  • We saw an analysis of sources cited in English Wikipedia in footnotes. Scholarly publications are cited less than in a traditional encyclopedia. Large fractions of references are to primary sources; and to from "alternative" publishers, governments, and nonprofits.

they commented on global South geography. Heather Ford, David R. Musicant, Shilad Sen, Nathaniel Miller: http://opensym.org/wsos2013/proceedings/p0203-ford.pdf

Data versioning and open access

  • Sowe and Zettsu discuss "curating" data sets with a wiki. A version control system for datasets is different from a version control system for code partly because data sets may change so much from version to version that they are too hard to compare realistically; and the data sets are large. They have a "data curation model" implemented on a MediaWiki in which a description of the data ("metadata") are on the wiki, and it links to the data itself, and the individuals doing this have wiki-histories and reputations.
  • Computational Biologist Philip Bourne on the challenges of open science. experiment making a PLOS publication that also went right to wikipedia. Discussed how a scientific paper could or should be associated with easy access to its data and executable versions of its statistical analysis and graphs. This subject came up other times at the conference. It implies a set of steps beyond open data toward open and reusable data and analysis. We're not close to making this easy to implement; it's a bit like making a movie for each scientific paper, which also includes its footnotes. He is the co-founder and founding Editor-in- Chief of the open access journal PLOS Computational Biology which, he said, is publishing 30,000 articles this year and is by this measure the largest academic journal in the world. (Bourne's slides)
Reverts on Wikipedia
these are edits on wikipedia that undo a string of previous edits
  • Geiger and Halfaker analyze the sources of "reverts" on Wikipeda -- . Most reverts are designed to maintain quality against vandalism and errors. The authors show that ClueBotNG is the quickest and most active mechanism -- usually acting against vandalism within 20 seconds if it will act at all -- and discuss the spectrum of other bots and tools and human behaviors that cause reverts. ClueBotNG was down several times for days in 2011, and they analyze how many reverts occurred in those periods. They conclude in essence that the same quality control was exercised in those periods, but more slowly, and they discuss how slowly. http://opensym.org/wsos2013/proceedings/p0200-geiger.pdf