186
I’m the principal research scientist at the nonprofit behind Wikipedia. I study AIs and community dynamics. AMA!
Hello, I’m Aaron Halfaker, and I’m a principal research scientist at the Wikimedia Foundation, the nonprofit that supports and operates Wikipedia and several other free knowledge projects.
I’m here today to talk about the work I do with wiki knowledge communities. I study the work patterns of the volunteers who build Wikipedia and I build artificial intelligences and other tools to support them. I’ve studying how crowds of volunteers build massive, high quality information resources like Wikipedia for over ten years.
A little background about me: I have a PhD degree in computer science from the Grouplens Research lab at the University of Minnesota. I research the design technologies that make it easier to spot vandalism and support goodfaith newcomers in Wikipedia. I think a lot about the dynamics between communities and new users—and ways to make communities inviting.
I’m very excited to be doing an AMA today and sharing more details about how Wikipedia functions, how the community scaled their process to handle an internet’s worth of contribution and how we can use artificial intelligence to support open knowledge production work.
I’ll try to answer any questions you have about community dynamics, the ethics of AI and how we think about artificial intelligence on Wikipedia, and ways we’re working to counteract vandalism on the world’s largest crowdsourced source of knowledge. One of the nice things about working at Wikipedia is that we make almost all of our work public, so if you’re interested in learning more about this stuff, you can read about the team I run or the activities of the larger research team.
My Proof: https://twitter.com/halfak/status/870374941597868032
Edit 1: I confirm that I work with this /u/Ladsgroup_Wiki guy. He's awesome. :)
Edit 2: Alright folks, it's been a great time. Thanks for asking some great questions and engaging some awesome discussion. I've got to go do some other things with my evening now, but I'll be back tomorrow morning (16 hours or so from now) to answer any more questions that come in. o/
Edit 3: I've been coming back periodically to answer questions, so feel free to keep asking and I'll get back to you when I can :)
halfak14 karma
Good Q. So, all of the vandal fighting systems in Wikipedia rely on a machine learning model that predicts which edits are likely problematic. There's ClueBot NG that automatically reverts very very bad edits and tools like Huggle/STiki that prioritize likely bad edits for human review. Before ORES, each of these tools used their own machine learning model. This would have been fine, but it's actually quite a lot of work to stand one of those models and maintain it so that it runs in real time. I think if it weren't so difficult, we'd see a lot more vandal fighting tools that use an AI. That's where ORES comes in.
ORES centralizes the problem of machine prediction so that tool/bot developers can think about the problem space and use interaction that they want to support rather than having to do the heavy lifting of the machine learning modeling stuff. Instead, developers only need to figure out how to use a simple API in order to get predictions in their tools. Currently, Huggle has switched over to using ORES, but I don't think ClueBot NG has. The developer of STiki was one of our key collaborators during the development of ORES. There are now many new tools that have come out in the past few years that use ORES.
Meta_Bot31 karma
That's very interesting to find out, thank you. Do you see a possibility where CluebotNG would integrate your machine prediction in their operational model? Have you discussed about this option with the bot makers?
BTW I would also just like to add that I appreciate this project and the team behind it very much, it's a highly interesting approach to machine learning and prediction.
halfak1 karma
The people who maintain ClueBot NG are notoriously difficult to get a hold of an I'm notoriously busy and thinking about other stuff. _^ So somehow we still haven't connected yet. I'd like to share notes with them at least. As it stands, it seems ClueBot NG and its maintainers are doing a good job, so I'm not too concerned that we haven't connected yet.
One note is that there are several ClueBot NG-like bots that run in other language wikis based on ORES. I forgot the name of the bot in Persian Wikipedia, but it roughly does the same thing -- reverting edits that are very likely to be vandalism. There's a bot that people are working on for Finnish Wikipedia that will use ORES to "auto-patrol" edits that ORES things are very likely to be good.
Thanks! And thanks for your questions :D
halfak13 karma
It does, but we have to be cautious here. Predictions can affect people's judgement. If we have an AI with a little bit of bias, that can direct people to perpetuate that bias. Then if we re-learn on their behavior, we'll learn the bias even stronger! So we're very cautious about training on past behavior. Instead, we ask people to use a custom "labeling interface" that removes them from the context of Wikipedia and asks them to make a single independent judgement pasted on edits/article/whatever we're modeling. A cool thing about this is that we can ask users to give us more nuanced and specific judgements. E.g. rather than just predicting "Will this edit be reverted", we can predict "Was this edit damaging" independent of "Was the damage probably intentional".
Edit: Here's some docs about our labeling system: https://meta.wikimedia.org/wiki/Wiki_labels
Mr_Math1238 karma
How much vandalism needs to happen before an article gets protected? Are there any other factors that might get taken into consideration?
Ladsgroup_Wiki7 karma
Hey, I'm Amir and I work with Aaron in the scoring platform team. This is mostly related to each Wiki's different policies and topic of the article. For example for English Wikipedia there is a page explaining on when the page needs to be protected by volunteers (to clarify, protection is done by volunteers and staff won't do this directly unless in extreme cases). As a rule of thumb, when I have my volunteer hat on, I usually protect after three vandalizing edits.
halfak10 karma
Ladsgroup_Wiki's answer is great, but I just wanted to take the opportunity to share my favorite example of a protected page: Elephant
This page has been protected since 2006 when Colbert vandalized it on-air. Check out this awesome Wikipedia article: https://en.wikipedia.org/wiki/Cultural_impact_of_The_Colbert_Report#Wikipedia_references (Because of course there's a Wikipedia article about that)
MjolnirPants7 karma
What language do you tend to write AIs in? I've got a background in LISP and I'm curious if the language that was once almost synonymous with AI and expert systems is still in use by AI researchers.
halfak8 karma
I breath python these days. The scikit learn library is an amazingly useful set of tools for doing machine learning. We end up using a few other libraries that are based on c-utilities in order to get the kind of speed we need out of python, but generally, I don't find python lacking in that department either.
One of the great things about working with python is that it's becoming so dominant for data scientists. A lot of the volunteers who I am working with are data-sciency PhD students who are interested in modeling stuff around wikipedia and they all are working in python. Python + jupyter notebooks seems to be gaining steam quickly.
MjolnirPants1 karma
<sighs and walks away while the Incredible Hulk theme plays> Nobody loves the parentheses anymore...
But seriously, thanks for the links. I will check those out.
halfak7 karma
Still got some love for LISP. I wrote my first genetic algorithm in LISP. I built a solver for the Optimal Movie Seating Problem https://xkcd.com/173/ :)
raveseer7 karma
Hello Aaron, could you go into more detail on how you use AI to counteract vandalism? This seems like a very cool concept, does it try to learn patterns of vandalism, such as common parts of an article to tinker with or alter, or is it more of a depth-based pattern, looking at specific users and how they are posting to see if the user has a history of vandalism?
halfak9 karma
We specifically target patterns within the edit to the article itself. There's a lot of problems we can run into by profiling users but there's also some great things we can do building a rap sheet based on the predictions that the individual edit model makes for a user historically. I.e. vandals tend to make edits that look like vandalism and non-vandals tend to make edits that don't look like vandalism. I've written a paper about how we can use this insight to target newcomer support to people who are trying to contribute productively. See http://www-users.cs.umn.edu/~halfak/publications/Snuggle/halfaker14snuggle-personal.pdf
If you want to dig into the details of the specific features we use for predicting which edits are damaging, this repo: https://github.com/wiki-ai/editquality
beppebo7 karma
Hi Aaron, sometimes part of the community could be "a bit reluctant" to embrace new technologies (i.e. edit interface, discussion, etc). In your experience, is there a similar behavior about studies? Have you ever faced a "no, that's not true!" comment from part of the community as a reaction to your finding? Thanks!
halfak9 karma
Yeah! Good question. My most seminal study The Rise and Decline, my collaborators and I highlight how the strong, negative reaction to newcomers in Wikipedia is causing a serious crisis for the editing community. I've been doing a lot of outreach to get the word out over the last 5 years, but I still come across people who are hard to convince. Still, I find that a healthy dose of empiricism (see the paper) is useful when talking to anyone who I want to convince. Still I often end up learning new things through conversations. Sometimes a study misses the point and researchers aren't Right(TM). I like to do trace ethnography and interviews in combination with my analyses to make sure I'm not getting things totally wrong.
Edit: I should mention that the strong negative reaction towards newcomers wasn't malicious but rather a response to huge quality control needs that Wikipedians needed. In a followup paper I talk about how it was that Wikipedia's quality control processes got so aggressive. A big part of my work today is trying to figure out how to have both efficient quality control and good newcomer socialization.
PubliusTheYounger6 karma
Have you ever seen AI automated clients for maliciously editing Wikipedia? If yes, how did you help defend against the edits? If not, do you think it might be a problem in the future.
halfak11 karma
I don't think that I have. Most of the vandalism we deal with is not very clever. Fun story, there's a cycle that is really obvious in the data that corresponds to the beginning and end of the school year in the West. It looks like quite a lot of vandalism in Wikipedia comes from kids editing from school computers!
Searching around, my colleagues found https://meta.wikimedia.org/wiki/Vandalbot which seems to be the kind of thing that you're asking about.
halfak11 karma
I think that, with better models and the better interfaces that people will develop on top of them, we'll have fewer people who are focusing on counter-vandalism interacting with good-faith newcomers. Instead, we'll be able to route good-faith newcomers to people who are focusing on supporting and training people when they show up at Wikipedia. I'm working with product teams at the Wikimedia Foundation now to start imagining what such a future routing system will look like. But I look forward to what our volunteer developer community comes up with. A big part of my job is making sure that they have the tools that they need to experiment with better technologies. I'm betting that, by providing easier to use machine learning tools, our tool developer community will be able to more easily dream of better ways to route and support good-faith newcomers.
pab3l5 karma
Hi Aaron, it's nice yo find you here! From your point of view, what is missing in the Natural Language Processing or Artificial Intelligence area? I mean, are there any particular needs or tasks at Wikimedia Foundation that you have not been able to solve due to the unreliable, or missing, state of the art in these science areas?
halfak4 karma
One of my main frustrations with the AI literature around Wikipedia/Wikidata/etc. is that the models that people build are not intended to work in realtime. There are a lot of interesting and difficult engineering problems involved in detecting vandalism in real time that disappear when your only goal is to make a higher fitness model that can take any finite amount of time to train and test. I often review research papers about an exciting new strategy <foo>, but there's either no discussion of performance or I find out at the end that scoring a single edit takes several minutes. :S
I guess one thing that I'd like is a nice way to process natural language into parse trees for more than just a couple of language. E.g. spaCy only works for English and German. I need to be able to support Tamil and Korean too! It's hard to invest in a technology that's only going to help a small subset of our volunteers.
PerplexedLabrat5 karma
Hi Aaron, hope you're doing well! Thanks for the AMA btw!
I am not a computer man myself, but what do you have to say for the people (Uni teachers I am looking at you) when they say not to cite Wikipedia?
Thanks again!
halfak7 karma
Hey PerplexedLabrat, this is a good question and I get it often. I answered it in another thread. See https://www.reddit.com/r/IAmA/comments/6epiid/im_the_principal_research_scientist_at_the/dic3rew/
Gist is (1) you should never cite an encyclopedia -- wikipedia or a traditional paper one and (2) people don't realize how awesome Wikipedia's quality process, but the fact remains that Wikipedia compares favorably to traditional encyclopedias. See the linked post for citations and links :)
iwas99x4 karma
Mr. Halfaker, how often are you on Reddit and what are your favorite subreddits?
halfak4 karma
Hey! So I've been on reddit since ... 2006ish. I'm a daily user. Obviously I'm using this account so I don't mix my personal stuff with work stuff. _^ My favorite subreddit is /r/youtubehaiku. I think /r/HumansBeingBros is pretty awesome too.
natematias4 karma
Hi Aaron, thanks for taking the time to speak with redditors!
I'm curious about how you, Wikimedia, and Wikimedians evaluate ORES. Based on your work with Huggle/Snuggle, it seems like there are two kinds of evaluation for any human-facing machine learning system: (a) recognition and (b) action.
The first way involves the classic machine learning questions of precision/recall: how well can the system detect what the Wikimedians consider to be abuse, and does it have unfair biases? As I understand it, you've designed ORES in such a way that community members can contest and reshape this part of the system.
The second way to evaluate a system has less to do with what it can recognize and much more to do with what people and computers do with that knowledge: as you wrote in your Snuggle paper, one could deploy all sorts of interventions at the end of an AI system that recognizes vandalism or abuse: one could ban people, remove their comments, or offer them mentorship. These interventions also need to be evaluated, but this evaluation requires stepping back from the question "did this make the right decision here" to ask "what should we do when we recognize a situation of a certain kind, and do those interventions achieve what we hope?"
As AI ethics becomes more of a conversation, it seems to me that almost all of the focus is on the first kind of evaluation: the inner workings and recognition of AI systems rather than the use that then follows that recognition. Is that also true for ORES? When you evaluate Wikimedia's work, do you think of evaluation in these two ways, and if so, what are your thoughts on evaluating the ethics of the outcomes of AI-steered interventions?
halfak3 karma
Hi Nate! Great to see you here and awesome question as expected.
So, re. the first way, we're investing a lot here, but I think we're very different from the rest of the field. Right now, I'm focused on improving auditing strategies that our users will be able to use to discover and highlight what predictions are working and what predictions are not working at all (e.g. false-positives in vandal fighting). Experience has shown that Wikipedians will essentially do grounded theory to figure out what types of things ORES does wrong. This is extremely valuable for repairing our training and testing process WRT what people discover in the field -- when they are using predictions within their tools.
For the second way -- I'd like to aim to be a bit less patriarchal. How should people use an AI? Well, the thing I'm worried about is that I'm not the right person to answer that. I don't know if any individual or group is. So my next thought is, what kinds of social infrastructures should we have to manage the way people use an AI? This is where I like to draw from standpoint epistemology and ask myself who gets access to the "conversation" and who doesn't. We do have conversations about the kind of technologies we do and do not want to operate around Wikipedia, but I'd like to extend the notion of "conversation" to the technology development as well. Who isn't involved in technology develop but should be? Wikipedia is an incredible space to be exploring how this works out because it's open and there's no (or little) money to be made. We can openly discuss what techs we want, which ones we don't want, and we can experiment with new things. My bet is that when you lower barriers to both the human-language and technology conversations, we'll all become more articulate in discussing the technologies we want and don't want. And that out of this will come ethical systems. As a scientist, if I'm wrong, that'll be very interesting and I look forward to learning from that.
evonfriedland4 karma
Hi Aaron. Earlier today I saw a tweet about Robert West's work growing Wikipedia across languages. His article highlights missing articles, but I was curious about disparities in content for different language versions of the "same" article.
I chose a topic: a country leader with a long history of corruption. The English language version covers it, but the Wikipedia for the country does not (despite being otherwise thorough). I checked out the page's history and, sure enough, contributions mentioning corruption allegations were being reverted as "biased".
How might the tools you work on support relatively small/new Wikipedia communities and in countries where the contributor ecosystem may not be as robust?
How can they help ensure that Wikipedia sites in all languages represent high quality information?
halfak7 karma
This is a great question. I'm going to respond to the bit that's easier for me first and then I'll address the hard part second.
How might the tools you work on support relatively small/new Wikipedia communities and in countries where the contributor ecosystem may not be as robust?
Right, so we're targeting small/new communities that are growing quickly for support with machine learning models and tools because they help the current communities scale up their processes (vandal fighting, work routing, etc.) It's really hard when your small community suddenly gets very popular. Within the Wikimedia Foundation, we've adopted the language "Emerging Communities" when discussing these folks and their needs.
How can they help ensure that Wikipedia sites in all languages represent high quality information?
OK now the hard part. This is a great question. All of the individual wiki communities have cultural and policy differences so it's not clear that an article on one wiki should cover the same content as an article on another wiki. That said, the problem you describe is still deeply concerning and likely an example of bad behavior we like to call ownership of content. One way that an algorithm might help is by highlighting what's missing in one language article but present in another. One of my collaborators at Northwestern (Dr. Hecht) has been studying these types of content discrepancies for years. See https://pdfs.semanticscholar.org/cf39/78c8ed9d907c44207ca410fdfc8e345b15f0.pdf for some of his work. We don't have tools that people can use yet, but we're working on developing better strategies (context translation, gap analyses, etc.) so we'll likely be able to start highlighting these discrepancies soon. Ultimately, it'll be up to our volunteers to decide what belongs. But hopefully, having clear indications of what's missing will make it more difficult to police content contributions from a ownership point of view.
ConvenientChristian1 karma
Which Wiki's are currently "Emerging Communities"? What kind of statistics are used to decide which Wiki's have that status?
halfak1 karma
Here's a list that's been curated by the Community Engagement team at the Wikimedia Foundation: https://meta.wikimedia.org/wiki/Community_Engagement/Defining_Emerging_Communities
It seems the Principles section at the top of the lists of included countries provides the criteria for inclusion.
mhashemi3 karma
Hi Aaron! I've wondered a lot lately, how big do you think Wikipedia will get? We're currently at 5.4 million articles on English Wikipedia, do you see that going to 50, or 500 million? If so, how do you think Wikipedia can get there?
Relatedly, do you think that the old camps of deletionist/inclusionist are still prevalent and relevant today? Do you tend toward one side or the other?
halfak7 karma
Fun question. Here's an essay I wrote on the subject. https://meta.wikimedia.org/wiki/User:EpochFail/When_will_wikipedia_be_done
Based on my estimates of labor hours into Wikipedia and how many articles we have yet to write, I'm guessing it'll take us about 5000 years to finish Wikipedia assuming we don't have a big population increase in the meantime.
See also this great essay by Emijrp: https://en.wikipedia.org/wiki/User:Emijrp/All_human_knowledge
Generally, I'm an inclusion. A big part of my push towards building article quality prediction models is to make a clear differentiation between new articles that are clearly problematic (spam, vandalism, personal attacks) and new articles that are just of borderline notability. I think we do better when we focus on trimming the bad and less on trimming the not-notable-enough.
TimeIsPower3 karma
What do you think about the concept of AI-written pages in the long-term? Perhaps there could be AI-based tools that could be used to help human editors create articles / find information more easily? Just throwing ideas out there. Interesting to think what kind of work you deal with.
halfak9 karma
I'm skeptical that we'll have those soon. When we do, they'll probably draw heavily from https://www.wikidata.org/, our structured data store. In that wiki, there's a concept of an "item" that roughly corresponds to an article. Check out the Wikidata item for Douglas Adams: https://www.wikidata.org/wiki/Q42
In the short term, I think that we could probably build a deep net for generating a useful human-language-ish summary of an item by training it on wikidata items as an input and Wikipedia articles as an output.
There's a lot of nuance that goes into what and how to write a Wikipedia article that would be hard for a machine with only structure data to understand. E.g. check out https://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view#Undue_weight I love this policy. For example, it helps Wikipedia editors figure out where and how to report about a political scandal depending on how central the scandal is to the politician's notability.
TimeIsPower2 karma
I'm actually a Wikipedia editor, hence some of my interest. I wonder if maybe an AI could create a decent summary (even if awkwardly written), and then said summary could be rewritten to create stubs. If bias is an issue, then perhaps editors could also work those details out. While that does mean that a human element would still be required, perhaps the writing of some articles could at least be sped up a bit. Since new topics come into existence everyday, the idea of having such tools to help expand is enticing.
halfak5 karma
Yeah! +1 for that. So there's something kind of like this that people are working on called "Article Placeholder" that builds placeholder articles based on information from Wikidata. See some research towards building those and this (kind of minimal) documentation about the system in development: https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder
There's also the context translation project that uses machine translation + a nice human interface to help build up articles from other wikis. See https://blog.wikimedia.org/2015/01/20/try-content-translation/ and this video for details on how that works.
jor1io3 karma
As a student, why do you believe there is such an ill will held towards Wikipedia by teachers?
halfak6 karma
OK so my thoughts on this center around process -- because a good process is where quality comes from. Wikipedia has a novel process for generating an information resource. It doesn't look like there's a review process in Wikipedia, but in my (relatively extensive ) experience, Wikipedia's review process is too robust if anything! Check out http://stuartgeiger.com/bots-cyborgs-halfaker.pdf for a summary of how part of Wikipedia's review process works. When I started learning about how traditional encyclopedias review content (hint, not much), it gave me a lot more faith that Wikipedia's process compared favorably.
But one thing that I always stick on is that paper after paper looking at Wikipedia's quality generally concludes that it's awesome. I think people just don't realize this. Here's a good summary of the research as of about a year ago. http://orbit.dtu.dk/fedora/objects/orbit:135926/datastreams/file_33d9ed29-a577-42d5-8bb9-145c50c8a12b/content
OK. One other quick note. There's a few organizations working in this field. My employer, the Wikimedia Foundation, runs the Wikipedia Education Program. The separate Wiki Ed Foundation works in the US and Canada. These organizations support instructors who want to include Wikipedia in their classroom so that their students can get the most out of the experience.
halfak7 karma
OK for the people who don't come to get beer with me one a week (month these days), KtK stands for "Kill the Keg" and it's a weekly event at the local tavern where I meet up with many of my old buddies from the University of Minnesota. Now to answer your question, my favorite part of KtK is talking about on good SciFi novels. I've been working on Daemon by Daniel Suarez and I like it a lot.
edit: small clarification
halfak2 karma
I ended up putting it down. It just ended up being a little to ... slow? Space opera-ish? I might still pick it up again.
ReverseSolipsist2 karma
Why hasn't Wikipedia been able to keep blatant ideological bias out of many articles, even when they have existed for years, and not just articles that deal with people/events/etc., but many articles on scientific subjects?
TimReineke3 karma
Because people are people. Once you get beyond a bare-bones "stub" article, the editors' biases are going to come out, whether they want them to or not. And everyone has a bias. One of the reasons WP:NPOV is such a big deal is that it can be hard!
ReverseSolipsist1 karma
Traditional encyclopedias didn't have this problem to nearly the extent wikipedia does, even though they were also written by people.
halfak2 karma
I'm skeptical of this claim. What evidence are you basing it on? What process does a traditional encyclopedia go through to deal with bias that Wikipedia does not? Is it a different type of bias or a different overall level?
Andrew_Davidson2 karma
At the Wikimania in London, you said that you were looking at algorithms for apportioning the contributions to Wikipedia articles -- a good way of establishing blame/credit. This sounded a promising way of improving the incentive to do good work. How's that going, please?
halfak4 karma
WikiCredit! yeah, I'd still really like to push on that idea. I think it's a good one. I've been slowly building up the capacity to have a live system like this deployed. I've even done some analyses of productivity in Wikipedia using the proposed methods. Check out this talk where I show that 15% of all productive contributions in Wikipedia come from anons. I also show off some of the WikiCredit-style contribution graphs for a few editors.
A big problem we have with bringing this to production is that it's very computationally intense to generate the measurements we need. The Analytics team at the Wikimedia Foundation is slowly building up the infrastructure that will make WikiCredit easier to bring to life.
rci222 karma
Why does clicking on the first word on every page eventually always lead to "philosophy"?
halfak3 karma
So in Wikidata (a structured data thing like Wikipedia), we have a couple of properties called "instance-of" and "subclass-of". By their nature, they point to something at a higher level of abstraction. These properties should be present in every item in Wikidata for a good reason. Every thing is a type of thing!
So back to Wikipedia, I think this works roughly the same way. But the insight between Wikidata and Wikipedia is that the "instance-of" and "subclass-of" relationship is so important that it should be stated in the first sentence of an article. "Minneapolis a county seat...", "A county seat is an administrative center", "An administrative centre is ... local government", ... "Public policy is a principled guide ...", ... "An academic discipline ... is a branch of knowledge", "Knowledge is a familiarity ...", and so on.
So I guess my answer is really that, as we click the first link, we tend to follow "instance-of" relationships until we get to "knowledge" or the study of knowledge ("philosophy") because everything in Wikipedia is knowledge.
ConvenientChristian2 karma
How useful is it for your algorithm if a user registers an account compared to the user doing an anonymous edit through his IP address? Would it help your algorithm do much better work if posting would be restricted to registered accounts?
halfak3 karma
First, I would certainly not want restrict IP editors. Most are highly productive and when we've experimented with strongly suggesting that they must register an account, it caused a serious drop in overall productivity. See https://meta.wikimedia.org/wiki/Research:Asking_anonymous_editors_to_register
For vandalism prediction, it might be possible to get a better sense for a user's experience level if they register an account, but all else held constant, the algorithm should make its prediction based on the quality of the edit itself. We still do include reputation measures in our predictions because they are useful, but ultimately, they manifest as a bias against newcomers and anons. So we're working on ways to get more signal from the actual edit rather than the editor.
halfak7 karma
Hmm. I haven't seen it, but there are some auto-summarizing AIs that might be able to help me get caught up? That'd be an interesting experiment :D
Rogocraft1 karma
What is the weirdest troll edit somebody has done to wikipedia you have seen?
halfak3 karma
OK so I have an example and I had the whole thing written up but then I realized that I should probably not give anyone any ideas.
Here's an essay about not giving vandals ideas: https://en.wikipedia.org/wiki/Wikipedia:Don%27t_stuff_beans_up_your_nose
That's one of my favorite Wikipedia essays. :) Along with https://en.wikipedia.org/wiki/Wikipedia:No_climbing_the_Reichstag_dressed_as_Spider-Man
halfak3 karma
There are a lot of hard problems that I work on. I love hard problems. How are we going to host a prediction model for every wiki and have it run in real time without spending a our whole budget on server hardware? Fun problem!
But there are some problems that just make me feel exhausted. One that comes up time after time is that a very small, but vocal population in our editing community is very hostile towards newcomers and new projects. I work with newcomers and I start a lot of new projects. Sometimes, you'll fly under the radar and most people who show up to work with you just honestly want to contribute. But sometimes you'll get one of these hostile folks who wants to tear the project to shreds or who will rake a newcomer through the mud. I've learned that the best thing I can do in those types of situations is to take a wikibreak and work on something else. Working openly within a volunteer context brings with it a lot of benefits (I meet lots of cool people, I get to speak openly about my work, usually I have lots of volunteers helping me do my work, etc.), but this is one of the big drawbacks. It takes a lot of emotional labor to make it through one of those experiences. It takes courage to continue to operate out in the open.
hzotaqu1 karma
If artificial intelligence can learn a think and interpret it, can it be neutral? And what do you think about the wikipedia ban in Turkey (Thanks for the AmA :d)
halfak2 karma
"think", "interpret", and "neutral" are kind of poorly defined with regards to AI, so that makes this question hard to answer directly. However, I do think there's something to be said for how AI can be consistent and predictable. In cases where we are looking at disparate impact of false-positives on, say, newcomers in Wikipedia -- it's easier to refine the model and re-check the false-positive rate when working with an AI than with a human. With an AI, I know exactly where it's biases originated (the training data) and that makes things much easier to inspect. Currently, a big part of my research is trying to figure out effective means for my users to inspect, interrogate, and audit an AI that they deal with.
Personally, I think that access to information should be a human right. I'm strongly anti-censorship even for speech that we don't like. I'd rather see bad ideas compete with good ideas than to have some individual or organization deciding which ideas are "good" and which are problematic.
If you haven't seen it, here's the Wikimedia Foundation's official response https://blog.wikimedia.org/2017/04/30/turkish-authorities-block-wikipedia/
Yoshyoka1 karma
Is it possible to use AI to factually check articles to reduce author based bias to better represent scientific consensus? For instance, if the author states that there is controversy in the scientific community on the factual accuracy of X, yet a meta analisys of the subject states otherwise can an automatic edit follow?
halfak2 karma
I don't think we'll have automation directly dealing with the issues you bring up. These issues need to have a much more complex and nuanced understanding of the state of the art in a given field than I suspect a machine will be able to muster in the near future. However, we can still do a lot with machines. There are types of AIs that essentially manage huge indexes (e.g. recommender systems like Netflix, search like Google, etc.) and they have great potential to be able to shed light on issues of coverage bias. E.g. if we can get good at indexing statements from scientific papers, we might be able to highlight "controversies" that don't appear to play out in the literature. Ultimately, I think that this will only be a signal for human editors to work with, but it would make it much more difficult for someone to camp out on the article falsely claiming a controversy when there is none.
I look forward to the future where editors have more direct access to and better indexes of reference material. Right now, there's a big push to make structured data about reference material freely available. Check out https://meta.wikimedia.org/wiki/WikiCite_2017. This is only a first step, but I think it's going to make a huge difference on the scale of ~10 years. I've been investing lots of time and energy into building data processing utilities for extracting citations (and cited statements) from Wikipedia while my collaborators are figuring out ways to link together old data formatting standards into our structured database, Wikidata.
foodfighter1 karma
Hi - thanks for doing the AMA.
Wikipedia is an amazing resource. I was quite surprised when I read this article with the unfortunate click-baity title voicing concerns about Wikimedias apparent growth in both revenue and spending in recent years.
Implied that Wikimedia might have lost its core focus and is no longer the lean, mean machine it was in years past.
Any comments from the inside on this article?
halfak6 karma
Hey! So I'm not in Finance, so it's hard for me to comment directly on the financial stuff, but I can talk to you about how I view my work.
So, I like to think about myself as "infrastructure" for volunteers. I do my best as a Wikimedia Foundation employee when I can enable many more volunteers to do something that they could not otherwise do. The main volunteers who I focus on are the community of tool developers who build a whole ecosystem of advanced technologies that directly support editors. By building ORES and making it available for tool developers, I'm acting as the infrastructure that they need to get their work done. And in return, I'm betting that they'll take advantage of what I've built so that we can solve some larger issues.
I also work with the volunteer research community around Wikimedia stuff. (We have an awesome community. See our mailing list (archives), @WikiResearch on Twitter and our monthly newsletter). I've been studying Wikipedia and doing large scale data analyses for years. I can bring my experience with the literature and with quirks in the data to newcomers to the research area to make sure they do the best work that they can do. Check out https://www.mediawiki.org/wiki/Mediawiki-utilities for a collection of python data processing utilities that I build and maintain to help researchers work with our data. I also do outreach at conferences, hackathons, and by visiting universities to socialize research opportunities that will benefit Wikimedia and our volunteers.
So I guess what I'm trying to say is that both I view my work and the Wikimedia Foundation as operating in an infrastructural role -- to allow our volunteers to do more than they could do otherwise. It's hard to rectify that through the lens of that article.
Here are some additional links that might also be helpful:
Article from Quartz about the op-ed you mention: https://qz.com/978416/reddit-is-going-nuts-over-a-post-named-wikipedia-has-cancer/
Response from Wikimedia Foundation to Op-ed questions: Wikipedia_talk:Wikipedia_Signpost/2017-02-27/Op-ed#Sharing_some_additional_information
Edit: Permalink the signpost discussion
Edit2: More links for research community stuff.
coryrenton1 karma
is there anything interesting you've come across in terms of seeing the effects of financial incentives in stabilizing or destabilizing volunteer communities?
halfak3 karma
I'm familiar with some psychology here. See https://en.wikipedia.org/wiki/Overjustification_effect
Otherwise, I don't specifically study financial incentives with regards to volunteer work.
halfak3 karma
Minnesota is awesome. I like weather. It's hot in the summer, cold in the winter, and it rains & snows. :) Also, I get to live near the best national park in the world.
https://en.wikipedia.org/wiki/Boundary_Waters_Canoe_Area_Wilderness
How do I even get trophies? This isn't my main account anyway. It'll go dormant until I have something research-related to post on reddit again.
Edit: Oh! And I have never heard of the future of life institute. Sorry :S
Edit 2: I'm not sure why a mod deleted your question. Although it is slightly off topic "Why do you still live in MN?" there were parts that were relevant "Have you heard of the future of life inst?" and I'd already responded. Mods, please calm down and don't delete questions that I've responded to. It's OK to be slightly off topic in this thread. How else would you know that I'm totally not a robot? :)
Navi1511 karma
Hi, Aaron! When will our steel masters arise and wipe out the human race? Seriously, do you believe in strong AI capable of independent human-or-higher-level thinking?
halfak2 karma
I don't see strong AI coming very soon. But I do see AIs interacting with us in more facets of our lives. There's not really much you need to worry about with the types of AIs I build though. It turns out there's a large set of problems around Wikipedia that basic machine learning strategies can help with. For example, vandalism detection lends itself well to a Boolean classifier. Similarly, we can do quite well by using a multiclass classifier to predict which articles fall into which quality class. These AIs are simple and narrow. They may be even boring from a futurist perspective, but they have great potential for making Wikipedia better.
halfak1 karma
That's a really good question. I have a great job. If I was independently wealthy, I think I'd be doing the exact same thing. I have a compulsion to study social phenomena like Wikipedia, to build tools, and to talk about it. If you hear me talk about this stuff in person, you'll know for sure that I'm stoked. Check out https://www.youtube.com/results?search_query=Aaron+Halfaker When I'm excited, I get talking very fast. :)
Usually, when it comes to figuring out what to do next, my curiosity is a good guide. But so are the things I learn by talking with and working with Wikipedians. These days, I'm getting to the level of seniority where I'm taking on grad students and other volunteers who are really interested in doing behavioral research and building tools for Wikipedians, so I spend a lot of time and energy making sure they have what they need in order to be successful. I'm finding that this mentoring/advising work is something that gets me excited too.
TimReineke1 karma
False positives: How should anti-vandalism tools react to suspected vandalism in a way that avoids false positives? Even human editors make mistakes here from time to time, and these mistakes can depress new editor retention.
halfak4 karma
This is a good question. So, counter-vandalism tools that are in use today have a key assumption built in:
Wikipedia is a firehose of new edits. We need to filter out the bad edits.
But the problem here is that this sets up a false dichotomy. The "good edits" are fine, but "bad edits" is really overloaded with a lot of different things. But they all get the same response -- which can be roughly paraphrased as "Get out of my Wikipedia with your vandalism."
So one of the things that I've learned in my research is that there's a lot of good-faith in "bad edits". Most of the edits that need to be reverted are someone who is making a mistake -- or just not fully understanding whats expected for a contribution in Wikipedia. E.g. the Biography of living persons policy states that contributions without clear citation can be reverted outright while contributions without clear citation on other articles should probably be tagged with [citation needed]. If you weren't aware of this policy and you made a "bad" contribution to a biography article, you'd be likely to be told "Get out of my Wikipedia with your vandalism."!
In my analysis, it looks like about 40% of new editors who register an account and start editing fall into the good-faith/accidentally "bad edit" group. These editors dominant the group of "bad" edits, so why do we react to them so strongly?
OK that brings me back to your question about the cost of false positives. In this case, even a true positive has cost, but a false positive is still pretty bad. Can you imagine making an edit as a newcomer that improves an article and being told "Get out of my Wikipedia with your vandalism."?
The way I'd like to address this is by bringing nuance to our language about "bad" stuff and integrating that nuance into how we think about tools and routing of newcomers contributions. Let's say an edit is "bad", was it saved in "good-faith"? Is there anything redeemable about it? If it was good-faith but had problems, is there a WikiProject that they could be directed to where someone might like to help them learn the ropes? Or maybe we could send them to the Teahouse -- a Q&A forum for newbies. Further, maybe we could stop ignoring the newcomers who are making good edits and instead respond positively to them.
So here's how I'm hoping to push people towards adopting this type of nuance. I've split the notion of "good" and "bad" in the prediction models we have deployed in ORES. We have a model called "damaging" which predicts if an edit caused harm to and article and a totally separate model that predicts if an edit was saved in "goodfaith". The cool thing with these models is that you can intersect them to find accidental damage (damaging & goodfaith), vandalism (damaging & badfaith), or good edits (not damaging & goodfaith). We already have people experimenting with new work practices that would route newcomers who accidentally do damage to help spaces, and I even saw a proposal for a "thank-athon" where experienced Wikipedians would use the models to find a list of productive newcomers to thank for their contributions.
In this second vision of the boundary of Wikipedia, false positives are less problematic because in the worst case scenario, you might get an invite to a newcomer help space when you were already contributing productively. We have a long way to go, but I can see the progress already well underway.
Meta_Bot313 karma
Hi Aaron, thanks for doing the AMA. How does the ORES quality assurance service currently fit in with other vandalism detection methods, such as CluebotNG on English Wikipedia?
View HistoryShare Link