An automated fact check flag, intended to debunk inaccurate information about a popular image, shows up on a manipulated version of the image, resulting in a misogynist meme.
Summary: Many social media sites include fact checkers in order to either block or at least highlight information that is determined to be false or misleading. However, in some ways, that alone can create a content moderation challenge.
Alan Kyle, a privacy and policy analyst, noticed this in late 2019 when he came across a meme picture on Instagram from a meme account called “memealpyro” showing what appeared to be a great white shark leaping majestically out of the ocean. When he spotted the image, it had been blurred, with a notice that it had been deemed to be “false information” after being “reviewed by independent fact checkers.” When he clicked through to unblur the image, next to the image there was a small line of text saying “women are funny.” And beneath that the fact checking flag: “See why fact checkers say this is false.”
The implication of someone coming across this image with this fact check is that the fact check is on the statement, leading to the ridiculous/misogynistic conclusion that women are not funny and that an independent fact checking organization had to flag a meme image suggesting otherwise.
As Kyle discusses, however, this seemed to be an attempt to rely on fact checkers checking one part of the content, in order to create the misogynistic meme. Others had been using the same image -- which was computer generated and not an actual photo -- and claiming that it was National Geographic’s “Picture of the Year.” This belief was so widespread that National Geographic had to debunk the claim (though it did so by releasing other, quite real, images of sharks to appease those looking for cool shark images).
The issue, then, was that fact checkers had been trained to debunk the use of the photo, on the assumption it was being posted with the false claim that it was National Geographic’s “Photo of the Year,” and Instagram’s system didn’t seem to expect that other, different claims might be appended to the same image. When Kyle clicked through to see the explanation, it was only about the “Picture of the Year” claim (which was not made on this image), and (obviously) not on the statement about women.
Kyle’s hypothesis is that Instagram’s algorithms were trained to flag the picture as false, and then possibly send the flagged image to a human reviewer -- who may have just missed that the text associated with this image was unrelated to the text for the fact check.
Decisions to be made by Instagram:
- If the caption and a picture need to be combined to be designated as false information, how should Instagram fact checkers handle cases where that information is separated?
- How should fact checkers handle mixed media content, in which text and graphics or video may be deliberately unrelated?
- Should automated tools be used to flag viral false information in a way that might be gamed?
- How much human review should be used for algorithmically flagged “false” information?
Questions and policy implications to consider:
- When there is an automated fact checking flagging algorithm, how will users with malicious intent try to game the system, as in the above example?
- Is fact checking the right approach to “meme’d” information that is misleading, but not in a meaningful way?
- Would requiring fact checking across social media lead to more “gaming” of the system as in the case above?
Resolution: As Kyle himself concludes, situations like this are somewhat inevitable, as the setup of content moderation works against those trying to accurately deal with content such as the piece described above:
There are many factors working against the moderator making the right decision. Facebook (Instagram’s parent company) outsources several thousand workers to sift through flagged content, much of it horrific. Workers, who moderate hundreds of posts per day, have little time to decide a post’s fate in light of frequently changing internal policies. On top of that, much of these outsourced workers are based in places like the Philippines and India, where they are less aware of the cultural context of what they are moderating.
The Instagram moderator may not have understood that it’s the image of the shark in connection to the claim that it won a NatGeo award that deserves the false information label.
The challenges of content moderation at scale are well documented, and this shark tale joins countless others in a sea of content moderation mishaps. Indeed, this case study reflects Instagram’s own challenged content moderation model: to move fast and moderate things. Even if it means moderating the wrong things.