[cf. concept creep, Levari et al 2018] Recent years have seen
debate about whether depictions of inherently evil monster races such as orcs
in role playing games or literature/movies such as Lord of
the Rings could be considered racist. Although such decisions may be subjective, little data has been produced to inform the debate regarding how critical
an issue this is. In particular, does consuming such material relate to racism in the real world, or do a majority of individuals, particularly people of color,
consider such depictions racist?
The current study sought to address these issues in a sample of 308 adults (38.2% non-White) a subset of whom (17%) were players of the role-playing game
Dungeons and Dragons.
Playing Dungeons and Dragons (D&D) was not associated with greater ethnocentrism (one facet
of racism) attitudes. Only 10.2% found a depiction of orc monsters as inherently evil to be offensive. However, when later asked the blunter question of
whether the same depiction was racist, the number jumped to 34.0%, with women particularly inclined to endorse this position.
This suggests asking people about racism may prime
them to see racism in material they hadn’t previously found to be offensive. Neither participant race nor history playing the D&D game was associated with perceptions of offensiveness or racism.
One of John Nash’s first papers deliberately used every Greek letter.
For the film A Beautiful Mind I used this paper for writing on
his dorm room window. As luck would have it, a widely circulated publicity still showed Russell Crowe intent behind “0 < π < 1” taken straight from that paper.
Suffice to say this was divisive within the math community. Half of us can’t imagine Pi meaning anything besides, um, Pi. The other half didn’t even blink.
Someone shared with me a hilarious email exchange within the Berkeley math department, wondering if the math consultant was deliberately trying to make Russell
Crowe look bad.
I got the chance to edit an interview with John Nash for the DVD extras, where he bragged to Ron Howard about using every Greek letter. I left that in.
Reality shifting (RS) is a trendy mental activity that emerged abruptly following the flare-up of the COVID-19 pandemic in 2020 and seems to be practiced
mainly by members of the post-millennial generation. RS, described as the experience
of being able to transcend one’s physical confines and visit alternate, mostly fictional, universes, is discussed by many on Internet platforms. One RS forum boasts over 40,000 members and RS clips on some social media platforms have been viewed over 1.7
billion times…The pertinent hashtag #realityshifting on TikTok has accumulated over 706 million views (TikTok, #realityshifting, n.d.-a)
while #shiftingrealities has accrued over 1.8 billion views (TikTok #shiftingrealities, n.d.-b). The term has also received mainstream
media coverage, as exemplified by a recent story in the Washington Post (Andrews 2021). Here is
how a member described her practice online:
Shifting is a very strange experience. It’s like an extremely vivid dream, yet it’s more real than any dream I’ve ever had. Before I plan on shifting, I write
myself a script in the notes app on my phone, in which I plan exactly what happens in the desired reality. This makes it easier to visualize exactly what I want
to happen—so I might script that I want to go to Hogwarts and for Draco to be my boyfriend, or that he will flirt with me.
The experience of shifting is reportedly facilitated by specific induction methods involving relaxation, concentration of attention, and autosuggestion. Some
practitioners report a strong sense of presence in their desired realities, reified by some who believe in the concrete reality of the alternate world they shift
to. One of the most popular alternate universes involves environments adopted from the Harry Potter book and film series.
…We describe the phenomenology of RS as reported online and then compare it to related phenomena such as hypnosis, tulpamancy, dissociation, immersive and maladaptive daydreaming, and lucid dreaming. We propose a theoretical model of interactive factors giving rise to RS,
and conclude that it is an important, uninvestigated emerging phenomenon and propose future research directions.
…Respawning: Respawning is a radical manifestation of the escapist psychological role RS can play. Some RS practitioners are motivated to
permanently sever their ties with the current reality (CR) and live in an alternate desired reality (DR) of choice, opting to leave their “clones” (ie. someone who
will continue interacting in the CR) behind and leaving the CR forever (eg. Madame Lovi 2003).
Can visual artworks created using generative visual algorithms inspire human creativity in storytelling? We asked writers to write creative stories from a
starting prompt, and provided them with visuals created by generative AI models from the same prompt. Compared to a control group, writers who used the visuals as
story writing aid wrote statistically-significantly more creative, original, complete and visualizable stories, and found the task more fun. Of the generative
algorithms used (BigGAN, VQGAN,DALL-E, CLIPDraw), VQGAN was the most preferred. The control group that did not view the visuals did
significantly better in integrating the starting prompts. Findings indicate that cross modality inputs by AI can benefit divergent aspects of creativity in
human-AI co-creation, but hinders convergent thinking.
Data annotation is a time-consuming and labor-intensive process for many NLP tasks. Although there exist various
methods to produce pseudo data labels, they are often task-specific and require a decent amount of labeled data to start with. Recently, the immense
language model GPT-3 with 175 billion parameters has achieved tremendous improvement across many few-shot learning
tasks. In this paper, we explore ways to leverage GPT-3 as a low-cost data labeler to train other models. We find that, to make the
downstream model achieve the same performance on a variety of NLU and NLG tasks, it
costs 50% to 96% less to use labels from GPT-3 than using labels from humans. Furthermore, we propose
a novel framework of combining pseudo labels from GPT-3 with human labels, which leads to even better
performance with limited labeling budget. These results present a cost-effective data labeling methodology that is generalizable to many practical
This paper explores the limits of the current generation of large language models for program synthesis in general purpose programming languages. We evaluate a
collection of such models (with between 244M and 137B parameters) on two new benchmarks, MBPP and MathQA-Python,
in both the few-shot and fine-tuning regimes. Our benchmarks are designed to measure the ability of these models to synthesize short Python programs from
natural language descriptions. The Mostly Basic Programming Problems (MBPP) dataset contains 974 programming
tasks, designed to be solvable by entry-level programmers. The MathQA-Python dataset, a Python version of the MathQA benchmark, contains 23,914 problems
that evaluate the ability of the models to synthesize code from more complex text. On both datasets, we find that synthesis performance scales log-linearly with
model size. Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59.6 percent of the problems from MBPP using few-shot learning with a well-designed prompt. Fine-tuning on a held-out portion of the dataset improves performance by
about 10 percentage points across most model sizes. On the MathQA-Python dataset, the largest fine-tuned model achieves 83.8 percent accuracy. Going further, we
study the model’s ability to engage in dialog about code, incorporating human feedback to improve its solutions. We find that natural language feedback from a
human halves the error rate compared to the model’s initial prediction. Additionally, we conduct an error analysis to shed light on where these models fall short
and what types of programs are most difficult to generate. Finally, we explore the semantic grounding of these models by fine-tuning them to predict the results of
program execution. We find that even our best models are generally unable to predict the output of a program given a specific input.
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct
production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from
docstrings, our model solves 28.8% of the problems, while GPT-3solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing
working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals
its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the
potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.
When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few
examples. Here, we present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language). Using
aligned image and caption data, we train a vision encoder to represent each image as a sequence of continuous embeddings, such that a pre-trained, frozen language
model prompted with this prefix generates the appropriate caption. The resulting system is a multimodal few-shot learner, with the surprising ability to learn a
variety of new tasks when conditioned on examples, represented as a sequence of multiple interleaved image and text embeddings. We demonstrate that it can rapidly
learn words for new objects and novel visual categories, do visual question-answering with only a handful of examples, and make use of outside knowledge, by
measuring a single model on a variety of established and new benchmarks.
State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and
adaptation to new settings. In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. To this end,
we introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters in a
data-driven fashion. Concretely, GBST enumerates candidate subword blocks and learns to score them in a position-wise
fashion using a block scoring network. We additionally introduce Charformer, a deep Transformermodel that integrates GBST and operates on the byte
level. Via extensive experiments on English GLUE, multilingual, and noisy text datasets, we show that Charformer outperforms a
series of competitive byte-level baselines while generally performing on par and sometimes outperforming subword-based models. Additionally, Charformer is fast,
improving the speed of both vanilla byte-level and subword-level Transformers by 28%-100% while maintaining competitive
quality. We believe this work paves the way for highly performant token-free models that are trained completely end-to-end.
Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or
embedding-level comparisons with the text references. This is different from human language processing, for which visual imaginations often improve comprehension.
In this work, we propose ImaginE, an imagination-based automatic evaluation metric for natural language generation. With the help of CLIP and
DALL-E, two cross-modal models pre-trained on large-scale image-text pairs, we automatically generate an image as
the embodied imagination for the text snippet and compute the imagination similarity using contextual embeddings. Experiments spanning several text generation
tasks demonstrate that adding imagination with our ImaginE displays great potential in introducing multi-modal information into NLG evaluation, and improves existing automatic metrics’ correlations with human similarity judgments in many circumstances.
The Classics “Shelf”: Genre, Hashtag, Advertising Keyword: This essay understands Goodreads users to be readers as well as “amateur
The Goodreads Algorithmic Echo Chamber: …The first key insight is that Goodreads purposely conceals and obfuscates its data from the
public. The company does not provide programmatic (API) access to the full text of its reviews, as some websites
and social media platforms do. To collect reviews, we thus needed to use a technique called “web scraping”, where one extracts data from the web, specifically from
the part of a web page that users can see, as opposed to retrieving it from an internal source.39 The Goodreads web interface makes it difficult to scrape large amounts of review data, however. It’s not
just difficult for researchers to collect Goodreads reviews. It’s difficult for anyone to interact with Goodreads reviews. Though more than 90 million
reviews have been published on Goodreads in the site’s history, one can only view 300 reviews for any given book in any given sort setting, a restriction that was
implemented in 2016. Previously, Goodreads users could read through thousands of reviews for any given book. Because there are a handful of ways to sort Goodreads
reviews (eg. by publication date or by language), it is technically possible to read through 300 reviews in each of these sort settings. But even when accounting
for all possible sort setting permutations, the number of visible and accessible Goodreads reviews is still only a tiny fraction of total Goodreads reviews. This
throttling has been a source of frustration both for Goodreads users and for researchers.
Working within these constraints, we collected approximately 900 unique reviews for each classic book—300 default sorted reviews, 300 newest reviews, and 300
oldest reviews—for a total of 127,855 Goodreads reviews. We collected these reviews regardless of whether the user explicitly shelved the book as a “classic” or
not. We also explicitly filtered for English language reviews. Despite this filtering, a small number of non-English and multi-language reviews are included in the
dataset, and they show up as outliers in some of our later results. Compared to the archives of most readership and reception studies, this dataset is large and
presents exciting possibilities for studying reception at scale. But it is important to note that this dataset is not large or random enough to be a statistically
representative sample of the “true” distribution of classics reviews on Goodreads. We believe our results provide valuable insight into Goodreads and the classics
Though the constraints of the Goodreads platform distort our dataset in certain ways, we tried to use this distortion to better scrutinize the influence of the
web interface on Goodreads users. For example, the company never makes clear how it sorts reviews by default, but we found that reviews with a combination of more
likes and more comments almost always appear above those with fewer—except in certain cases when there is, perhaps, another invisible social engagement metric such
as the number of clicks, views, or shares that a review has received. Since we collected data in multiple sort settings, we are able to go further than this basic
observation and investigate how exactly this default sorting algorithm shapes Goodreads users’ behavior, social interactions, and perceptions of the classics.
Based on our analysis, we found that the first 300 default visible reviews for any given book develop into an echo chamber. Once a Goodreads review appears in the
default sorting, in other words, it is more likely to be liked and commented on, and more likely to stay there (Figure 6). Meanwhile the majority
of reviews quickly age beyond “newest” status and become hidden from public view. These liking patterns reveal that Goodreads users reinforce certain kinds of
reviews, such as longer reviews (Figure 7), reviews that include a “spoiler alert” (Figure 9), and reviews written by a small set
of Goodreads users who likely have many followers (Table 2). If a review is prominently displayed by the default sorting algorithm, its
author may be more likely to go back and modify this review. More default-sorted reviews included the words “update” or “updated” than oldest or newest reviews
(Figure 8). In one especially interesting updated review, a Goodreads user raised her rating of Toni Morrison’s The Bluest Eye and
apologized for the way that her original, more negative review offended others and reflected her white privilege, which other Goodreads users had pointed out.
Dialogue systems in the form of chatbots and personal assistants are being increasingly integrated into people’s lives. These dialogue systems often have the
ability to adopt an anthropomorphic persona, mimicking a societal demographic to appear more approachable and trustworthy to users. However, the adoption of a
persona can result in the adoption of biases. We define persona biases as harmful differences in text (eg. varying levels of offensiveness or affirmations of
biased statements) generated from adopting different demographic personas. In this paper, we present the first large-scale study on persona biases in dialogue
systems and conduct analyses on personas of different social classes, sexual orientations, races, and genders. Furthermore, we introduce an open-source framework,
UnitPersonaBias, a tool to explore and aggregate subtle persona biases in dialogue systems. In our studies of the Blenderand DialoGPT dialogue systems, we show that the choice of personas can affect the degree of harms in generated responses. Additionally,
adopting personas of more diverse, historically marginalized demographics appears to decrease harmful responses the most.
When fine-tuning pretrained models for classification, researchers either use a generic model head or a task-specific prompt for prediction. Proponents of
prompting have argued that prompts provide a method for injecting task-specific guidance, which is beneficial in low-data regimes. We aim to quantify this benefit
through rigorous testing of prompts in a fair setting: comparing prompted and head-based fine-tuning in equal conditions across many tasks and data sizes. By
controlling for many sources of advantage, we find that prompting does indeed provide a benefit, and that this benefit can be quantified per task. Results show
that prompting is often worth 100s of data points on average across classification tasks.
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet nearly all commonly-used
models still require an explicit tokenization step. While recent tokenization approaches based on data-derived subword lexicons are less brittle than manually
engineered tokenizers, these techniques are not equally suited to all languages, and the use of any fixed vocabulary may limit a model’s ability to adapt. In
this paper, we present CANINE, a neural encoder that operates directly on character sequences, without explicit
tokenization or vocabulary, and a pre-training strategy that operates either directly on characters or optionally uses subwords as a soft inductive bias. To use
its finer-grained input effectively and efficiently, CANINE combines downsampling, which reduces the input
sequence length, with a deep transformer stack, which encodes context. CANINE outperforms a comparable mBERT model by 2.8 F1 on TyDi QA, a challenging multilingual benchmark, despite having 28% fewer model parameters.
Limerick generation exemplifies some of the most difficult challenges faced in poetry generation, as the poems must tell a story in only five lines, with
constraints on rhyme, stress, and meter. To address these challenges, we introduce LimGen, a novel and fully automated system for limerick generation that
outperforms state-of-the-art neural network-based poetry models, as well as prior rule-based poetry models. LimGen consists of three important pieces: the Adaptive
Multi-Templated Constraint algorithm that constrains our search to the space of realistic poems, the Multi-Templated Beam Search algorithm which searches
efficiently through the space, and the probabilistic Storyline algorithm that provides coherent storylines related to a user-provided prompt word. The resulting
limericks satisfy poetic constraints and have thematically coherent storylines, which are sometimes even funny (when we are lucky).
GPT-3 can perform numerous tasks when provided a natural language prompt that contains a few training examples.
We show that this type of few-shot learning can be unstable: the choice of prompt format, training examples, and even the order of the training examples can cause
accuracy to vary from near chance to near state-of-the-art. We demonstrate that this instability arises from the bias of language models towards predicting certain
answers, eg. those that are placed near the end of the prompt or are common in the pre-training data. To mitigate this, we first estimate the model’s bias towards
each answer by asking for its prediction when given the training prompt and a content-free test input such as “N/A”. We then fit calibration parameters that
cause the prediction for this input to be uniform across answers. On a diverse set of tasks, this contextual calibration procedure substantially improves
GPT-3and GPT-2’s average accuracy (up to 30.0% absolute)
and reduces variance across different choices of the prompt.
Prevailing methods for mapping large generative language models to supervised tasks may fail to sufficiently probe models’ novel capabilities. Using
GPT-3 as a case study, we show that 0-shot prompts can significantly outperform few-shot prompts. We suggest
that the function of few-shot examples in these cases is better described as locating an already learned task rather than meta-learning. This analysis motivates
rethinking the role of prompts in controlling and evaluating powerful language models. In this work, we discuss methods of prompt programming, emphasizing the usefulness of considering prompts
through the lens of natural language. We explore techniques for exploiting the capacity of narratives and cultural anchors to encode nuanced intentions and
techniques for encouraging deconstruction of a problem into components before producing a verdict. Informed by this more encompassing theory of prompt programming,
we also introduce the idea of a metaprompt that seeds the model to generate its own natural language prompts for a range of tasks. Finally, we discuss how these
more general methods of interacting with language models can be incorporated into existing and future benchmarks and practical applications.
While we instantaneously recognize a face as attractive, it is much harder to explain what exactly defines personal attraction. This suggests that attraction
depends on implicit processing of complex, culturally and individually defined features. Generative adversarial neural networks (GANs), which
learn to mimic complex data distributions, can potentially model subjective preferences unconstrained by pre-defined model parameterization.
Here, we present generative brain-computer interfaces (GBCI), couplingGANs with brain-computer interfaces. GBCI first presents a selection of images and captures
personalized attractiveness reactions toward the images via electroencephalography. These reactions are then used to control a ProGAN model, finding a representation that matches the features constituting an attractive image for an individual. We conducted
an experiment (N= 30) to validate GBCI using a face-generatingGAN and producing images that are hypothesized to be individually attractive. In double-blind evaluation of the
GBCI-produced images against matchedcontrols, we found GBCI yielded highly
Thus, the use of EEG responses to control aGAN presents
a valid tool for interactive information-generation. Furthermore, the GBCI-derived images visually replicated
known effects from social neuroscience, suggesting that the individually responsive, generative nature of GBCI
provides a powerful, new tool in mapping individual differences and visualizing cognitive-affective processing.
…Thus, negative generated images were evaluated as highly attractive for other people, but not for the participant themselves. Taken together, the results
suggest that the GBCI was highly accurate in generating personally attractive images (83.33%). They also show that while
both negative and positive generated images were evaluated as highly attractive for the general population (respectively M = 4.43 and 4.90 on a scale of 1–5), only
the positive generated images (M = 4.57) were evaluated as highly personally attractive.
Qualitative results: In semi-structured post-test interviews, participants were shown the generated images that were expected to be found
attractive/ unattractive. Thematic analysis found predictions of positive attractiveness were experienced as accurate: There were no false positives
(generated unattractive found personally attractive). The participants also expressed being pleased with results (eg. “Quite an ideal beauty for a male!”; “I would
be really attracted to this!”; “Can I have a copy of this? It looks just like my girlfriend!”).
In news that surprises nobody, Goodreads last week quietly announced the
deprecation of their public APIs. And I mean really quietly—the only people who were told about this were those
unfortunate enough to have their existing API keys disabled withoutwarning. Other than a small banner at
the top of the API docs which mentions vague “plans to retire these tools”, nobody else appears to have heard
anything from Goodreads, including those whose API keys remain active…So this is an “announcement” much in the way a
windshield announces its presence to bugs on a highway, and with the same consequences: dead bugs. Some developers have taken to the API discussion boards and blogs, but the overall impression I’m getting is grim acceptance. Really the surprising thing is how long
it took them: Amazon has been in charge at Goodreads for almost 8 years now, and I think we’ve all been expecting this to come at some point.
So why now? What’s changed? Well, the fact is the market’s changing—and Goodreads isn’t. Alternative options are starting to emerge, and since Goodreads has
forgotten how to innovate, it wants to use its market position to stifle innovation instead.
Based on a corpus including 150 novels by 40 authors, a stylometric survey was conducted to assess which modern authors were similar to Elena Ferrante, the pen
name used for eight novels, including “My Brilliant Friend” (Tuzzi & Cortelazzo 2018a and 2018b). The survey proved that Elena Ferrante’s writing style is remarkably different from that of the other main
contemporary Italian novelists with the notable exception of Domenico Starnone. Follow-up studies (Cortelazzo, Mikros & Tuzzi 2018 and another under way) show
that non-fiction works signed by Elena Ferrante may be attributed to different authors, ie., Anita Raja, Starnone again, and a collective author including the
staff of the E/O publishing house. This study complements the results obtained by previous research by assessing Elena Ferrante’s role in modern Italian fiction
following the publication of her latest novel, “The Lying Life of Adults”. In addition, the analysis of her similarities to Domenico Starnone was enhanced by means
of a larger corpus of his novels, thus corroborating the outcome of previous research.
[Keywords: Elena Ferrante, contemporary Italian literature, authorship attribution, similarity measure, text clustering]
This work studies the widely adopted ancestral sampling algorithms for auto-regressive language models, which is not widely studied in the literature. We use
the quality-diversity (Q-D) trade-off to investigate three popular sampling algorithms (top-k, nucleus and tempered sampling). We focus on the task of open-ended
language generation. We first show that the existing sampling algorithms have similar performance. After carefully inspecting the transformations defined by
different sampling algorithms, we identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation. To validate the importance of the identified
properties, we design two sets of new sampling algorithms: one set in which each algorithm satisfies all three properties, and one set in which each algorithm
violates at least one of the properties. We compare their performance with existing sampling algorithms, and find that violating the identified properties could
lead to drastic performance degradation, as measured by the Q-D trade-off. On the other hand, we find that the set of sampling algorithms that satisfies these
properties performs on par with the existing sampling algorithms. Our data and code are available at
Large generative language models such as GPT-2 are well-known for their ability to generate text as well as
their utility in supervised downstream tasks via fine-tuning. Our work is twofold: firstly we demonstrate via human evaluation that classifiers trained to
discriminate between human and machine-generated text emerge as unsupervised predictors of “page quality”, able to detect low quality content without any training.
This enables fast bootstrapping of quality indicators in a low-resource setting. Secondly, curious to understand the
prevalence and nature of low quality pages in the wild, we conduct extensive qualitative and quantitative analysis over 500 million web articles, making this the
largest-scale study ever conducted on the topic.
This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also
provide a dataset of more than 1.39 instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and
generate pipeline that identifies stylistic attributes and subsequently generates a sentence in the target style while preserving most of the source content. For
politeness as well as five other transfer tasks, our model outperforms the state-of-the-art methods on automatic metrics for content preservation, with a
comparable or better performance on style transfer accuracy. Additionally, our model surpasses existing methods on human evaluations for grammaticality, meaning
preservation and transfer accuracy across all the six style transfer tasks. The data and code is located at https://github.com/tag-and-generate.
In the real world, RL agents should be rewarded for fulfilling human preferences. We show that RL agents implicitly learn the preferences of humans in their
environment. Training a classifier to predict if a simulated human’s preferences are fulfilled based on the activations of a RL agent’s neural network gets
.93 AUC. Training a classifier onthe raw environment state gets only .8 AUC.
Training the classifier off of the RL agent’s activations also does much better than training off of activations from an autoencoder. The human preference
classifier can be used as the reward function of an RL agent to make RL agent more beneficial for humans.
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize
perplexity of the next token. We also propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of a human-like multi-turn conversation. Our experiments show strong correlation between
perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72% on multi-turn evaluation) suggests that a human-level SSA of 86% is potentially within
reach if we can better optimize perplexity. Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79% SSA, 23% higher in absoluteSSA than the existing chatbots we evaluated.
When I first finished Metal Gear Solid V: The Phantom
Pain, like so many other players, I was disappointed. MGSV was supposed to be the “Missing Link” in the
Metal Gear canon. It was that game that would reveal the bridge between the heroic
Big Boss of MGS 3, Portable Ops, and Peace Walker, and the grand historical villain of Metal Gear 1 and 2. As expressed by numerous launch trailers and Hideo Kojimatweets, MGSV was going to be a
tale of Big Boss’s fall into darkness, driven by an insatiable lust for revenge, a consummate anger lit by his enemies which would scorch his soul until nothing
was left but a power-hungry mad man who would threaten the world with nuclear war for the sake of his deluded ambitions. Instead we got an incredibly weird twist
which did little more than retcon patch a largely ignored plot hole in one of the least-played Metal Gear games. We found out that the final boss of Metal
Gear 1 was not Big Boss, but a body double,
who through surgery and hypnotherapy was made into almost an exact copy of the legendary soldier. Again, like most other players, when I first finished the game I
thought this was a neat trick, a typically crazy, convoluted, but seductively entertaining twist from one of my favorite storytellers of all time. But of course…
it was also a major let down.
…It wasn’t until I had put over 200 hours into my save file and replayed the entire game for a second time that the impact of Metal Gear Solid
V’s story really hit me. Not only does MGSV do exactly what it was advertised to do—reveal the descent of Big
Boss from hero to villain—but it does so in a subtle and narratively ambitious manner at a depth not seen in any video game since Metal Gear Solid 2: Sons of Liberty.
MGSV is the story of Big Boss’s fall from grace, but it’s also so muchmore than that. MGSV may very well be Kojima’s magnum opus. The game distills all of the Metal Gear series’ most important thematic elements into a
relatively simple story with a deceptively small scale. The reason the vast majority of players didn’t realize this is because, well, Kojima can be too
subtle for his own good…MGSV really is about Big Boss becoming a horrible monster worthy of every conceivable
condemnation. But that story is the bedrock layer hidden beneath a million other narrative layers designed to confuse and manipulate the player, in exactly the
same way Big Boss and Zero’s whole Phantom Snake project was designed to confuse and manipulate Venom Snake.
Recent work has presented intriguing results examining the knowledge contained in language models (LM) by having the LM fill in the blanks of prompts such as
“Obama is a _ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “Obama worked as a _” may result
in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and
thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge
contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods
to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on
the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from
31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.
We present BART, a denoising autoencoder for pretrainingsequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original
text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to
the bidirectional encoder), GPT (with the left-to-right decoder), and many other more recent pretraining schemes.
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel
in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when
fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with
comparable training resources onGLUE and SQuAD, achieves new state-of-the-art results on a range of
abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE. BART also provides a 1.1BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also report ablation
experiments that replicate other pretraining schemes within the BART framework, to better measure which factors
most influence end-task performance.
Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte
Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens.
While multiple segmentations are possible even with the same vocabulary, BPE splits words into unique sequences; this may
prevent a model from better learning the compositionality of words and being robust to segmentation errors. So far, the only way to overcome this
BPE imperfection, its deterministic nature, was to create another subword segmentation algorithm (Kudo, 2018). In
contrast, we show that BPE itself incorporates the ability to produce multiple segmentationsof the same
word. We introduce BPE-dropout—simple and effective subword regularization method based on and compatible with
conventional BPE. It stochastically corrupts the segmentationprocedure of BPE, which leads to producing multiple segmentationswithin the same fixed BPE framework.
Using BPE-dropout during trainingand the standard BPE during inference
improves translation quality up to 3 BLEUcompared to BPE
and up to 0.9BLEU compared to the previous subword regularization.
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful
technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of
approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by
introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training
objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from
our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question
answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data
set, pre-trained models, and code.
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Schick and Schütze
(2020) recently showed that these models struggle to understand rare words. For static word embeddings, this problem has been addressed by separately
learning representations for rare words. In this work, we transfer this idea to pretrained language models: We introduce BERTRAM, a powerful architecture based on BERT that is capable of inferring
high-quality embeddings for rare words that are suitable as input representations for deep language models. This is achieved by enabling the surface form and
contexts of a word to interact with each other in a deep architecture. Integrating BERTRAM intoBERT leads to large performance increases due to improved representations of rare and medium frequency words on both a
rare word probing task and three downstream tasks.
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point
further model increases become harder due to GPU/TPU memory limitations and longer
training times. To address these problems, we present two parameter-reduction techniques to lower memory consumption and increase the training speed of
BERT. Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better
compared to the original BERT. We also use a self-supervised loss that focuses on modeling inter-sentence
coherence, and show it consistently helps downstream tasks with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the
GLUE, RACE, and benchmarks while having fewer parameters
compared to BERT-large. The code and the pretrained models are available at
This vignette provides an introduction on how to fit distributional regression models with brms. We use the term distributional model to refer to a
model, in which we can specify predictor terms for all parameters of the assumed response distribution.
In the vast majority of regression model implementations, only the location parameter (usually the mean) of the response distribution depends on the predictors
and corresponding regression parameters. Other parameters (eg. scale or shape parameters) are estimated as auxiliary parameters assuming them to be constant across
observations. This assumption is so common that most researchers applying regression models are often (in my experience) not aware of the possibility of relaxing
it. This is understandable insofar as relaxing this assumption drastically increase model complexity and thus makes models hard to fit. Fortunately,
brms uses Stan on the backend, which is an incredibly flexible and powerful tool for estimating Bayesian models so that model complexity is much less
of an issue.
…In the examples so far, we did not have multilevel data and thus did not fully use the capabilities of the distributional regression framework of
brms. In the example presented below, we will not only show how to deal with multilevel data in distributional models, but also how to incorporate
smooth terms (ie., splines) into the model. In many applications, we have no or
only a very vague idea how the relationship between a predictor and the response looks like. A very flexible approach to tackle this problems is to use splines and
let them figure out the form of the relationship.
I read Disaster Artist on a whim when the movie came out. I’ve since gone through the audiobook 3.5 times and can
confidently say it’s one of my favorite books of all time. I expected just to hear funny anecdotes about the making of a famously awful movie and the man behind it, but I found so much more depth. In my eyes, Disaster Artist is an examination of insanity (which I am defining as
“the inability to perceive reality to the degree of low or non-functionality in regular life”). The book is a pushback against a subtle cultural norm that sees
crazy people as having some sort of gift or potential or insight that everyone else doesn’t.
This message hit me especially hard because I had my first real experience with a crazy person only a few months before I read Disaster Artist…We fired
our employee. We offered a small severance, about ¼ of his monthly salary, just to smooth things over. The employee demanded a full month’s salary, which he said
he needed to provide for his wife and child. Then he (an ex-marine) threatened to personally kill me if we didn’t pay him.
That 30 minute phone call was terrifying. I wasn’t actually scared of being murdered, and we never gave in to his demands, but it wasn’t until that call that I
understood what it meant to be crazy. It unnerved me in a sort of staring into the abyss way. This man was truly detached from reality. He either
didn’t know or could not understand the facts before him. When presented with reality, he would lash out in pain and anguish and fury at phantom targets. I would
make calm, reasonable arguments about how he had violated his work contract, hurt our business, hurt our clients, and lied to us, and he would respond with
nonsensical excuses, random tangents, blaming his personal life, and never ever coming close to acknowledging his own culpability.
I came away from the conversation with a mixture of pity, revulsion, and dread. I don’t know if this guy was bipolar, drug-addled, schizophrenic, or what, but I
was 100% sure that this man lived in a nightmare. Everything was confusing and nonsensical to him. The world was dark, malevolent, and couldn’t stop hurting him
even as he tried his best. I had an image of him sitting alone in his tiny apartment listening to that one student’s song over-and-over again on repeat while his
mind blurred between random scientific and historical topics until he could no longer fight the urge to pick up the phone and call me or someone like me who took
enough pity on him to politely listen for a few minutes until we made excuses and left him back alone in silence.
I see Tommy Wiseau, the creator of The Room and the subject of Disaster Artist, in the same category as my ex-employee. The form of their
insanity is somewhat different, but both men live tortured, miserable lives, and constantly lash out at bystanders because of it. However, unlike my ex-employee,
Wiseau is beloved by the masses precisely for his insanity. This is a dangerous, inaccurate, unfair reality, and in my opinion, is precisely what the
Disaster Artist book argues against.
Mathematical reasoning—a core ability within human intelligence—presents some unique challenges as a domain: we do not come to understand and solve mathematical
problems primarily on the back of experience and evidence, but on the basis of inferring, learning, and exploiting laws, axioms, and symbol manipulation rules. In
this paper, we present a new challenge for the evaluation (and eventually the design) of neural architectures and similar system, developing a task suite of
mathematics problems involving sequential questions and answers in a free-form textual input/output format. The structured nature of the mathematics domain,
covering arithmetic, algebra, probability and calculus, enables the construction of training and test splits designed to clearly illuminate the capabilities and
failure-modes of different architectures, as well as evaluate their ability to compose and relate knowledge and learned processes. Having described the data
generation process and its potential future expansions, we conduct a comprehensive analysis of models from two broad classes of the most powerful
sequence-to-sequence architectures and find notable differences in their ability to resolve mathematical problems and generalize their knowledge.
The oft-repeated elevator pitch on Black Leopard Red Wolf, the buzzy new novel from Man Booker Prize winner Marlon James, is that it’s the African
Game of Thrones. (“I said that as a joke”, James protested in an interview this week.) To a certain extent, the comparison holds. Black Leopard Red
Wolf is a lush epic fantasy set in an enchanted and mythical Africa, filled with quests and magical beasts and vicious battles to the death. But it’s also a
much weirder, twistier book than the Game of Thrones parallels would suggest. Most notably, it is not driven by story. Black Leopard Red Wolf
actively resists any attempts on the reader’s part to sink inside the world of the book and lose themselves. It is deliberately opaque, on the level of sentence as
well as plot.
On the sentence level, James likes to withhold proper nouns until the last possible moment and then waits to reveal them just a little bit longer than you’d
think he should be able to get away with. That means his sentences are generally carried by verbs, and you don’t know who is doing what or why for long stretches
at a time: You just get an impression of anonymous limbs tangled together in sex or battle for some reason that is not immediately clear.
On the plot level, the quest for a missing boy that ostensibly powers the action of the book is so confusing, and has so little to do with the main character’s
motivations, that the rest of the characters are constantly complaining about it. “This child carries no stakes for you”, one says toward the end of the novel to
Tracker, our protagonist, and she’s correct. So is the poor sad giant who has the premise of the quest he is on explained to him multiple times and can only
conclude, “Confusing, this is.”
…In other words, we know that the quest will be futile and the child will die. We also know that the protagonist is not particularly interested in the quest. It
is nearly impossible for a reader to hook into the narrative. Yet Black Leopard Red Wolf spends hundreds and hundreds of pages tracking its many twists
and permutations. The opacity here is clearly a deliberate choice on James’s part. He is not interested in easy reads or straightforward stories. “The African
folktale is not your refuge from skepticism”, he told the New Yorker earlier this year. “It is not here to make things easy for you, to give you faith so you don’t
have to think.” And James plans to keep things challenging through the rest of the Dark Star trilogy, of which Black Leopard is only the first volume.
He’s modeling it on Showtime’s Rashomon-like series The Affair, he says, so that each volume will present the same events to the reader through a
different point of view. “The series is three different versions of the same story, and I’m not going to tell people which they should believe”, James says.
Angus Trumble on Dante Gabriel Rossetti and company’s curious but longstanding fixation with the furry oddity that is the wombat—that “most beautiful of God’s creatures”—which found its way into their poems, their
art, and even, for a brief while, their homes…the Pre-Raphaelites were not the first English to become enamoured by the unusual creature. Wombats captured the
attention of English naturalists as soon as they found out about them from early settlers, explorers, and naturalists at the time of first contact. The Aboriginal
word wombat was first recorded near Port Jackson, and though variants such as wombach, womback, the wom-bat and womat were noted, the present form of the name
stuck very early, from at least 1797. Beautiful drawings survive from the 1802 voyages of the Investigator and Le Géographe. Ferdinand Bauer, who
sailed with Matthew Flinders, and Charles-Alexandre Lesueur, who was in the rival French expedition of Nicolas Baudin, both drew the creature. These were engraved
and carefully studied at home. Wombats were admired for their stumpy strength, their patience, their placid, not to say congenial manners, and also a kind of stoic
determination. Occasionally they were thought clumsy, insensible or even stupid, but these isolated observations are out of step with the majority of
…The movie is noteworthy for having a rather unusual genesis. Originally, director Yoshiyuki Tomino was going to wrap up Amuro and Char’s storyline in Gundam ZZ, but mid-way through production he was given the go-ahead
to make a movie, forcing the plot of ZZ to be rewritten (details on its trope
page). In the meantime Tomino wrote the novel Hi-Streamer, but when Sunrise gave him the green light, he went back and wrote a second novel,
Beltorchika’s Children, which he specifically wrote to be adapted into a movie. However, Sunrise instead chose to use Hi-Streamer, with the final
film being a pretty straightforward adaptation of its second half.
Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks.Maximum-Likelihood (MLE) models trained with teacher forcing have
consistently been reported as weak baselines, where poor performance is attributed to exposure bias (Bengio et al 2015;
Ranzato et al 2015); at inference time, the model is fed its own prediction instead of a ground-truth token, which can lead to accumulating errors
and poor samples. This line of reasoning has led to an outbreak of adversarial based approaches for NLG, on the account
thatGANs do not suffer from exposure bias. In this work, we make several surprising observations
which contradict common beliefs. First, we revisit the canonical evaluation framework for NLG, and point out
fundamental flaws with quality-only evaluation: we show that one can outperform such metrics using a simple, well-known temperature parameter to
artificially reduce the entropy of the model’s conditional distributions. Second, we leverage the control over the quality /
diversity trade-off given by this parameter to evaluate models over the whole quality-diversity spectrum and find MLE
models constantly outperform the proposed GAN variants over the whole quality-diversity space. Our
results have several implications: (1) The impact of exposure bias on sample quality is less severe than previously thought, (2) temperature tuning provides a
better quality / diversity trade-off than adversarial training while being easier to train, easier to cross-validate, and less computationally expensive. Code to
reproduce the experiments is available at github.com/pclucas14/GansFallingShort
Adversarial Reprogramming has demonstrated success in utilizing pre-trained neural network classifiers for alternative classification tasks without modification
to the original network. An adversary in such an attack scenario trains an additive contribution to the inputs to repurpose the neural network for the new
classification task. While this reprogramming approach works for neural networks with a continuous input space such as that of images, it is not directly
applicable to neural networks trained for tasks such as text classification, where the input space is discrete. Repurposing such classification networks would
require the attacker to learn an adversarial program that maps inputs from one discrete space to the other. In this work, we introduce a context-based vocabulary
remapping model to reprogram neural networks trained on a specific sequence classification task, for a new sequence classification task desired by the adversary.
We propose training procedures for this adversarial program in both white-box and black-box settings. We demonstrate the application of our model by adversarially
repurposing various text-classification models including LSTM, bi-directional LSTMand CNN for alternate classification tasks.
On June 4th, a group of lawyers shuffled into a federal court in Manhattan to argue over two trademark registrations. The day’s hearing was the
culmination of months of internet drama—furious blog posts, Twitter hashtags, YouTube videos, claims of doxxing, and death threats…They were gathered there that
day because one self-published romance author was suing another for using the word “cocky” in her titles. And as absurd as this courtroom scene was—with a federal
judge soberly examining the shirtless doctors on the cover of an “MFM Menage Romance”—it didn’t even begin to scratch the
The fight over #Cockygate, as it was branded online, emerged from the strange universe of Amazon Kindle Unlimited, where authors collaborate and
compete to game Amazon’s algorithm. Trademark trolling is just the beginning: There are private chat groups, ebook exploits, conspiracies to seed hyper-specific
trends like “Navy SEALs” and “mountain men”, and even a controversial sweepstakes in which a popular
self-published author offered his readers a chance to win diamonds from Tiffany’s if they reviewed his new book…A genre that mostly features shiny, shirtless men
on its covers and sells ebooks for ¢99 a pop might seem unserious. But at stake are revenues sometimes amounting to a million dollars a year, with some authors
easily netting six figures a month. The top authors can drop $50,000 on a single ad campaign that will keep them in the charts—and see a worthwhile return on that
…According to Willink, over the course of RWA, Valderrama told her about certain marketing and sales
strategies, which she claimed to handle for other authors. Valderrama allegedly said that she organized newsletter swaps, in which authors would promote each
other’s books to their respective mailing lists. She also claimed to manage review teams—groups of assigned readers who were expected to leave reviews for books
online. According to Willink, Valderrama’s authors often bought each other’s books to improve their ranking on the charts—something that she arranged, coordinating
payments through her own PayPal account. Valderrama also told her
that she used multiple email addresses to buy authors’ books on iBooks when they were trying to hit the USA Today list.
When Valderrama invited Willink to a private chat group of romance authors, Willink learned practices like chart gaming and newsletter placement selling—and
much more—were surprisingly common.
…In yet more screencaps, members discuss the mechanics of “book stuffing.” Book stuffing is a term that encompasses a wide range of methods for taking advantage
of the Kindle Unlimited revenue structure. In Kindle Unlimited, readers pay $9.99 a month to read as many books as they want that are available through the KU
program. This includes both popular mainstream titles like the Harry Potter series and self-published romances put out by authors like Crescent and
Hopkins. Authors are paid according to pages read, creating incentives to produce massively inflated and strangely structured books. The more pages Amazon thinks
have been read, the more money an author receives.
…Book stuffing is particularly controversial because Amazon pays authors from a single communal pot. In other words, Kindle Unlimited is a zero-sum game. The
more one author gets from Kindle Unlimited, the less the other authors get. The romance authors Willink was discovering didn’t go in for clumsy stuffings of
automatic translations or HTML cruft; rather, they stuffed their books with ghostwritten content or repackaged,
previously published material. In the latter case, the author will bait readers with promises of fresh content, like a new novella, at the end of the book. Every
time a reader reads to the end of a 3,000-page book, the author earns almost 14 dollars. For titles that break into the top of the Kindle Unlimited charts, this
trick can generate a fortune.
In this paper, we propose a joint architecture that captures language, rhyme and meter for sonnet modelling. We assess the quality of generated poems using
crowd and expert judgements. The stress and rhyme models perform very well, as generated poems are largely indistinguishable from human-written poems. Expert
evaluation, however, reveals that a vanilla language model captures meter implicitly, and that machine-generated poems still underperform in terms of readability
and emotion. Our research shows the importance expert evaluation for poetry generation, and that future research should look beyond rhyme/meter and focus on
Deep neural networks are susceptible to adversarial attacks. In computer vision, well-crafted perturbations to images can cause neural networks to make
mistakes such as confusing a cat with a computer. Previous
adversarial attacks have been designed to degrade performance of models or cause machine learning models to produce specific outputs chosen ahead of time by the
attacker. We introduce attacks that instead reprogram the target model to perform a task chosen by the attacker—without the attacker needing to specify or
compute the desired output for each test-time input. This attack finds a single adversarial perturbation, that can be added to all test-time inputs to a machine
learning model in order to cause the model to perform a task chosen by the adversary—even if the model was not trained to do this task. These perturbations can
thus be considered a program for the new task. We demonstrate adversarial reprogramming on six ImageNet classification models, repurposing these models to perform
a counting task, as well as classification tasks: classification of MNIST and CIFAR-10 examples presented as
inputs to theImageNet model.
…a distasteful practice called “book stuffing” by some Kindle Unlimited authors. Kindle Unlimited is an Amazon program that works like Netflix for books: You
can read as much as you want for a flat monthly fee. For various reasons, Kindle Unlimited is filled with books written and self-published by independent authors,
many of them in the romance genre.
How do authors get compensated when readers pay a flat fee for the service? Amazon has created a pool of funds that authors are paid from, currently around
$22.5 million. Up until 2015, authors earned a flat fee for each download of their books. But the company noticed that many of these Kindle Unlimited books were
very, very short. So instead, Amazon began paying a bit less than ¢0.5 cent for each page that was actually read. That’s how book stuffing was born.
It works like this. An Amazon author publishes a new book that’s, say, 300 pages long. At ¢0.5 per page, the author would earn about $1.50 every time that book
was read to the end. To beef up their earnings, book stuffers add several other already-published books, or a long series of newsletters, to the end of the book as
“bonus material.” Most stuffed books run near 3,000 pages, the maximum that Amazon will pay for. In the current system, an author could earn about $13.50 per book
this way, which is more than most authors earn from traditional publishers when their books are sold as hardcovers.
$1.2 million a year?
Serious book stuffers acquire email lists that they sometimes share with each other. They boost their sales by sending out promotional email to hundreds of
thousands of email addresses. They also spend a lot of money on Amazon Marketing Services, promoting their books as “sponsored” to Kindle Unlimited subscribers and
other Kindle shoppers. These tactics, in combination with artificially producing positive reviews (against Amazon’s rules), help them rank high in Amazon’s romance
category, crowding out authors who take a more traditional approach. Some book stuffers publish a new book every couple of weeks (they may use ghostwriters to
actually write the books), doing a new promotion for each one. In this way, observers report, they can earn as much as $100,000 per month.
…Why would anyone read through 2,700 pages of uninteresting bonus material? They usually don’t, but many authors do something that gets people to turn to the
last page of the book, such as promising a contest or giveaway (forbidden by Amazon rules), or putting some new and perhaps particularly racy content right at the
end of the book. On some devices, Amazon may simply be using the last page opened as a measure of how much of a book was “read.” Thus, the author gets full credit
for the book, even though the customer didn’t read all of it.
…Carter openly invited other authors to pay for the use of his “platform” to send out promotional emails to their own mailing lists and also share mailing lists
and cross-promote with other authors/book stuffers. In fact, he was so proud of his book stuffing talents that he posted his credo for the world to see in a
Kindle publishing forum:
Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of
understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the
final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a
single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is
captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that
models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some
shortcomings of current models. As part of this study, we release multiple human reference translations for two popular benchmarks.
This article looks at the case of Elena Ferrante, the (presumed) pseudonym of an
internationally successful Italian novelist, and has two objectives: first, to observe how her novels are positioned in the panorama of modern Italian literature
(represented by an ad hoc reference corpus—composed of 150 novels by forty different authors) and, second, to attempt to understand whether, amongst the authors in
the corpus, there are any that can be considered candidates for involvement in the writing of the novels signed Ferrante.
Consistent with these two objectives, the analyses also use two methods: correspondence analysis for the content mapping of the novels and Labbé’s intertextual
distances to establish a measure of similarity between the novels. In the results, we do not see the expected similarities with writers from the Naples area as
Elena Ferrante distinguishes herself with original literary products that, both in terms of theme and style, show her strong individuality.
Amongst the authors included, Domenico Starnone, who has been previously
identified by other investigations as the possible hand behind this pen name, is the author who has written novels most similar to those of Ferrante and which,
over time, has become progressively more similar.
There have been a few recommendations datasets for movies (Netflix, Movielens) and music (Million Songs), but not for books. That is, until now. The dataset
contains six million ratings for ten thousand most popular books (with most ratings). There are also:
books marked to read by the users
book metadata (author, year, etc.)
types of data here:
implicit feedback indicators (books marked to read)
tabular data (book info)
…All files are available on GitHub. Some of them are quite large, so GitHub won’t show their contents online. See samplesfor smaller CSV snippets. You can download
individual zipped files from releases.
Gidding hints that the house itself is doing the haunting, implying that the architectural environment is responsible for reflecting back the fears of
those within, teasing out their vulnerabilities, feeding upon them, and making them manifest. The house becomes a monster, a maleficent presence that resents its
human tenants. If the house can be read as a metaphor for the body, as is often the case in Gothic mansions and castles, then the occupants become its
consciousness, the archetypes inhabiting its ego and id. Then the house inevitably suffers from a mental schism, a multiple personality disorder. The characters
become those internal voices of nagging doubt and paranoia for the house… and it eventually suffers a mental breakdown.
Despite filming in England, the setting remained as New England. Ettington Park in Stratford-upon-Avon was the spooky mansion that Robert Wise chose for Hill
House’s exteriors, reputedly selected from a list he sourced from the British Psychical Research Society of buildings considered to be genuinely haunted.
This is the first ‘character’ to appear in the film, emerging out of darkness and looking very eerie indeed, due to the inventive use of infra-red film stock.
It’s been argued that the house is the true star of the film, and I have to admit it turns in a memorable ‘performance’. This, though, has more to do with
marvellous production design by Elliot Scott and the huge labyrinthine sets built at Borehamwood. Corridors were made to converge or open out, creating a subtly
expressionistic feel and rooms were constructed slightly askew, sometimes with walls that angled inward. Scott went on to design Labyrinth (1986) and the
first two Indiana Jones sequels.
…The Haunting is regularly included in Top 10 lists of the scariest films ever made. But the special effects are limited to only a few ingenious
mechanical effects, as the terror is mostly the result of brilliant sound-design, clever use of shadows, and inventive camerawork.
Wise chose to shoot the film in Panavision’s wide format and every shot makes full use of it, with beautiful compositions and plenty of visual interest across
every inch of the screen. The otherworldly atmosphere and ominous tracking shots, enhanced by special lenses, work in tandem with the subtly distorted sets.
Wise had some problems sourcing the wide-angle lenses he needed, mainly because they didn’t exist at that time. He wanted the interior to look deep, dark, and
foreboding, seeming to move as if we were within a living thing. The available lenses just weren’t cutting it for him. He badgered Bob Gottschalk, president of
Panavision, until he let slip that wider-lenses were in development at their optics labs. Gottschalk explained that they were early prototypes and the lenses
caused unacceptable distortions. This was exactly what Wise wanted! After signing a disclaimer to waive any legal repercussions, he became the first
director to use such wide angles, imbuing Hill House with its unique and disquieting visual personality.
The unique look of the film goes a long way to creating the brooding atmosphere, but the sound design was the real breakthrough. The slightest creak of
floorboard or sigh of draught makes audiences hold their breath to better listen, and then cacophonous groans and thuds really get the heart racing.
…Of course, our emotional involvement hinges on the performances of the actors. It seems that the personal circumstances and attitudes of the actors already
reflected the characters they were to play. Harris admits that she was suffering from a bout of depression during filming, and this inadvertently helped her play
the central role of the sensitive Eleanor, who feels isolated and shunned by her colleagues, and so becomes victim to the seductively malign atmosphere of the
house. Her performance is both fragile and disturbingly unhinged in turns. The voice-over she provides, to share her character’s paranoia, might have looked corny
on paper to those American studio executives, but Harris delivers it so perfectly that it draws the sympathies of the audience. We feel for her, even as she seems
to succumb to madness and becomes the willing victim.
The Haunting stands alongside Night of the Demon (1957) and The Innocents (1961) as a defining classic in the cinema of the
supernatural. It has never been surpassed and its ‘presence’ is palpable in most intelligent psychological horror films to this day. If special effects had been
used more extensively, then it surely would have dated, but keeping the focus on mood and the psychological aspects of the narrative has ensured it remains as
effective as ever.
The chapters of this volume report the results of this endeavour that were first presented during the international workshop Drawing Elena Ferrante’s Profile in Padua on 7 September 2017 as part of the
3rdIQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The fascinating research
findings suggest that Elena Ferrante’s work definitely deserves “many hands” as well as an extensive effort to understand her distinct writing style and the
reasons for her worldwide success.
…In 2016, an Italian research team embarked on a study suitable for submitting to the international scientific community for debate. It collected a corpus
of 150 novels published in the last 30 years, written by 40 different Italian authors, and chosen according to precise parameters that took into account the main
hypotheses emerging over the years concerning the real identity of Elena Ferrante, and the general scenario of contemporary Italian literature. To submit their
findings to a broader scientific community for discussion, the authors adopted the well-established practice of presenting the results at specialist conferences
and as peer-reviewed journal articles. They also went a step further: in the conviction that any worthwhile research is—by its very nature—transparent and
available for debating, continuing, and confuting, as the case may be, they circulated their data to international experts of authorship attribution, profiling and
analysis of textual data, inviting them to apply their own analytical methods to the material made available.
This volume is a collection of the contributions of various researchers who used various scientific methods to identify the author behind the novels by Elena
Ferrante—a nom de plume that has become one of the most remarkable and often-discussed successes in the publishing world in recent years. The list of the academics
involved, in addition to the curators of this volume, Arjuna Tuzzi and Michele Cortelazzo (University of Padova), includes (in alphabetical order): Maciej Eder
(Pedagogical University of Kraków—Polish Academy of Sciences, Poland), Patrick Juola (Duquesne University of Pittsburgh, PA
USA), Vittorio Loreto and his research team, Margherita Lalli and Francesca Tria (University of Roma “La Sapienza”, Italy), George Mikros (National
and Kapodistrian University of Athens, Greece), Pierre Ratinaud (University of Toulouse II “Jean Jaurès” France), Jan Rybicki (Jagiellonian University of Kraków,
Poland), and Jacques Savoy (University of Neuchâtel, Switzerland). The results of the research conducted by this international group of experts were presented for
the first time during the workshop Drawing Elena Ferrante’s profile, held in Padua on 7 September 2017, as part of the 3rdIQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The Summer School, directed by Arjuna Tuzzi and run by Padova
University’s Dipartimento di Filosofia, Sociologia, Pedagogia e Psicologia Applicata [Department of Philosophy, Sociology, Education and Applied Psychology], is an
interdisciplinary program financed by the University of Padova. The exchange of ideas among the experts at the workshop, with the addition of contributions from 20
participants (from 11 different countries) attending the Summer School, provided the basis for the present publication.
Reading this volume, it is very interesting to see how the various contributions succeed in producing a genuinely interdisciplinary study on a concrete object
of study. Not only were the authors of the contributions from all sorts of disciplines (linguists, social scientists, computer scientists, mathematicians,
statisticians, physicists), they also conversed with one another from different analytical approaches. In addition, the vast majority of them do not speak Italian,
so they worked on the corpus of novels completely blinded to the meaning of the words, trusting entirely to their methods for quantitatively analyzing textual
data. Though they moved from different perspectives, their results supported and strengthened each other’s like the different voices in a choir, leading to
remarkably coherent and integrated conclusion.
We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search. We show both deficiencies and improvements over the
quality of phrase-based statistical machine translation.
One cold Friday in 1660, Samuel Pepys encountered two unpleasant surprises. “At home found all well”, he wrote in his diary, “but the monkey loose, which did
anger me, and so I did strike her.” Later that night, a candlemaker named Will Joyce (the good-for-nothing husband of one of Pepys’s cousins) stumbled in on Pepys
and his aunt while “drunk, and in a talking vapouring humour of his state, and I know not what, which did vex me cruelly.” Presumably, Pepys didn’t resort to blows
this time around.
The two objects of Pepys’ scorn that day, his disobedient pet monkey and his drunken cousin-in-law, were not as distant as one might think. Monkeys stood in for
intoxicated humans on a surprisingly frequent basis in 17th century culture. In early modern paintings, tippling primates can frequently be seen in
human clothing, smoking tobacco, playing cards, rolling dice, and just plain getting wasted.
…So what is going on with these images showing drunken and drug-selling monkeys? I think that what we’re missing when we simply see these as a form of social
satire is that these are also paintings about addiction. Desire is a dominant theme in these works: monkeys are shown jealously squabbling over piles of
tobacco, or even, in the example below, hoarding tulip flowers during the height of the Dutch tulipmania (they appear to be using the profits to get drunk, in the
upper left)…But there’s an alternative narrative running through these paintings as well. It epitomizes the ambivalence that has long surrounded intoxicating
substances, in many cultures and in many times: These monkeys seem to be having fun.
[Discussion with screenshots of the classic Ridley Scott SF movie Blade Runner, which employs typography to disconcert the viewer, with unexpected
choices, random capitalization and small caps, corporate branding/advertising, and the mashed-up creole multilingual landscape of noir cyberpunk LA (plus
discussion of the buildings and sets, and details such as call costs being correctly inflation-adjusted).]
Summary: Medieval peasants Jean and Jeanne are idyllic newlyweds. Their happiness vanishes, however, when Jeanne is raped by the local lord in
a legally sanctioned deflowering ritual. Afterwards, while the couple tries to resume their life together, Jeanne starts receiving visions from a demon. It
comforts her in her sadness, but it also encourages her to act out against the lord. Jeanne resists at first, but as her fortunes continue to wane, she’s thrown
further into the demon’s embrace. As time goes on, Jeanne is drawn into an experience that radically reconfigures her sense of herself, the world, and the course
of history itself.
Review: An X-rated anime classic newly remastered for the screen, Belladonna of Sadness is one of animation’s premiere psychedelic
experiences, brought over to North America for nearly the first time ever in 2016. Its history has already been covered by us before, but here’s a quick refresher:
Belladonna of Sadness is a legendarily low-budget, sexual, and psychedelic anime film from the 1970s. Poorly received at the time of its release, it
accrued a cult audience over the next few decades. Recently, its reputation has been rehabilitated to the point where it’s considered an overlooked classic. Still,
wider appreciation of the film was hampered by the lack of an English release and poor quality of existing prints. That changed in 2014, when the high-end
distribution company Cinelicious chose it as their first candidate for an in-house 4k restoration and re-release. This May, the completed film began screening in
theaters across the United States and Canada, and will continue to do so until September. I attended one of these screenings at International House theater in
Philadelphia. This was my first time seeing the film, and I left very much impressed by both its artistry and storytelling.
…Fair warning, though—it’s not an exaggeration that this film is touted as ultra-sexual. I’d say most of the film’s runtime is made up of sex scenes, some of
them violent and disturbing. It literally opens with a rape. These scenes are appropriate to the story, and the scenes are gorgeous in their artistry, but they are
unpleasant. Otherwise, the sexual imagery is largely abstract. Flowers become vaginas, figures in cloaks become disembodied penises, and Jeanne’s rape is depicted
as her being bisected from the groin upwards. Some psychedelic sequences also contain intense strobe lighting, so epileptics be warned. As for the visuals
themselves, expect watercolors, morphing lineart, and little in terms of actual animation. There are no lush Kyoto Animation frame counts here. Much of the film’s
motion consists of pans and zooms across static illustrations. There aren’t even any lip flaps. The studio went under while making this film, so this was a method
of cutting costs. However, the results are memorable and even contribute to the film’s power. (There’s a great analysis to be written about its use of vertical
versus horizontal space.) Despite these limitations, Belladonna of Sadness is, on a purely aesthetic level, almost unbelievably beautiful. I’d hang any
given frame of it up on my wall. Even if you don’t care about it’s message, this film is still worth watching as a work of altered-state eroticism.
Overall, viewers who can handle the content will probably be entertained by this gorgeous and trippy movie. However, I especially recommend Belladonna of
Sadness to anyone interested in the history of anime.
…Belladonna of Sadness is the culmination of a rare attempt to make blatantly un-commercial, artistically challenging anime. At the cost of bankruptcy,
Mushi Productions made a masterpiece that wouldn’t be fully appreciated for 40 years. Now hindsight allows us to see the breadth of its influence and depth of its
daring. Get in on this experience while you have the chance.
[JPL-sponsored Art Deco/WPA poster series with the concept of advertising
travel in the Solar System & to exoplanets; public domain & free to download/print.]
A creative team of visual strategists at JPL, known as “The Studio”, created the poster series, which is titled
“Visions of the Future.” Nine artists, designers, and illustrators were involved in designing the 14 posters, which are the result of many brainstorming sessions
with JPL scientists, engineers, and expert communicators. Each poster went through a number of concepts and
revisions, and each was made better with feedback from the JPL experts.
David Delgado, creative strategy: “The posters began as a series about exoplanets—planets orbiting other stars—to celebrate NASA’sstudy of them. (The NASA program that focuses on finding and studyingexoplanets is managed by JPL.) Later, the director of JPL was on vacation at
the Grand Canyon with his wife, and they saw a similarly styled poster that reminded them of the exoplanet posters. They suggested it might be wonderful to give a
similar treatment to the amazing destinations in our solar system that JPL is currentlyexploring as part of
NASA. And they were right! The point was to share a sense of things on the edge of possibility that are closely tied to
the work our people are doing today. The JPL director has called our people”architects of the future." As for the
style, we gravitated to the style of the old posters the WPA created for the national parks. There’s a nostalgia
for that era that just feels good."
Joby Harris, illustrator: “The old WPA posters did a really great job delivering a feeling about a
far-off destination. They were created at a time when color photography was not very advanced, in order to capture the beauty of the national parks from a human
perspective. These posters show places in our solar system (and beyond) that likewise haven’t been photographed on a human scale yet—or in the case of the
exoplanets might never be, at least not for a long time. It seemed a perfect way to help people imagine these strange, new worlds.”
David Delgado: “The WPA poster style is beloved, and other artists have embraced it before us. Our
unique take was to take one specific thing about the place and focus on the science of it. We chose exoplanets that had really interesting, strange qualities, and
everything about the poster was designed to amplify the concept. The same model guided us for the posters that focus on destinations in the solar system.”
Lois Kim, typography: “We worked hard to get the typography right, since that was a very distinctive element in creating the character of those old
posters. We wanted to create a retro-future feel, so we didn’t adhere exactly to the period styles, but they definitely informed the design. The Venus poster has a
very curvy, flowy font, for example, to evoke a sense of the clouds.”
Sadly overlooked is that other crucial literary category: the summer non-read, the book that you pick up, all full of ambition, at the beginning of
June and put away, the bookmark now and forever halfway through chapter 1, on Labor Day. The classic of this genre is Stephen Hawking’s A Brief History of
Time, widely called “the most unread book of all time.”…How can we find today’s greatest non-reads? Amazon’s “Popular Highlights” feature provides one quick
and dirty measure. Every book’s Kindle page lists the five passages most highlighted by readers. If every reader is getting to the end, those highlights could be
scattered throughout the length of the book. If nobody has made it past the introduction, the popular highlights will be clustered at the beginning.
Thus, the Hawking Index (HI): Take the page numbers of a book’s five top highlights, average them, and divide by the number of pages in the whole book. The
higher the number, the more of the book we’re guessing most people are likely to have read. (Disclaimer: This is not remotely scientific and is for entertainment
purposes only!) Here’s how some current best sellers and classics weigh in, from highest HI to lowest:
Thinking Fast and Slow by Daniel Kahneman: 6.8%
Apparently the reading was more slow than fast. To be fair, Prof. Kahneman’s book, the summation of a life’s work at the forefront of cognitive psychology,
is more than twice as long as Lean In, so his score probably represents just as much total reading as Ms. Sandberg’s does.
A Brief History of Time by Stephen Hawking: 6.6%
The original avatar backs up its reputation pretty well. But it’s outpaced by one more recent entrant—which brings us to our champion, the most unread book
of this year (and perhaps any other). Ladies and gentlemen, I present:
Capital in the Twenty-First Century by Thomas Piketty: 2.4%
Yes, it came out just three months ago. But the contest isn’t even close. Mr. Piketty’s book is almost 700 pages long, and the last of the top five
popular highlights appears on page 26. Stephen Hawking is off the hook; from now on, this measure should be known as the Piketty Index.
So, what are the movies that people loved, but critics hated? And what about those movies that got rave reviews but just didn’t click with audiences?
To try and answer these questions I’ve analysed 10,000 movies from 1970 to 2013 in the Rotten Tomatoes database, and determined the difference in audience score
and critic score by subtracting the former from the latter. This gives us an index of audience-critic agreement, which I’ve named the Tisdale-Carano
index. From this, we can see which movies the audience loved, but the critics hated—which will be more positive, and movies the critics loved but the audience
hated—more negative. We can also find out what types of movies fall into these categories—like which actors, directors and genres are most common to each.
…I used this IMDb list of 10,000 US-released movies from 1970–2013 (though I did notice a film from 1967) to
get ID numbers for a large number of movies. I then wrote a program that accesses the Rotten Tomatoes database via their API and grabbed the title, first two actors listed, genres, first director listed, studio, year of release, and Motion Picture
Association of America (MPAA) rating of each moviebased on the IMDb number.
From this, I removed 2,828 films without a user or critic rating. This produced the dataset for analysis. I created the Tisdale-Carano index by simply
subtracting the critic score from the user score, then ranking the entire dataset by this number.
This is a line-by-line analysis of the second verse of “99 Problems” by Jay-Z, from the perspective of a criminal procedure professor.
It’s intended as a resource for law students and teachers, and for anyone who’s interested in what pop culture gets right about criminal justice, and what it
[WP: “In 2011 Southwestern Law School Professor Caleb Mason
wrote an article with a line-by-line analysis of the second verse of the song from a legal perspective referencing the Fourth Amendment to the United States Constitution, citing it as a useful tool for teaching law
students search and seizure law involving search warrants, Terry stops, racial profiling, the
exclusionary rule, and the motor vehicle exception. Mason writes that some of Jay-Z’s lyrics are
legally accurate and describe prudent behavior (eg. identifying when police ask for consent to search, specifically asking if one is under arrest, and complying
with the police order to stop rather than fleeing which would certainly result in a search of the car and might authorize police to use lethal force to stop a high
speed chase). However, Mason also notes the song lyrics are legally incorrect in indicating that a driver can refuse an order to exit the Arand that police would
need a warrant to search a locked glove compartment or trunk—in fact, police would only need probable cause to search a car.”]
…The year is ‘94, in my trunk is raw
In my rearview mirror is the motherfuckin’ law
Got 2 choices, y’all: pull over the car or
Bounce on the devil, put the pedal to the floor
And I ain’t tryin’ to see no highway chase with Jake
Plus I got a few dollars, I can fight the case
So I pull over to the side of the road
I heard, “Son, do you know why I’m stopping you for?”
“‘Cause I’m young and I’m black and my hat’s real low?
Do I look like a mind reader, sir? I don’t know
Am I under arrest or should I guess some more?“
”Well, you was doing 55 in a 54
License and registration and step out of the car
Are you carrying a weapon on you? I know a lot of you are“
”I ain’t stepping out of shit, all my paper’s legit“
”Well, do you mind if I look around the car a little bit?“
”Well, my glove compartment is locked
So is the trunk in the back
And I know my rights, so you gon’ need a warrant for that”
“Aren’t you sharp as a tack? You some type of lawyer or something? Somebody important or something?”
“Well, I ain’t passed the bar, but I know a little bit
Enough that you won’t illegally search my shit”
“Well, we’ll see how smart you are when the K9 come”
I got 99 problems, but a bitch [female dog] ain’t one; hit me!
The poet Christopher Smart—also known as “Kit Smart”, “Kitty Smart”, “Jack
Smart” and, on occasion, “Mrs Mary Midnight”—was a well known figure in 18th-century London. Nowadays he is perhaps best known for considering his cat Jeoffry. Writer and broadcaster Frank Key looks at Smart’s weird
and wonderful Jubilate Agno…
It was not until 1939 that his masterpiece, written during his confinement in St Luke’s, was first published.
Jubilate Agno is one of the most extraordinary poems in the English language, and almost certainly the reason we remember Christopher Smart today. It
has been described as a vast hymn of praise to God and all His works, and also as the ravings of a madman. Indeed, that first edition was published under the title
Rejoice In The Lamb: A Song From Bedlam, clearly marking
it as a curio from the history of mental illness. It was W. H. Bond’s revised edition of 1954 which gave order to Smart’s surviving manuscript, restoring the Latin
title Jubilate Agno, bringing us the poem in the form we know it today.
Christopher Smart never completed the work, which consists of four fragments making a total of over 1,200 lines, each beginning with the words “Let” or “For”.
For example, Fragment A is all “Let”s, whereas in Fragment B the “Let”s and “For”s are paired, which may have been the intention for the entire work, modelled on
antiphonalHebrew poetry. References and allusions abound to Biblical (especially Old Testament) figures, plants and animals, gems, contemporary politics
and science, the poet’s family and friends, even obituary lists in current periodicals. The language is full of puns, archaisms, coinages, and unfamiliar usages.
Dr. Johnson famously said “Nothing odd will do long; Tristram Shandy did not last”. Jubilate Agno is, if anything,
“odder” than Sterne’s novel, and perhaps we are readier to appreciate it in the twenty-first century than when it was written…one of the great joys of Jubilate
Agno is in its sudden dislocations and unexpected diversions. The “my cat Jeoffrey” passage is justly famous, but the
poem is cram-packed with similar wonders, and must be read in full to appreciate its inimitable genius.
This post investigates female attractiveness, but without the usual photo analysis stuff. Instead, we look past a woman’s picture, into the reaction
she creates in the reptile mind of the human male. Among the remarkable things we’ll show:
that the more men as a group disagree about a woman’s looks, the more they end up liking her
guys tend to ignore girls who are merely cute
and, in fact, having some men think she’s ugly actually works in woman’s favor
…Now let’s look back at the two real users from before, this time with their own graphs. OkCupid uses a 1 to 5 star system for rating people, so the
rest of our discussion will be in those terms. All the users pictured were generous and confident enough to allow us to dissect their experience on our site, and
we appreciate it. Okay, so we have: […] As you can see, though the average attractiveness for the two women above is very close, their vote patterns differ. On the
left you have consensus, and on the right you have split opinion.
To put a fine point on it:
Ms. Left is, in an absolute sense, considered slightly more attractive
Ms. Right was also given the lowest rating 142% more often
yet Ms. Right gets 3× as many messages
When we began pairing other people of similar looks and profiles, but different message outcomes, this pattern presented itself again and again. The
less-messaged woman was usually considered consistently attractive, while the more-messaged woman often created variation in male opinion…Our
first result was to compare the standard deviation of a woman’s votes to the messages she gets. The more men disagree about a woman’s looks, the more they like
her. I’ve plotted the deviation vs. messages curve below, again including some examples…
Social scientists generally presume that a good reputation has advantages.
Yet the Walt Disney Corporation, a firm that has long benefited from a
reputation for producing wholesome popular culture, attracts more than its share of efforts to link it to various social problems. In particular, conservative
moralists argue that Disney in fact produces morally questionable products, progressive critics claim that Disney’s messages help preserve social inequities, and
social scientists criticize Disney for fostering inauthentic and alienating entertainment.
These claims are a form of blowback—negative reactions to the firm’s positive reputation. While blowback makes it easier to construct social problems claims, a
good reputation remains an important resource in deflecting these criticisms.
This study focuses on effects of knowledge and experience on both mean and variance measures of individual and team
innovations. We propose that multiple knowledge domains produce novel combinations that increase the variance of product
performance and that extensive experience produces outputs with high average performance.
We analyzed innovations in the comic book industry [1972–1996], finding that innovations with extreme success and failure [collectible prices] were affected by
factors similar to those affecting high-performing innovations.
Multimember teams and teams with experience working together produced innovations with greater variation in value, but individuals were able to combine
knowledge diversity more effectively than teams.
Open systems strategy enables a sponsor to diffuse its technology and promotes standardization in an industry. However, this strategy has been studied in
high-tech settings. We hypothesize that, in a non-high-tech industry, a sponsor giving access to its technical knowledge may impact industry structure. Based on a
survey of the U.S. tabletop role-playing game(RPG) industry, our results highlight that the introduction of an open system in a sector creates an entry induction phenomenon and
that these new entrants adopt more readily the open system than incumbents. Moreover, the average size of the firms in the industry decreases due to vertical
…Sample and Data: For the purpose of this study we have compared the structure of the RPG sector
before and after the introduction of the d20open license. Our comparison is between the 2-year periods of 1998–99 (before the
introduction of the d20 license) and 2000–01 (after the introduction of the d20 license). These periods can legitimately be compared, as the U.S. market
segment encompassing RPG products did not witness a drastic evolution over these 4 years. 8 After collecting qualitative
data on the industry from RPG publications (Comics and Games Retailer, D20 Magazine, Dragon
Magazine) and Internet websites (D20 Reviews, Game Manufacturers Association, Game Publishers Association, GameSpy, Gaming Report, RPGA Network, RPGNow, RPG Planet, Wizard’s Attic), we
established an exhaustive list of the 193 active U.S. companies publishing RPGs and compiled a database comprising
3 firm variables: age, size (number of employees), and technological system adopted (the open system vs. proprietary systems). These data were
collected from company websites. We collected information
…Results: We hypothesized that the introduction of an open system in an industry would favor the arrival of new entrants
(Table 1). Hypothesis 1 was strongly supported by our chi-square analysis. The 2000–01 period saw 78 new entrants into the
RPG sector, with only 20 new entrants in the 1998–99 period (c2 = 12.35, statistically-significant
at the 0.01 level). Of the 78 new entrants in the 2000–01 period, 51 adopted the d20 license (Table 2). This proportion was markedly greater
than for incumbents, strongly supporting Hypothesis 2 (c2 = 17.89, statistically-significant at the 0.01 level). New entrants were found to adopt the new open system more readily than
incumbents. These new entrants were essentially players and former freelancers operating within the sector who saw the d20 as an opportunity to avoid the
prevailing development costs and switching costs for players, and so decided to launch their own company.
It should be noted that some firms, both new entrants and incumbents, coupled the open system with development of their own proprietary game’s rules of play.
Moreover, 27 new entrants did not adopt the d20 license. This figure corresponds roughly to the number of new entrants during the 1998–99 period (ie. 20). This
confirms that the 2 periods (1998–99 and 2000–01) are comparable and that no exogenous variable has drastically modified the economic context of the industry. We
can then attribute the new entries in the RPG industry in 2000–01 to the introduction of the d20 license per
We hypothesized that the diffusion of an open system into an industry should lead to a decrease in the average size of companies in that industry. Our
ANOVA result strongly supports this hypothesis (F = 8.739, statistically-significant at the 0.01 level). Indeed, even though RPG companies have
traditionally been very small, their average size became even smaller after the diffusion of the d20 system (reducing from an average of 5.02 down to 2.76
The mind, that rambling bear, ransacks the sky
In search of honey,
Fish, berries, carrion. It minds no laws…
As if the heavens were some canvas tent,
It slashes through the firmament
To prise up the sealed stores with its big paws.
The mind, that sovereign camel, sees the sky
For what it is:
Each star a grain of sand along the vast
Passage to that oasis where, below
The pillared palms, the portico
Of fronds, the soul may drink its fill at last.
The mind, that gorgeous spider, webs the sky
With lines so sheer
They all but vanish, and yet star to star
(Thread by considered thread) slowly entwines
The universe in its designs—
Un-earthing patterns where no patterns are.
The mind, that termite, seems to shun the sky.
It burrows down,
Tunneling in upon that moment when,
In Time—its element—will come a day
The longest-shadowed tower sway,
Unbroken sunlight fall to earth again.
…DNA was unspooled in the year
I was born, and the test-tube births
Of cloned mammals emerged in a mere
Half-century; it seems the earth’s
Future’s now in the hands of a few
Techies on a caffeinated all-nighter who
Sift the gene-alphabet like Scrabble tiles
And our computer geeks are revealed, at last,
As those quick-handed, sidelined little mammals
In the dinosaurs’ long shadows—those least-
Likely-to-succeed successors whose kingdom come
Was the globe itself (an image best written down,
Perhaps, beneath a streetlamp, late, in some
Star-riddled Midwestern town).
He wrote boys’ books and intuitively
Recognized that the real
Realist isn’t the one who details
Lowdown heartland factories and farms
As if they would last, but the one who affirms,
From the other end of the galaxy,
Ours is the age of perilous miracles.
In conclusion, fencing tempo is a vital element of swordsmanship, but clearly for the duelist hitting before being hit is not at all the same thing as hitting
without being hit. Exsanguination is the principal mechanism of death caused by stabbing and incising wounds and death by this means is seldom instantaneous.
Although stab wounds to the heart are generally imagined to be instantly incapacitating, numerous modern medical case histories indicate that while victims of such
wounds may immediately collapse upon being wounded, rapid disability from this type of wound is by no means certain. Many present-day victims of penetrating wounds
involving the lungs and the great vessels of the thorax have also demonstrated a remarkable ability to remain physically active minutes to hours after their wounds
were inflicted. These cases are consistent with reports of duelists who, subsequent to having been grievously or even mortally wounded through the chest, neck, or
abdomen, nevertheless remained actively engaged upon the terrain and fully able to continue long enough to dispatch those who had wounded them.
…Early American motion pictures have frequently misrepresented virtually every aspect of authentic swordplay. This seems to have been especially true of the
industry’s depiction of the manner in which swordsmen fell before the blades of their opponents. While anecdotes of duels may have been biased by politics or
personal vanity, modern forensic medicine provides ample evidence to support historical accounts of gravely wounded duelists continuing in combats for surprising
lengths of time, sometimes killing those who had killed them.
In the first installment of this essay modern forensic evidence indicated that exsanguination is the principal mechanism of death caused by stabbing and
incising wounds, but that death by this means is seldom instantaneous; victims frequently capable of continued physical activity, even after being stabbed in the
heart. Similarly, victims of sharp force injuries to the lungs are not infrequently able to carry on for protracted periods of time. Wounds which result in the
introduction of blood into the upper airway, on the other hand, are likely to incapacitate and kill an adversary quite rapidly.
Duels featuring penetrating wounds to the muscles of the sword arm appear in some cases to have left duelists fully capable of manipulating their weapons.
Thrusts to the thigh and leg may have been even less efficacious. Strokes with the cutting edges of swords to the limbs may result in more serious wounds to the
musculature than the penetrating variety, but historical accounts of duels demonstrate that immediate incapacitation of an adversary stricken with such wounds was
by no means guaranteed. Incising wounds which sever tendons, however, can be expected to immediately incapacitate the muscles from which they arise. Recent medical
reports of sharp force injuries to the brain suggest that even a sword-thrust penetrating the skull ought not to have been expected always to disable an opponent
instantaneously. While severe pain is usually incapacitating, the stress of combat may mask the pain of gravely serious wounds, enabling the determined duelist to
remain on the ground for a considerable length of time.
The immediate consequences to a duelist of wounds inflicted by thrusts or cuts from the rapier, dueling sabre or smallsword were unpredictable. While historical
anecdotes of affairs of honor and twentieth century medical reports show that many stabbing victims collapsed immediately upon being wounded, others did not. While
a swordsman certainly gained no advantage for having been wounded, it cannot be said that an unscathed adversary, after having delivered a fatal thrust or cut, had
no further concern for his safety. Duelists receiving serious and even mortal wounds were sometimes able to continue effectively in the combat long enough to take
the lives of those who had taken theirs…For the duelist, however, another form of tempo had to be considered. In the early history of affairs of honor, this
“dueling tempo” spanned the period extending from the moment that a wound was inflicted until the instant that the adversary was no longer able to continue
effectively. This span of time was unpredictable in length and could be expressed in terms ranging from a fraction of a second to minutes. Considering the number
and severity of wounds that were sustained by combatants in the early days of the duel, it would not be surprising to find that many duelists of latter days
secretly breathed a sigh of relief when interrupted by seconds rushing in to terminate affairs of honor immediately upon the delivery of a well placed cut or
[Early webcomic by British artist Sam Chivers. Notable for being a
Flash webcomic, originally hosted at
www.realitytax.com, but apparently removed when published in the 2004 comics anthology Prophecies: Volume 1 (Sequent Media,
“Headcase” is a wordless narrative in a Moebius-like style set in a grim dystopian
cyberpunk future where an initially hopeful working-class robot (reminiscent of a crash test dummy) struggles to get to his job, survive his shift, and the indignities of the day (such as stepping on poop), becoming
progressively ground down and broken; he adopts the spirit of a monkey, whose prank precipitates his beating by a local thug; the robot then commit suicide,
smashing his head open, revealing it was piloted by a small dying animal. The monkey spirit resurrects the corpse, becoming a busker dancing in a costume for
donations; at the very end, the monkey-robot steps on a piece of poop and becomes annoyed.]
The continuing controversy over online file sharing sparks me to offer a few thoughts as an author and publisher. To be sure, I write and publish neither movies
nor music, but books. But I think that some of the lessons of my experience still apply.
Lesson 1: Obscurity is a far greater threat to authors and creative artists than piracy.
…More than 100,000 books are published each year, with several million books in print, yet fewer than 10,000 of those new books have any substantial sales,
and only a hundred thousand or so of all the books in print are carried in even the largest stores…The web has been a boon for readers, since it makes it
easier to spread book recommendations and to purchase the books once you hear about them. But even then, few books survive their first year or two in print.
Empty the warehouses and you couldn’t give many of them away…
Lesson 2: Piracy is progressive taxation
For all of these creative artists, most laboring in obscurity, being well-enough known to be pirated would be a crowning achievement. Piracy is a kind of
progressive taxation, which may shave a few percentage points off the sales of well-known artists (and I say “may” because even that point is not proven), in
exchange for massive benefits to the far greater number for whom exposure may lead to increased revenues…
Lesson 3: Customers want to do the right thing, if they can.
…We’ve found little or no abatement of sales of printed books that are also available for sale online…The simplest way to get customers to stop trading
illicit digital copies of music and movies is to give those customers a legitimate alternative, at a fair price.
Lesson 4: Shoplifting is a bigger threat than piracy.
…What we have is a problem that is analogous, at best, to shoplifting, an annoying cost of doing business. And overall, as a book publisher who also makes
many of our books available in electronic form, we rate the piracy problem as somewhere below shoplifting as a tax on our revenues. Consistent with my
observation that obscurity is a greater danger than piracy, shoplifting of a single copy can lead to lost sales of many more. If a bookstore has only one copy
of your book, or a music store one copy of your CD, a shoplifted copy essentially makes it disappear from the next potential buyer’s field of possibility.
Because the store’s inventory control system says the product hasn’t been sold, it may not be reordered for weeks or months, perhaps not at all. I have many
times asked a bookstore why they didn’t have copies of one of my books, only to be told, after a quick look at the inventory control system: “But we do. It
says we still have one copy in stock, and it hasn’t sold in months, so we see no need to reorder.” It takes some prodding to force the point that perhaps it
hasn’t sold because it is no longer on the shelf…
Lesson 5: File sharing networks don’t threaten book, music, or film publishing. They threaten existing publishers.
…The question before us is not whether technologies such as peer-to-peer file sharing will undermine the role of the creative artist or the publisher, but
how creative artists can leverage new technologies to increase the visibility of their work. For publishers, the question is whether they will understand how
to perform their role in the new medium before someone else does. Publishing is an ecological niche; new publishers will rush in to fill it if the old ones
fail to do so…Over time, it may be that online music publishing services will replace CDs and other physical distribution media, much as recorded music
relegated sheet music publishers to a niche and, for many, made household pianos a nostalgic affectation rather than the home entertainment center. But the
role of the artist and the music publisher will remain. The question then, is not the death of book publishing, music publishing, or film production, but
rather one of who will be the publishers.
Lesson 6: “Free” is eventually replaced by a higher-quality paid service
A question for my readers: How many of you still get your email via peer-to-peer UUCP dialups or the old
“free” Internet, and how many of you pay $32.84$19.952002a month or more to an ISP? How many of you watch
“free” television over the airwaves, and how many of you pay $33$202002–$99$602002 a month for cable or satellite television? (Not to mention continue to rent movies on videotape and
DVD, and purchasing physical copies of your favorites.) Services like Kazaa flourish in the absence of competitive
alternatives. I confidently predict that once the music industry provides a service that provides access to all the same songs, freedom from onerous
copy-restriction, more accurate metadata and other added value, there will be hundreds of millions of paying subscribers…Another lesson from television is that
people prefer subscriptions to pay-per-view, except for very special events. What’s more, they prefer subscriptions to larger collections of content, rather
than single channels. So, people subscribe to “the movie package”, “the sports package” and so on. The recording industry’s “per song” trial balloons may work,
but I predict that in the long term, an “all-you-can-eat” monthly subscription service (perhaps segmented by musical genre) will prevail in the
Lesson 7: There’s more than one way to do it.
A study of other media marketplaces shows, though, that there is no single silver-bullet solution. A smart company maximizes revenue through all its
channels, realizing that its real opportunity comes when it serves the customer who ultimately pays its bills…Interestingly, some of our most successful
print/online hybrids have come about where we present the same material in different ways for the print and online contexts. For example, much of the content
of our bestselling book Programming Perl (more than 600,000 copies in print) is available online as part of the standard Perl documentation. But the entire
package—not to mention the convenience of a paper copy, and the aesthetic pleasure of the strongly branded packaging—is only available in print. Multiple ways
to present the same information and the same product increase the overall size and richness of the market. And that’s the ultimate lesson. “Give the Wookiee
what he wants!” as Han Solo said so memorably in the first Star Wars movie. Give it to him in as many ways as you can find, at a fair price, and let
him choose which works best for him.
Summoned to Kubrick’s secluded mansion and offered an enormous sum of money, Watson began collaborating on a film idea with Kubrick, who was a perfectionist who
demanded endless marathon revisions of possible stories and ideas, only to throw them out and hare off on an entirely different avenue; he would spend
extravagantly on travel or books on a topic or demand photos of a particular place or a specific item like a bag on sale only discard them without a second look,
perennially challenging his assistants’ patience. (This attitude extended to his films, where he thought nothing of ordering in an entire plastic replica garden,
only to decide it was inadequate, discard it, and order real palm trees flown in.) He was a lover of animals like cats,
dogs, and birds, requiring a servant to mow grass & deliver it to a cat kept upstairs on a daily basis, although his affection was often quite as harmful as
helpful (his generosity in ordering feeding of the birds made them obese). Careless of rough drafts, he’d lose printouts or erase disks, but even more paranoid, he
would be infuriated when the local hacker who assisted them with computer problems restored files from backups the hacker had prudently kept. This paranoia further
kept him terrified about global geopolitics, such as whether Saddam Hussein would trigger nuclear war in the Middle East.
For all the surreal comedy, when Kubrick dies—A.I still being nowhere near filming, of course—and Watson writes up his memoirs, he finds that he misses
Kubrick and “I remain sad that he’s gone.”]
[Chapter 6 of the first book of The Book of the New Sun, and is famous for being an extended homage to Jorge Luis Borges as the blind librarian Ultan who was gifted
blindness right as he became librarian, and also has some of the most beautiful writing in the series.]
…“You are in close contact, then, with your opposite numbers in the city”, I said. The old man stroked his beard. “The closest, for we are they. This library is
the city library, and the library of the House Absolute too, for that matter. And many others.” “Do you mean that the rabble of the city is permitted to enter the
Citadel to use your library?” “No”, said Ultan. “I mean that the library itself extends beyond the walls of the Citadel. Nor, I think, is it the only institution
here that does so. It is thus that the contents of our fortress are so much larger than their container.”
…His grip on my shoulder tightened. “We have books here bound in the hides of echidnes, krakens, and beasts so long extinct that those whose studies they are,
are for the most part of the opinion that no trace of them survives unfossilized. We have books bound wholly in metals of unknown alloy, and books whose bindings
are covered with thickset gems. We have books cased in perfumed woods shipped across the inconceivable gulf between creations—books doubly precious because no one
on Urth can read them.”
“We have books whose papers are matted of plants from which spring curious alkaloids, so that the reader, in turning their pages, is taken unaware by bizarre
fantasies and chimeric dreams. Books whose pages are not paper at all, but delicate wafers of white jade, ivory, and shell; books too whose leaves are the
desiccated leaves of unknown plants. Books we have also that are not books at all to the eye: scrolls and tablets and recordings on a hundred different substances.
There is a cube of crystal here—though I can no longer tell you where—no larger than the ball of your thumb that contains more books than the library itself does.
Though a harlot might dangle it from one ear for an ornament, there are not volumes enough in the world to counterweight the other. All these I came to know and
made safeguarding them my life’s devotion. For seven years I busied myself with that; and then, just when the pressing and superficial problems of preservation
were disposed of, and we were on the point of beginning the first general survey of the library since its foundation, my eyes began to gutter in their sockets. He
who had given all books into my keeping made me blind so that I should know in whose keeping the keepers stand.”
…“In every library, by ancient precept, is a room reserved for children. In it are kept bright picture books such as children delight in, and a few simple tales
of wonder and adventure. Many children come to these rooms, and so long as they remain within their confines, no interest is taken in them.” He hesitated, and
though I could discern no expression on his face, I received the impression that he feared what he was about to say might cause Cyby pain.
“From time to time, however, a librarian remarks a solitary child, still of tender years, who wanders from the children’s room and at last deserts it entirely.
Such a child eventually discovers, on some low but obscure shelf, The Book of Gold. You have never seen this book, and you will never see it, being past
the age at which it is met.”
“It must be very beautiful”, I said. “It is indeed. Unless my memory betrays me, the cover is of black buckram, considerably faded at the spine. Several of the
signatures are coming out, and certain of the plates have been taken. But it is a remarkably lovely book. I wish that I might find it again, though all books are
shut to me now. The child, as I said, in time discovers The Book of Gold. Then the librarians come—like vampires, some say, but others say like the fairy
godparents at a christening. They speak to the child, and the child joins them. Henceforth he is in the library wherever he may be, and soon his parents know him
The limit of the Willing Suspension of Disbelief for a given element is directly proportional to its awesomeness.
Stated another way, all but the most pedantic of viewers will forgive liberties with reality as long as the result is wicked sweet or awesome. This applies to
the audience in general; there will naturally be a different threshold for each individual. Also known in some circles as a “rad herring”, in which something
doesn’t make sense within the guidelines of the story’s reality, but it’s too cool not to include it…Since it’s subjective, it doesn’t have to be cool in
the sense of “Grim reaper on a mountain playing an electric guitar”. The protagonist might not use guns because it’s cooler to have them fight vampires with knives
and stakes. You might have Missing Parent Syndrome because it would be weird to have parents with you on a road trip across the country. Basically, Rule of Cool
works differently for whichever genre you’re writing for.
When a character does something evil, cruel or very mean for no apparent gain, because the author wants to demonstrate that he’s not a nice guy and shift
audience sympathy away from him.
Why this trope works could be expressed in the words of William Cowper: “I would not enter on my list of friends (though graced with polished manners and fine
sense, yet wanting sensibility) the man who needlessly sets foot upon a worm.” In other words, a cruel act, no matter how trivial, establishes someone as a cruel
person. Conversely, the creator may show a character being kind for no apparent gain, to demonstrate that the character is a nice person and someone the audience
is meant to cheer for. Both devices are used to help the audience become emotionally invested in the story.
What separates this trope from a character’s other evil or cruel acts is that this bit of evil is gratuitous. It doesn’t get the character anything or even
advance the plot. The sole reason for this story beat existing is to place one or more characters squarely on the wrong side of the Rule of Empathy.
The brms package provides an interface to fit Bayesian generalized (non-)linear multivariate multilevel models using Stan, which is a C++ package for performing full Bayesian inference. The formula syntax is very similar to that of the package lme4 to provide a familiar and simple interface for performing regression analyses. A wide range of
response distributions are supported, allowing users to fit—among others—linear, robust linear, count data, survival, response times, ordinal, zero-inflated, and
even self-defined mixture models all in a multilevel
context. Further modeling options include non-linear and smooth terms, auto-correlation structures, censored data, missing value imputation, and quite a few more.
In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Multivariate models (ie., models with
multiple response variables) can be fit, as well. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their beliefs. Model
fit can easily be assessed and compared with posterior predictive checks, cross-validation, and Bayes factors.