While we hypothesized that the frequency of a word's usage would be positively correlated with time, we did not find much of a correlation at all. Regression analyses using Microsoft Excel demonstrated that a word's popularity is only very weakly correlated with time; the R-square value for "Mephistophelian" was ~0.019 and "schlumpy"'s was ~0.157. Nor did we observe a spike in popularity after the Oxford Dictionary posted the words' induction in 2013.

However, we did find a strong correlation between the passage of time and a word's interactivity (retweet rate and favorite rate). This correlation suggests that the words share in the popular consciousness did in fact grow over time.

Finally, our gender analysis revealed that women use "schlumpy" at a significantly higher relative rate than they do "Mephistophelian." We will not here attempt an explanation for this discrepancy.

Looking forward

If we were to continue our investigation, we would focus on fixing the scraper so that we could collect datasets for the other words we hoped to study. Our present corpus is rather small and includes only very uncommon words, so we probably do not have a representative sample of neologisms; this might explain why we were not able to find strong correlations in the data. Gathering data for more common words would also permit additional types of research that we were not able to conduct with our limited corpus. For example, we had hoped to do a geographic analysis by plotting neologism usage in four dimensions—the three dimensions of space and the one of time—but since so few users include their location on their profile, our corpus did not offer enough geographic data points.

Additionally, we would like to add a control to our research. The best solution we came up with was to scrape any tweet with a specific function word like: “the.” This would allow us to see the frequency of tweets being produced over time, the likelihood of them being liked/retweeted, and the gender of the person tweeting. At the moment, we do not have a concrete reason for the presence of any correlations. Interaction with tweets in our corpus increased, but this might come from a change in twitter etiquette and not an increase of interaction with the neologisms.

Creative Commons License
Twitter Neologisms by JR, Tom McIntyre, Shawn Kurta is licensed under a Creative Commons Attribution 4.0 International License.