Saturday, June 07, 2014

Fun With Ngrams: Grateful Dead and Jerry Garcia, 1965-1995

Here's the bigram relevant to evaluating some hypotheses I ventured in my"Garcia and Marley" post. 
Figure xxx. Google Bigram: "Grateful Dead" and "Jerry Garcia", 1965-1995

In that post, I hypothesized that "Jerry Garcia" and "Grateful Dead" would show an uptick 1986. On visual inspection the GD claim looks rightish: the 1986-1995 slope should be statistically significantly larger than the 1965-1985 slope. I am sure there's a simple test for that kind of structural break.

I think visual inspection suggests an uptick for "Jerry Garcia" as well, but there I'd really like to see the statistic. There's a substantive point that becomes an inferential point: The Grateful Dead was massively more popular (as measured by appearances in printed materials, and probably on any other sensible measure) than Jerry Garcia, per se. We know this. But, in books, it's about a 10x difference by the end, with the spread especially increasing 1986 and forward (i.e., post "Touch of Grey"). I'd hypothesize (call it H20140607c) that the two phrases co-appear more robustly in the later period, i.e., that Garcia is partly subsumed under the GD behemoth. That'd be appropriate.

The inferential point is that we can't really tell about the Garcia curve through visual inspection, because there's no way to set a separate vertical axis, which we need (per the substantive point I just made). So, here's just "Jerry Garcia":
Figure xxx. Google Unigram: "Jerry Garcia", 1965-1995

I think the answer is that, yes, there's a 1986 and post uptick, but, again, I'd like to run the statistic. In the meantime, pictures are fun.

