Shannon information in Roman coin legends

I’ve received funding for a Research Associate for the spring semester (S19) from the J.D. Power Center for Liberal Arts in the World. I’ll be working with Thomas Posillico ‘20 to analyze Roman imperial coin legends (that is, the text on Roman imperial coins: numismatists have their own language for everything).

We plan to look at the Shannon information (or entropy) of the openly available corpus of roughly 100,000 tweet-length texts from the Online Coins of the Roman Empire (download from http://nomisma.org/). Our hypothesis is that we may gain insight into ancient practices of writing by thinking about texts adapted to extreme constraints (like the size of a coin) as a compressed encoding of a message.

I’ve recently set up this jekyll-based site using the Hydejack theme, so we’ll be able to include LaTeX-formatted math in posts. If our hypothesis is borne out, we’ll have the option of explaining exactly what we mean by “entropy.”

For a sample of n outcomes \mathbf x = (x_1, x_2,...,x_n) chosen from some larger number of different possible outcomes, the average Shannon information (entropy) of a single outcome x_j is given by

\overline h(\textbf x) = \frac{1}{n}\sum_{j=1}^{n} h(x_j)  \text{ bits}

or

= \frac{1}{n}\sum_{j=1}^{n} \mathrm {log} \frac{1} {p(x_j)} \text { bits}