About this project

Origin

Roman Letters grows out of a simple observation in Patrick Wyman's 2016 USC dissertation, Letters, Mobility, and the Fall of the Roman Empire: the late Roman world left behind an extraordinary volume of surviving correspondence. Senators, bishops, monks, and imperial officials all relied on letters to maintain relationships across vast distances, and many of those letters still exist, scattered across digital archives and critical editions.

This project collects that scattered corpus into a single, structured database and provides tools for exploring the communication networks it reveals.

The dataset

The database currently contains:

7,049
Letters
43
Collections
1,516
People identified
4,410
First English translations
4,465
With distance data
2,084
Topic-tagged
478
Carrier mentions
81
Author bios

Letters span from roughly 97 to 800 AD, covering the transition from the unified Roman Empire to the early medieval kingdoms of western Europe. Major collections include the letters of Augustine, Gregory the Great, Symmachus, Basil of Caesarea, Jerome, Cassiodorus, and Sidonius Apollinaris, among others.

Methodology

Text collection

Texts were collected by scraping and parsing freely available digital sources. Each source required a custom parser to handle its markup, encoding, and structure. Latin and Greek originals came primarily from The Latin Library, Tertullian.org, Perseus Digital Library, and OpenGreekAndLatin's First1KGreek project (CSEL XML editions). English translations came primarily from New Advent (the Nicene and Post-Nicene Fathers series) and Tertullian.org. Additional volumes were drawn from Internet Archive scans of MGH and CSEL print editions, Latin Wikisource, the Fordham Medieval Sourcebook, Livius.org, and Demonax.info. Where OCR-sourced text was used (particularly Patrologia Graeca volumes from Internet Archive), the raw text was cleaned to remove scanning artifacts before import.

Translations

Every letter has a modern English translation. For letters where a public-domain English translation already existed (primarily the NPNF series), the modern translation was produced by modernizing that 19th-century text using Claude (Anthropic). For the 4,410 letters with no prior English translation, the modern translation was produced directly from the Latin or Greek original using Claude, guided by a detailed internal translation guide covering late antique epistolary conventions, register, and rhetorical style. All AI-assisted translations are labeled in the interface. They are provided for accessibility and research convenience, not as authoritative scholarly translations. The original Latin or Greek text is preserved alongside every translation.

A quality note: bulk translations of Greek collections (particularly Isidore of Pelusium and Libanius) are thematic renderings rather than precise philological translations. OCR-sourced texts from Patrologia Graeca may contain scanning artifacts in the Latin and Greek originals. Corrections from domain experts are welcome.

Recipient and sender identification

Sender and recipient names were extracted by automated parsing of letter headers (e.g., "To Eusebius", "Augustine to Paulinus") and then reconciled against a shared people table. Common variants, nicknames, and titles were normalized during a manual review pass. Where a letter's addressee is unknown or disputed, the recipient is recorded as "Unknown" or left blank.

Location assignment and confidence levels

Geographic coordinates for letter origins and destinations were assigned in three tiers, following Wyman's methodology:

  • Strong - historically established location, confirmed by prosopographic data or explicit mention in the letter
  • Approximate - inferred from collection context, known residence periods, or regional information
  • Unknown - no reliable location data; coordinates not used in distance calculations

Confidence levels are displayed on individual letter pages and used to filter the map and network visualizations.

Distance calculation

Straight-line distances between sender and recipient locations are computed using the haversine formula (great-circle distance between two latitude/longitude coordinates). Only letters with "strong" or "approximate" confidence on both endpoints are included in distance calculations.

Road routing

The map timelapse can optionally display routed paths along the ancient Roman road network rather than straight-line arcs. Road data comes from the Ancient World Mapping Center (AWMC) Barrington Atlas road network, provided as GeoJSON. Paths are computed using BFS (breadth-first search) over the road graph, snapping letter endpoints to the nearest road node. Where no road path can be found, the display falls back to a straight-line arc.

Sources

Digital text sources

Map and geographic data

Scholarly references

  • Patrick Wyman, Letters, Mobility, and the Fall of the Roman Empire (PhD dissertation, University of Southern California, 2016) - the primary scholarly framework for network analysis and geographic scope
  • Cristiana Sogno, Bradley K. Storin, and Edward J. Watts (eds.), Late Antique Letter Collections: A Critical Introduction and Reference Guide (University of California Press, 2017) - reference for collection scope and authorship context

AI translation transparency

Modern English translations were produced using Claude (Anthropic), working from either the Latin/Greek original or an existing 19th-century English version. Translation work was guided by two internal documents: a translation guide covering late antique epistolary conventions, rhetorical register, and how to handle common formulaic phrases; and a modern voice guide specifying tone, vocabulary level, and how to avoid archaism while remaining faithful to the original.

AI-generated translations are clearly marked in the interface. They are provided for accessibility and research convenience, not as authoritative scholarly translations. The original Latin or Greek is preserved alongside every translation, and 19th-century English versions are shown where available. Corrections from domain experts are welcome.

License

  • Code - MIT License. Source available on GitHub.
  • Data and translations - CC BY 4.0. Attribution: Roman Letters / romanletters.org.
  • Map tiles - DARE tiles are CC BY 4.0. Credit: Johan Ahlfeldt, Digital Atlas of the Roman Empire.
  • Road data - AWMC Barrington Atlas road network, available under ODbL.

Credits

Open source

The full dataset, scraping scripts, and this website are open source and available on GitHub.