Phylogenetic reconstruction of Anglo-Saxon month names in De temporum ratione

   The cladogram above (Figure 1) shows the relative similarity of Old English words across the oldest digitized manuscripts of De temporum ratione, chapter 15. The manuscripts were written between 780-836 CE and are the earliest record of Anglo-Saxon month names and calendar terms (Table 1). Descending from a 725 CE original, they contain many spelling variations, with less than 50% consensus in 13 of 30 phrases (Table 2). Since consensus-based reconstruction can also be biased by error propagation, a phylogeny was computed to determine their most likely initial spelling (Table 3):

PhraseAssociationIndexAppearance[1]
monamoonB1
monathmonthA2
giuliDecember/JanuaryA3
solmonathFebruaryA4
rhedmonathMarchA5
eosturmonathAprilC6
trimilciMayA7
lidaJune/JulyB8
ueodmonathAugustB10
halegmonathSeptemberB11
uintirfyllithOctoberB12
blotmonathNovemberB13
modranectmidwinter nightB15
trilidiintercalary yearD17
rhedaMarch goddessA22
eostreApril goddessD24

   Excluding capitalizations, nearly half are spelled differently to the 1943 reconstruction[2] and 1999 translation[3] of De temporum ratione. Also interesting is the consistent spelling of lida and trilidi as lfda and trilfdi in only the oldest manuscript (A). This was either a systematic error or the sole instance of the correct spelling. Both options are possible in one model, but the latter requires all other mentions to descend from copy errors. If lfda was correct, it may be related to the Old English word ælfe[4] for the river Elbe. This is supported by the associated Latin translation of navigable[5] and the different initial appearance of i and f. This is unsupported by the lack of a preceding vowel in lfda.


Methods
Data and R code

   Input data was derived from base transcriptions in plain-text format excluding whitespaces, punctuation, capitalization, and separable latin words. The 30 phrases were considered individually, and all variants present in at least two manuscripts were scored numerically as 0, 1 or NA (Table 4). A neighbor-joining phylogeny was coded in R, with nodes defined as the collective mean of every descendant, and similarity scores as the negative mean absolute difference of each trait. Collective means were used instead of progressive means to limit the bias of individual manuscripts on the reconstructed nodes in situations of close similarity (example FLJM). Mean absolute differences were used instead of Euclidean distances to limit the influence of missing data or NA.

   A leave-one-out cross validation for each manuscript (n=13) and trait (n=41) was done to account for individual biases, and repeated after removing 8 redundant traits (1,9,10,11,13,21,27,35) to allow for systematic changes (example uu for u). Each one resulted in the same overall structure of the 4 color-coded groups shown in the cladogram, which are loosely associated with geography and share unique variants. The red Gallo-Romance group (FLJMBI) used hr in hredmonath and hreda, o in ueodmonath, d in the first drimilci, and replaced e with æ and o with u in the first æustur. The green Franconian group (CGE) added u to uueudmonath and uuintirfyllith, and a distal h to trimilchi. The purple Franconian group (HK) added a proximal h to thrimilci. Both Franconian groups used rh in rhedmonath and rheda, and u in ueudmonath. The brown Irish group (AD) shared an absence of h and a near-absence of i to y substitutions in trimilci, trilidi and uintirfillit, and used rh in rheda.

   The first two principal components of the numeric dataset excluding NA-containing traits were then graphed to check if the 4 groups showed continuous or discrete variation (Figure 2). A uniform manifold approximation and projection[6] was used to verify same-color clustering across all dimensions (Figure 3).

   A second neighbor-joining phylogeny was also coded in R, following the principle of inclusive uncertainty. Each node began as a matrix consisting of all unique non-NA variants of its constituents, and either maintained or expanded uncertainty in non-matching scenarios or else collapsed into strictly matching variants when forming a secondary node. Similarity scores were defined as the percentage of matching traits. This method allows for anachronistic reconstruction unlike the collective-mean phylogeny, but requires a higher trait to sample ratio to resolve nodes that bifurcate close in time, since it is not clearly defined for simultaneously matching clusters. Although it produced the same four color-coded groups, it then encountered the described error matching the green, purple and brown groups. Leave-one-out cross validation of manuscripts produced inconsistent results, and removing the 8 redundant traits resulted in the red group being split, suggesting the model did not fit the data.

   Next, the number of internal inconsistencies and unique variants in each manuscript were counted as measures of reliability (Table 5). Although the first measure can be biased by the inconsistency of the source manuscript and the second by a high copy number, they are both proportional to the mutation rate and transmission number. The most reliable manuscript from each group (FGHA) was then used in an inclusive-uncertainty phylogeny, which produced the same branch order seen in the collective-mean approach.

   A third neighbor-joining phylogeny using progressive means and absolute differences was also performed. This phylogeny maintained the four color-coded groups, but joined brown to green and purple before red. Leave-one-out cross validation of manuscripts reversed this finding in 5 of 13 cases. Removing redundant traits also reversed the initial finding, and continued to produce the same configuration as the collective-mean phylogeny in 12 of 13 leave-one-out cross validations. The collective mean approach was probably most reliable because some manuscripts may have been copied from the same source, making their precise chronological order less significant than their overall average trait values, and because it considers the rarity of variants rather than treating all possibilites equally.

   Three approaches were used to determine the most parsimonious variants of the root node (Table 6), all following the phylogeny produced by collective means. The first used all manuscripts individually (n=13), the second used all variants in each manuscript group (n=4), and the third used only the most reliable manuscript from each group (n=4). Each result shared at least one variant at every trait, however their intersection maintained ambiguity in 6 traits. Two traits (20 & 24) were resolved by comparison to similar traits (6,32 & 14,39), one (30) by considering a previously unclassified unique variant (u after t before r in phrase 6), and the remaining 3 (15,25,40) by considering terminal d a substitute for th in manuscript A (see phrases 11,28). In the third approach where A was the only representative of its group, all consistent unique variants were also possible root variants: lfda (n=5), uintirfyllid (n=3), modronecht (n=1), eostree (n=1).

   Overall, the main limitations of these methods are that reconstruction is limited to the most recent common ancestor, which may not be the initial 725 CE draft, and that classification of traits involves some degree of interpretation. Next steps would be to include the 3 undigitized same-period manuscripts when available, as well as expanding the date range to include other 9th century manuscripts.[7] To compensate for the higher sample number, the entirety of chapter 15 should be transcribed and converted into variant format, not just the Old English words. Edit transcriptions could also be used as potentially independent samples, and models expanded to allow for hybridization scenarios. Additionally, transition likelihoods between variants to convert them to n-dimensional space could be applied (example Δ hred-hered < Δ rhed-hered).


Materials

Table 1. ^ Manuscripts in phylogeny
MsLibraryShelfmarkFoliosDate (bold link)ReferenceBODT
AÖNB WienCod. 152983v780-820 CECLA 1551-
BUB WürzburgM.p.th.f.4648r-48v792-807 CECLA 1413103
CEDDB KölnCod. 10380r-81r801-810 CE-2
DEDDB KölnCod. 83-II104r-104v805 CESCHH 632
EBA VaticanaPal. lat. 144849r810 CEDQH 144890
FSB St. GallenCod. Sang. 25169-70810-820 CEBStK 21010
GBA VaticanaPal. lat. 144952v-53v812 CEDQH 144991
HBSB MünchenClm 1472563v-65r1st quarter 9th centuryVKHBS 2457
IUB LeidenSCA 2858v-59r~816 CE-42
JBNF ParisLatin 1301371r-71v1st third 9th century-70
KUB Kassel2° Ms. astron. 224r-24v1st third & mid 9th centuryBStK 3253
LB GenèveMs. lat. 5060v-61r~825 C-12
MVBA MilanoD 30 inf.47v-48r~836 CE-14

Table 2. ^ Old English transcripts
PhraseABCDEFGHIJKLM
1_amonamonamaonmonamonamonamonamonamoenamonamonamona
2monathmonathmonathmonathmonathmonathmonathmonatismonathmonathmonathmonathmonath
3giuligiuligiuligiuliguligiuligiuligiuligiul_giuligiuligiuligiuli
4solmonathsolmonathsolmonathsolmanathsolmonathsolmonathsolmonathsolmonathsolmonathsolmonathsolmonathsolmonathsolmonath
5rhed_nathhredmonathrehdmonathremmonathredmonathhredmonathrethmonathrhedmonathhredmonathheredmonathrhedmonathhredmonathredmonath
6eustormonathaesturmunatheosturmonatheosturmanatheosturmonathaeusturmonatheosturmonatheosturmonathesturmunathaeusturmonatheosturmonatheusturmonathaeusturmonath
7trimilcidrymilcitrimilchitrimilcitrimilchidrimylcitrimilchithrimilcidrimilcidrimylcithrimilcidrimylcidrymylci
8lfdalidalidalidalidalidalidalidalidalidalidalidalida
9lfdalidalidalidalidalidalidalidalidalidalidalidalida
10ueodmonadueodmonathuueudmonathueutmonathuueudmonathueodmonathuueudmonathueudmonathueodmonathueodmonathueudmonathueodmonathuuaeodmonath
11halegmonathhalegmonathalegmonathalegmonathhalegmonathalegmonathhalegmonathhalegmonathhalegmonathhalegmonathhaleggemonathalegmonathhalegmonath
12_ntirfylliduintirfyllithuuintirfillithuintirfilituuintyrfyllithuintirfyllithuuintirfillithuintirfyllithuintirfillituintirfyllithuintirfyllithuintirfyllituintirfyllit
13blothmonadblotmonathblatmonathblotmonathblodmonathblodmonathblotmonathblotmonathblotmanothblotmonathblotmonathlotmonathblothmonath
14giuligiuligiuligiuligiuligiuligiuligiuligiuligiuligiuligiuligiliu
15modronechtmodranectmodranectmodranectmoderaneamodranectmodranectmodranectmodranectmodranectmodranehtmodranectmodranect
16_fdalidalidalidalidalidalidalidalidalidalidalidalida
17trilf_thrilidithriliditrilidithrilidithrilidithrilidithriliditrhilidithrilidithriliditrilidithrylidi
18_rfilliduinthyrfyllthuuintirfillithuintirfillituuintyrfyllituintirfyllithuuintyrfyllithuintirfyllithuinthyrfyllituintirfullithuintirfyllithuintirfillithuinturfyllith
19gi_ligiuligiuligiulisinguiligiuligiuligiuligiuligiuligiuligiligiuli
20s_nathsolmanothsolmonathsolmonathsolmonathsolmonathsolmonathsolmanothsolmonathsolmonathsolmanothsolmonathsolmonath
21rhedmonadhredmonathrethmonathredmonathredmonathhredmonathrhedmonathrhedmonathhredmonathhredmonathruedmonathhredmonathhredmonath
22rhedahredarhedarhedarhedahredarehdarhedahredahredarhedahredahreda
23eustormo_atheosturmonatheosturmonatheostormonatheosturmonatheosturmonatheosturmonatheostrmonatheosturmunateusturmonatheosturmonatheosturmonatheosturmonath
24eostreeeostraeeostraeeostreeostraeeostreeostraeeostraeeostreeustereostraeeostraeeostre
25trimilcitrimylcitrimilchitrimilcitrimilchithrimylcitrimilchrthrimilcitrimilcitrimilchitheimilcithrimilcithrymilci
26lfdalidarulidalidalidalidalidalidalidalidalidalidalida
27ueodmonadueothmonathuueudmonathneutmonathueuudmonathueodmonathuueudmonathueudmonathveodmonathueodmonathueudmonathueodmonathueodmanath
28halegmonadhalegmonathhalegmonathhalegmonathhalegmonathhalegmonathhalegmonathhalegmonathhalegmo_athhalegmonathaligmonathhalegmonathhalegmonath
29uintirf_lliduintirfyllithuuintyrsyllithuintirfillituuintyrfyllituintirfyllithuuintyrfyllithuintirfyllithuinthyrfyllithuintirfillituintirfyllithuintirfillithuintirfyllith
30blothmonadblothmonathblotmonathblothmonathblodmonathblotmonathblotmonathblotmonathblothmonathblothmonathblotmonathblotmonathhlothmonath

Table 4. ^ Variant assignments
TraitPhraseCode 1Code 0ABCDEFGHIJKLM
15h before rh after r before m010NANA1001101NA
26a before eno a before e0100010001001
36o before rno o before r1011101100100
46u after mno u after m0100000010000
57starts with dstarts with t0100010011011
67h before rno h before r0000000100100
77y before mno y before m0100000000001
87y after mno y after m0000010001011
97h after cno h after c0010101000000
1010starts with uudoes not start with uu0010101000001
1110o before mno o before m1100010011011
1211starts with hstarts with a1100101111101
1312starts with uudoes not start with uuNA010101000000
1412y after fno y after f1100110101111
1512ends with hdoes not end with h0110111101100
1613d before mno d before m0000110000000
1713h before mno h before m1000000000001
1815c after eno c after e1111011111011
1915h before tno h before t1000000000100
2017h before lno h before l0110111111101
2118starts with uudoes not start with uuNA010101000000
2218h before rno h before rNA100000010000
2318y before rno y before rNA100101010000
2418y after fno y after f0100111110101
2518ends with hdoes not end with h0110011101111
2620a before nno a before nNA100000100100
2721h before rh after r before m010NANA10011NA11
2822h before rh after r0100010011011
2923o before sno o before s0111111110111
3023o after t before rno o after t before r1001000000000
3124a after rno a after r0110101100110
3225h before mno h before m0000010100111
3325y after mno y after m0100010000000
3425h after cno h after c0010101001000
3527uu before mno uu before m0010101000000
3627o before mno o before m1100010011011
3729starts with uudoes not start with uu0010101000000
3829y before rno y before r0010101010000
3929y after fno y after fNA110111110101
4029ends with hdoes not end with h0110011110111
4130h before mno h before m1101000011001

Table 5. ^ Internal inconsistencies (α) and unique variants (β)
Msαβmona(mona)_thgiulisolrhedeost(eost)_urtrimilcilid(lid)_aueodhaleguintirfyllithblotmodranect(trilid)_i(rhed)_a(eost)_rae
A35monad
th
giulisolrhedeust
eost
ortrimilcilfdaueodhaleguintirfyllid
uintirfillid
blothmodronechtNAaree
B104mona
muna
mano
thgiulisolhredaest
eost
urdry
thri
tri
milci
mylci
lida
aru
ueod
ueoth
haleguintirfyllith
uinthyrfyllth
blot
bloth
modranectiarae
C62monathgiulisolrehd
reth
rhed
eosturtri
thri
milchilidauueudaleg
haleg
uuintirfillith
uuintyrsyllith
blat
blot
modranectiarae
D95maon
mona
mana
thgiulisolrem
red
rhed
eostur
or
trimilcilidaueut
neut
aleg
haleg
uintirfilit
uintirfillit
blot
bloth
modranectiare
E65monathguli
giuli
singuili
solred
rhed
eosturtri
thri
milchilidauueud
ueuud
haleguuintyrfyllith
uuintyrfyllit
blodmoderaneaiarae
F41monathgiulisolhredaeust
eost
urdri
thri
mylcilidaueodaleg
haleg
uintirfyllithblod
blot
modranectiare
G51monathgiulisolreth
rhed
rehd
eosturtri
thri
milchi
milchr
lidauueudhaleguuintirfillith
uuintyrfyllith
blotmodranectiarae
H32mona
mano
tis
th
giulisolrhedeostur
r
thrimilcilidaueudhaleguintirfyllithblotmodranectiarae
I105mona
muna
mano
th
t
giulisolhredest
eost
urdri
trhi
tri
milcilidaueod
veod
haleguintirfillit
uinthyrfyllit
uinthyrfyllith
blot
bloth
modranectiare
J93moena
mona
thgiulisolhered
hred
aeust
eust
urdri
thri
tri
mylci
milchi
lidaueodhaleguintirfyllith
uintirfullith
uintirfillit
blot
bloth
modranectiaer
K54mona
mano
thgiulisolrhed
rued
eosturthri
thei
milcilidaueudhalegge
alig
uintirfyllithblotmodranehtiarae
L83monathgiuli
gili
solhredeust
eost
urdri
tri
thri
mylci
milci
lidaueodaleg
haleg
uintirfyllit
uintirfillith
lot
blot
modranectiarae
M106mona
mana
thgiuli
giliu
solred
hred
aeust
eost
urdry
thry
mylci
milci
lidauuaeod
ueod
haleguintirfyllit
uinturfyllith
uintirfyllith
bloth
hloth
modranectiare

Table 6. ^ Root variant reconstructions (R1-3), intersection (R0) and resolution (R)
TraitPhraseCode 1Code 0R1R2R3R0R
15h before rh after r before m00000
26a before eno a before e00000
36o before rno o before r11111
46u after mno u after m00000
57starts with dstarts with t00000
67h before rno h before r00000
77y before mno y before m00000
87y after mno y after m00000
97h after cno h after c00000
1010starts with uudoes not start with uu00000
1110o before mno o before m0101111
1211starts with hstarts with a11111
1312starts with uudoes not start with uu00000
1412y after fno y after f11111
1512ends with hdoes not end with h010101011
1613d before mno d before m00000
1713h before mno h before m000100
1815c after eno c after e11111
1915h before tno h before t000100
2017h before lno h before l010101010
2118starts with uudoes not start with uu00000
2218h before rno h before r00000
2318y before rno y before r00000
2418y after fno y after f010101011
2518ends with hdoes not end with h010101011
2620a before nno a before n00000
2721h before rh after r before m00000
2822h before rh after r00000
2923o before sno o before s110111
3023o after t before rno o after t before r010101010
3124a after rno a after r001000
3225h before mno h before m000100
3325y after mno y after m00000
3425h after cno h after c00000
3527uu before mno uu before m00000
3627o before mno o before m0101111
3729starts with uudoes not start with uu00000
3829y before rno y before r00000
3929y after fno y after f0101111
4029ends with hdoes not end with h010101011
4130h before mno h before m1010111


Notes

1. ^ The initial appearance is based on the alphabet used in the earliest surviving fragment of De temporum ratione (725 CE), shelfmark Hs-4262 in the Universitäts- und Landesbibliothek Darmstadt.

2. ^ In Bedae Opera de temporibus (Jones 1943), pages 211-212: mona, monath, giuli, solmonath, hredmonath, eosturmonath, thrimilchi, lida, vveodmonath, halegmonath, vvinterfilleth, blodmonath, modranect, thrilidi, Hreda, Eostre. Digitized copy.

3. ^ In Bede The Reckoning of Time (Wallis 1999), pages 53-54: mona, monath, Giuli, Solmonath, Hrethmonath, Eosturmonath, Thrimilchi, Litha, Weodmonath, Halegmonath, Winterfilleth, Blodmonath, Modranecht, Thrilithi, Hretha, Eostre. Digitized copy.

4. ^ First attested in an Old English adaptation of Historia adversus paganos (early 9th century) on folio 7v, shelfmark Add MS 47967 in the British Library.

be ƿestan eald seaxum is ælfe muþa
to the west of old saxony is the elbe's mouth

Also translated to modern English in The Anglo-Saxon Version (Barrington 1773), page 8. Digitized copy.

5. ^ In De temporum ratione (780-820 CE) on folio 3v, shelfmark Cod. 15298 in the Österreichische Nationalbibliothek.

lfda dr̄ blandus sū nauigabilis
lfda is called bland or navigable


6. ^ R package umap based on UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction (McInnes 2018).

7. ^ De temporum ratione manuscripts reviewed here.

Archived on 2020-07-24.