-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpaper.tex
2293 lines (1968 loc) · 114 KB
/
paper.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
\PassOptionsToPackage{dvipsnames,svgnames,x11names}{xcolor}
%
\documentclass[
number,
preprint]{elsarticle}
\usepackage{amsmath,amssymb}
\usepackage{iftex}
\ifPDFTeX
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
\usepackage{unicode-math}
\defaultfontfeatures{Scale=MatchLowercase}
\defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
\usepackage{lmodern}
\ifPDFTeX\else
% xetex/luatex font selection
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
\KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\setlength{\emergencystretch}{3em} % prevent overfull lines
\setcounter{secnumdepth}{5}
% Make \paragraph and \subparagraph free-standing
\makeatletter
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\renewcommand{\paragraph}{
\@ifstar
\xxxParagraphStar
\xxxParagraphNoStar
}
\newcommand{\xxxParagraphStar}[1]{\oldparagraph*{#1}\mbox{}}
\newcommand{\xxxParagraphNoStar}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\subparagraph\undefined\else
\let\oldsubparagraph\subparagraph
\renewcommand{\subparagraph}{
\@ifstar
\xxxSubParagraphStar
\xxxSubParagraphNoStar
}
\newcommand{\xxxSubParagraphStar}[1]{\oldsubparagraph*{#1}\mbox{}}
\newcommand{\xxxSubParagraphNoStar}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
\makeatother
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}\usepackage{longtable,booktabs,array}
\usepackage{calc} % for calculating minipage widths
% Correct order of tables after \paragraph or \subparagraph
\usepackage{etoolbox}
\makeatletter
\patchcmd\longtable{\par}{\if@noskipsec\mbox{}\fi\par}{}{}
\makeatother
% Allow footnotes in longtable head/foot
\IfFileExists{footnotehyper.sty}{\usepackage{footnotehyper}}{\usepackage{footnote}}
\makesavenoteenv{longtable}
\usepackage{graphicx}
\makeatletter
\newsavebox\pandoc@box
\newcommand*\pandocbounded[1]{% scales image to fit in text height/width
\sbox\pandoc@box{#1}%
\Gscale@div\@tempa{\textheight}{\dimexpr\ht\pandoc@box+\dp\pandoc@box\relax}%
\Gscale@div\@tempb{\linewidth}{\wd\pandoc@box}%
\ifdim\@tempb\p@<\@tempa\p@\let\@tempa\@tempb\fi% select the smaller of both
\ifdim\@tempa\p@<\p@\scalebox{\@tempa}{\usebox\pandoc@box}%
\else\usebox{\pandoc@box}%
\fi%
}
% Set default figure placement to htbp
\def\fps@figure{htbp}
\makeatother
% definitions for citeproc citations
\NewDocumentCommand\citeproctext{}{}
\NewDocumentCommand\citeproc{mm}{%
\begingroup\def\citeproctext{#2}\cite{#1}\endgroup}
\makeatletter
% allow citations to break across lines
\let\@cite@ofmt\@firstofone
% avoid brackets around text for \cite:
\def\@biblabel#1{}
\def\@cite#1#2{{#1\if@tempswa , #2\fi}}
\makeatother
\newlength{\cslhangindent}
\setlength{\cslhangindent}{1.5em}
\newlength{\csllabelwidth}
\setlength{\csllabelwidth}{3em}
\newenvironment{CSLReferences}[2] % #1 hanging-indent, #2 entry-spacing
{\begin{list}{}{%
\setlength{\itemindent}{0pt}
\setlength{\leftmargin}{0pt}
\setlength{\parsep}{0pt}
% turn on hanging indent if param 1 is 1
\ifodd #1
\setlength{\leftmargin}{\cslhangindent}
\setlength{\itemindent}{-1\cslhangindent}
\fi
% set entry spacing
\setlength{\itemsep}{#2\baselineskip}}}
{\end{list}}
\usepackage{calc}
\newcommand{\CSLBlock}[1]{\hfill\break\parbox[t]{\linewidth}{\strut\ignorespaces#1\strut}}
\newcommand{\CSLLeftMargin}[1]{\parbox[t]{\csllabelwidth}{\strut#1\strut}}
\newcommand{\CSLRightInline}[1]{\parbox[t]{\linewidth - \csllabelwidth}{\strut#1\strut}}
\newcommand{\CSLIndent}[1]{\hspace{\cslhangindent}#1}
\usepackage{booktabs}
\usepackage{caption}
\usepackage{longtable}
\usepackage{colortbl}
\usepackage{array}
\usepackage{anyfontsize}
\usepackage{multirow}
\makeatletter
\@ifpackageloaded{caption}{}{\usepackage{caption}}
\AtBeginDocument{%
\ifdefined\contentsname
\renewcommand*\contentsname{Table of contents}
\else
\newcommand\contentsname{Table of contents}
\fi
\ifdefined\listfigurename
\renewcommand*\listfigurename{List of Figures}
\else
\newcommand\listfigurename{List of Figures}
\fi
\ifdefined\listtablename
\renewcommand*\listtablename{List of Tables}
\else
\newcommand\listtablename{List of Tables}
\fi
\ifdefined\figurename
\renewcommand*\figurename{Figure}
\else
\newcommand\figurename{Figure}
\fi
\ifdefined\tablename
\renewcommand*\tablename{Table}
\else
\newcommand\tablename{Table}
\fi
}
\@ifpackageloaded{float}{}{\usepackage{float}}
\floatstyle{ruled}
\@ifundefined{c@chapter}{\newfloat{codelisting}{h}{lop}}{\newfloat{codelisting}{h}{lop}[chapter]}
\floatname{codelisting}{Listing}
\newcommand*\listoflistings{\listof{codelisting}{List of Listings}}
\makeatother
\makeatletter
\makeatother
\makeatletter
\@ifpackageloaded{caption}{}{\usepackage{caption}}
\@ifpackageloaded{subcaption}{}{\usepackage{subcaption}}
\makeatother
\journal{Journal of Computer Applications in Archaeology}
\usepackage{bookmark}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\urlstyle{same} % disable monospaced font for URLs
\hypersetup{
pdftitle={XRONOS: An Open Data Infrastructure for Archaeological Chronology},
pdfauthor={Joe Roe; Clemens Schmid; Setareh Ebrahimiabareghi; Caroline Heitz; Martin Hinz},
pdfkeywords={open data, chronology, chronometry, radiocarbon
dating, dendrochronology, typological dating},
colorlinks=true,
linkcolor={blue},
filecolor={Maroon},
citecolor={Blue},
urlcolor={Blue},
pdfcreator={LaTeX via pandoc}}
\setlength{\parindent}{6pt}
\begin{document}
\begin{frontmatter}
\title{XRONOS: An Open Data Infrastructure for Archaeological
Chronology}
\author[1]{Joe Roe%
\corref{cor1}%
}
\ead{joeroe@hey.com}
\author[2]{Clemens Schmid%
%
}
\author[1]{Setareh Ebrahimiabareghi%
%
}
\author[1]{Caroline Heitz%
%
}
\author[1]{Martin Hinz%
\corref{cor1}%
}
\ead{martin.hinz@unibe.ch}
\affiliation[1]{organization={University of Bern, Institute of
Archaeological Sciences},,postcodesep={}}
\affiliation[2]{organization={Max Planck Institute for
Geoanthropology},,postcodesep={}}
\cortext[cor1]{Corresponding author}
\begin{abstract}
XRONOS (\url{https://xronos.ch}) is an open data infrastructure for the
backbone of the archaeological record -- chronology. It provides open
access to published radiocarbon dates and other chronometric data from
any period, anywhere in the world. By collating a large number of
existing regional and global compilations of dates, XRONOS offers the
most comprehensive radiocarbon database yet published, with over 350,000
radiocarbon and 75,000 site records. It also provides a foundation for
expanding the systematic collection of chronometric information beyond
radiocarbon, with support for typological and dendrochronological dates
and a generalisable data model that can be adapted to other methods of
absolute dating. Automated and semi-automated quality control processes
ensure that data from diverse sources is continuously integrated and
standardised, making it easier to find information of interest and
reducing the need for manual data cleaning by end users. In this paper
we describe the concept and implementation of XRONOS in relation to the
state of the art in chronometric data-sharing, and evaluate its
potential as a general-purpose open repository and curation platform for
archaeological chronology.
\end{abstract}
\begin{keyword}
open data \sep chronology \sep chronometry \sep radiocarbon
dating \sep dendrochronology \sep
typological dating
\end{keyword}
\end{frontmatter}
\section{Introduction}\label{introduction}
Chronology is the backbone of the archaeological record. As a necessary
prerequisite to understanding the context of any past event or process
(Lucas, 2004), it is has unsurprisingly been at the forefront of
methodological development in archaeology for as long as the discipline
has existed: from putting finds and events in sequence (Ford, 1962;
Harris, 1979; Petrie, 1899; Thomsen, 1836; Worsaae, 1843) to an
increasingly wide array of scientific methods that place them on an
absolute timescale (Bada and Helfman, 1975; Daniels et al., 1953;
Douglass, 1929; Evernden et al., 1965; Libby, 1955) and an increasingly
sophisticated set of statistical tools to build them into chronologies
(Buck et al., 1991; Crema, 2024; Levy et al., 2021; Mischka, 2004;
Suess, 1967). If archaeology is to be an open science (Lake, 2012), it
is therefore critical that effective open access to chronological
information be placed front and centre.
Over the last two decades, archaeologists have answered this call by
publishing an increasing number of compilations of dates from
archaeological contexts as open data. These efforts have facilitated
re-evaluations of chronologies themselves (e.g. Higham et al., 2014;
Katsianis et al., 2020; Loftus et al., 2019; Prates et al., 2020) but
also the development of novel ways of using chronological data (Crema et
al., 2024; Crema, 2022; e.g. Grove, 2011; Marom and Wolkowski, 2024;
Riris et al., 2024; Silva and Steele, 2014). The focus has been
overwhelmingly on radiocarbon dating and most compilations focus on a
single region and/or period. The profusion of open radiocarbon data in
particular has prompted several initiatives towards a global synthesis
(Bird et al., 2022; Bronk Ramsey et al., 2019; e.g. Schmid et al.,
2019).
At the same time, the broad range of other types of chronological
information used in archaeology---from other radiometric methods to
dendrochronology to typological dating and epigraphy---remains
relatively difficult to access as open data. Even when it comes to
radiocarbon data, the coverage of available compilations is patchy both
geographically and in time and of variable quality (see
Section~\ref{sec-c14-compilation}). The publication of many overlapping,
non-standardised and mostly static open data resources means that it is
still difficult to obtain reliable and up-to-date chronological
datasets, especially for applications that crosscut convential
geographic and temporal domains of research. Initiatives towards
synthesis have improved this situation, but the goal of a global dataset
that is both comprehensive and up-to-date remains elusive.
XRONOS is a new open data infrastructure that aims to provide access to
published radiocarbon dates and other chronometric data from any period,
anywhere in the world. It is our attempt to move the state of the art in
open archaeological chronology beyond the publication of static, one-off
resources (`uploading CSVs,' Batist, 2023, pp. 188--189), and towards a
living digital infrastructure (Kintigh, 2006) embedded in a transparent
and sustainable collabrative network. The core of XRONOS is a server
application that ingests chronological data from diverse sources,
facilitates semi-automated and manual curation of this data, and makes
it available via both a web-based graphical user interface (GUI) and
machine-readable application programming interface (API). The web
frontend can be accessed via \url{https://xronos.ch} and all components
of the software are developed as free and open source software with
source code available at \url{https://github.com/xronos-ch}.
In the remainder of this paper, we describe the concept and
implementation of XRONOS in relation to the state of the art in open
chronometric data in archaeology, and evaluate our progress in achieving
these goals as of writing. Since we envisage both XRONOS as a dataset
and XRONOS as software to be continually developing resources, the
description here should be read as a `snapshot' of the project as of
writing rather than its final state.
\section{State of the Art}\label{state-of-the-art}
\subsection{Compilations of radiocarbon
dates}\label{sec-c14-compilation}
Though an \emph{explicit} emphasis on `open data' is a relatively recent
phenomenon in archaeology (Lake, 2012), the open publication of compiled
radiocarbon dates has a substantial prehistory. Arnold and Libby (Arnold
and Libby, 1951) initiated the tradition of regularly publishing `data
lists', a practice was subsequently continued by radiocarbon
laboratories as supplements to journals such as \emph{Radiocarbon} and
\emph{Archaeometry}. However, as the number of labs and volume of
radiocarbon dates being produced grew, this paper-based format became
impractical and mostly disappeared (Bronk Ramsey et al., 2019; c.f. e.g.
Ndeye et al., 2022), without being replaced by another form of
systematic data-sharing or dissemination. Additionally, because date
lists were sourced from radiocarbon laboratories directly---not from
those who collected the sample---they typically included only very
limited contextual information. On the eve of the AMS revolution there
was an effort to create a computerised `International Radiocarbon
Database' (Kra, 1988)---already by 1989 described as a ``much needed,
long overdue enterprise'' (Kra, 1989, p. 1067)---but it never came to
fruition.
Thus, even though radiocarbon data comes from a relatively limited
number of sources (some 172 active labs, Radiocarbon, 2024) and has
relatively standardised reporting conventions (Bayliss, 2015; Millard,
2014), in practice the only way to produce aggregated datasets in recent
decades has been to manually search through relevant literature for
dates reported secondarily by the submitter of the sample. This already
laborious process is further hampered by a significant inconsistency in
how much authors adhere to reporting conventions for measurements and
sample metadata, a lack of conventions on the reporting of
\emph{contextual} information, weak or nonexistent disciplinary norms
regarding the responsibility to publish results openly in a timely
fashion, and a range of other issues affecting data reuse (Moody et al.,
2021).
\begin{table}
\caption{\label{tbl-c14-datasets}Summary of published compilations of
radiocarbon dates. For full data, see supplementary materials.}
\centering{
\fontsize{9.8pt}{11.7pt}\selectfont
\begin{tabular*}{\linewidth}{@{\extracolsep{\fill}}>{\raggedright\arraybackslash}p{\dimexpr 0.40\linewidth -2\tabcolsep-1.5\arrayrulewidth}>{\raggedleft\arraybackslash}p{\dimexpr 0.15\linewidth -2\tabcolsep-1.5\arrayrulewidth}>{\raggedleft\arraybackslash}p{\dimexpr 0.15\linewidth -2\tabcolsep-1.5\arrayrulewidth}>{\raggedright\arraybackslash}p{\dimexpr 0.30\linewidth -2\tabcolsep-1.5\arrayrulewidth}}
\toprule
Database & Published & Dates & References \\
\midrule\addlinespace[2.5pt]
\href{https://bda.huma-num.fr/}{Base de Données Archéologique} & 1994 & 7,000 & @Perrin2019 \\
\href{https://andesc14.pl}{ANDES 14C} & 1994 & 5,800 & @MichczynskiEtAl1995 \\
\href{http://www.tayproject.org/enghome.html}{Archaeological Settlements of Turkey} & 1998 & 1,600 & @TanindiErdogu2005 \\
\href{https://www.canadianarchaeology.ca/}{Canadian Archaeological Radiocarbon Database} & 1999 & 171,500 & @GajewskiEtAl2011; @KellyEtAl2022 \\
\href{https://radonb.ufg.uni-kiel.de/}{RADO.NB (incl. RADON and RADON-B)} & 1999 & 34,200 & @Raetzel-Fabian1999; @RADON; @RADONB; @RADO.NB \\
\href{https://www.waikato.ac.nz/nzcd}{New Zealand Radiocarbon Database} & 2000 & 2,000 & @McFadgenEtAl2000 \\
\href{http://web.archive.org/web/20080509082232/http://www.canew.org/index.html}{International Central Anatolian Neolithic e-Workshop databases} & 2001 & 1,000 & @ReingruberThissen2005; @ReingruberThissen2009 \\
\href{https://ees.kuleuven.be/geography/projects/14c-palaeolithic/}{Radiocarbon Palaeolithic Europe Database} & 2002 & 17,900 & @Vermeersch2024 \\
\href{https://zenodo.org/doi/10.5281/zenodo.7215741}{CalPal database} & 2002 & 49,800 & @Weninger2022 \\
\href{http://context-database.uni-koeln.de}{CONTEXT} & 2002 & 2,900 & @BohnerSchyle2004 \\
\href{http://pidba.utk.edu/dating.htm}{Paleoindian Database of the Americas} & 2003 & 1,300 & @AndersonEtAl2010 \\
\href{https://archaeologydataservice.ac.uk/archives/view/austarch_na_2014/}{AustArch (1, 2, and 3)} & 2008 & 5,000 & @WilliamsEtAl2008; @WilliamsSmith2012; @WilliamsSmith2013 \\
\href{https://sites.google.com/site/chapplearchaeology/irish-radiocarbon-dendrochronological-dates}{Irish Radiocarbon \& Dendrochronological Dates} & 2010 & 10,700 & @IRDD \\
\href{https://www.exoriente.org/associated_projects/ppnd.php}{Platform for Neolithic Radiocarbon Dates (PPND)} & 2010 & 800 & @Benz2010 \\
\href{http://www.paleoanthro.org/media/journal/content/PA20110001.pdf}{PACEA Geo-Referenced Radiocarbon Database} & 2011 & 6,000 & @dErricoEtAl2011 \\
\href{https://doi.org/10.1016/j.quaint.2012.08.2052}{Peru archaeological radiocarbon database} & 2013 & 300 & @RademakerEtAl2013 \\
\href{https://telearchaeology.org///EUBAR/}{EUBAR C14 database} & 2014 & 1,700 & @CapuzzoEtAl2014 \\
\href{https://doi.org/10.1016/j.quascirev.2014.05.015}{Wang 2014} & 2014 & 4,700 & @WangEtAl2014 \\
\href{http://www.14sea.org/}{14SEA} & 2015 & 3,000 & @ReingruberThissen2017 \\
\href{https://discovery.ucl.ac.uk/id/eprint/1469811/}{EUROEVOL} & 2015 & 14,100 & @ManningEtAl2016 \\
\href{https://doi.org/10.1016/j.quascirev.2015.06.022}{Flohr et al. 2015} & 2015 & 3,000 & @FlohrEtAl2016 \\
\href{https://doi.org/10.1016/j.quaint.2014.09.076}{South Central Andes Radiocarbon} & 2015 & 1,700 & @GayoEtAl2015 \\
\href{https://github.com/dirkseidensticker/aDRAC}{Archive des datations radiocarbones d'Afrique centrale} & 2016 & 1,900 & @SeidenstickerSchmid2021 \\
\href{https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/NJLNRJ}{KITE East Africa} & 2016 & 800 & @CourtneyMustaphi2016 \\
\href{https://doi.org/10.14324/000.ds.1570274}{Palmisano et al. 2017} & 2017 & 1,900 & @PalmisanoEtAl2017 \\
\href{http://vmtropicar-proto.ird.fr/archeologie/}{Plateforme des datations archéologiques intertropicales} & 2017 & 1,300 & @deSaulieuEtAl2017 \\
\href{http://www.idearqueologia.org/c14/}{Archivo de Dataciones Radiocarbónicas de la Prehistoria Recient (IDEARQ)} & 2017 & — & @UriarteGonzalezEtAl2017 \\
\href{https://www.rekihaku.ac.jp/up-cgi/login.pl?p=param/esrd/db_param}{Database of Radiocarbon Dates Published in Japanese Archaeological Reports} & 2018 & 44,000 & @Kudo2018; KudoEtAl2023 \\
\href{https://c14.arch.ox.ac.uk/sadb/db}{Southern African Radiocarbon Database} & 2019 & 2,700 & @LoftusEtAl2019 \\
\href{https://doi.org/10.1016/j.quascirev.2019.105878}{Douglass et al. 2019} & 2019 & 200 & @DouglassEtAl2019 \\
\href{https://theia.arch.cam.ac.uk/MedAfriCarbon/}{MedAfriCarbon} & 2020 & 1,600 & @LucariniEtAl2020 \\
\href{https://github.com/clipo/rapanui-radiocarbon}{A repository of radiocarbon data for Rapa Nui} & 2020 & 800 & @Lipo2020 \\
\href{https://github.com/jgregoriods/rxpand}{rxpand} & 2020 & 2,800 & @GregoriodeSouza2020 \\
\href{https://core.tdar.org/project/455305/mesoamerican-radiocarbon-database-mesorad}{Mesoamerican Radiocarbon Database (MesoRAD)} & 2020 & 1,800 & @HoggarthEtAl2021 \\
\href{https://zenodo.org/records/4541470}{AgriChange} & 2020 & 3,700 & @Martinez-GrauEtAl2021 \\
\href{https://github.com/ercrema/jomonPhasesAndPopulation}{Crema \& Kobayashi 2020} & 2020 & 2,100 & @CremaEtAl2016 \\
\href{https://rdr.ucl.ac.uk/articles/dataset/Dataset_for_An_Aegean_history_and_archaeology_written_through_radiocarbon_dates/12489137/1}{Katsianis et al. 2020} & 2020 & 3,200 & @KatsianisEtAl2020 \\
\href{https://www.arqueodata.com/}{ArqueoData} & 2021 & 800 & @Alcantara2021 \\
\href{https://github.com/philriris/caribbean-14C}{caribbean-14C} & 2021 & 2,100 & @Riris2021 \\
\href{https://github.com/apalmisano82/NERD}{Near East Radiocarbon Dates (NERD)} & 2021 & 11,000 & @PalmisanoEtAl2022a \\
\href{https://doi.org/10.13131/archelogicadata-yb11-yb66}{NeoNet} & 2021 & 2,500 & @HuetEtAl2022; @HuetEtAl2024 \\
\href{https://github.com/ercrema/NeolithicKoreaDemography}{Kim et al. 2021} & 2021 & 900 & @KimEtAl2021 \\
\href{https://doi.org/10.1371/journal.pone.0251407}{Cochrane et al. 2021} & 2021 & 100 & @CochraneEtAl2021 \\
\href{https://www.waikato.ac.nz/research/research-services-facilities/radiocarbon-dating/research//nz-radiocarbon-database}{Aotearoa New Zealand Radiocarbon Databas} & 2022 & 4,100 & @PetcheyEtAl2022 \\
\href{https://github.com/apalmisano82/AIDA}{Archive of Italian Radiocarbon Dates (AIDA)} & 2022 & 4,000 & @PalmisanoEtAl2022 \\
\href{https://www.p3k14c.org/}{p3k14c} & 2022 & 179,700 & @BirdEtAl2022 \\
\href{https://research.jcu.edu.au/data/published/7c74f590a2ba11edb22c156e754c4bda/}{Torres Strait Radiocarbon Database} & 2022 & 300 & @LinnenluckeEtAl2023 \\
\href{http://holoceno.iaas.ull.es/14Canarias_ULL/}{14Canarias} & 2023 & 700 & @Pardo-GordoEtAl2023 \\
\href{https://doi.org/10.1016/j.jasrep.2023.103944}{Hoebe et al. 2023} & 2023 & 6,500 & @HoebeEtAl2023 \\
\href{https://core.tdar.org/collection/71234}{Bolivian Radiocarbon Database} & 2023 & 3,000 & @Capriles2023 \\
\href{https://doi.org/10.5281/zenodo.8334722}{An Annotated Compilation of Chronometric Dates for the Middle-Upper Palaeolithic Transition (45-30 ka BP) in Northern Iberia (Spain)} & 2023 & 200 & @Diaz-RodriguezEtAl2023 \\
\href{https://doi.org/10.1371/journal.pone.0291956}{Großman et al. 2023} & 2023 & 3,400 & @GrossmannEtAl2023 \\
\href{https://doi.org/10.1080/0067270X.2023.2215649}{Datations absolues, Inventaires archéologiques et Bibliographies en Afrique Centrale} & 2023 & 1,800 & @ClistEtAl2023 \\
\href{https://doi.org/10.1016/j.quaint.2024.01.012}{Updated Peru archaeological radiocarbon database} & 2024 & 500 & @Rademaker2024 \\
\href{https://banadora.mom.fr/}{Banque Nationale de Données Radiocarbone pour l'Europe et le Proche Orient} & — & — & @BANADORA \\
\href{https://c14.arch.ox.ac.uk/database}{Oxford Radicoarbon Accelerator Unit database} & — & 8,500 & @GillespieEtAl1984; @BronkRamseyEtAl2009 \\
\href{https://c14.arch.ox.ac.uk/egyptdb/db.php}{Egyptian Radiocarbon Database} & — & 1,600 & @RamseyEtAl2010; @DeeEtAl2012; @DeeEtAl2013 \\
\href{https://c14.arch.ox.ac.uk/database}{NERC Radiocarbon Facility} & — & 6,100 & @GarnettEtAl2023 \\
\href{http://www.adias-uae.com/radiocarbon.html}{Abu Dhabi Islands Archaeological Survey Radiocarbon Database} & — & 100 & @Al-Abyadh-Balghelam-Dalma-JebelEtAl \\
\href{http://c14.kikirpa.be/}{Royal Institute for Cultural Heritage web based Radiocarbon database} & — & 7,300 & @VanStrydonckDeRoock \\
\href{https://telearchaeology.org///c14/}{La base de dades radiocarbòniques de Catalunya} & — & — & @BarceloAlvarezEtAl2013 \\
\bottomrule
\end{tabular*}
}
\end{table}%
Despite these inefficiencies, there have been a profusion of published
radiocarbon compilations since the decline of the date list. Our review
of the literature identified 61 published since 1994 (Table
Table~\ref{tbl-c14-datasets} and supplementary materials). This is
almost certainly an undercount, because our firsthand knowledge of
regional literature is limited to Europe and West Asia and many
resources only ever existed in `grey' formats (e.g.~websites that were
not indexed and no longer exist). We also restricted ourselves to
structured datasets disseminated primarily in a digital format; `date
lists' in printed periodicals and gazetteers were excluded.
\begin{figure}
\centering{
\pandocbounded{\includegraphics[keepaspectratio]{figures/fig-c14-datasets-time-1.pdf}}
}
\caption{\label{fig-c14-datasets-time}Cumulative number of radiocarbon
compilations published since 1995}
\end{figure}%
The number of available compilations has increased exponentially since
around 1995 (Figure~\ref{fig-c14-datasets-time}). The first generation
came around the turn of the century and consists mostly of online
databases with a web frontend. These include some databases operated by
radiocarbon labs, for example the Oxford Radiocarbon Lab (ORAU) and the
Belgian Royal Institute for Cultural Heritage (KIK-IRPA), and
essentially represent a continuation of their date lists in a digital
format. The majority, however, were compiled from the literature by
individual researchers interested in a particular region and/or period.
Notable early examples include ANDES 14C in 1994 (Central Andes,
Michczyński et al., 1995), CARD (Canada, Gajewski et al., 2011) and
RADON (Europe, Raetzel-Fabian, 1999) in 1999, and CANEW in 2001 (Near
East, Reingruber and Thissen, 2005). From 2010, coinciding with broader
shifts in scientific publishing (Tenopir et al., 2011), it became more
common to publish standalone `open data' products in the form of journal
supplements, archives in repositories and/or data papers; the
\emph{\href{https://openarchaeologydata.metajnl.com}{Journal of Open
Archaeology Data}}, launched in 2012, has been a prominent venue for
this latter category. Most recently there has been a trend towards
providing version-controlled plain text data via platforms such as
\href{https://github.com}{GitHub}, reflecting the broader adoption of
these tools amongst computational archaeologists over the last decade
(Batist and Roe, 2024). The shift from online databases towards more
static but more preservable open data products is welcome, given how
many databases from the first generation have subsequently ceased to be
accessible. Version-controlled repositories are particular well-suited
to data compilation projects because they allow for continued updates
whilst still providing snapshot `releases' that are citeable and can be
archived in long-term repositories.
\begin{figure}
\centering{
\pandocbounded{\includegraphics[keepaspectratio]{figures/fig-c14-datasets-map-1.pdf}}
}
\caption{\label{fig-c14-datasets-map}Geographic coverage of published
regional radiocarbon compilations according to our survey (see
Supplementary Material).}
\end{figure}%
Although this body of work has greatly improved the accessibility of
radiocarbon dates and supported significant methodological advances
(Crema et al., 2024; Crema, 2022), some limitations are apparent.
The geographic coverage of regional radiocarbon compilations is markedly
uneven (Figure~\ref{fig-c14-datasets-map}). Europe and, especially,
North America are over-represented (Alcántara and Pedroza, in press;
Chaput and Gajewski, 2016). South America, West Asia, and East Asia are
reasonably well-covered, but there practically no systematically
compiled dates from East or West Africa, Central or South Asia, or
Mainland Southeast Asia. This is probably explained in part by a lower
volume of archaeological research and access to radiocarbon dating in
these regions, but a lack of attention in compilation work must also be
a factor. For example, radiocarbon dating has been an established part
of Indian archaeology since at least 1961 (Kusumgar et al., 1963), but
we have not able to locate a single systematic compilation of dates from
South Asia.\footnote{We would be very happy to be corrected on this
point.}
Datasets based on literature review also become out of date almost
immediately upon publication, due the the constant production of new
dates. Unfortunately this applies to many databases that are in theory
continuously updated, as it is common to see them become unmaintained
and or unexpectedly become unavailable. Of the 61 published datasets we
identified, 33 were intended to be continuously updated, but only 13
have received updates in the last two years. The average `lifespan' of a
dataset from its publication to its last update is around 4 years. Most
radiocarbon datasets we reviewed were compiled with a specific goal in
mind (e.g.~a particular analysis) and, even where there is the intention
to keep them updated afterwards, the exigencies of scientific production
combined with the labour-intensive nature of the process make that
difficult to achieve in practice.
Laboratory databases solve the problem of currency, but tend to have
more arbitary coverage, since the inclusion of data is determined by who
submits dates to that lab, not any form of principled curation. There
are also comparatively few of them -- most active labs no longer
directly publish dates that they produce (if they ever did).
Other outstanding problems with existing compilations include various
systematic biases in data collection (Clist et al., 2023) and a large
degree of overlap and duplication between individual databases. For
example, we identified 9 different resources covering Western Europe but
none covering South Asia. The quality and accessibility of published
compilations is also variable. 50 of the 61 resources we reviewed are
not `open' according to the Open Knowledge Foundation's definition of
data openness (``Open data and content can be freely used, modified, and
shared by anyone for any purpose,'' Open Knowledge Foundation, n.d.),
which both limits the access to and reuse potential of these datasets.
And even of these, many are not currently available in readily
machine-readable formats (e.g.~plain text or database files rather than
PDFs or hypertext).
The fragmentation of the radiocarbon record into regional datasets also
hinders analysis at larger scales. Although the core elements of a
radiocarbon date---laboratory identifier, radiocarbon age, measurement
error---are more or less standardised, there is no such consistency in
contextual information on the sample or site. Such contextual
information is important not just for the interpretation of dates, but
for filtering out unreliable dates based on sample information
(`chronometric hygiene' sensu Pettitt et al., 2003) and for correcting
for known systematic errors such as the marine reservoir effect (Alves
et al., 2018). Most published datasets incorporate all or part of
earlier compilations, meaning duplicate records are also very common,
but deduplicating them is not a trivial problem due to format variations
(see Section~\ref{sec-implementation-data}). These issues are by no
means impossible to overcome, but adds a significant amount of
data-cleaning effort to a process that would otherwise be very amenable
to standardisation.
\subsection{Global radiocarbon
compilations}\label{sec-global-compilations}
\begin{figure}
\centering{
\pandocbounded{\includegraphics[keepaspectratio]{figures/fig-c14-global-1.pdf}}
}
\caption{\label{fig-c14-global}Geographic and temporal (sum calibration)
of georeferenced dates in XRONOS and other global radiocarbon
compilations}
\end{figure}%
The profusion of radiocarbon compilations over the last decade has
naturally prompted many to think globally. Three existing initiatives in
particular share similar aims to XRONOS (at least as far as radiocarbon
is concerned): c14bazAAR, IntChron, and p3k14c.
The first available synthetic radiocarbon database was c14bazAAR (Schmid
et al., 2019), an R package that provides an index of openly published
radiocarbon databases and a common interface for retrieving them and
performing basic data cleaning. Because c14bazAAR downloads data from
its original source repositories, rather than mirroring it, it only
includes resources that have been published in a fully open and
machine-accessible format. Despite this limitation, it has global
coverage and a large number of dates (Figure~\ref{fig-c14-global}), and
was therefore our starting point for data collection for XRONOS.
Another indexical approach is taken by the IntChron project (Bronk
Ramsey et al., 2019), which exposes data from multiple sources and
exposes them with a common JSON-based web interface. The IntChron
specification is open, meaning that radiocarbon labs or compilation
projects can implement it independently and thereby allow end users to
access their data through a common interface (though to our knowledge it
has so far only been adopted by databases associated with the Oxford
Radiocarbon Lab). The JSON format also lends itself to the
implementation of wrapper libraries, for example the rIntChron package
gives direct access to IntChron-indexed databases in R (Roe, 2024).
p3k14c (Bird et al., 2022) instead compiles multiple source databases
into a single flat file dataset, with a similar level of coverage to
c14bazAAR. The major advantage of this approach is that the data is made
internally consistent and has been manually cleaned to an extent, which
makes it particularly well-suited to global analyses. The downside is
that without the continuous link to the source databases present in the
c14bazAAR and IntChron, it can only be kept up to date manually with
periodic re-releases. An accompanying package (Bird et al., 2024)
provides direct access to the p3k14c dataset in R.
As of December 2024, c14bazAAR had 118,071 radiocarbon dates with unique
laboratory identifiers (excluding those sourced from p3k14c and XRONOS),
IntChron had 12,388 (excluding those from non-archaeological contexts),
and p3k14c had 176,016. The geographic distribution of dates from each
is similar (Figure~\ref{fig-c14-global}), reflecting the large degree in
overlap between the sources of each compilation. IntChron, which in
practice is currently only used to publish dates associated with the
Oxford Radiocarbon Lab, has dates from more diverse contexts, but is an
order of magnitude smaller.
\subsection{Beyond radiocarbon}\label{beyond-radiocarbon}
Radiocarbon has been by far the most active area of open data
compilation, but archaeological chronology incorporates a much more
diverse range of sources of information (Harding, 1999). In periods
beyond the practical limit of radiocarbon dating (c.~55,000 BP), other
types of radiometric (K--Ar, U--Pb, etc.), chemical or luminescence
dating offer an alternative (Aitken, 1999). Conversely, in historic
periods, radiocarbon is often relatively underused compared to
conventional typological dating (based on artefact characteristics),
which in these periods can offer comparable or better temporal
resolution, or direct dating based on epigraphy (Heřmánková et al.,
2021), numismatics (Kemmers and Myrberg, 2011) or historical sources. In
places where it is widely available, dendrochronology (Baillie, 2014)
also produces significantly better resolved chronologies and therefore
tends to be the main source of chronometric data. Other more
application-specific chronological methods include shoreline dating
(Brøgger, 1905; Roalkvam, 2023), lichenometry (Benedict, 2009) and rock
weathering dating (Bednarik, 2020; Whitley, 2012).
Compared to radiocarbon, there are few examples of systematic, open
compilations of any of these other types of data. This is most striking
when it comes to other radiometric/scientific dating methods, as the
data structures and publication modes are very similar to radiocarbon.
The `Radiocarbon Palaeolithic Europe Database' (Vermeersch, 2020),
despite the name, includes a significant number of thermo- and optically
stimulated luminescence, electon spin resonance, uranium--thorium and
amino acid dates. Similarly, the AustArch database (Williams et al.,
2014) includes luminescence dates alongside radiocarbon, but is limited
to Australia and was last updated in 2013. Apart from these and a few
other exceptions where other scientific dates are collected alongside
radiocarbon, we are not aware of any open compilations of them.
\subsubsection{Dendrochronology}\label{dendrochronology}
With regard to tree-ring data, some databases provide valuable resources
for dendrochronological studies in general but are not primarily
intended for archaeological contexts. For instance, Dendro4Art
specializes in dendrochronological data related to wooden art objects,
such as sculptures and panel paintings. While this focus serves art
historians and conservationists well, its utility for studying
prehistoric datasets is minimal. Similarly, the Dendrochronological
Picture Database, maintained by the Swiss Federal Institute for Snow and
Avalanche Research (SLF), offers a visual archive of approximately 1,400
images documenting dendrochronological phenomena. Although valuable as
an educational resource, it does not provide raw data necessary for
chronological or archaeological analyses. Additionally, the OLDLIST and
Eastern OLDLIST databases focus on documenting the maximum ages of trees
worldwide. Their emphasis on biological longevity, while significant for
ecological research, limits their applicability to archaeological or
prehistoric investigations.
Among databases that do provide dendrochronological data, the degree to
which they support prehistoric research varies substantially. The NOAA
International Tree-Ring Data Bank (ITRDB) serves as a global repository
of tree-ring measurements. However, its focus remains predominantly on
North America, with only 34 datasets representing European prehistoric
contexts. This restricts its relevance for studying the European past.
Similarly, the ADS database, maintained by the UK-based Vernacular
Architecture Group, compiles dendrochronological data from the UK but is
limited to medieval and later periods, making it unsuitable for
prehistoric studies.
DendroDB, hosted by the Swiss Federal Institute for Forest, Snow and
Landscape Research (WSL), emphasizes ecological and climate studies over
archaeological wood material. While it claims a broad scope, the
database remains non-functional, rendering it ineffective for research
needs. The CFS-TRenD database, managed by the Canadian Forest Service,
compiles over 4,600 datasets from Canadian forests, primarily focusing
on boreal ecosystems. Despite its extensive coverage for North America,
its geographical specificity and lack of open access restrict its
utility for European prehistoric contexts. Similarly, the QUB
Dendrochronology Database, managed by Queen's University Belfast, offers
valuable datasets for Ireland and the UK but lacks significant
representation of prehistoric material, limiting its application in
broader archaeological investigations. The Building Archaeology Research
Database (BARD) contains over 24,000 records, including
dendrochronological data from more than 2,700 buildings. However, its
focus on medieval and post-medieval timber-frame construction further
narrows its utility for studies involving prehistoric wood samples.
The Digital Collaboratory for Cultural Dendrochronology (DCCD) presents
itself as a potentially valuable international platform for
dendrochronology, particularly through its integration with
archaeological data services such as ARIADNE. However, it remains
heavily biased toward datasets from the Netherlands, which account for
more than two-thirds of its entries, while only 0.08\% of its
\emph{Quercus} data pertain to Switzerland and just 2.5\% represent
prehistoric datasets. Notably, these estimates date back to 2021, and an
updated assessment is currently unattainable following the platform's
migration to DataverseNL, which now charges an annual fee of nearly
€6,000 for access. Furthermore, database activity has declined
significantly, from 3,846 new project records between 2010 and 2014 to
only 83 by the end of 2019. Although 519 additional records have been
reported since 2021, it remains unclear whether this figure includes
revisions to pre-existing entries, potentially inflating the count. The
database's narrow focus and restrictive access model significantly limit
its broader utility for prehistoric research.
Finally, the Strategic Environmental Archaeology Database (SEAD), hosted
by Umeå University, integrates multiple environmental proxies, including
dendrochronological data. However, the dendrochronological component is
largely confined to Swedish data, with limited relevance to prehistoric
contexts. While SEAD aims for broader applications, its dendro component
has limited utility for studies outside Sweden.
The utility of dendrochronological databases for prehistoric research
varies widely. Global resources such as NOAA's ITRDB and DCCD offer
substantial datasets but face significant limitations in practical
geographical and temporal scope. Similarly, platforms like DendroDB and
BARD primarily cater to historical studies, leaving critical gaps in
prehistoric coverage. Specialized resources like OLDLIST, Dendro4Art,
and the Dendrochronological Picture Database provide valuable
contributions but lack the direct relevance necessary for archaeological
tree-ring analysis. Consequently, researchers focused on prehistoric
dendrochronology must navigate a fragmented landscape of databases, each
offering distinct strengths and limitations. Addressing these gaps
remains crucial for advancing the field.
\subsubsection{Typological dating}\label{typological-dating}
Typological dates---i.e.~relative, expertise- or seriation-based dating
based on artefact characteristics---are ubiquitous in archaeological
studies but rarely treated as a form of chronometric data in their own
right. For example, the majority of the radiocarbon datasets we reviewed
(Section~\ref{sec-c14-compilation}) included some form of typological
chronological information in the form of a `period' or `culture' column.
This is also typically present in many other forms of systematic
compilation work in archaeology, for example site gazetteers. Aggregated
typological information from such sources are often used in aoristic
analysis and related methods (Crema, 2024; Mischka, 2004). What is
lacking in this presentation of typological dating is metadata on how
the determination was made and how exactly it is to be understood. Like
any archaeological date, a typological date is derived from a physical
sample -- the object or set of object from which a chronological
estimate was derived. Typological dates on one class of object may well
clash with other classes of object, or for that matter with scientific
dates -- does one trust the date on pottery, the date on architecture,
or the radiocarbon date? Without additional metadata on e.g.~who made
the typological determination or what the radiocarbon date was obtained
on, such inconsistencies are difficult to resolve Similarly the absolute
date range corresponding to a typological determination (e.g.~``Late
Neolithic'') can be interpreted in multiple ways depending on the region
and intentions of the expert making the determination. PeriodO
(Rabinowitz et al., 2016) is a linked open data infrastructure that
includes a shared vocabulary of typological periods and corresponding
calendar age estimates, and an important step towards addressing the
latter problem. However, it remains to be systematically linked to
compilations of typological dates (though there are some efforts in this
direction e.g. Hannah et al., 2022).
\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
What is missing to date is a general-purpose infrastructure for
combining all of these types of chronometric information on a global
scale. This is the gap that XRONOS aims to fill, starting with three
methods: radiocarbon, dendrochronology, and typological dating. These
were chosen because they are widely used and relatively advanced in
terms of open data, but an important aim of the project is to develop a
generalisable data model that can easily scale to any and all types of
archaeological chronology (see Section~\ref{sec-data-model}).
\section{Concept}\label{concept}
XRONOS inherits its basic structure from RADON (Hinz et al., 2012;
Kneisel et al., 2014; Raetzel-Fabian, 1999; Rinne et al., 2024), with a
database-backed web application and a data model that separates
radiocarbon dates, contextual information, and sites. Our overall aims
in developing XRONOS is to bring this model, which RADON has operated on
for more than twenty years, up to date, to generalise it to other types
of chronometric information, and to transform it from an online database
to a data infrastructure that supports the continuous ingestion,
curation, and open dissemination of archaeological chronologies from
diverse sources.
\subsection{Design goals}\label{design-goals}
XRONOS is our answer to Kintigh's call (Kintigh, 2006) for digital
infrastructures that don't just provide access to chronological data but
enables researchers to ``archive, access, integrate, and mine disparate
data sets''. It complements several similar open data infrastructures
within and outwith archaeology, such as the Global Biodiverisity
Information Facility (GBIF, Canhos et al., 2004), the Strategic
Environmental Archaeology Database (SEAD, Buckland, 2014), IMPACT for
mummified human remains (Nelson and Wade, 2015), Neotoma for
palaeoecological data (Williams et al., 2018), IsoArcH for stable
isotope data (Plomp et al., 2022), the International Soil Radiocarbon
Database (ISRaD, Lawrence et al., 2020), and the `Big Interdisciplinary
Archaeological Database' (BIAD), an ambitious new initiative to combine
many of these individual domains, including chronology (Reiter et al.,
2024). To improve upon existing global syntheses of radiocarbon dates
(see Section~\ref{sec-global-compilations}), we aimed to develop a
living infrastructure that both continually collected data from diverse
sources and presented a seamless single database to the user.
Our principal goals for the software were therefore to:
\begin{enumerate}
\def\labelenumi{\arabic{enumi}.}
\tightlist
\item
Combine all available sources of radiocarbon and other chronometric
data in single database
\item
Develop robust tools for the continuous ingestion, collation and
curation of this data
\item
Disseminate the collated and curated data as linked open data within a
FAIR framework
\end{enumerate}
Meeting these goals required the development of a) a conceptual data
model, including links to other open data resources, that is flexible
enough for all forms of chronometric data; and b) a software
implementation that supports the main functions of ingesting, curating,
and disseminating this data. The individual components of this work are
described in more depth below but, briefly, consist of a relational data
model implemented in a PostgreSQL database; a Ruby application providing
server-based tools for ingestion, curation and dissemination of data;
and multiple graphical and programmatic interfaces to the resulting
dataset.
\subsection{Data model}\label{sec-data-model}
\begin{figure}
\centering{
\pandocbounded{\includegraphics[keepaspectratio]{figures/fig-data-model-1.pdf}}
}
\caption{\label{fig-data-model}Simplified entity relationship diagram
showing the XRONOS data model}
\end{figure}%
At the base of the XRONOS data model (Figure~\ref{fig-data-model}) are
sets of spatiotemporal coordinates or, as we call them, \emph{chrons}.
In an archaeological context, we conceptualise a chron as an assertion
linking human activity with a particular point in space and time. Our
data model currently encompasses three types of chron: radiocarbon
dates, typological dates (e.g.~`Early Neolithic') and
dendrochronological dates. However we anticipate that the concept will
accommodate other types of absolute and relative dating techniques, as
the scope of the database expands.
Chrons are conceptually useful because they emphasise that different
types of archaeological `dates', drawn from different sources, have
essentially the same information content: the location of an event in
space and time. We thereby avoid privileging certain sources of
chronological data over (as might be the case if, for example, we
treated `period' as a fixed attribute of a site) and can accommodate
contradictory (e.g.~differences of opinion on typological
classification). This is important given that XRONOS aspires to be an
authoratative `backbone' with a global scope, so we cannot realistically
impose a single chronological scheme or resolve conflicting information
provided by specialists. They are useful practically because they expose
a common interface for attributes that all types of chronological
information share, such as a \emph{terminus post quem} (TPQ),
\emph{terminus ante quem} (TAQ), and midpoint estimate. This allows
applications that use XRONOS' data model (including XRONOS itself) to
collate chronological data from multiple sources, without necessarily
having to be aware of the pecularities of each type of dating.
In order to unify chronological information in the form of a chron, we
need a common chronological `coordinate system'. The natural choice is a
\emph{calendar probability distribution}, which expresses the
probability that an event occurred as a function of time on a calendric
scale. Most archaeologists are familiar with working with this kind of
representation in the form of calibrated radiocarbon dates, but it can
be extended and generalised to essentially any kind of chronological
information. For example, in aoristic analysis (Mischka, 2004), a
periodic time estimate (e.g.~the event occurred in the Neolithic) is
conceptualised as a uniform probability distribution over the timespan
between the known start and end dates of that period. A similar model is
used in OxCal (Bronk Ramsey, 2009, a direct inspiration for our
approach) to integrate prior chronological information from diverse
sources. In practical terms, this model means that the canonical
representation the time component of any chron in XRONOS, regardless of
source, is a probability distribution over the set of calendar years
(arbitrarily measured in years Before Present) in which it could have
plausibly occurred. Further statistics, e.g.~a midpoint estimate or
TPQ/TAQ range, can be derived from this distribution using well-known
methods. In this way, we can support many different types of date and
much of the implementation of XRONOS can be agnostic to the source of
chronological information.
Chrons are located in space through association to a \emph{sample} --
the physical object from which a chronological determination was made.
The location of samples is represented with geographical coordinates and
an associated coordinate reference system (CRS), though since in
practice the precise location of single samples is rarely available,
this property is usually inherited from the site. We also record
relevant metadata on the nature of the sample. For radiocarbon dates,
for example, we follow established conventions (Millard, 2014) in
recording the type (e.g.~charcoal, charred seed) and, where applicable,
taxonomic designation (e.g.~\emph{Quercus}, \emph{Triticum dicoccum}) of
the organic material used for dating. For typological dates, an ideal
scenario would be for the sample to represent the particular object from
which an inference was made (e.g.~`Natufian' might be inferred from
`lunate-type microlith'). In practice, the best we can glean from most
published datasets is the type of material used (e.g.~`pottery',
`lithics'). The same sample can be associated with multiple chrons,
including different types of chron. This is useful, for example, for
representing replicate radiocarbon dates on the same sample, or
radiocarbon dates and dendrochronological made on the same section of
wood for wiggle-matching.
Further contextual information is associated with \emph{contexts} and
\emph{sites}. The site is the primary geographic container for
chronological information. As already mentioned, we typically record the
spatial location of chrons using this entity, though it is possible to
modify this by providing specific coordinates at the sample level. Sites
also have attributes describing their conventional name or names in
different languages and are associated with a flexible `site type'
typology that combines information on their form and function.
A context represents the specific find-context of a sample, e.g.~an
architectural feature, stratigraphic unit, or phase. Since the units and
conventions for recording such information vary greatly between
different regions and archaeological traditions---and XRONOS is designed
with global data in mind---we leave the question of what a context
precisely represents open, and only record an unstandardised, free text
label for it. Crucially, however, contexts can have a self-referential
association to other contexts belonging to the same site. This allows it
to encode arbitrary relational structures between contexts, whether they
be hierarchical (e.g.~phases and sub-phases) or graphical
(e.g.~stratigraphic). In this way, it can serve as a foundation for
chronological modelling.
The series of relations
\texttt{{[}chron{]}\ \textgreater{}\ sample\ \textgreater{}\ context\ \textgreater{}\ site}
links the chronological and contextual sides of the XRONOS data model.
Each step is a many-to-one association, meaning for example that it is
possible to attach multiple chrons to the same sample (e.g.~replicated
radiocarbon dates on the same material), multiple \emph{types} of chrons
to the same sample (e.g.~radiocarbon dates on tree-rings for
wiggle-matching). Since this kind of information is rarely
systematically recorded in our source databases, there are currently few
actual records that make use of this feature of the data model. However,
we hope it will provide a foundation for more nuanced chronological
modelling in the future.
Metadata is incorporated into XRONOS' data model at the level of the
individual records (e.g.~all records store their data of creation and