R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets

Stötzner, Ernst; Homburg, Timo; Bullenkamp, Jan Philipp; Mara, Hubert

View/Open

047-056.pdf (3.375Mb)

Date

2023

Author

Stötzner, Ernst

Homburg, Timo

Bullenkamp, Jan Philipp

Mara, Hubert

Metadata

Show full item record

Abstract

Motivated by the demands of Digital Assyriology and the challenges of detecting cuneiform signs, we propose a new approach using R-CNN architecture to classify and localize wedges. We utilize the 3D models of 1977 cuneiform tablets from the Frau Professor Hilprecht Collection available as pen data. About 500 of these tablets have a transcription available in the Cuneiform Digital Library Initiative (CDLI) database. We annotated 21.000 cuneiform signs as well as 4.700 wedges resulting in the new open data Mainz Cuneiform Benchmark Dataset (MaiCuBeDa), including metadata, cropped signs, and partially wedges. The latter is also a good basis for manual paleography. Our inputs are MSII renderings computed using the GigaMesh Software Framework and photographs having the annotations automatically transferred from the renderings. Our approach consists of a pipeline with two components: a sign detector and a wedge detector. The sign detector uses a RepPoints model with a ResNet18 backbone to locate individual cuneiform characters in the tablet segment image. The signs are then cropped based on the sign locations and fed into the wedge detector. The wedge detector is based on the idea of Point RCNN approach. It uses a Feature Pyramid Network (FPN) and RoI Align to predict the positions and classes of the wedges. The method is evaluated using different hyperparameters, and post-processing techniques such as Non-Maximum Suppression (NMS) are applied for refinement. The proposed method shows promising results in cuneiform wedge detection. Our detector was evaluated using the Gottstein system and with the PaleoCodage encoding. Our results show that the sign detector performs better when trained on 3D renderings than photographs. We showed that detectors trained on photographs are usually less accurate. The accuracy on photographs improves when trained, including 3D renderings. Overall, our pipeline achieves decent results, with some limitations due to the relatively small amount of data. However, even small amounts of high-quality renderings of 3D datasets with expert annotations dramatically improved sign detection.

BibTeX

@inproceedings {10.2312:gch.20231157,
booktitle = {Eurographics Workshop on Graphics and Cultural Heritage},
editor = {Bucciero, Alberto and Fanini, Bruno and Graf, Holger and Pescarin, Sofia and Rizvic, Selma},
title = {{R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets}},
author = {Stötzner, Ernst and Homburg, Timo and Bullenkamp, Jan Philipp and Mara, Hubert},
year = {2023},
publisher = {The Eurographics Association},
ISSN = {2312-6124},
ISBN = {978-3-03868-217-2},
DOI = {10.2312/gch.20231157}
}

URI

https://doi.org/10.2312/gch.20231157
https://diglib.eg.org:443/handle/10.2312/gch20231157

Collections

GCH 2023 - Eurographics Workshop on Graphics and Cultural Heritage

Except where otherwise noted, this item's license is described as Attribution 4.0 International License