R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets
View/ Open
Date
2023Author
Stötzner, Ernst
Homburg, Timo
Bullenkamp, Jan Philipp
Mara, Hubert
Metadata
Show full item recordAbstract
Motivated by the demands of Digital Assyriology and the challenges of detecting cuneiform signs, we propose a new approach using R-CNN architecture to classify and localize wedges. We utilize the 3D models of 1977 cuneiform tablets from the Frau Professor Hilprecht Collection available as pen data. About 500 of these tablets have a transcription available in the Cuneiform Digital Library Initiative (CDLI) database. We annotated 21.000 cuneiform signs as well as 4.700 wedges resulting in the new open data Mainz Cuneiform Benchmark Dataset (MaiCuBeDa), including metadata, cropped signs, and partially wedges. The latter is also a good basis for manual paleography. Our inputs are MSII renderings computed using the GigaMesh Software Framework and photographs having the annotations automatically transferred from the renderings. Our approach consists of a pipeline with two components: a sign detector and a wedge detector. The sign detector uses a RepPoints model with a ResNet18 backbone to locate individual cuneiform characters in the tablet segment image. The signs are then cropped based on the sign locations and fed into the wedge detector. The wedge detector is based on the idea of Point RCNN approach. It uses a Feature Pyramid Network (FPN) and RoI Align to predict the positions and classes of the wedges. The method is evaluated using different hyperparameters, and post-processing techniques such as Non-Maximum Suppression (NMS) are applied for refinement. The proposed method shows promising results in cuneiform wedge detection. Our detector was evaluated using the Gottstein system and with the PaleoCodage encoding. Our results show that the sign detector performs better when trained on 3D renderings than photographs. We showed that detectors trained on photographs are usually less accurate. The accuracy on photographs improves when trained, including 3D renderings. Overall, our pipeline achieves decent results, with some limitations due to the relatively small amount of data. However, even small amounts of high-quality renderings of 3D datasets with expert annotations dramatically improved sign detection.
BibTeX
@inproceedings {10.2312:gch.20231157,
booktitle = {Eurographics Workshop on Graphics and Cultural Heritage},
editor = {Bucciero, Alberto and Fanini, Bruno and Graf, Holger and Pescarin, Sofia and Rizvic, Selma},
title = {{R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets}},
author = {Stötzner, Ernst and Homburg, Timo and Bullenkamp, Jan Philipp and Mara, Hubert},
year = {2023},
publisher = {The Eurographics Association},
ISSN = {2312-6124},
ISBN = {978-3-03868-217-2},
DOI = {10.2312/gch.20231157}
}
booktitle = {Eurographics Workshop on Graphics and Cultural Heritage},
editor = {Bucciero, Alberto and Fanini, Bruno and Graf, Holger and Pescarin, Sofia and Rizvic, Selma},
title = {{R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets}},
author = {Stötzner, Ernst and Homburg, Timo and Bullenkamp, Jan Philipp and Mara, Hubert},
year = {2023},
publisher = {The Eurographics Association},
ISSN = {2312-6124},
ISBN = {978-3-03868-217-2},
DOI = {10.2312/gch.20231157}
}