The great potential brought by large-scale data in the humanities is still hindered by the time and technicality required for making documents digitally intelligible. Within urban studies, historical cadasters have been hitherto largely under-explored despite their informative value. Powerful and generic technologies, based on neural networks, to automate the vectorization of historical maps have recently become available. However, the transfer of these technologies is hampered by the scarcity of interdisciplinary exchanges and a lack of practical literature destinated to humanities scholars, especially on the key step of the pipeline: the annotation. In this article, we propose a set of practical recommendations based on empirical findings on document annotation and automatic vectorization, focusing on the example case of historical cadasters. Our recommendations are generic and easily applicable, based on a solid experience on concrete and diverse projects.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.