Towards an Ontopedia for Post-Medieval Hebrew Manuscripts

The event-based ontology for Hebrew Manuscripts
The project is funded by the Israel Science Foundation grant (2015-2018).(Grant No. 342/15).

ontopedia5

The migration map of the Jewish population in the period of 1300-2000.
The migration map of the Jewish population in the period of 1300-2000.

ontopedia1
onto2
Hebrew manuscripts are one of the most important sources of Jewish cultural heritage. For thousands of years, even after the introduction of printing. These manuscripts shed light on the intellectual, religious and everyday life of Jews throughout the ages. The estimated number of Hebrew manuscripts that survived is 80,000 volumes. The largest collection of Hebrew manuscript metadata is offered by the catalog of the Institute of Microfilmed Hebrew Manuscripts in the National Library of Israel (http://web.nli.org.il/), but still most of the manuscripts’ data remains unsearchable and thus undiscovered.

In this research we aim to develop a framework for building a dynamic web-based encyclopedia (ontopedia) for post-medieval Hebrew manuscripts based on a rich ontology.

The event-based ontology model of the Hebrew Manuscripts
The event-based ontology model of the Hebrew Manuscripts

Digitization of the national cultural heritage is a rapidly expanding field essential for preserving and maintaining the historical data and leveraging its future research. Historical handwritten Hebrew manuscripts are one of the most unique and authentic evidences of the Jewish culture and thought that survived through the centuries. Scholars from various fields increasingly study these manuscripts to reveal historical, linguistic, religious, philosophical, and social aspects of the Jewish life in different times and places.

 

Ontology events
Ontology events

Currently, the only available digital representation of these manuscripts metadata is library catalogs. The largest collection of Hebrew manuscripts metadata is offered by the catalog of the Institute of Microfilmed Hebrew Manuscripts in the National Library of Israel. This catalog is accompanied with a search engine to retrieve records by a limited number of parameters, such as author, title, date and subject, while most of the data still remain unsearchable and thus undiscovered. However, to enable a systematic research of the knowledge embedded in the manuscripts there is a need for a formal conceptual data model with high level of semantic granularity, an ontology. To the best of our knowledge, there is no formal ontology for the realm of historical handwritten Hebrew manuscripts. There are many ontologies that focus on different aspects of textual information but one single ontology representing all these aspects does not exist.

Therefore, in this research we propose to build a dynamic web-based framework that will allow scholars to create, enrich and consult an “ontopedia” (ontology-based encyclopedia) of post-medieval Hebrew manuscripts. We focus on the post-medieval period (16th century and later), because these works are under-explored in the research literature. The framework will be based on an ontology especially designed and implemented for this domain and goals. The ontopedia will be represented in the RDF language recommended by W3C for the semantic representation of knowledge. Ontology creation will require researching the most appropriate existing ontologies for the domain and classification and mapping of the catalog fields into ontological entities. Then, semi-automatic techniques will be developed to convert the content of the catalog of the Institute of Microfilmed Hebrew Manuscripts to RDF representation and extend the core ontology with concrete manuscript data. In addition, we will link the entities of our ontology to similar entities in the existing ontologies in the semantic web. The resulting ontology will be published on the web and will become part of the open linked data. Furthermore, we will developed, a user-friendly interface for searching, browsing and querying the ontology.

Finally, the ontopedia for the Hebrew historical manuscripts will be constructed as culmination of this multi-disciplinary study that comprises and integrates research in computer science (automatic conversion of catalog to ontology and user interface design), information science (conceptual model development and evaluation), library studies (catalog record examination) along with Hebrew codicology and paleography (completing the missing features of the manuscripts and building query sets for the ontopedia construction).

The underlying philosophical approach behind the proposed ontology is to view a manuscript as a “living entity” and design a data model of its narrative. This model will include stages and milestones in its biography (e.g. creation, printing, or acquisition), its influence and interactions with other manuscripts, people, places, historical and cultural events. A sequence of events and places (as Jewish writers were spread over the world) will constitute a timeline of history against which manuscripts, people and their relationships can be placed.

 The results of our project will greatly contribute to the study of Hebrew manuscripts and of the Jewish cultural heritage. It will enable posing queries and cross-referencing data from various vocabularies in the semantic web. Large-scale automated reasoning will also enable a comparison of the effect of time and place on qualitative characteristics and quantitative distribution of manuscripts.

Particularly, we found that through typical and marginal script types in different regions and their changes over time, it is possible to draw the migration map of the Jewish communities over the centuries. Specifically, the waves of immigration from Western Europe can be seen clearly from the second half of the 13th century, which continued until the 17th century and created the Eastern European Jewish community.

The distribution of script types in 30 different countries along the centuries (empty bars indicate no manuscript written in the country in the given period, light blue color indicates cases of 5 copying events or less).
The distribution of script types in 30 different countries along the centuries (empty bars indicate no manuscript written in the country in the given period, light blue color indicates cases of 5 copying events or less).

Publications

1. Prebor, G., Zhitomirsky-Geffet, M., and Miller, Y. (2018). A new analytic framework for script type utilization as predictors of migration patterns over time. Journal of the Digital Scholarship in the Humanities.

2. Zhitomirsky-Geffet M., Prebor G., Buchel O., Bouhnik D. (2018, June). A New Methodology for Error Detection and Data Completion in a Large Historical Catalogue Based on an Event Ontology and Network Analysis. The annual International conference of the Alliance of Digital Humanities Organizations (ADHO), Mexico City, Mexico.

3. Zhitomirsky-Geffet M., Prebor G., Miller Y. (2018, November). Ontology-based analysis of the large collection of historical Hebrew manuscripts. Poster in Proceedings of the Annual Meeting of the Association for Information Science, Vancouver, Canada. ASIS&T digital library: Maryland, USA.

4. Zhitomirsky-Geffet M. and G. Prebor. (2016). Towards an Ontopedia for historical Hebrew manuscripts. Frontiers in Digital Humanities, section of Digital Paleography and Book History. 

5. Prebor G. and Zhitomirsky-Geffet M. “Towards the Ontopedia for Hebrew Historical Manuscripts”. Poster in the Proceedings of the Digital Humanities 2015 conference (DH2015), Sydney, Australia, June, 2015.

6. Zhitomirsky-Geffet M. and G. Prebor. “A new event-based ontology model for the Hebrew Historical Manuscripts”, London, UK, September, 2015.