Multi-perspective ontology for images in Jewish cultural heritage

Research funded by the ISF research grant 2007-2010.

The research was conducted in cooperation with Prof. Judit Bar-Ilan and Prof. Snunith Shoham and Dr. Yitzhak Miller from the Department of Information Science in Bar-Ilan University.
The research objective of this work was to develop a general framework that incorporates collaborative social tagging with a novel ontology scheme conveying multiple perspectives. We proposed a framework where multiple users tagged the same object (image in our case), and an ontology was created and extended based on these tags while being tolerant about different points of view. Both the tagging and the ontological models are intentionally designed to suit the multi-perspective environment. The proposed framework characterized the underlying processes for a collaborative development of a multi-perspective ontology and its application to improve image annotation, searching and browsing.
In order to construct the ontology each user initially tagged the images without seeing the tags provided by the other users. Then, the users saw the tags assigned by others and were also encouraged to interact. Results show that after the social interaction phase the tag sets converged and the popular tags became even more popular. Even though in all cases the total number of assigned tags increased after the social interaction phase, the number of distinct tags decreased in most cases. When viewing the image only, in some cases the users were not able to correctly identify what they saw in some of the pictures, but overcame the initial difficulties after interaction. From this experiment we concluded that social interaction may lead to convergence in tagging and that the “wisdom of the crowds” helps overcome the difficulties due to the lack of information.
Once the initial ontology was constructed based on the most popular user tags, we investigated the effectiveness of ontology-based interface for further image tagging and retrieval. To this end, we defined four evaluation criteria for tag quality and compared three types of user interfaces for image tagging: free-text based, ontology-based, and a mixed interface which incorporates both free-text based and ontology based tagging. We found that ontological tags always achieved a broader user agreement, the highest average popularity scores and were more stable during the tag modification stages of the experiment. On the other hand, the free-text interface when available before using or in parallel with the ontology was perceived as an easier in use option and therefore produced more tags. The conclusion was that ontology could be very effectively employed for image tagging, when no other interfaces were available at the time of or before seeing the ontology. The obtained results also revealed a complementary nature of the free-text and ontological tags, which created a basis for a dynamic process of collaborative ontology extension.
In another experiment each participant had to search images matching certain predefined scenarios, when using one of four retrieval interfaces: tag search in a search box; faceted tag search in a search box, selecting terms from the tag cloud of all the tags in the database and selecting concepts from an ontology created from the tags assigned to the images. The obtained results have shown that the highest recall on average was achieved by users of the ontology interface, for seven out of the ten tasks, however users were more satisfied with the textbox based search than the cloud or the ontology.
The significance of this research was that it focused on exploring effective ways for employment of the proposed multi-perspective ontological model both for retrieval and tagging of visual objects.
An additional contribution of this project was a creation of the annotated image collection of hundreds of pictures in the area of Jewish cultural heritage, which were also indexed by the ontology concepts and thus could be effectively retrieved.

Publications:

  1. Zhitomirsky-Geffet  M., J. Bar-Ilan, Y. Miller and S. Shoham. 2012. Exploring the effectiveness of folksonomy based tagging vs. free text tagging. Book Chapter in Indexing and Retrieval of Non-Text Information. Ed. by Rasmussen Neal, Diane. Series: Knowledge and Information / Studies in Information Science.
  2. Bar-Ilan, J., Zhitomirsky-Geffet Maayan, Yitzhak Miller, and Snunith Shoham. 2012. Tag-based Retrieval of Images through Different Interfaces – A User Study. Online Information Review.
  3. Bar-Ilan, J., Zhitomirsky-Geffet Maayan, Yitzhak Miller, and Snunith Shoham. “The effects of background information and social interaction on image tagging”. The Journal of the American Society for Information Science and Technology (JASIST), 61(5), 940-951. 2010.
  4. Bar-Ilan, J., Zhitomirsky-Geffet, M.,  Miller, Y. and Shoham, S. “Tag cloud and ontology based retrieval of images”. In Proceedings of the Third Symposium of Information Interaction in Context (IIiX), 2010, pp. 85-94.
  5. Zhitomirsky-Geffet  M., J. Bar-Ilan, Y. Miller and S. Shoham. “A Generic Framework for Collaborative Multi-perspective Ontology Acquisition”. Online Information Review. Vol. 1. 2010.
  6. Zhitomirsky-Geffet  M., J. Bar-Ilan, Y. Miller and S. Shoham. “A Generic Framework for Collaborative Multi-perspective Ontology Acquisition”. The 17th International World Wide Web Conference (WWW2008). Beijing, China. 2008.

Towards a cross-generation social network for Jewish sages

The Mishna networks:

The diagram presents the network which contains only disagreement relationships (edges in the graph) between the sages (nodes in the graph). The most frequent sages in the corpus appear in larger nodes, the edges’ width reflects the number of disagreement relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.The diagram presents the network which contains only disagreement relationships (edges in the graph) between the sages (nodes in the graph). The most frequent sages in the corpus appear in larger nodes, the edges’ width reflects the number of disagreement relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.

The diagram presents the network which contains only agreement/support relationships (represented as edges) between the sages (represented as nodes). The most frequent sages in the corpus appear in larger nodes, the edges’ width reflects the number of relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.

The diagram presents the network which contains only agreement/support relationships (represented as edges) between the sages (represented as nodes). The most frequent sages in the corpus appear in larger nodes, the edges’ width reflects the number of relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.

The diagram presents the directed graph of the citation relation network with nodes representing the sages and incoming arrows of the node representing sages citing a corresponding sage. The nodes’ size shows the frequency of sages’ appearance in the corpus, the arrows’ width reflects the number of disagreement relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.
The diagram presents the directed graph of the citation relation network with nodes representing the sages and incoming arrows of the node representing sages citing a corresponding sage. The nodes’ size shows the frequency of sages’ appearance in the corpus, the arrows’ width reflects the number of disagreement relations for a pair of sages, and the different colors represent the cliques of the inter-connected sages.

The frequencies of occurrences of the top-40 popular sages in each of the three versions of the Mishna and on average:

The frequencies of occurrences of the top-40 popular sages in each of the three versions of the Mishna and on average.

Jewish Biblical and Rabbinic literature is a great source of ancient wisdom and cultural heritage. It includes a large amount of people such as prophets, political and religious leaders, sages and other historical figures. Amazingly, although these people were spread over the world and through different time periods, they were united and connected by the same text – the Bible. Therefore, the aim of this research is to propose and implement a methodology for construction of a cross-generation social network for Jewish sages to explore their inter-relationships on a large scale, using modern computerized tools for text analysis and graph mining.

At the first stage we defined the corpus of the study and a digital resources for this corpus (we used three various versions of the Mishna from the Sefaria repository). We work with the text of the Mishna (2nd century CE) and Talmud (4th-5th century CE).

Publications:

Zhitomirsky-Geffet, M., Prebor G. 2018. SageBook: A cross-generation network of the Jewish sages. Journal of the Digital Scholarship in the Humanities.

Zhitomirsky-Geffet M. and Prebor G. “Towards a cross-generation social network for Jewish sages”. Poster in the Proceedings of the Digital Humanities 2016 conference (DH2015), Krakov, Poline, July, 2016.

The devised generic pattern of sage’s name representation was:

[title] + private_name_variation + [nickname] + [“son of” + [father_title + father_private_name_variation] + [father_nickname]]

Examples of a typical full name of a sage (translated to English) is:

[title]      [private_name]                  [title]      [father_name]  [fathers_nickname / location]

Rabban    Simeon                 son of   Rabban    Gamaliel             of Yavne

[title]      [private_name]   [nickname]                  [father_name]

Rabbi       Eliezer                  the Great       son of     Hyrcanus

The patterns for relationship extraction:

Pattern no. Pattern expression No. of correct relationships No. of incorrect relationships Relationship Type
1 Say/s/id [to him] Sage A [*.] Say/s/id [to him] Sage B 38 15 Disagreement
2 Say/s/id [to him] Sage A [*.] And Sage B says/id 3 2 Disagreement
3 Say/s/id [to him] Sage A [*.] Sage B says/id 26 14 Disagreement
4 These are the words of Sage A [*.] And Sage B <ruling verb> 7 0 Disagreement
5 These are the words of Sage A [*.] And Sage B say/s/id 15 0 Disagreement
6 These are the words of Sage A [*.] Say/s/id [to him] Sage B 27 8 Disagreement
7 These are the words of Sage A [*.] Sage B <ruling verb> 9 3 Disagreement
8 These are the words of Sage A [*.] Sage B say/s/id 204 23 Disagreement
9 Sage A <ruling verb> [*.] Say/s/id [to him] Sage B 16 13 Disagreement
10 Sage <ruling verb> [*.] Sage B say/s/id 44 27 Disagreement
11 Sage A say/s/id [*.] Say/s/id [to him] Sage B 97 24 Disagreement
12 Sage A say/s/id [*.] And Sage B Say/s/id 167 12 Disagreement
13 Sage A say/s/id [*.] Sage B say/s/id 571 211 Disagreement
14 Sage A say/s/id [*.] the words of Sage B 6 3 Disagreement
15 Sage A forbid/s [*.] Sage B permit/s 5 2 Disagreement
16 Sage A purify/ies [*.] Sage B rule/s impure 7 3 Disagreement
17 Sage A rule/s impure [*.] Sage B purify/ies 9 4 Disagreement
18 Sage A obligate/s [*.] Sage B exempt/s 3 0 Disagreement
19 Sage A obligates/s [*.] Sage B exempt/s 5 0 Disagreement
20 Sage A disqualify/ies [*.] Sage B qualify/ies 5 0 Disagreement
21 Sage A qualify/ies [*.] Sage B disqualify/ies 3 2 Disagreement
22 Sage A in the name of Sage B 7 0 Citation
23 Sage A says in the name of Sage B 19 0 Citation
24 Sage A that says in the name of Sage B 1 0 Citation
25 Sage A received from Sage B 2 0 Citation
26 Sage A and Sage B <a verb but not “say”> 24 9 Agreement / Support
27 Sage A and Sage B say 27 0 Agreement / Support
28 Sage to rule as Sage B 2 1 Agreement / Support

Figures above visualize the social network of the sages for each of the relationship type. 347 distinct pairs of sages take part in the constructed network. 304 of them disagree with each other, 50 agree/support each other and 39 pairs of sages who cite one another. 142 distinct pairs of sages appeared in a relationship in more than one Mishna item. 38 pairs of sages appeared in two different types of relationships in different items of the Mishna (e.g. at least once in a disagreement and at least once in a support relationships), 4 distinct pairs of sages appeared at least once in all 3 types of relationships, and the rest 305 (88%) distinct pairs of sages appeared only in one relationship type (once or several times) in the entire text of the Mishna.

For 70% (239) of the pairs our database contained the generation information extracted from the traditional research literature for both sages in the pair. We found that the majority of the related pairs (69%) in the network were cross-generational (i.e. comprised sages who lived in different generations). For agreement/support relationship the proportion of the inter-generation pairs is a bit higher (44%), while for the citation relationship 85% of the pairs were cross-generational (as could be expected). In addition, our analysis revealed that in 55 pairs the sages belonged to successive generations. When we recalculated the percentage of cross-generational pairs considering these 55 pairs as inter-generational, since they comprise sages with overlapping life periods, the proportion has changed and there were only 27% cross-generational pairs.

To put the obtained quantitative results into the historical context, we next shed some light on the development of the Halakha (Jewish law) over time. The first stage of determining the Halakha was when the Sanhedrin (the ancient Jewish high court) sat in the Chamber of Hewn Stone. At that time there were no possible disagreements, Rabbi Jose dealt with the issue of disagreements in Jewish Law and offered a historical explanation.  During the period of the “Five Pairs” (Zugot), there was only one dispute that lasted for several generations: whether one who offers a Chagigah sacrifice on a Festival leans his hands on it or not. Shammai and Hillel (the fifth couple) disagreed concerning three other things. At a later stage, the disagreements increased. These stages of the disputes are detailed in Rabbi Jose’s words: “It is taught in … that Rabbi Jose said: Initially, discord would not proliferate among Israel. Rather, the court of seventy-one judges would sit in the Chamber of Hewn Stone. And there were two additional courts each consisting of twenty-three judges; one would convene at the entrance to the Temple Mount, and one would convene at the entrance to the Temple courtyard. And all the other courts consisting of twenty-three judges would convene in all cities inhabited by the Jewish people. From the time that the disciples of Shammai and Hillel grew in number, and they were disciples who did not attend to their masters to the requisite degree, dispute proliferated among the Jewish people and the Torah became like two Torahs. Two disparate systems of Halakha developed, and there was no longer a Halakhic consensus with regard to every matter…” (Talmud, Sanhedrin Daf 88b, English from The William Davidson digital edition of the Koren Noé Talmud, with commentary by Rabbi Adin Even-Israel Steinsaltz).

Figures above clearly illustrates what the Talmud states. As demonstrated in the figures, the most dominant relationship type in the constructed network (91% of all the relationships) is indeed the disagreement relationship between the sages. Furthermore, we can see the strong relationship between the school of Shammai and the school of Hillel with a very wide edge that connects between them. This reflects the high number of disagreement relations between them.

A more in-depth historical analysis indicates that the strong relationships and the cliques in the network (computed using Blondel’s et al. (2008) algorithm) based only on text connections between sages are correlated with the generations of sages and with information from traditional research. Thus, the brown-colored clique represents Period 1 of the “Five Pairs”. The “Five Pairs” of the sages in the brown clique (i.e., a total of 10 sages), only have connections among themselves, but they are not related to other sages with one exception, namely Shammai. The first generation of Tannaim, (generation 6)   Period 2, is the generation of the destruction of the Temple, which is represented in the red clique in the network with the sages: Johanan b. Zakkai, Hanina b. Antigonus, Akabya b. Mehalaleel, Judah b. Bathyra and more. Also, during this period there are few connections between sages. The third Period of Yavne (generation 7) (after the destruction of the Temple) is represented in blue. The most prominent figures are Eliezer b. Hyrcanus and Joshua b. Hananiah with many disputes. The fourth Period (generation 8) is represented in green. The most prominent figure is Akiba b. Joseph, who had many disagreements with Ishmael b. Elisha, his colleague and contemporary, as well as with his teacher Eliezer b. Hyrcanus of the previous generation.  The last Period of Tannaim, after the Bar Kokhba revolt (generation 9-10), is represented in yellow. In this generation we found four prominent figures: Judah b. Ilai, Meir, Jose b. Halafta and Simeon b. Yohai. All of them were Rabbi Akiba b. Joseph’s students, as we can see in Figure 3 there were many disagreements between them. Judah b. Ilai has the larger node in the figure, which is not surprising because he is the Tanna whose rulings are mentioned in the Mishna more than any other sage.

Figure 2 above presents the network which contains agreement/support relationships between the sages. It can immediately be seen that the network is thinner and includes only 84 support relationships (5.6%). This doesn’t mean that there was no agreement between the sages, the Mishna is very concise and does not bring the discussion in full.  If two sages agreed/supported each other, it was not necessary to bring both of them in the text. However, if there is disagreement and each of them have different opinions it was important for the Mishna to bring both opinions. Also, we found that the most prominent figure is Judah b. Ilai with the biggest node and the support relationships are with the same sages, with whom he also had many disagreements, his colleagues Meir and Simeon b. Yohai. Simeon b. Yohai also has a connection with his colleague Jose b. Halafta and with his teacher’s teacher Eliezer b. Hyrcanus.

Figure 3 displays the citing relationships network. Only 47 citation relationships were identified in the study (3.1%) and as mentioned the vast majority of them are cross-generational. Jose b. Halafta cites his teacher Akiba b. Joseph. The most quoted sages are: Meir, Johanan b. Zakkai and Joshua b. Hananiah. This finding matches and strengthens the observations of traditional research of the Mishnaic text.