Explore Harry Potter via a dynamic social network of characters

Description: In Harry Potter, we obtain four different communities. One involves Harry’s family, including his true parents and the Dudleys, as well as their entourage. A second one is formed by Dumbledore’s friends, which are a bit outside of the scope of the book’s main plot and are mentioned during one chapter. The members of the third one are related to quidditch: Griffondor’s team, Lee Jordan the speaker, Slytherin and Marcus Flint…. Finally, the last one includes all the main characters of the plot — Harry/Ron/Hermione of course, but also Voldemort/Quirrel/Snape/Dumbledore, as well as Harry’s teachers and fellow students.

Although it is difficult to assess how good this communities are, and that a few other decent options are consistent, I believe the algorithm did a pretty good job.

We now would like to see if it is possible to obtain a graph that would summarise the main interactions between characters, and thus provide in this way a sort of summary of the book.

Our intuition entails using the k-core of the static entity graph as a summary of the book. To assess the relevance of this approach, we can easily find a good book summary to compare it with. The idea behind it is that we would like the k-core decomposition of the full graph to be similar to the graph of the summary of the book.

In practice, the two graphs nearly have the same number of nodes, the same connectedness and similar character relations. The entities involved differ only slightly, with an iou score of 0.5 (intersection / union of the two sets of characters). The main differences lie with the inclusion of the Dursleys and the Weasleys. So in conclusion, our graph obtained from the k-core of the full book could be used successfully as a summary of the book’s interactions.

Comparison with Lord of the Rings. We compare the writing style of Harry Potter and the Philosopher’s Stone with Lord of the Rings: The Fellowship of the Ring. More precisely, we proceed to the exact same analysis for both books and search for similarity and differences between them. For instance, although LOTR is much longer, there is a similar number of existing entities and interactions in the two books. However, unlike Harry Potter, LOTR is centered around several characters and several plots.

Going further, we could compute the embedding of the full dynamic graph (like for graph classification in the field of network representation learning — resource) and calculate the distance between the two graphs’ embeddings. It would allow everyone to see which books are very similar. Note that it would be relevant only if more than two books are processed (not the case here) — in order to have an idea of the order of magnitude for the metric.

In this post, we constructed and analysed a dynamic version of a novel’s social character network. We first processed the book’s textual content — extracting all character occurrences using BERT NER and placing a special emphasis on co-referencing to obtain graphs as accurate as possible. We then created a dynamic heterogeneous graph framework that embeds a time dimension into our network, and focused on five different tasks: character importance, structural change, community detection, graph summarisation and books comparison. Conducting them gave us precious information about the book’s plot, the author’s style, differences with other books’ writing scheme, characters (importance, relations, role, main interactions, etc.).

Our network analysis confirmed some expectations and provided new insights into the richly imagined book Harry Potter and the Philosopher’s Stone. Note that we have considered a fanciful application of network science to give an enticing taste of its capabilities. Other related tasks include clustering of novels, predicting the genre, the author or even new character interactions. More generally, more serious applications abound, and network science promises to be invaluable in understanding our modern networked life.

[1] Structure-based clustering of novels, Mariona Coll Ardanuy and Caroline Sporleder. 2014.

[2] Telling Stories about Dynamic Networkswith Graph Comics. Benjamin Bach et al. 2016.

[3] Network of Thrones. Andrew Beveridge and Jie Shan. 2016.

[4] Extracting Social Networks from Literary Fiction. David Elson, et al. 2010.

[5] Extraction and Analysis of Fictional Character Networks: A Survey. Vincent Labatut and Xavier Bost. 2019.

[6] How to Tell Stories with Networks: Exploring the Narrative Affordances of Graphswith the Iliad. Tommaso Venturin et al. 2017.


Leave a Comment