02 NOV 2023 ISSUE 14
5. What’s New & Upcoming?

 Data Extraction, Visualisation and Storytelling: A Case Study in Headline Analysis of The Hongkong News with Deep Learning
 Photo Album Creation for historical newspapers through computer vision by using ABBYY, VGG model, and Yolov5



Data Extraction, Visualisation and Storytelling: A Case Study in Headline Analysis of The Hongkong News with Deep Learning

Digital scholarship tools now provide scholars with multiple approaches to exploring history. They can study an interactive timeline, browse the massive volume of digitised textual content or investigate major events via newspaper headlines. This project explores temporal and spatial issues of narrative through the concepts of geography and space. Contemporary lives linked to the events and times related to newspaper articles can be discovered through narratives. This represents another approach to applying narrative in storytelling for digital scholarship. The findings of the study were published in the proceedings of the Archiving 2023 International Conference
.

This project provides a method of extracting headlines and illustrations from a newspaper to use in storytelling that supports digital scholarship. The Hongkong News was selected from Hong Kong Early Tabloid Newspapers for the case study, due to its unique historical value.


The proposed methods were evaluated using Optical Character Recognition (OCR) with scraping and Deep Learning Object Detection models. Two visualisation products were developed to showcase the feasibility of our proposed methods in terms of effective storytelling.


Product Showcase  
Timeline Visualization Geodata Visualization


Related Links:
 


 



Photo Album Creation for historical newspapers through computer vision by using ABBYY, VGG model, and Yolov5

The Hong Kong Early Tabloid Newspapers collection was launched in 2022 and mainly consists of tabloid newspapers published in Hong Kong during the 20th century. Unlike serious broadsheet newspapers, tabloid newspapers represent leisure and entertainment topics that appeal to the Hong Kong public, covering topics such as politics, genre fiction, operas, drama, comics and pornography.

In this study, Amusement News《娛樂之音》was selected for its entertainment focus, especially its movie- and Cantonese opera-oriented content, and for its many printed illustrations. Our novel approach involved building digital image albums and storytelling with images from historical newspapers using Computer Vision and Machine/Deep Learning. The findings of the study were published in the proceedings of the 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI).

We aimed to create a series of online albums featuring images automatically detected, extracted, and categorised from the newspapers. Different methods were evaluated and compared in terms of accuracy, including ABBYY, an optical character recognition and object detection commercial software package; the VGG nls-chapbook model, a visual analysis tool based on the EfficientDet object detection model; and Yolov5, an object detection model within the single-stage deep learning-based object detector family. The extracted images were then categorised into specific actor/actress categories to construct individual albums for each person.




Related Links: 


 



Back to Issue
Table of Contents
1. From the University Librarian
2. Opening of the Refurbished CUHK History Gallery
3. Collections Spotlight – David Hawkes Archive
4. New Acquisitions
5. What’s New & Upcoming?
6. Meet New Library Staff
7. Do You Know
8. Contact Us
 

Past Issue