02 JAN 2024 ISSUE 18
8. Department Summer Internship Programme

  • The Census and Statistics Department, HKSAR




CHAN, Hin Long
BSc in Statistics



I am grateful for the opportunity to work in the trade research analysis section (2) of the trade statistics branch (2) during my summer internship at the Census and Statistics Department (C&SD). Under the guidance of my mentor, Mr. Tony Ho, my main task was to identify abnormal addresses in manifests and trade declaration forms. To accomplish this, I created a model that predicts the country code of an address and checks whether the prediction agrees with other information in the same document. The model uses a combination of different approaches to perform the task. Initially, it adopts rule-based methods to make a preliminary match. It then performs further matching by tokenisation and classification based on the previous matching result. The model did a commendable job in sorting out abnormal addresses.

During the process of creating the model, I learned to use various Python packages. The most frequently used package was the Regex library, which provides great flexibility in manipulating strings. It was extremely useful in dealing with addresses.

During my time at CUHK, I worked under the supervision of Prof. Yam. My task was to test the performance of the Comonotone-Independence Bayes classifier (CIBer) algorithm on different datasets, compared with the performances of other classical models, such as linear regression models. I also had the opportunity to explore various topics, including principal component analysis, Poisson regression and regression trees. One of the major challenges that I encountered was troubleshooting errors, as I was unfamiliar with the algorithms and locating the errors could be difficult. Additionally, finding suitable datasets for my tasks presented a difficulty. Despite these challenges, the experience was valuable, as I learned about concepts that I had never encountered before, such as principal component analysis and copulas.

In summary, the joint programme provided a highly valuable experience. It expanded my knowledge beyond the scope of regular studies and provided me with a better understanding of a working environment.



FAN, Jingyi
BSc in Statistics



I am truly privileged to have been appointed to the Trade Research and Analytics Branch (Section 1) of the Census and Statistics Department (C&SD), under the stewardship of my supervisor, Mr. Ian Ng.

In the realm of traditional statistical work, the extraction of information is primarily conducted through the processing of numerical data. However, a significant proportion of textual data remains underutilised. Our initial task is primarily centred on the analysis of textual data. Through this work, we have gained proficiency in the application of the word embedding method for the transformation of words into vectors and the identification of textual outliers. Additionally, we have acquired skills in machine learning, such as the application of supervised and unsupervised models for the classification of textual data.

During my time at CUHK, I was supervised by Dr. Ho Kwok Wah. One of the tasks assigned to me was to transcribe R code of a Bayesian analysis algorithm into Python. This assignment not only enriched my Python coding skills but also deepened my comprehension of the Bayesian analysis domain. Furthermore, I acquired skills for extracting data from various databases, a competence that will be of immense value in my future academic pursuits and career advancement.

I would like to express my sincere gratitude to Mr. Ian Ng, Dr. Ho and all those who have supported and guided me throughout my journey. The knowledge and skills that I have gained during my time at the Trade Research and Analytics Branch and at CUHK are invaluable, and I am truly appreciative of the opportunities provided to me.



MAO, Wenxin
BSc in Statistics

I am deeply grateful for the invaluable opportunity to intern at the Census and Statistics Department. During my internship, I was stationed in the Trade Research and Analytics Section under the guidance of my supervisor, Mr. Ian Ng. My main role involved detecting outliers in commodity descriptions and developing neural network models for classification. Through this, I acquired proficiency in handling textual data using natural language processing (NLP), mastering techniques such as word embeddings and textual similarity calculations. I familiarised myself with both supervised and unsupervised machine learning methods for the classification of extensive textual datasets. Additionally, I enhanced my data visualisation skills using principal component analysis and honed my exploratory data analysis capabilities in Excel, particularly with pivot tables. The internship programme at C&SD was a well-structured and enriching experience. I appreciated the balance between guided learning and opportunities for hands-on application. The mentors were knowledgeable and always approachable, fostering an environment of open communication and continuous feedback.

Simultaneously, I had the privilege of working closely with Dr. Liu at the university, exploring the realm of causal inference. Under his mentorship, I not only grasped various causal inference analysis methods but also evaluated them according to accuracy, precision and efficiency. My research further led me to incorporate simple neural network architectures to improve traditional causal inference models. The culmination of this project involved applying our models to real-world datasets and validating their applicability and effectiveness. Initially, the concept of causal inference was an enigma to me. However, Dr. Liu’s patience and methodical guidance transformed a difficult challenge into an enlightening journey.

I extend my deepest gratitude to my supervisors Mr. Ian Ng, Mr. Benjamin Chan and Dr. Liu for their unwavering guidance and support. Undoubtedly, these experiences have laid a strong foundation for my future studies and career. The hands-on exposure to causal inference, NLP and machine learning provides a robust foundation for any data-driven field that I may pursue, ensuring that I hit the ground running in conducting advanced coursework or research. Whether I delve deeper into data science, transition into sectors like finance, or embark on academic research, this internship has equipped me with a versatile toolkit and practical experience that will undoubtedly be invaluable.



MIN, Weidi
BSc in Statistics



Through my internship, I am honoured to have been given the opportunity to work in the National Income Division (1)3 of the C&SD. The main duty of this division is to conduct data collection and calculate Hong Kong’s gross domestic product by expenditure.

At C&SD, I was responsible for building and testing different estimation models for predicting Hong Kong resident departures by destination. If the Hong Kong residence departures by destination can be estimated, then it can contribute to the estimation of the import component of the gross domestic product. I first conducted desktop research on estimation models and then adjusted the models according to my research. The project is particularly innovative in its approach to predicting resident departures, given that records in Hong Kong only exist prior to 2005. With kind and patient guidance from my mentor, Louise, I constructed time series models and compared the performances of the different models.

In addition, I analysed 13f files with coding under the supervision of Prof. Tony Sit. This was a precious opportunity to receive coding project training as I am planning to pursue a career in data science. Prof. Tony Sit provided me with much instruction to help me develop the project. This large project included data scraping, data cleaning, data storing and profile analysis, and I was sometimes uncertain regarding the next course of action. Nevertheless, Prof. Sit’s unwavering guidance ensured that I remained on track and made consistent progress.

I would like to thank my supervisor Louise, my colleagues and Prof. Sit. Their mentorship not only directed my project endeavours but also offered invaluable insights for my future studies and career aspirations. Their consistent and kind support encouraged me to move forward and explore my future. Through this programme, my comprehension and proficiency in official statistics and academic research have profoundly improved. I am fully confident that this experience will play a pivotal role in shaping my forthcoming studies and career pursuits.



YEUNG, Ching Hin
BSc in Statistics



I am very grateful to have the opportunity to work in the Census and Statistics Department (C&SD), in the Consumer Price Index Section of the Price Statistics Branch, under the guidance of my supervisor, Mr. Chan Chin-Tang. This section is mainly responsible for investigating price-related data for the calculation of inflation and household expenditure patterns.

Throughout the two months of the summer internship, I was involved in a research project on international practices in calculating prices/price relatives of special items such as food and private rent with a view to overcoming new goods/quality biases and thus enhancing consumer price index compilation. In addition, I attended a familiarisation programme and gained insights into C&SD’s internal operations and compilation of Hong Kong SAR official statistics.

Besides working at C&SD, I worked as a research assistant at CUHK under Dr. Chan Chun-man on a project forecasting sewage flow in Hong Kong. All beginnings are hard, and research work is no exception. I started without a strong foundation or background knowledge, yet I learned several time series techniques by reading the materials provided by Dr. Chan. I value the experience of applying classroom knowledge to real-world scenarios and dipping my toes into the field of research.

All in all, this internship at C&SD has equipped me with a deeper understanding of statistics outside Hong Kong and broadened my horizons. Meanwhile, the work at CUHK gave me hands-on experiences while I was still attending school. The opportunities will undeniably assist my further studies and future career.

Back to Issue
Table of Contents
1. Message from the Chair
2. Staff Movement
3. Prizes and Awards

Staff Awards
Alumni Awards
Student Awards
Recipients of Department of Statistics Scholarships and Sponsorship
4. Departmental Activities

Dual Degree Programme and Collaboration with The University of Edinburgh (UoE)
MSc Annual Dinner
Symposium on Statistics and Risk Management 2022
The 12th ICSA International Conference
Science Faculty 60th Anniversary Distinguished Science Lecture - Prof. Fan Jianqing
Distinguished Lectures and Seminars in 2022-23
MSc in Advanced Studies in Statistics and Data Science
5. Sharing from Awardees of Overseas Research Award for PhD Students
6. Global Young Scientists Summit 2023
7. Exchange Sharing
8. Department Summer Internship Programme
9. Internship Sharing
 

Past Issue