UWORCS 2023

About UWORCS

UWORCS stands for University of Western Ontario Research in Computer Science. UWORCS is our annual departmental student conference and provides a great opportunity to develop your presentation skills in front of a friendly audience of your peers and faculty members.

Your participation is needed to make this event a success. Please email Andrew Bloch-Hansen at ablochha@uwo.ca for more details.

Subjects

Presenters

Registered Attendees

Keynote Speaker

Colin Cherry

Colin Cherry is a Research Scientist at Google Montreal, working with Translate. His primary research area is machine translation, recently speech translation and translation with large models, with occasional forays into parsing, morphology, information extraction and dialogue. He has just ended his term as chair of the executive board of the North American Association for Computational Linguistics, and is currently an action editor at the Transactions of the Association for Computational Linguistics. He served as research track chair for the 2018 meeting of the Association for Machine Translation in the Americas.

Talk Title

Making sense of Language Models and Large Language Models

At this point you've probably heard of ChatGPT and how it's going to change everything one way or the other. But have you ever wondered about the technologies behind it? In this talk, I'll give a high-level tour of the history of language modeling, from humble beginnings in information theory to the development of the various techniques that eventually allowed the crucial addition of the word "large," and closing with some of the recent discoveries that have helped create the flexible conversationalist we know today.

Dr. Cherry's keynote speech will take place on Tuesday, April 11th, in Middlesex College room MC110. Please join us to welcome Dr. Cherry to the 30th Annual Conference - UWORCS 2023. This talk will also be available on Zoom.

Meeting ID: 914 2742 2947
Passcode: 299778

Frequently Asked Questions

Here are the FAQs for UWORCS 2023.

Who can attend?

Computer science faculty, graduate students, and undergraduate students are invited to attend to listen to the presentations.
Is there a registration fee?

No registration fee.
What sort of research can be presented?

The more you care about a subject the better your talk will probably be. Choose something that you've personally worked on during your grad/undergrad thesis studies, or even a course project with a research flavour. You can even present ongoing research. UWORCS is a great opportunity to practice explaining whatever work you are most proud of.
How should I prepare my talk?

Each presentation should be 20 minutes long with an additional 7 minutes for questions.
Presentations should clearly state the research problem and motivation, should include relevant facts/data/analysis to support the research strategy, should be accessible to an audience with an undergraduate-level background in computer science, and should have a smooth flow of ideas.
Does this fulfill my yearly PhD seminar?

Yes. PhD students in their 3rd and 4th years can present their current research and have it count towards their yearly seminar requirement (692).
Will there be session prizes?

Yes, each session will have a cash prize for the best presentation. The number and size of the prizes will be determined once we have made the schedule.
How does session judging/chairing work?

Each presentation will be given a score out of 50 by each of three faculty members (for a total score out of 150), based mainly on presentation quality and clarity. The highest score out of 150 for each session is awarded the best presentation award. Feedback from the judges will be forwarded to each presenter after the event is over. Session chairs announce each speaker and ensure that the talks stay on schedule.
Is lunch included?

Yes, lunch will be available to all participants and registered attendees. We have not yet set a menu, but we will ensure that an appropriate variety of lunch options are provided. In addition, coffee and snacks will be provided throughout the day.

Subjects

UWORCS 2023 involves talks that are judged by faculty members and senior students and prizes are awarded to top presenters
in a variety of categories including the following subjects. Topics of interest include, but are not limited to:

Data Science

Distributed Systems and Applications

Games

Human Computer Interaction

Software Engineering

Theoretical Computer Science

Team

The team for 2023 year

Andrew Bloch-Hansen

Conference Chair

Mohammad Javaad Akhtar

Volunteer

Nianqi Chen

Web Master

Presenters

There are the presenters and their topics.
Waiting for your participation!

A Polynomial-Time Approximation Scheme for Thief Orienteering on Directed Acyclic Graphs

Andrew Bloch-Hansen
Theoretical Computer Science

We consider the scenario of routing an agent called a thief through a weighted graph G = (V, E) from a start vertex s to an end vertex t. A set I of items each with weight wi and profit pi is distributed among V \ {s, t}. The thief, who has a knapsack of capacity W, must follow a simple path from s to t within a given time T while packing in the knapsack a set of items, taken from the vertices along the path, of total weight at most W and maximum profit. The travel time across an edge depends on the edge length and current knapsack load. The thief orienteering problem is a generalization of the orienteering problem and the 0-1 knapsack problem. We present a polynomial-time approximation scheme (PTAS) for the thief orienteering problem when G is directed and acyclic, and adapt the PTAS for other classes of graphs and special cases of the problem. In addition, we prove there exists no approximation algorithm for the thief orienteering problem with constant approximation ratio, unless P = NP.
Transparency in Ranking: An Analytic Visualization Tool

Mozghan Salimiparsa
Artificial Intelligence

Machine learning models have the potential to transform healthcare by providing decision support tools. However, a major challenge is the lack of transparency and accountability, as these models often do not provide clear explanations for their output. Explainable Artificial Intelligence (XAI) methods can be used to construct and communicate explanations for how a model operates, making it easier to understand why a particular decision was made. This is particularly important in healthcare, where the output of decision support tools must be easily interpreted by medical professionals.

In this study, we propose a visual analytics tool that helps users understand the relative rankings of XGBoost model predictions. Unlike other settings where explanations are relative to a single prediction, this tool allows users to understand and compare multiple predictions together. The visualization summarizes the most important features involved in each prediction and helps users understand the relationship between an entity's feature values and its position within the ranking defined by the model. To enable this functionality, we introduce a new framework for counterfactual explanation that finds feature value changes resulting in predictions that meet a dynamic threshold defined by other data items, rather than meeting a fixed threshold that would result in a different predicted class label, as is standard practice. We demonstrate the effectiveness of our tool using a health care triage problem.
Developing A Smart Home Surveillance System Using Autonomous Drones

Chongju Mai
Autonomous Drones

Placing a number of home surveillance cameras around the property can enhance home security. However, camera coverage and their true effectiveness can be limited due to the limited number of cameras that can be installed, camera’s field of view, camera’s fixed position, and associated privacy issues. Unmanned aerial vehicles (UAVs), commonly known as drones, are able to fly independently without any human intervention. There are already a few commercially available options for outdoor drone surveillance, but none for indoor applications. We believe the drones can be effectively deployed for home monitoring purposes in a cost-effective and privacy-preserving manner. In this research, we developed a novel autonomous drone prototype that can offer economically viable effective smart home monitoring capabilities than currently available home monitoring solutions in today’s smart home industry. While in flight, our developed drone navigation system can fly on any predefined paths, dynamically change the paths based on user requirements to inspect any place within its range and adapt to unanticipated situations, such as obstacle avoidance and low battery. In addition, the system can utilize machine learning to evaluate the camera stream from the onboard camera and perform object detection tasks and notify users accordingly. In our testing, we demonstrated that our developed prototype successfully performed all the functions mentioned above. Also, our novel findings from this research shed light on some of the important parameters of indoor drone-based monitoring systems, which will contribute to the further advancement in drone-based home monitoring technology.
Combating Computer Vision-based Aim Assist Tools In Competitive Online Games

Mathias Babin
Artificial Intelligence

This work presents a novel approach to the application of adversarial attacks to the domain of video games, specifically, the exploitation of computer vision-based aim assist tools in the first-person shooter genre. As one of the greatest issues plaguing modern shooters, aim assist (also referred to as aimbots) greatly increase the speed and accuracy of cheating players, giving them an unfair advantage over their competitors. The latest versions of these aim assisting tools make use of object detection models such as YOLO (You Only Look Once); fortunately, these models are vulnerable to attack via small perturbations to their input space which results in the misclassification of objects. The purpose of this work is to formulate an attack on a black-box object detection model which can be feasibly implemented in a commercial game environment. What makes our solution unique is the generation of attack images in the form of in-game objects rendered by the game engine itself, instead of a set of screenshots or from a generic differentiable renderer. Results show that our approach is capable of generating adversarial examples which can fool an object detection model in a black-box environment, as well as recreating the game's original textures such that these perturbations go unnoticed by players.
Reinforcement Learning for Multi-Level Decision Support Within Skilled Nursing Facilities
Caro Strickland
Artificial Intelligence

The concept of optimal decision-making is critical within the health care domain. Clinical care practitioners are often required to make accurate and appropriate decisions in fast-paced, high-stress environments in which there exists a significant level of uncertainty. In recent years, researchers have used stochastic health care data to successfully apply Reinforcement Learning (RL) to numerous sequential decision-making problems existing within health care (for example, learning dynamic treatment regimes and developing strategies for resource scheduling and task allocation).

A research area that remains largely unexplored, however, is the effective use of RL algorithms to provide multi-level decision support for Skilled Nursing Facilities (SNFs). These SNFs are multidisciplinary settings in which patients stable enough to be discharged from the hospital receive short-term skilled services from a team of therapists and nursing staff. Decision makers within these SNFs have a multitude of patient-level and facility-level decisions to make on a daily basis, with two of the most impactful decisions being 1) the acceptance or denial of incoming referrals (a patient-level decision), and 2) the daily scheduling of Certified Nurse Aides (CNAs) based on the level of care required by current inpatients (a facility-level decision). These are the decisions in which this work focuses on.

To approach this multi-level support problem, we first present the findings from a retrospective analysis performed on a large-scale data set that discusses factors currently influencing how decisions are made within SNFs. We then present a simulator calibrated using real-world data which allows an RL agent to take actions within a simplified SNF environment and observe the result of those actions at each timestep.
Modifications of Formal Systems and Gödel’s First Incompleteness Theorem
Christopher Maligec
Theoretical Computer Science

The Gödel sentence “This sentence is unprovable in system S” is not provable in S yet true (in all standard models of S) for any consistent logical system S capable of representing all the concepts of arithmetic in the language of arithmetic. Gödel’s result is important for Computer Science as it shows that there is no theorem-prover that can derive a theorem for each true formula of mathematics. As a result, any study involving Gödel sentences will have an impact on the study of theorem-provers. This thesis seeks to circumvent Gödel’s specific result, where he used a Gödel sentence, by creating a chain of two systems, S1 and B (which we shall call the binary system S1B), where the Gödel sentence of S1 is provable in B, but B has no Gödel sentence of its own. Furthermore, sentences such as “this sentence is not provable in S1, and this sentence is not provable in B” cannot hold either.
Faster and Secure Autonomous Vehicle’s OTA Update System: Using 5G Network QoS and PBFT Consensus in Hyperledger Fabric Blockchain
Sadia Yeasmin
Computer Systems and Networks

The advent of connected autonomous vehicles (CAVs) is bringing forth a revolutionary new era of technology transforming transportation. For traffic to be optimized and safe, efficient vehicle-to-everything collaboration and improved autonomous vehicles (AV) decision-making are crucial. It becomes essential to make decisions in real-time using information from vehicle sensors, software, and traffic data. As a part of such an In- Vehicle Network (IVN), over-the-air (OTA) software update service in CAVs needs to be facilitated rapidly, reliably, and securely. In such environments, data optimization and forecasting to manage network traffic and sustain QoS is the primary necessity for users. Fifth generation (5G) is wireless network technology designed to provide efficient Quality of Service (QoS) by enabling high data rates, low latency, high reliability, and high availability. However, by taking advantage of vulnerabilities, attackers may quickly target the OTA software update as part of botnets to execute cyber attacks. This work proposes two factors: a) The proposed method implements Practical Byzantine Fault Tolerance (PBFT) as the consensus mechanism and a distributed firewall to ensure the Hyperledger Fabric (HLF) Blockchain model is secure and tamper-proof to detect and prevent cyber attacks in CAVs’ OTA update systems and b) This research introduces the first-ever software tool model to collect 5G network QoS data, including throughput, latency, jitter, and packet loss from a large-scale field test by uploading and downloading different file types to and from a controlled environment-based server.
Protein interaction site prediction using contextual embeddings
SeyedMohsen Hosseini
Bioinformatics

Proteins are crucial for cellular functions, and while some proteins work independently, most require collaboration with other proteins. Therefore, understanding the binding sites that facilitate protein interactions is vital. PITHIA is a deep learning model that predicts protein interaction sites using alignment, attention, and embedding techniques, which are powerful tools in bioinformatics. The MSA-transformer, which combines attention with multiple sequence alignments, generates a language model that surpasses prior unsupervised approaches. In this talk, We will review some state-of-the-art protein contextual embedding models, and we will demonstrate how one of these embeddings is applied in PITHIA. The PITHIA architecture employs attention and carefully selects candidates while incorporating contextual embeddings from the MSA-transformer. PITHIA outperforms its competition on five datasets, with the closest competitor trailing by as much as 35% in terms of the area under the precision-recall curve.

Balanced Dense Multivariate Multiplication

Haoze Yuan
Computer Algebra

We propose general preprocessing techniques to reshape dense multivariate polynomials over finite fields, in order to minimize the cost of memory accesses, while preserving sufficient parallelism, so as to reduce the running time of polynomial multiplication in multi-threaded implementations.
Improved Explanations of Black-Box Models in Natural Language Processing
Anemily Sippola
Artificial Intelligence

State of the art models in Natural Language Processing (NLP) have grown increasingly opaque, billions of parameters and training examples, while at the same time automated decision making processes are coming under increased scrutiny by government and other regulatory bodies. Currently in NLP, explainable AI (XAI) methods dealing with a single decision rely on raw text or embedding values. This ignores the rich set of linguistic features that exist both in a sentence (parts of speech tags, dependency parse graphs, etc.) and in a language as a whole (synonyms, antonyms, word senses, etc.). Much work has been done with probing task to evaluate how a model handles a single linguistic feature globally, more recently using null-space projection to remove information from the embedding space. Additionally, work has been done that creates pre-trained embeddings with labeled dimensions, giving meaning to otherwise opaque values. This research focuses on creating alternative methods to generate explanations for a single decision using linguistic features, intended to be used in conjunction with existing XAI methods. Probing task methods will be built upon so that observations about a set of linguistic features can be made, and methods will be explored to extract linguistic information from a model's embeddings using pre-trained embeddings with labeled dimensions. A public framework will be created implementing each of these methods, and human evaluations of the proposed explanations will be performed. Augmenting NLP explanations should lead to both Machine Learning experts and non-experts having a greater ability to evaluate models and understand decision making, allow business and researchers to better prepare for and meet incoming regulations, and give lay persons better understand about automated decisions that affect their lives.
Modular Intersect

Juan Pablo Gonzalez Trochez
Computer Algebra

One of the core commands in the RegularChains library is Triangularize. The underlying decomposes the solution set of an polynomial system into geometrically meaningful components represented by regular chains. This algorithm works by repeatedly calling a procedure, called Intersect, which computes the common zeros of a polynomial p and a regular chain T. As the number of variables of p and T, as well as their degrees, increase, the call Intersect(p, T) becomes more and more computationally expensive. It was observed in (C. Chen an M. Moreno Maza, JSC 2012) that when the input polynomial system is zero-dimensional and T is one-dimensional then this cost can be substantially reduced. The method proposed by the authors is a probabilistic algorithm based on evaluation and interpolation techniques. This is the type of method which is typically challenging to implement in a high-level language like Maple's language, as a sharp control of computing resources (in particular memory) is needed.
Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation

Pengcheng Xu
Computer Vision and Image Analysis

Current methods of blended targets domain adaptation (BTDA) usually infer or consider domain label information but underemphasize hybrid categorical feature structures of targets, which yields limited performance, especially under the label distribution shift. We demonstrate that domain labels are not directly necessary for BTDA if categorical distributions of various domains are sufficiently aligned even facing the imbalance of domains and the label distribution shift of classes. However, we observe that the cluster assumption in BTDA does not comprehensively hold. The hybrid categorical feature space hinders the modeling of categorical distributions and the generation of reliable pseudo labels for categorical alignment. To address these, we propose a categorical domain discriminator guided by uncertainty to explicitly model and directly align categorical distributions P(Z|Y). Simultaneously, we utilize the low-level features to augment the single source features with diverse target styles to rectify the biased classifier P(Y|Z) among diverse targets. Such a mutual conditional alignment of P(Z|Y) and P(Y|Z) forms a mutual reinforced mechanism. Our approach outperforms the state-of-the-art in BTDA even compared with methods utilizing domain labels, especially under the label distribution shift, and in single target DA on DomainNet.
Multifunctionality in a Fly-Inspired Reservoir Computer

Jacob Morra
Artificial Intelligence

Multifunctionality — or, the capacity for a neural network to accomplish multiple input-driven tasks simultaneously — is an emerging evaluation metric in Reservoir Computing. This phenomena has been well observed in human and other animal brains; in particular, in the fruit fly olfactory region. In this work, we investigate the extent to which an (olfactory) fly brain connectome can capture such multifunctionality in a Reservoir Computing context, here on the Seeing Double problem. We furthermore explore the nonlinear and chaotic dynamics of this network while varying its spectral radius, which is known to have a powerful impact on network stability. Compared to the conventional Reservoir Computer, we report that this network obtains a greater capacity for multifunctionality; is multifunctional across a broader hyperparameter range; and retains stability where dynamics are typically chaotic.
Addressing Data Heterogeneity by Uncovering Subpopulations using Sum-Product Networks

Ghazaleh Noroozi
Databases

Data heterogeneity often refers to the presence of different types or formats of data within a dataset or across multiple datasets. From another perspective, data heterogeneity refers to differences in the generative processes that produce the data. Heterogeneity in machine learning can be detrimental because it can lead to unfair models with less accurate predictions for protected attributes such as race, gender, or socioeconomic status. One way to mitigate the effects of data heterogeneity is to cluster the data into more homogeneous groups and train separate models for each group. By doing so, the models can be optimized for the specific characteristics of each group, leading to improved fairness and accuracy. In this project, we make use of Sum-Product Networks (SPNs). SPNs are probabilistic graphical models that allow for tractable inference and can represent complex probability distributions. Another interesting characteristic of SPNs that we take advantage of is their ability to implicitly model latent variables. More specifically, we learn a SPN on the data and augment it to explicitly represent the underlying latent variables. We then estimate the latent variables for each data record by performing probabilistic inference on the augmented SPN. Finally, we use the latent variables to cluster the data into homogeneous groups.
Density And Dynamic Time Warping Based Spatial Clustering For Appliance Operation Modes

Kareem Jaradat
Distributed Systems and Applications

Household demand response (DR) is an important research problem that aims to modify consumer's energy consumption. One of the promising areas is clustering Appliance Operation Modes (AOMs) and inducing DR by promoting consumption patterns that use less energy-intensive modes. This work proposes a novel clustering approach (DDTWSC) which aims to cluster AOMs based on the similarity of the appliance load profiles (SUPs). DDTWSC leverages the power of the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm to partition the appliance load profiles into clusters of similar profiles that share the same AOM. Within DBSCAN, to measure the similarity between SUPs, the Dynamic Time Warping (DTW) algorithm is used. The resulting clustering is evaluated against two publicly datasets, namely RAE and UK-DALE. The Silhouette score is used to measure the performance of the proposed technique in clustering SUPs. DDTWSC demonstrated a significant improvement in the results compared to similar previous work in the literature.
Vehicle–Pedestrian Collision Avoidance System Exploiting Lightweight Smartphone

Moinul Sayed
Distributed Systems and Applications

Road accidents are causing injuries and financial suffering in our society. Potential collisions between automobiles and pedestrians should be spotted prior to their occurrence in order to offer early warnings. In recent years, numerous solutions have been presented to avoid vehicle-pedestrian accidents. However, the majority of these systems need substantial infrastructure, which is costly, difficult to implement on a large scale and incurs heavy maintenance. We propose a collision avoidance system that utilizes smartphones and requires no additional hardware resources. Our proposed system includes a lightweight app that generates trajectories as a prediction of future locations on user's side and sends it to the cloud. The trajectory updates are processed on the cloud for finding potential collisions and sending alerts to the possibly colliding devices in advance. The smartphone app is power consumption-wise less expensive as it requires only location updates and does not intervene with any other sensor. The real road experiments show impressive accuracy in generating timely, relevant warnings.

Talks Schedule

Activity	Room	Time
Registration	MC 312 (Grad Lounge)	8:00am - 9:00am
Presentations	See Timetable	9:00am - 12:00pm
Break	-	12:00pm - 12:15pm
Keynote	MC 110	12:15pm - 1:15pm
Closing Ceremony	MC 110	1:15pm - 1:45pm
Lunch	MC 312 (Grad Lounge)	1:45pm - 3:00pm
Info Session	MC 300	2:15pm

Keynote talk is available on Zoom -- Meeting ID: 914 2742 2947 Passcode: 299778

Timetable

Time slot	MC110 Algorithms	MC105B Artificial Intelligence	MC320 Computer Systems
9:00AM	Andrew Bloch-Hansen	Mathias Babin	Kareem Jaradat
9:30AM	Christopher Maligec	Caro Strickland	Moinul Sayed
10:00AM	Haoze Yuan	Anemily Sippola	Sadia Yeasmin
10:30AM	Juan Pablo Gonzalez Trochez	Jacob Morra	Chongju Mai
11:00AM	PengCheng Xu	Mozghan Salimiparsa	Ghazaleh Noroozi
11:30AM	SeyedMohsen Hosseini	-	-
Judges	Mostafa Milani Zubair Fadlullah	Hanan Lutfiyya Alex Brandt	Lucian Ilie Kaizhong Zhang
Session Chairs	Yalda Mohsenzadeh	Roberto Solis-Oba	Mike Domaratzki