CS 4490z/4460z Project Topics Fall 2010



Prof. Mike Bauer's Topics

bauer@uwo.ca

Parallel software for analyzing x-ray diffraction data

This project deals with the development of efficient parallel software to analyze data from X-ray diffraction experiments. This is primarily a project involving the development of parallel software to carry out this analysis; no in depth understand of chemistry, X-ray diffraction, or structure of materials is required. The project will build on existing code for analysis. X-Ray diffraction using Laue patterns is a rapid and sensitive approach to detecting structures of complex minerals, even on a micron scale. A rapid method has been developed at Western that should be capable of identifying many (but not all) such structures. This approach uses the crystal cell parameter index that is widely available to scientists through previous diffraction studies. The objective this project is to build software that could take these parameters and identify the structures – using existing software.

Prof. Lucian Ilie's Topics

ilie@csd.uwo.ca

Genome Assembly

Genome assembly is the process of taking a large number of short DNA sequences, all of which were generated by a shotgun sequencing project, and putting them back together to create a representation of the original chromosomes from which the DNA originated. Very recently, high throuput sequencing technologies such as Illumina's Genome Analyzer, ABI's SOLiD, and Roche's 454 were developed to produce huge amounts of data. Their applications, including whole-genome sequencing and resequencing, SNP discovery, identification of copy number variations, chromosomal rearrangements, etc., are revolutionizing biological research. Analyzing such data is of crucial importance but is not an easy task. The most important problem is genome assembly. Current assemblers are stuck into a non-satisfactory state. I am working with Prof. Roberto Solis-Oba and Ph.D. student Md.Bahlul Haider on a new approach to genome assembly. An undergraduate student would help with programming. A publication in a top journal and/or conference is expected to which the undergraduate student would be a co-author. All that is required is good C++ programming skills and motivation.

Prof. Mahmoud El-Sakka's Topics

elsakka@csd.uwo.ca

Still-Image compression

It is widely believed that a picture is worth more than a thousand words. However, dealing with digital pictures (images) requires far more computer memory and transmission time than that needed for plain text. To be able to "efficiently" handle the huge amount of data associated with images, compression schemes are needed. In this project, the details of few still-image compression schemes will be provided. The student will be asked to implement these schemes and compare their performance. This project can accommodate more than one student, where each student will be given different set of still-image compression schemes to implement.

Prof. Mike Katchabaw's Topics

katchab@csd.uwo.ca

Game Design and Development

I am taking on students interested in studying various aspects of game design and development. In particular, we have several on-going projects involving characters in games, both from artificial intelligence and storytelling perspectives. We have developed new approaches to characters for video games enabling more realistic and dynamic behaviour, driven by elements of personality, emotion, social ties and roles, physiological needs, and so on, and there is much more to study in this area. Specific project ideas include the following:
Dialogue synthesis. Based on a very successful project last year, we are looking to continue work on a system for constructing believable in-game dialogue for non player characters in games dynamically, at run-time within a game. If we want truly dynamic and realistic behaviour from our characters, we need to be able to have to converse in this fashion as well.
Automated character creation. We have many interesting models that are used in defining characters, but we need algorithms and tools to allow us to create characters automatically. If we are to have a population of hundreds or thousands of characters, it becomes resource intensive for game designers and storytellers to create them one at a time manually. If we are to create an entire population, there are numerous interesting aspects to study, including the formation of social networks, the roles of heredity and environment in the development of characters, the assignment of careers to individuals in the population, and so on. Many interesting issues to study here! We have done some initial work in this area, but there is still much to do ...
Story/drama management. As we empower characters by allowing them to act more dynamically and realistically given the virtual world they are in, it becomes more and more difficult for these characters to follow story and narrative direction without additional work. Certain behaviours might be realistic given their situation, but from a dramatic perspective, would not build the story in a way that is satisfying for the player. (A classic example is your typical James Bond story. Eventually, at some point, the super villain captures the super spy; instead of doing away with him immediately, he is allowed to live only to later escape and put an end to the super villain's plans. A more logical super villain would immediately kill someone of James Bond's reputation, but that would put a pre-mature end to the story ...) Instead of allowing complete freedom to a game's characters, their activity has to be overseen, at least to a certain extent by a story or drama manager that ensures that certain things either do or do not happen in the game. This requires us to consider how to encode story, make assertions or queries to the story, and so on. This is an interesting, yet challenging problem.
Automated story construction. A different approach to the above story problem is to allow characters to do anything within the game, even if it runs against pre-conceived notions of what the story of the game is to be about. When a character does something against that vision of the story, the story is automatically restructured to take this into consideration and allow the game to proceed in new directions regardless. (For example, a villain in the game could be killed pre-maturely by someone other than the player that decides to be a hero ... so then what happens? The story would need to adjust by having a new villain emerge, or doing something else to give the player a reason to continue playing.) Constructing story in this fashion has its own interesting problems, which require interesting solutions.
Character and story performance. As we make for richer stories and characters in games, we increase the demand on already scarce computing resources (namely CPU cycles and memory ...). We need ways of improving gameplay experiences from these perspectives while maintaining acceptable levels of performance and resource consumption. We have already started work in this area, but there is still more that needs to be done ... a background or interest in operating systems and systems and software performance would be an asset here! Of course, there are numerous other interesting issues and problems in this area, and so I would welcome other ideas!

Virtual and Augmented Reality

Last year, we started an interesting and exciting project with neurologists at London Health Sciences Centre relating to the use of virtual and augmented reality in treatment and rehabilitation of patients with various neurological conditions. The goal in doing this was to create various scenarios from home and daily life in a clinic environment, allowing patients to be examined and assessed in completing tasks that carry over to their home environments, ultimately leading to an improved quality of living. In doing this project, a virtual reality environment was created using Valve's Source engine and a virtual reality visor system. In this project, the goal is to extend this, exploring new visor technologies as well as motion capture/sensor systems to better position patients in the virtual world and capture their movements. This will allow us to create new scenarios for treatment and rehabilitation beyond relatively simple navigation tasks. This is a challenging project, but also potentially very rewarding!

Visualization of high fidelity financial data
co-supervised with James McInnes of Cyborg Trading Systems (http://www.cyborgtrading.com)

In finance, visualizing the market data can help algorithm developers to see patterns. The sheer volume of data is staggering, and there are valuable clues that lie in this information. We will be realeasing a product shortly that will allow traders to record this data, and use it for algorithm development. If we can also add in a visualization tool, then I think it would be a valuable asset in the data mining effort. We could also then use the visualization tool to even create realtime trading signals that could be executed by our main trading engine. We would provide a lot of the guidance in terms of what kind of data and analysis would be useful, however I think it would be an interesting project that would apply statistical methods and visualization techniques in computer science.

Automated pattern recognition in financial data
co-supervised with James McInnes of Cyborg Trading Systems (http://www.cyborgtrading.com)

Short term market patterns are very important for a successful trading strategy. Algorithms that can recognize these patterns, and adapt to them in real time are a very active area of research in finance. We could take a small subset of the market data and use some machine learning to look for patterns in the data. We could then either link this in with a visualization tool, or generate trading signals directly from the data. I have some good ideas on what types of patterns and signals would be useful, so we are not looking at a massive sample space here. In any case, this would be a perfect problem for any student who is interesting in machine learning techniques in comp sci.

Prof. Sylvia Osborn's Topics

sylvia@csd.uwo.ca

There is a CFI project awarded to Neal Ferris in the Anthropoloty Department called Capacities for Sustainable Archaeology. In association with McMaster University, they are developing curation and analytical facilities to house thousands of archaeological collections from across southern Ontario. The CS 4490 project would involve designing a database to store the information that the archeologists have, and design several interfaces for entering and querying data. One interface would be for people entering new data. One search interface might be to integrate with ESRI (GIS software) to find similar objects in the database found "near" a particular one. The design of the data repository and interfaces would be done to suit the requirements of the Archeologists. A prototype will need to be developed.

The project will be co-supervised by
Prof. Rhonda R. Bathurst
Department of Anthropology
e-mail: rhonda.bathurst@uwo.ca

Prof. Peter Rogan's Topics

progan@csd.uwo.ca

Our laboratory developed software for computation of DNA sequence intervals which are found at unique locations in genomes. This software uses the method described in US Pat. No. 7,734,424 to design DNA probes for genomic disease detection. The method circumvents requirements to compare a catalog of repetitive sequences with the genomic target sequence for unique sequence identification. The milestones of this project are to (1) to complete the ongoing processing of the human genome reference sequence for ab initio identification of a genome-wide set of single copy probes. (2) This is followed by an sensitivity and specificity analysis of genomic coverage of the deduced intervals compared to repeat-masked sequences, and characterization of the gene and conserved non-genic content. (3) This is followed design of oligonucleotides for fluorescent in situ hybridization (FISH) and (4) genomic microarrays for array comparative genomic hybridization, and (4) followed by testing and production of FISH probes and/or a microarray. Milestone 1 is largely completed, but projects can also be developed to tune parameters for single copy interval detection of other complete genome sequences and compare these to the current probe set from the reference genome. Currently, two CSD students are engaged in this project. There are opportunities for CS thesis students to participate and contribute to Milestones 2 and 3 and to the extension of Milestone 1 described above.

Prof. Sheng Yu's Topics

syu@csd.uwo.ca

Context-Free Language Extension to Grail+

Grail+ is a symbolic manipulation system for automata and formal lan- guage objects. The system was developed in early 1990s in our department and has been used by researchers in many different countries. It is one of the earliest and most importance tools for manipulating automata and formal language objects in the world.
The Grail+ system includes many different models, including regular ex- pressions, nondeterministic and deterministic finite automata, alternating finite automata, and cover automata. The system contains a large number of operations on those models and transformations among them. However, all of the models and operations of the current system concern only regu- lar languages and a number of their subfamilies. This project is to extend the current Grail+ so that it will include models of context-free languages: context-free grammars and deterministic and nondeterministic pushdown au- tomata, and a number of operations on the new models.
Context-free languages are the most commonly used formal languages in practice among all the languages in Chomsky Hierarchy. Most of the pro- gramming languages are considered as deterministic context-free languages in syntax. So, the new extension will be a very important part of Grail+. The project includes the following tasks: Note that all the new representations should be consistent with the origi- nal presentations of Grail+. The new representations should be readable and easy to manipulate. The operations should be convenient to use and main- tain. All the implementations are in C++. They should be fully tested for both correctness and efficiency. They should also be able to handle grammars and automata of very large sizes.