Our research focuses on the development of new methodologies for the management and data mining of large-scale object-relatio

IDeAL Research Group

Our research at the Intelligent Decision and Data Analysis Lab (IDeAL) is driven by the current unprecedented accumulation of real-time, massive databases. Examples of these databases include, but are not limited to, real-time sales data from retail superstores, scientific data, world-wide economic indicators, and data collected from smart phones, amongst others.

We apply the end results of our research within a number of diverse domains, including anthropometry and health care.

Contact:

Please contact us at hviktor{at}uottawa.ca if you are interested in our research. We are associated with the TAMALE group here at the University of Ottawa.

Current projects include the following:

- Mining real-time data streams. We are developing data mining algorithms to build just-in-time models to facilitate real-time decision making against database streams. We follow an incremental, any-time learning paradigm, where the models evolve as the data changes. We favour a “user-in-the-loop” paradigm, where we use active learning techniques to choose the best training instances for the users.

- Location-aware data cube construction and mining. Our goal is to answer the following question: “Given that we have a very large database with multiple users who require relevant and up to date information. If the user’s profile, location and situation are known, what should be done differently?” To this end, we are creating a system that dynamically computes that specific data cube that is highly relevant, from a particular user’s location, current needs and perspective. Further, we are implementing a recommender system that uses this reduced cube to provide the user with personalized, situation-aware recommendations as she travels.

- We are also collaborating with the National Research Council of Canada (NRC) on a proteomics initiative. In this work, we are investigating the use of data mining and computational intelligence techniques for rational drug design.

- Fourthly, we have started working on green database management and data mining, where our aim is to develop sustainable algorithms that guarantee fast answers while minimizing power consumption.

The following past projects has been completed successfully.

- Relational database mining refers to the problem setting where data resides in multiple tables (or relations) as contained in a relational database. Consider a database containing Terabytes or Petabytes of data. In this case, the evaluation of a hypothesis may involve hundreds of thousands of tuples spread over multiple tables, leading to computationally expensive multiple joins, which cannot assume the use of main memory. Furthermore, the current state-of-the art, involve object-relational databases which contain also multimedia content such as 2D images or 3D objects. We have developed the so-called IDeAL2 utility-based environment to directly mine data as contained in object-relational databases, focusing on techniques for classification and clustering.

- Finding clothes that fit. In the apparel industry, an important challenge is to produce garments that fit various populations well. However, repeated studies of customers’ levels of satisfaction indicate that this is often not the case. The following questions come to mind. What, then, are the typical body profiles of a population? Are there significant differences between populations, and if so, which body measurements need special care when e.g. designing garments for Italian females? Within a population, would it be possible to identify the measurements that are of importance for different sizes and genders? Furthermore, assume that we have access to an accurate anthropometric database. Would there, then, be a way to guide the data mining process to discover only those body measurements that are of the most interest for apparel designers? To this end, we are investigating new approaches to explore a database, containing anthropometric measurements and 3-D body scans, of samples of the North American, Italian and Dutch populations.

- Preserving software dependent data over a very long time. The rapid changes in technology in general, and in Internet-related technologies in particular, make the long-term preservation of e-data an important challenge. Our objective was to better understand the intrinsic subtleties when preserving e-data over 50 years or more. To this end, our research aimed to creating an environment to study the long-term preservation of e-data. We focused our attention on preserving multimedia and relational data, which were dependent on software components, for future use. The end result of this research resulting in the IDeaL long-term experimental environment, containing a persistent data webhouse, together with archiving and indexing, retrieval and trend analysis modules for handling the evolving e-data.

- Managing and exploring Cultural Heritage repositories. We studied the efficient management and exploration of very large repositories of 2D images and 3D objects for the modelling and reconstitution of complex heritage sites, and applied our methodology to a variety of real cases.

Collaborators and Sponsors:

- IBM Canada

- National Research Council of Canada (NRC)

- University of Bari, Italy

- Telfer School of Management at the University of Ottawa

- Canada Foundation for Innovation (CFI)

- Ontario Innovation Trust (OIT)

- Ontario Research Network for E-Commerce (ORNEC)

- National Science and Engineering Research Council (NSERC) of Canada