Current Projects

Medical Imaging Informatics

NSF REU Program in Medical Informatics Program

The Medical Informatics (MedIX) program's main objectives are to encourage talented undergraduates to pursue graduate education and to expose students to interdisciplinary research, especially at the border of information technology and medicine. All of the projects on which students will work are inspired by state-of-the-art research questions in imaging informatics ranging from traditional image processing (e.g. liver segmentation and computer-aided diagnosis, breast density assessment for cancer detection) to structured reporting and natural language processing of radiology reports, to workflow and process re-engineering to the application of data mining and ontology-based means for image annotation and markup (e.g. lung nodule detection and interpretation). The Program is sponsored by National Science Foundation since 2005 and is hosted by two interdisciplinary laboratories: the Medical Informatics Laboratory at DePaul University and the Imaging Research Institute at the University of Chicago. For more information, please visit the Program's website at:


Computer-aided detection, diagnosis, and characterization for lung nodule interpretation

Early diagnosis and treatment of lung cancer offers hope for improving the outcomes of patients with this most common cause of cancer death. Early diagnosis depends upon the diagnosis of small pulmonary nodules by radiologists using high-resolution computed tomography (CT) imaging to detect and assess small focal anomalies. Although more sensitive than chest X-rays for imaging pulmonary nodules, CT requires radiologists to review tens to hundreds of images for every patient case. In the Intelligent Multimedia Processing (IMP) lab and the Medical Imaging Informatics Lab we are developing novel computer-aided approaches to address this workload challenge and to act as "second readers" in the interpretation process. Our main projects are in the areas of 1) computer-aided detection (CADe), 2) computer-aided diagnosis (CADx), and 3) computer-aided diagnostic characterization (CADc). For more information on these projects, please visit the Publications page of the IMP Lab website at:


Bridging the gap between human and computer interpretation of similarity in the medical domain

Content-Based Image Retrieval (CBIR) aims to retrieve images relevant to the image query and has the potential to be used as a decision support tool for evidence-based medicine and case-based reasoning. However, as decision support tools, the CBIR systems must produce similarity rankings that agree with the similarity opinions of expert radiologists. The difference between the content-based similarity results and the human-based similarity is called the "semantic gap" in the imaging research community. Our work focuses on reducing this semantic gap by investigating computer-based similarity measures and image features that are close to the human perception of similarity and encode the visual content of an image similarly to the human vision. For more information on these projects, please visit the Publications page of the IMP Lab website at:


Prediction of Chronic Fatigue Syndrome

CFS (Chronic Fatigue Syndrome) is a chronic condition with symptoms that are severe, but often difficult to detect upon physical examination. They include debilitating fatigue, headaches and unrefreshing sleep. The objective is to determine what self-report survey responses indicate CFS and develop a system that will be able to differentiate between CFS, regular fatigue and healthy subjects. This project is in collaboration with the Psychology department and focuses on using machine learning techniques and other analyses to help researchers discover the causes of CFS.


P Tracking Illness with Tweets

The goal of this project is to use the classified and geo-coded twitter data to identify and track trends exhibited by the illness propagation. So far they have identified two categories of potential trends that need to be detected: a) Local spike: detecting and reporting a "significant" increase in illness occurrence inside a particular geographical area. Such determination cannot be based on absolute numbers alone because the same change may be significant in some areas but not others. b) Spreading pattern: identifying a pattern of increase in number of sightings of a particular illness based on an underlying (possibly unknown) cause. The most obvious example is the spread of a contagious disease.

Web Data Mining, Web Personalization, and Recommender Systems

Using data mining and recommender systems to facilitate large-scale requirements processes

Requirements related problems account for numerous project failures and translate into significant wasted. In many cases, these problems originate from inadequacies in the human-intensive task of eliciting stakeholders' needs, and the subsequent problems of transforming them into a set of clearly articulated and prioritized requirements. This project is intended to produce a new framework that utilizes data mining and recommender systems techniques to process and analyze high volumes of unstructured data in order to facilitate large-scale and broadly inclusive requirements processes. The goal of this project is to develop a robust requirements elicitation framework and an associated library of tools which can be used to augment the functionality of wikis, forums, and specialized management tools used in the requirements domain. Specifically, we will enhance requirements clustering techniques by incorporating prior knowledge and user-derived constraints, and we will develop a contextualized recommender system designed to facilitate appropriate placement of stakeholders into requirements discussion forums generated in the clustering phase. For more information on this and other projects in the Center for Web Intelligence, please contact Professor Bamshad Mobasher or visit:


Ontology-based user modeling for web personalization and recommendation

Users and site owners are increasingly relying on personalization and recommendation software to enable the navigation of large information spaces and the selection of pertinent items. Effective personalization of information access involves two important challenges: accurately identifying the user's context, and organizing the information in such a way that matches the particular context. To address these challenges, intelligent information systems must have the ability to seamlessly integrate knowledge representing the current user context; long-term user profiles, representing established preferences; as well as knowledge from ontologies that provide an explicit representation of the domain of interest. Such systems should be able to leverage a variety of sources of evidence, including the social knowledge derived collaboratively from peer users. The goal of this project is to develop a framework for ontological user modeling and study how it can be used in a variety of Web personalization tasks such as search and recommendation. Our aim is to create user profiles that contain both the statistical detail that implicitly or explicitly represent the user's behavior and the semantic richness that can indicate what that behavior might mean. This, in turn allows for the system to distinguish among different contexts whose associated concepts occupy disparate places in the ontology. For more information on this and other projects in the Center for Web Intelligence, please contact Professor Bamshad Mobasher or visit:


Recommender Systems for the Social Web

The influence of social media in the way people use the Web is evident from the immense popularity of sites such as Facebook, Delicious, YouTube, and others. Web users are no longer simple consumers of information interacting with the Web through predefined navigational structures. Instead, social web technologies enable users to actively interact with other users and to participate in describing, creating or sharing of resources. An important part of the social web tapestry is the process of social annotation through which users can associate labels (such as ratings, tags, reviews, etc.) with resources or other users. Annotations create explicit or implicit connections and information channels between users, resources, and labels. The complexity and the dynamic nature of the structures created by these new modes of interaction can provide a challenging environment for users. Recommender systems that assist users' in their information seeking and resource sharing activities can therefore play an essential role in the evolution of the social Web. Our goal in this project is to develop a framework for the construction of effective recommender systems for social Web environment and particularly social annotation systems. We are conducting empirical analyses across several dimensions, namely, the recommendation tasks (such as resource recommendation, label recommendation, user recommendation), recommendation algorithms (including collaborative and content-based, graph-based, and model-based approaches), and the type of social annotation system (such as those designed for information sharing and discovery versus those designed for social networking). For more information on this and other projects in the Center for Web Intelligence, please contact Professor Bamshad Mobasher or visit:


Trustworthy and Secure Recommender Systems for the Web

Social applications such as recommender systems rely on user input and user profiles learn and adapt their output for target users. These applications are therefore vulnerable to attacks by rogue agents that may disguise themselves as ordinary users and attempt to affect the output of recommender systems or to subvert social web sites for their own ends. Such attacks are frequent enough that sites must adopt strong countermeasures or be quickly overwhelmed. These countermeasures are, to date, almost entirely manual – responding to customer complaints, monitoring logs for suspicious activities – and entirely post-hoc. The need for such an on-going investment in manual defensive measures limits the adoption of social web technologies and limits the potential that such systems might achieve. The cyber security community has, so far, had little to say to the operators of open social sites. In this research project, we focus on the study the security properties of open user-adaptive of social web applications. Through the study of the underlying data mining and machine learning algorithms, attack modeling and empirical evaluation, we will develop attack detection approaches that substantially eliminates the threat posed by attackers seeking to biasing the system output in their favor. This work is an extension of our prior research on the security of recommender systems.

Multi-dimensional recommendation in Complex Heterogeneous Networks

This project brings together recent work in both complex heterogeneous information networks and recommender systems to create a recommendation framework suitable for supporting information access in dynamic heterogeneous networks. It looks to personalization techniques to generate recommendations and personalized measures when assessing the effectiveness of different recommendation algorithms. Research has also shown the importance of a multi-criteria approach to measuring recommender system effectiveness. In addition to accuracy, therefore, this project looks at the ability of a system to provide a diverse set of recommendations and to avoid excessive focus on popular items. Our approach is to build a weighted hybrid of low-dimensional recommenders. The scope of the hybrid is controlled by evaluating the information-theoretic contribution of each potential recommendation dimension.

Leadership Hospitality Projects

p Food and Beverage Analytics and Optimization Modeling

The project is a multi-outlet M-Factor Productivity Analysis of productivity and efficiency through application of data envelopment statistical computation (DEA). Traditional partial-factor productivity statistics, such as meals-per-labor hour, simply do not reflect adequately the many factors that influence the metric. Hence, the DEA methodology will deployed for use in this study by examining individual variables of each FB outlet as one holistic productivity metric—one that includes traditional operational variables such as revenue, profit, food cost, and labor cost, and unconventional variables such as NOI per square footage, REVPASH (revenue per available seat hour), to determine best practice modeling and illuminate holistic resource allocation constructs.

Restaurant Revenue Analytics and Predictive–Profit Optimization Scenarios

This project aims to introduce predictive revenue optimization and scenario planning constructs to improve restaurant profitability. The scope of the study includes an exploratory research analysis examining a top 25 grossing US restaurant operation and offers multiple revenue optimization scenarios based on Revpash-duration, price elasticity of demand, average check and menu item analysis factors.

Other Applications

In Search of Perfection: Describing the Users and Non-Users of Body Enhancement

The American Society for Aesthetic Plastic Surgery (ASAPS 2006) reports that there were nearly 11.5 million cosmetic surgical and nonsurgical procedures performed in the United States in 2005 This roughly $12.4 billion was spent even though medical research shows that many of the products and services used for body-enhancement carry with them the possibility for significant risk. In a society where consumers spend billions of dollars every year on items and professional expenses meant to help improve or change their physical appearance and performance (Blendon et al. 2001), identifying and understanding the consumers who use or avoid these enhancing items is critical although somewhat limited. Using a multi-method approach, data was collected from over 500 people measuring a variety of personal characteristics such as vanity, self-esteem, level of information search, influence of social norms and previous history and knowledge of the topic. Analyzed through a cluster analysis, a typology of consumers is developed that describes the characteristics and behavioral patterns of user, non-users, and potential users of body-enhancing products and services. This research extends previous consumer behavior literature on aesthetic surgery and can be used by physicians for forecasting or marketing strategy creation. Additionally the research lends itself to public policy makers to help in the development of regulatory programs to help protect consumer interests. Future research is also discussed that should be conducted to track this ever-expanding field of personal consumption.

Previous Projects

Analysis of legionellosis occurrence

The project is a collaboration of the Data Mining and Predictive Analytics center with the Chicago Public Health Office. A group of CDM graduate students under the supervision of DAMPA faculty are analyzing data on cases of legionellosis in the city of Chicago in the past 10 years. Previous studies have linked increased risk of legionellosis to summertime seasonality and specific weather related events, such as increase in humidity and temperatures. The goal of the analysis is to evaluate the effect of weather related events and of temporal trends on the occurrence of the disease in Chicago. The results of the study will help develop a predictive model to identify periods with higher risk of legionellosis occurrences.


Big Shoulders Fund project

Big Shoulders Fund is a not-for-profit organization that provides support to Chicago Catholic schools in the neediest areas of the city. The organization’s funding programs help schools and their students in various ways including scholarships, facility improvement and faculty support.

The Data Mining and Predictive Analytics center is collaborating with BSF to analyze and manage the data collected by the organization and evaluate the success of their activities. The project aims to identify key measures of success that can be used by the organization to determine the differential impact of their programs on student enrollment, quality of education, and academic achievement for BSF schools and students.


A data-driven typology of urban communities

The goal of the project is to use data mining techniques to predict changes in urban communities leading to gentrification or abandonment. The study analyzes data on the socio-economical and housing characteristics of Chicago community areas, and employs multivariate statistical methods and sequence analysis techniques to create a typology representing the social diversity of the Chicago neighborhoods, and to understand factors that affect mobility and home ownership.


If you are interested to work on any of these projects as part of your capstone project or independent study, please contact Dr. Daniela Raicu or Dr. Raffaella Settimi.