DATA MINING RESEARCH GROUP
Data Mining has been established as a vital technology of the New
Economy. Made possible by the large amount of data already captured in
computers (an amount which grows daily) and powered by the union of
technologies ranging from Statistics to Artificial Intelligence to
Databases, Data Mining is an indispensable tool for any organization
that wants to be competitive in its market segment. By leveraging the
amount of data collected, and extracting information from this data,
organizations can make more informed decisions with respect to every
aspect of their dynamics, from supplying to customer management to
internal production.
The Data Mining group at the Computer Engineering and Computer
Science Department of the Speed Scientific School, University of
Louisville, was officially created on its first meeting on February 22,
2001. The creation of the group is not only an answer to the need for
R&D on the area of Data Mining, but also a realization that relevant
knowledge and experience in the area or related areas was already
possessed by many of the faculty members of the department.
The goals of the group are:
- To leverage the individual knowledge and practical and research
experience of its members to achieve results beyond those at the reach
of individuals though collaboration and pooling of resources.
\item To advance the state of the art in the theory and practice of Data
Mining, establishing the department as a national center for research on
the area.
- To serve the community of high-tech companies in Louisville and
the state of Kentucky by creating mutually beneficial relationships with
any interested industry partner. The group will offer very flexible
arrangements to industrial partners, ranging from simple sponsorships or
temporary consulting services to support for long-term projects.
MEMBERS
At the present, the group counts as members
Dr. Patricia Cerrito
,
Dr. Ahmed Desoky
,
Dr. Adel Elmaghraby
,
Dr. James Graham
,
Dr. Mehmet Kantardzic
,
Dr. Anup Kumar
,
Dr. Ramohan Ragade
,
and
Dr. Antonio Badia
. They are all faculty in the Computer Engineering and Computer
Science department at the Speed Scientific School
, University of Louisville
, except Dr. Cerrito, who is with the Mathematics department in
the College of Arts and Sciences. Together, they bring in expertise not
only on data mining, but also in
related areas like knowledge discovery, machine learning, neural
networks, distributed systems, data visualization, data quality and
databases. Together, they have published more than fifty scientific
papers in data mining and relevant areas in international conferences
and journals. Besides this academic activity, they have participated
in several projects with local industry, hospitals and UofL School of Medicine
(see a description of some projects below) and they bring
more than twenty years of collective real-world experience.
Past Projects
The members of the group have considerable practical experience in
applying Data Mining techniques to real problems, as well as experience
in related areas. The following is just a summarized selections of some
past projects of the members of the group.
- Machine Learning for Drug Design This project, developed
jointly with the Institute for Molecular Diversity and Drug Design at the
University of Louisville, and the Department of Biostatistics and
Medical Informations at the University of Wisconsin, involves the use of
machine learning for structure-activity prediction in drug design. In
its simplest form, structure activity prediction is an ideal example of
an inductive machine learning problem. Active molecules comprise the
set of positive examples, while those molecules with little, or no,
activity comprise the set of negative examples. The machine learning
task is to learn a structured description that will distinguish the
positive examples from the negative ones. Concurrently there have been
new advances in combinatorial chemistry, in which mixtures of thousands
of chemical compounds can be synthesized and tested for bio-activity.
Applying the machine learning, structure-activity prediction
methodologies to the results of combinatorial chemistry experiments
promises to be a major advance in pharmacophore identification and drug
design.
- Heuristic Search in Planning Sequences of Views for 3D Object
Recognition, a contract with Space and Naval Warfare Systems Center,
San Diego, CA. This ongoing project uses sequences of images with the
objective of finding comprehensible rules and relations in data
sets. Often these rules are abstractions and require the application of
machine learning techniques such as decision tree learners, artificial neural
networks and/or sparse correlation kernel analysis. Usually two
steps are performed: Data Preprocessing (feature extraction) and
Data Analysis (prediction, modeling and performing).
-
Improvements in Diagnostics using Data Mining
Technologies. This project, joint with the H. Lee Moffitt Cancer Center
and Research Institute of the University of South Florida, studied the
way that Polycythemia Rubra Vera is diagnosed and discovered improved
methods of diagnosis which use fewer (but more relevant) tests.
- Model of Decision Making in a Selection of Laproscopic
Methods, a joint project with the Department of Gynecology, University
of Louisville.
Materials
Note: These tutorials were not made by the members of this
group. The authors are indicated in the documents. They are here only
as a reference and as a service to whoever visits this page.
A tutorial on Web Mining for Personalization from PKDD 2001 is here on PDF format.
A tutorial on Text Mining, also from PKDD 2001 is here on PDF format.
A tutorial on Data Mining for Bio-Informatics is here on PowerPoint format.
How to Contact Us
If you are interested in contacting the group, please send e-mail to
abadia@louisville.edu
or call 852-0478 working days from 9:30 am to 4:30 pm. Please do not
hesitate to get in touch with us if you would like further
information!