Intelligent Data Analysis
Dr. Christian Borgelt
The main objectives of the Intelligent Data Analysis and Graphical Models Research Unit are the development of intelligent methods for analyzing data and their application to real-world problems. We emphasize the treatment of complex data, especially fuzzy data and graph data, and focus on statistical methods, graphical modeling and frequent pattern mining as core approaches.
There is a growing need for intelligent data analysis methods: due to modern information technology, which produces ever more powerful computers every year, it is possible today to collect, transfer, combine, and store huge amounts of complex data at very low costs. As a consequence, basically every company as well as every scientific and governmental institution massively collects data in electronic form. However, even though any single bit of information can easily be retrieved and simple aggregations can be computed, general patterns, structures, and regularities often go undetected. In order to find these patterns and thus to exploit more of the information contained in the available data, we need intelligent tools that help us to transform raw data into useful knowledge. The gained insights may then be exploited to increase turnover and profit, to increase the product quality, to improve customer satisfaction, or generally to better understand the data-providing domain.
In order to meet these needs, the Intelligent Data Analysis and Graphical Models Research Unit develops and implements a large variety of intelligent data analysis and data mining methods and applies them to real-world problems. Extending classical statistics, we develop methods for parameter estimation, hypothesis testing, and regression models, especially for fuzzy data, but also work on non-parametric inferential statistics, time series analysis, robust statistics, functional data analysis, probabilistic clustering, and graphical models (in the sense of Bayes and Markov networks), including implementation in standard software environments like R, C++, JAVA, etc. Frequent pattern mining methods, for transactional data as well as for graph data, are investigated in order to improve their performance and their usefulness for practical purposes. We apply them, for example, in neurobiology (to the analysis of parallel spike trains) and in biochemistry (for drug design and protein folding/unfolding). In addition, we try to improve classical machine learning and soft computing approaches like decision and regression trees, Bayes classifiers, artificial neural networks, and fuzzy clustering. Finally, in an EU FP7 project called BISON, we are working on graph-based knowledge representation, analysis, and retrieval with the goal to support creative information discovery.