Research Environment for Data Analysis Tool Allocators

作者：Jeff Wilkinson, Robert Levinson

摘要

Intelligent data analysis implies the reasoned application of autonomous or semi-autonomous tools to data sets drawn from problem domains. Automation of this process of reasoning about analysis (based on factors such as available computational resources, cost of analysis, risk of failure, lessons learned from past errors, and tentative structural models of problem domains) is highly non-trivial. By casting the problem of reasoning about analysis (MetaReasoning) as yet another data analysis problem domain, we have previously [R. Levinson and J. Wilkinson, in Advances in Intelligent Data Analysis, edited by X. Liu, P. Cohen, and M. Berthold, volume LNCS 1280, Springer-Verlag, Berlin, pp. 89–100, 1997] presented a design framework, MetaReasoning for Data Analysis Tool Allocation (MRDATA). Crucial to this framework is the ability of a Tool Allocator to track resource consumption (i.e. processor time and memory usage) by the Tools it employs, as well as the ability to allocate measured quantities of resources to these Tools. In order to test implementations of the MRDATA design, we now implement a Runtime Environment for Data Analysis Tool Allocation, RE:DATA. Tool Allocators run as processes under RE:DATA, are allotted system resources, and may use these resources to run their Tools as spawned sub-processes. We also present designs of native RE:DATA implementations of analysis tools used by MRDATA: K-Nearest Neighbor Tables, Regression Trees, Interruptible (“Any-Time”) Regression Trees, and “Hierarchy Diffusion” Temporal Difference Learners. Preliminary results are discussed and techniques for integration with non-native tools are explored.

论文关键词：analysis strategies, limited resources, reinforcement learning

论文评审过程：

论文官网地址：https://doi.org/10.1023/A:1008382825019