: Malu Castellanos, Umeshwar Dayal, Renée J. Miller
: Enabling Real-Time Business Intelligence Third International Workshop, BIRTE 2009, Held at the 35th International Conference on Very Large Databases, VLDB 2009, Lyon, France, August 24, 2009, Revised Selected Papers
: Springer-Verlag
: 9783642145599
: 1
: CHF 43.90
:
: Sonstiges
: English
: 181
: DRM
: PC/MAC/eReader/Tablet
: PDF
This book constitutes the thoroughly refereed post-conference proceedings of the Third International Workshop on Business Intelligence for the Real-Time Enterprise, BIRTE 2009, held in Lyon, France, in August 2009, in conjunction with VLDB 2009, the International Conference on Very Large Data Bases. The volume contains the carefully reviewed selected papers from the workshop, including one of the two keynotes, the six research, two industrial, and one experimental paper, and also the basic statements from the panel discussion on"Merging OLTP and OLAP". The topical focus is on models and concepts, architectures, case-studies, and applications of technologies for real-time enterprise business intelligence.
Preface5
Organization6
Table of Contents7
Queries over Unstructured Data: Probabilistic Methods to the Rescue (Keynote)8
Unstructured Data in Enterprises8
Probabilistic Models for Information Extraction10
Representing Noisy Extractions as Imprecise Databases11
Multi-attribute Extractions13
Imprecise Data Models for Representing Uncertainty of De-duplication15
Probability of Two Records Being Duplicates15
Probability over Entity Groupings15
Queries over Imprecise Duplicates16
Concluding Remarks18
References19
Federated Stream Processing Support for Real-Time Business Intelligence Applications21
Introduction21
Related Work22
The MaxStream Federated Stream Processing System24
Architecture26
Two Key Building Blocks28
Hybrid Queries: Using Persistence with Streams30
Using MaxStream in Real-Time BI Scenarios32
Reducing Latency in Event-Driven Business Intelligence32
Persistent Events in Supply-Chain Monitoring33
Other Real-Time BI Applications34
Feasibility Study34
Conclusions and Future Directions36
References37
VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans39
Introduction39
Preliminaries42
Review of the Chain Scheduling42
Problem Definition43
The VPipe Execution Scheme44
Change of Operator Logic45
Discussion47
Stochastic Analysis of Chain47
System Model Basis48
Case 1: System Analysis for SOS Synchronization48
Case 2: System Analysis for IDS Synchronization50
Performance Study53
Experiment 1: Response Time Comparison53
Experiment 2: Broken Pipeline Probability54
Related Work54
Conclusion55
References55
Ad-Hoc Queries over Document Collections – A Case Study57
Introduction57
Query Planning and Query Plan Execution59
Understanding “Human-Powered” Query Execution Strategies59
Elementary Plan Operators60
The Coverage-Join (CJ) and Density-Join (DJ) Operator64
Example Query and Example Plans64
Plan Enumeration65
Case Study66
Heuristics for Plan Selection66
Results and Discussion67
Related Work69
Summary and Future Work70
References71
Appendix: Implementing the KEYWORD-Operator72
ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP73
Introduction73
Grouping Analysis: A Retrospective74
Group by75
Cubes75
Grouping Variables and the MD-Join76
Windows76
MapReduce77
Associated Sets (ASSET) Queries77
Definitions77
SQL Syntax78
DataMingler: A Spreadsheet-Like GUI79
ASSET Queries and Data Streams (COSTES)80
Financial Application Motivating Examples81
COSTES: Continuous Spreadsheet-Like Computations83
ASSET Queries and Persistent Data Sources (ASSET QE)84
Social Networks: A Motivating Example84
ASSET Query Engine (QE)86
Conclusions and Future Work88
References89
Evaluation of Load Scheduling Strategies for Real-Time Data Warehouse Environments91
Introduction91
System Model and Problem Statement93
System Architecture93
Workload Model94
Scheduling Performance Objective95
Problem Statement96
Scheduling Policies97
Scheduling Algorithms for Push-Based Update Propagation97
Evaluation and Discussion98
Simulation Framework98
Effect of the Data Production Process Length99
Comparison of Local and Global Scheduling100
Effects of Stage-Concurrent and Long-Running Updates101
Ratio of Stage-Concurrent Updates102
Pruning of Irretrievable Queries103
Effects of Long-Running Update and Queries during Runtime103
Related Work104
Conclusion105
References106
Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools107
Near Real-Time Data Warehousing107
Related Work108
Data Wareh