| Preface | 6 |
|---|
| The CHIL Consortium | 8 |
| Acknowledgments | 9 |
| Contents | 14 |
|---|
| List of Figures | 17 |
|---|
| List of Tables | 20 |
|---|
| The CHIL Vision and Framework | 22 |
|---|
| 1 Computers in the Human Interaction Loop | 23 |
| Perceptual Technologies | 27 |
|---|
| 2 Perceptual Technologies: Analyzing the Who, What, Where of Human Interaction | 28 |
| 3 Person Tracking | 30 |
| 3.1 Goals and Challenges | 31 |
| 3.2 Difficulties and Lessons Learned | 33 |
| 3.3 Results and Highlights | 34 |
| References | 39 |
| 4 Multimodal Person Identification | 42 |
| 4.1 Speaker Identification | 44 |
| 4.2 Face Identification | 44 |
| 4.3 Multimodal Person Identification | 47 |
| 4.4 Lessons Learned | 47 |
| References | 49 |
| 5 Estimation of Head Pose | 51 |
| 5.1 Single-Camera Head Pose Estimation | 52 |
| 5.2 Multicamera Head Pose Estimation | 54 |
| 5.3 Conclusion and Future Work | 58 |
| References | 59 |
| 6 Automatic Speech Recognition | 61 |
| 6.1 The ASR Framework in CHIL | 62 |
| 6.2 ASR Preprocessing Steps | 64 |
| 6.3 Main ASR Techniques and Highlights | 66 |
| 6.4 An ASR System Example | 70 |
| 6.5 Experimental Results | 72 |
| 6.6 Conclusions and Discussion | 73 |
| References | 74 |
| 7 Acoustic Event Detection and Classification | 78 |
| 7.1 Acoustic Event Classification | 79 |
| 7.2 Acoustic Event Detection | 83 |
| 7.3 Demonstrations of Acoustic Event Detection | 88 |
| 7.4 Conclusions and Remaining Challenges | 90 |
| References | 90 |
| 8 Language Technologies: Question Answering in Speech Transcripts | 91 |
| 8.1 Question Answering | 92 |
| 8.2 Question Answering: From Written to Spoken Language | 93 |
| 8.3 Fast Question Answering | 95 |
| 8.4 The QAST 2007 Evaluation | 98 |
| 8.5 Conclusions and Discussion | 101 |
| References | 102 |
| 9 Extracting Interaction Cues: Focus of Attention, Body Pose, and Gestures | 103 |
| 9.1 From Head Pose to Focus of Attention | 104 |
| 9.2 Determining Focus of Attention in Dynamic Environments | 105 |
| 9.3 Tracking Body Pose | 106 |
| 9.4 Pointing Gesture and Hand-Raising Detection | 107 |
| 9.5 Detection of Fine-Scale Gestures | 108 |
| References | 109 |
| 10 Emotion Recognition | 110 |
| 10.1 Emotion Recognition for the Socially Supportive Workspaces Scenario | 111 |
| 10.2 Emotion Recognition for the Connector Agent Scenario | 114 |
| 10.3 Discussion | 117 |
| 10.4 Conclusion | 120 |
| References | 120 |
| 11 Activity Classification | 121 |
| 11.1 Visual Activities Recognition in a Smart-Room Environment Using a Probabilistic Syntactic Approach | 122 |
| 11.2 Person Activity Classification Using Gestures | 123 |
| 11.3 Activity Recognition and Room-Level Tracking in an Office Environment | 128 |
| 11.4 Conclusion | 131 |
| References | 132 |
| 12 Situation Modeling | 134 |
| 12.1 Defining Concepts: Role, Relation, Situation, and Situation Network | 135 |
| 12.2 Implementations of the Situation Model | 138 |
| 12.3 Perspective: Automatic Acquisition and Adaptation of Situation Models Based on User Feedback | 143 |
| 12.4 Conclusion | 143 |
| References | 144 |
| 13 Targeted Audio | 146 |
| References | 154 |
| 14 Multimodal Interaction Control | 155 |
| 14.1 Interaction Control in Spoken Dialog Systems | 156 |
| 14.2 Multimodal Output and Interaction Control | 162 |
| References | 167 |
| 15 Perceptual Component Evaluation and Data Collection | 170 |
| 15.1 CHIL Data Overview | 172 |
| 15.2 CHIL Corpus Annotations | 180 |
| 15.3 CHIL Evaluations Overview | 183 |
| 15.4 Conclusions | 186 |
| References | 186 |
| Services | 188 |
|---|
| 16 User-Centered Design of CHIL Services: Introduction | 189 |
| 16.1 Methodology | 192 |
| 16.2 Methodological Issues | 193 |
| 16.3 Overview of Part III | 194 |
| References | 195 |
| 17 The Collaborative Workspace: A Co-located Tabletop Device to Support Meetings | 197 |
| 17.1 RelatedWork | 198 |
| 17.2 User-Centered Design of a Tabletop Interface | 201 |
| 17.3 Initial User Study: Whiteboard as Mock-up | 201 |
| 17.4 The Collaborative Workspace: First Design | 206 |
| 17.5 The Second User Study | 207 |
| 17.6 Re-Thinking the CollaborativeWorkspace | 210 |
| 17.7 The Collaborative Workspace: Second Design | 211 |
| 17.8 The Third User Study | 211 |
| References | 214 |
| 18 The Memory Jog Service | 216 |
| 18.1 The AIT Memory Jog Service for Meeting, Lecture and Presentation Support | 216 |
| 18.2 The UPC Memory Jog Service | 229 |
| References | 242 |
| 19 The Connector Service: Representing Availability for Mobile Communication | 244 |
| 19.1 The Always-OnWorld: Benefits and Burdens of Mobile Communication | 244 |
| 19.2 The Connector: Representing the Receiver’s Plans to the Sender | 245 |
| 19.3 Situated Aspects of Availability | 249 |
| 19.4 New Communication Modalities: Implicit Availability Representations | 258 |
| 19.5 Conclusions | 262 |
| References | 264 |
| 20 Relational Cockpit | 266 |
| 20.1 Prototype | 267 |
| 20.2 Evaluation | 270 |
| 20.3 Results | 273 |
| 20.4 Conclusion and Lessons Learned | 276 |
| References | 278 |
| 21 Automatic Relational Reporting to Support Group Dynamics | 280 |
| 21.1 Background and RelatedWork | 280 |
| 21.2 The Survival Task Experiment | 282 |
| 21.3 T
|